7 Chapter 7: Engineer your code

Note

Early draft release: This chapter is not yet available.

Spark configuration and managing different environments
Call a local .py file
See the built-in libraries
Include public libraries
Create a custom library
Use a custom library from resources
Upload a custom library in an environment
Organize your code (https://github.com/josephmachado/data_engineering_best_practices_log/tree/8abf0c8a8293c7ad2ff8c7afc8d2f6f95bad6020)
A section about performance
- %%time and %%timeit
- Conditional Actions (collect, count, show, take) and logging
Logging
- https://learn.microsoft.com/en-us/fabric/data-engineering/azure-fabric-diagnostic-emitters-azure-storage
- Extract logs and do something with it https://fabric.guru/extracting-fabric-spark-driver-logs-using-api

https://learn.microsoft.com/en-us/fabric/data-engineering/microsoft-spark-utilities

msspark utils - run a notebook from code, with parameters - run several notebooks + use DAG - exit value