This post explains how to calculate the total storage size of an Azure Data Lake Store(ADLS) Gen1 or Gen2 folder in Pyspark using Azure Databricks or Azure Synapse Analytics. Assumptions ADLS Gen1 or Gen2 is already set and is being mounted in Azure Databricks or Azure Synapse Analytics. The below…
Category: databricks
Run Databricks Notebooks In Parallel -Python
Databricks is an industry-leading, cloud-based data engineering tool used for processing and transforming massive quantities of data and exploring the data through machine learning models. You can use dbutils library of databricks to run one notebook and also run multiple notebooks in parallel. Run one Notebook from another Notebook You can use the…