Most of the customers I talk to are directly or indirectly asking to scale their workloads and use Databricks. It has become the new normal in data processing in cloud. If you are using or plan to use Azure Databricks, this post will guide you on some interesting things to investigate as you start. These are not technical deep-dives but are good for developers and architects to know.

1. Use Interactive Clusters
Teams spend a lot of time playing with data and exploring patterns. There is certainly a need to have unobstructed compute and horsepower available while users are exploring data or while developers are working on notebooks. It’s great to have an interactive cluster available for developers or end users for such scenarios.
2. Use Job Clusters
While your jobs or notebooks are running in production, there are cost optimizations that can be achieved using job clusters. These clusters spin up for the job runtime only, provide the compute, and decommission automatically once the job is done. Whether you are scheduling in Azure Databricks or orchestrating from tools such as Data Factory, job clusters provide a great way to optimize cost and resources in production.
3. Use Shortcuts
There are a lot of shortcuts available in notebooks and the list can be accessed from within the notebooks. Here are my favorites:
- Shift + Enter — Run the command and switch control to the next cell. Best of all, it inserts a new cell if you are at the end of the notebook.
- Ctrl + / — By far the most used shortcut. Comments/uncomments the code in the cell. Depending on the magic command used, it uses the right comment format (
//,--, or#) for the language. - Hold Shift while deleting a cell — Stops the annoying pop-up confirmation to delete the cell.
4. Use Magic Commands
Switching cell languages as you go through data exploration is very useful. Having come from a SQL background, it just makes things easy — and there is no proven performance difference between languages. All languages are first-class citizens in Databricks. Feel free to toggle between Scala, Python, and SQL to get the most out of Databricks.
5. Use Databricks Delta
By far the best feature of the technology, and one that is going to change the way data lakes are perceived and implemented. Delta provides seamless capability to upsert and delete data in the lake — which was a crazy overhead earlier. Using Delta is going to change how lakes are designed.
For more on Delta, see Introduction to Delta Architecture.