How Deutsche Börse built a generative AI tool to tackle the mass migration of Zeppelin notebooks to Databricks

At Deutsche Börse Group, our Statistics Platform provides approximately 95% of all clearing and trading data across the group, powering self-service analysis for hundreds of business users. Keeping that data accessible and actionable is at the heart of everything we do.

For years, this meant Zeppelin notebooks running on Cloudera with access to HDFS and Oracle data systems. The platform served us well, but the landscape changed. Cloudera is shutting down Zeppelin entirely in 2027, our analytics workloads are moving to the cloud, and Databricks has been selected as our new unified analytics platform. That combination created a migration challenge that most organizations underestimate: 2,000+ users and huge amounts of notebooks, many of them deeply embedded in day-to-day business workflows, all need to be migrated.

Rewriting everything manually would take years. So we decided to build a better way at Databricks.

notebook migration problem

A lot of attention is paid to infrastructure migration. Notebook migration doesn’t happen, which is a big reason they slow down Teams.

Our Zeppelin notebooks were not ordinary scripts. They included complex SQL and Python logic, custom interpreters, Oracle and HDFS references, visualizations, widgets, and scheduling logic built up over the years. Each reflected the institutional knowledge of the business teams that relied on it. The diversity throughout the notebook landscape made a rule-based rewrite engine impractical, as the logic was too heterogeneous and business-specific to handle automated rules reliably.

That constraint led us to a neat design insight: separating structure from logic, and applying the right tools to each. Structural transformation (mapping Zeppelin’s paragraph format to Databricks cells, translating interpreter syntax, reformatting metadata) is deterministic and automatic, while logic reconstruction is not. Thankfully, LLMs are great at this structural conversion part.

Building a Converter on Databricks Apps

With that design principle in mind, we created the Zeppelin to Databricks Notebook Converter, a Databricks app designed specifically for our migration workflow.

The app handles the structural side of the transformation: Zeppelin paragraphs become Databricks cells, interpreter mappings are applied. (%python, %sql, %pyspark and others are translated to their Databricks equivalents), and the notebook metadata is reformatted into valid .ipynb JSON. The original content is exactly preserved. We are not rewriting the logic at this stage, just preparing it for the next step.

That next step is the genie. For each uploaded notebook, the app automatically generates a context-aware prompt that includes specific details about our Zeppelin environment. Think about our custom interpreters, data sources, and configuration patterns. The hint gives the Genie the context needed to accurately reconstruct the logic in a Databricks-native manner.

The workflow for the business user is straightforward:

Export Zeppelin Notebook as JSON
Upload it to Databricks app
click convert
download converted .ipynb
Open Databricks, upload the notebook, launch Genie and paste the generated prompt
The genie asks clarifying questions and reconstructs the notebook

The app itself was built with the shadcn UI frontend. Originally, we created a Streamlit prototype, but we felt that shadcn gave us a more professional and scalable interface. Databricks apps development experience made it easy to ship rapidly without standing up separate infrastructure.

which we decided not to automate

One of the most important design decisions was to determine what the device should intentionally leave alone.

The converter does not rewrite SQL logic, Python logic, visualizations, widgets, Oracle and HDFS references, scheduling logic, or business-specific custom code. All that content is preserved in the converted notebook, untouched, as rewriting it will automatically introduce errors and reduce confidence in the output. These are exactly the elements that vary the most across notebooks and that contain the most business-critical logic. They are related to jinn, who can interpret context, ask clarifying questions and make judgment calls that rules cannot.

This hybrid approach of automating the deterministic part and delegating the variable part allows us to avoid the brittleness of rule-based systems and leverage AI where it really performs well.

Result: hours to minutes

By combining structural transformation with AI-assisted logic reconstruction, we have reduced notebook reconstruction from hours of manual effort to 15-20 minutes per notebook, depending on complexity. For large-scale migrations of this nature, spanning multiple business domains, this approach transforms a resource-intensive, time-consuming undertaking into a scalable, repeatable workflow that will take much less time.

With increase in speed the nature of work also changes. Business users do not need deep Databricks expertise to migrate their notebooks. They follow a short sequence of steps, receive the signal and let the Genie rebuild. The tool is so accessible that migration does not require a dedicated engineering team.

what we learned

Some principles have emerged from this project that we will adopt in any similar endeavor.

Avoid over-engineering. Our first attempt used a more complex agentic architecture that added overhead without solving the core problem. A simple UI and a clean backend proved to be absolutely enough.
Rule-based rewriting is not suitable for heterogeneous content. The diversity of arguments in our notebooks makes rules impractical. LLMs are essential to handle that variability and the key is to work thoughtfully between automation and AI.
Context is the difference between a good sign and a great sign. Generic Genie prompts produce generic results. Investing in a prompt that encoded knowledge of our specific environment – interpreters, data sources, configuration patterns – made the output actually usable.
Get your platform team involved early. Our collaboration with the Databricks team throughout the build helped us stay cohesive and avoid rework.

what will happen next

While the initial development of our converter tool is complete, we are now moving forward with large-scale, real-world testing. Our immediate priorities include finalizing quick definitions to improve accuracy, validating the tool with notebooks across multiple business entities and IT, and preparing to onboard users.

The broader implications are what excites us most. This project demonstrated that AI-assisted migration is not a future capability, it is available right now! By combining Databricks apps with generative AI, we’ve created a repeatable workflow that turns one of the toughest problems of cloud transformation into a fast, scalable process.

How Deutsche Börse built a generative AI tool to tackle the mass migration of Zeppelin notebooks to Databricks

notebook migration problem

Building a Converter on Databricks Apps

which we decided not to automate

Result: hours to minutes

what we learned

what will happen next

The new Siri will reportedly allow auto-deleting chats

US bans travelers from three African countries due to Ebola outbreak

Related Articles

Leave a Comment Cancel Reply