2025 in Review: Databricks SQL, faster for every workload

For most data teams, performance is no longer about one-time tuning. It’s about getting analytics faster at the scale of data, users, and administration without increasing costs.

with Databricks SQL (DBSQL)That expectation is built into the platform. In 2025, the average Performance improved by up to 40% in production workloadsNo tuning, no query rewrites, and no manual intervention required.

The bigger story goes beyond a benchmark. From faster dashboard loads and more efficient pipelines to administration and queries that remain responsive even with shared data, performance has improved across the platform, while geospatial analytics and AI functions continue to scale without added complexity.

The goal remains simple: Make workloads faster and reduce total costs by default. With DBSQL Serverless, Unity Catalog Managed Tables, and predictive optimization, improvements are automatically applied to your environment, so existing workloads benefit as the engine evolves.

This post details the performance gains delivered to query engines in 2025, unity list, delta sharingStorage, spatial SQL and AI functions.

Fast query execution during every workload

Databricks SQL measures performance using millions of real customer queries run repeatedly in production. By tracking how these workloads change over time, we measure the real impact of platform improvements and optimizations, rather than individual benchmarks.

In 2025, Databricks SQL delivered consistent performance gains across all major workload types. These improvements are applied by default through engine-level optimizations such as predictive query execution and photon vectorized shuffle, without requiring configuration changes.

exploratory workload Saw the biggest benefits, Running up to 40% faster on average and allowing analysts and data scientists to iterate more quickly on large datasets.
business Intelligence Workload improved by approximately 20%, resulting in more responsive dashboards and intuitive interactive analytics under concurrency.
ETL workload even walking roughly helped 10% faster and shortening pipeline runtime without rework.

These measurements come from the Databricks Performance Index, which is derived from statistically repetitive workloads and calculated against billions of production queries.

If you last evaluated Databricks SQL a year ago, your existing workloads are already running faster today.

Analytics that stay fast as governance scales with Unity Catalog

As data wealth grows, administration often becomes a hidden source of latency. Permission checks, metadata access, and lineage lookups can slow down queries, especially in interactive and high-concurrency environments.

In 2025, Unity Catalog significantly reduces this overhead. End-to-end catalog latency improved by up to 10xPowered by optimizations in the catalog service, networking stack, Databricks runtime client, and dependent services.

Results appear where it matters most:

dashboard Stay responsive even with subtle access controls.
high-concurrent workloads Scale without constraints from metadata access.
interactive analysis Users get a faster feel when they explore large volumes of controlled data.

Teams no longer have to choose between strong governance and performance. With Unity Catalog, analytics stay faster as governance expands to more data and more users.

Delta sharing, shared data that acts like original data

Sharing data between teams or organizations traditionally comes with a cost. Queries against shared tables often ran more slowly, and optimizations were applied unevenly compared to the original data.

In 2025, Databricks SQL closes that gap. Queries run on shared tables through Delta Sharing, improving query execution and statistics dissemination up to 30% fasterBringing shared data performance in line with native tables.

Delta sharing and UC perf improvements — *From 2024 to 2025, end-to-end Unity Catalog latency becomes 10x faster and delta sharing improves by 30%.*

This change makes most sense in scenarios where external data needs to be treated like internal data. Data marketplaces, cross-organization analytics and partner-driven reporting can now run on shared datasets without sacrificing interactivity or predictability.

With Delta Sharing, teams can broadly share controlled data while preserving performance expectations for modern analytics.

Low storage costs, automatic optimization built-in

As the amount of data increases, storage efficiency becomes a larger part of the total cost. Compression plays an important role, but choosing the format and managing migration traditionally adds operational overhead.

In 2025, Databricks made Zstandard compression the default for all new Unity Catalog managed tables. Zstandard is an open source compression format that delivers Saving up to 40% in storage costs Compared to older formats, without degrading query performance.

zstd full fix — *With Zstd, we have delivered cost savings of up to 40% compared to older storage formats.*

These benefits apply automatically to new tables, and existing tables can also be migrated to Zstandard, with simple migration tooling coming soon. Large fact tables, long-retained datasets, and rapidly growing domains result in immediate cost reductions without changing the way queries are written or executed.

This results in low storage costs by default, delivered without compromising performance or adding new tuning steps.

Geospatial analysis without specialized systems

Geospatial analysis places heavy demands on query execution. Spatial joins, range queries, and geometric calculations are computation-intensive, and at scale they often require specialized systems or careful tuning.

In 2025, Databricks SQL significantly improved performance for these workloads. Spatial SQL queries ran up to 17x fasterPowered by engine-level optimizations such as R-tree indexing, optimized spatial joins in Photon, and intelligent range optimization.

local full correction — *From 2024 to 2025, spatial connectivity for large-scale data accelerated by 17 times.*

These improvements allow teams to work with location data using standard SQL, while the engine automatically handles execution complexity. Use cases such as real-time location analytics, large-scale geofencing, and geographic enrichment become faster and more frequent as data volumes grow.

Spatial analysis no longer requires separate tooling or manual optimization. Complex geospatial workloads run directly within Databricks SQL.

AI Functions, Scalable AI directly in SQL

Applying AI to data has traditionally required working outside the warehouse. Text classification, document parsing, and translation often means building separate pipelines, managing model infrastructure, and linking the results back into the analytical workflow.

AI Functions simplify that model by bringing AI directly into SQL. In 2025, Databricks SQL significantly expanded the scale and performance of these capabilities. New batch-optimized distributed infrastructure Up to 85x faster performance for tasks like ai_classify, ai_summaryAnd ai_translateAllows large batch tasks that used to take hours to complete in minutes.

Databricks also introduced ai_parse_document And adapted it rapidly for scale. Purpose-built models for document understanding, distributed, hosted on Databricks Model Serving Up to 30x faster performance Compared to general-purpose alternatives, this makes it practical to process large amounts of unstructured content directly within the analytics workflow.

Complete revamp of AI functions — *For large batch workloads, AI functions become up to 85x faster in 2025.*

These improvements enable intelligent document processing, extracting insights from unstructured data and predictive analytics using familiar SQL interfaces. Without the need for separate systems or custom pipelines, AI workloads grow alongside analytics workloads.

With AI Functions, Databricks expands beyond analytics into AI-powered workloads, while preserving the simplicity and performance expectations of a SQL warehouse.

launch

All of these improvements are already live in Databricks SQL Serverless, with nothing to enable and no configuration required.

If you haven’t tried DBSQL Serverless, create a serverless warehouse and start querying. Existing workloads benefit immediately, with performance and cost improvements automatically applied as the platform continues to evolve.

2025 in Review: Databricks SQL, faster for every workload

Fast query execution during every workload

Analytics that stay fast as governance scales with Unity Catalog

Delta sharing, shared data that acts like original data

Low storage costs, automatic optimization built-in

Geospatial analysis without specialized systems

AI Functions, Scalable AI directly in SQL

launch

Yahoo Scout: An AI search engine to compete with ChatGPT, Perplexity and Google

Tencent Hunyuan releases HPC-Ops: a high-performance LLM inference operator library

Related Articles

Leave a Comment Cancel Reply