Author(s): allglen
Originally published on Towards AI.
Before we talk about where to store your registry, let’s address the elephant in the room: What about LookML?
If you’re already using Looker, you may be wondering if you need to create Completely YAML-based semantic layer – And that’s a fair question.
What is LookML?
LookML is Looker’s modeling language for defining your semantic layer. This is what makes Looker powerful: instead of writing SQL for each chart, you define your business logic once in LookML, and Looker automatically generates the SQL.
same here net_revenue The metric looks like this in LookML:
view: orders {
sql_table_name: analytics.fact_orders ;;dimension: order_id {
primary_key: yes
type: string
sql: ${TABLE}.order_id ;;
}
dimension_group: order {
type: time
timeframes: (date, week, month, year)
sql: ${TABLE}.order_date ;;
}
dimension: country {
type: string
sql: ${TABLE}.country ;;
}
dimension: net_amount {
type: number
sql: ${TABLE}.net_amount ;;
hidden: yes # Don't expose raw dimension
}
measure: net_revenue {
type: sum
sql: ${net_amount} ;;
description: "Revenue excluding tax, refunds, and cancelled orders"
value_format_name: usd
}
measure: order_count {
type: count_distinct
sql: ${order_id} ;;
description: "Total number of completed orders"
}
measure: average_order_value {
type: number
sql: ${net_revenue} / NULLIF(${order_count}, 0) ;;
description: "Average net revenue per order"
value_format_name: usd
}
}
# Explores define which views can be joined together
explore: orders {
description: "Order analytics"
join: customers {
type: left_outer
sql_on: ${orders.customer_id} = ${customers.customer_id} ;;
relationship: many_to_one
}
}
LookML gives you:
- ✅ Dimensions and measurement definitions (Just like our YAML registry)
- ✅ type security (dates, numbers, strings with verification)
- ✅ reusable logic (Derived measures refer to other measures)
- ✅ add relationships (Defines how views are joined)
- ✅ access control (field-level permissions)
- ✅ version control (LookML files reside in Git)
Integration: LookML + ADK
Here’s the good news: LookML and ADK work beautifully together.
Google to open source Looker’s semantic layer in 2024 Open SQL InterfaceWhich means you can access LookML-defined metrics programmatically from anywhere – including your ADK agents.
Architecture with LookML:
Natural language question
↓
(ADK Agent) ← Reasoning & orchestration
↓
(Looker API / Open SQL Interface) ← Query LookML metrics
↓
(LookML Models) ← Semantic layer (metrics, dimensions, joins)
↓
(BigQuery) ← Raw storage & computation
Implementation with Looker API:
from google.adk import Agent, Tool
import requests
import osclass LookerSemanticAgent:
def __init__(self):
self.looker_api_url = os.getenv("LOOKER_API_URL")
self.looker_client_id = os.getenv("LOOKER_CLIENT_ID")
self.looker_client_secret = os.getenv("LOOKER_CLIENT_SECRET")
self.access_token = self._authenticate()def _authenticate(self):
"""Authenticate with Looker API"""
auth_response = requests.post(
f"{self.looker_api_url}/login",
data={
"client_id": self.looker_client_id,
"client_secret": self.looker_client_secret
}
)
return auth_response.json()("access_token")@Tool
def query_looker_metric(
self,
model: str,
explore: str,
measures: list(str),
dimensions: list(str) = None,
filters: dict = None,
limit: int = 500
) -> dict:
"""
Query metrics defined in LookML through Looker API.Args:
""
model: LookML model name (e.g., 'analytics')
explore: Explore name (e.g., 'orders')
measures: List of measures to query (e.g., ('net_revenue'))
dimensions: List of dimensions to group by (e.g., ('country'))
filters: Dimension filters (e.g., {'order_date': '30 days'})
limit: Maximum number of rows to return
"# Build Looker query
query_body = {
"model": model,
"view": explore,
"fields": measures + (dimensions or ()),
"filters": filters or {},
"limit": limit
}# Create and run query
headers = {"Authorization": f"Bearer {self.access_token}"}# Create query
create_response = requests.post(
f"{self.looker_api_url}/queries",
json=query_body,
headers=headers
)
query_id = create_response.json()("id")# Run query
run_response = requests.get(
f"{self.looker_api_url}/queries/{query_id}/run/json",
headers=headers
)results = run_response.json()
# Get field definitions for context
explore_response = requests.get(
f"{self.looker_api_url}/lookml_models/{model}/explores/{explore}",
headers=headers
)
explore_metadata = explore_response.json()return {
"results": results,
"metadata": {
"model": model,
"explore": explore,
"measures": measures,
"dimensions": dimensions,
"filters": filters,
"field_definitions": self._extract_field_metadata(
explore_metadata,
measures + (dimensions or ())
)
}
}def _extract_field_metadata(self, explore_metadata, fields):
"""Extract descriptions and types for requested fields"""
metadata = {}for field in fields:
# Search in measures
for measure in explore_metadata.get("fields", {}).get("measures", ()):
if measure("name") == field:
metadata(field) = {
"description": measure.get("description"),
"type": measure.get("type"),
"sql": measure.get("sql")
}# Search in dimensions
for dimension in explore_metadata.get("fields", {}).get("dimensions", ()):
if dimension("name") == field:
metadata(field) = {
"description": dimension.get("description"),
"type": dimension.get("type")
}return metadata
# Create ADK agent with Looker integration
looker_agent = LookerSemanticAgent()agent = Agent(
name="LookerAnalyticsAgent",
model="gemini-2.0-flash-exp",
instruction="""You are an analytics expert with access to Looker's semantic layer.When users ask about metrics:
1. Identify which LookML model and explore to use
2. Map their question to specific measures and dimensions
3. Extract any filters from their question
4. Call query_looker_metric with the correct parameters
5. Present results clearly with metric definitionsAlways include the metric description in your response so users understand
exactly what they're seeing.Available models: 'analytics'
Available explores: 'orders', 'customers', 'products'
""",
tools=(looker_agent.query_looker_metric)
)
When to use LookML vs Custom YAML Registry
Here is a decision framework:
Use LookML when:
✅ You are already using Looker – Don’t rebuild what you have
✅ you have complex join logic – LookML’s Explore layer handles this beautifully
✅ You need a BI tool anyway – Get a semantic layer + visualization in one
✅ You have non-technical users – Looker’s UI makes metric searching easy
✅ You want enterprise features – Row-level security, caching, built-in alerts
Use custom YAML registry when:
✅ you don’t need looker — Why pay for software you won’t use?
✅ you want complete control – No vendor lock-in, complete flexibility
✅ you have simple aggregation — Most metrics are SUM/COUNT/AVG on a table
✅ You are making a special device – Embedded analytics, CLI tools, etc.
✅ You want minimal infrastructure – Only files in Git, no service for maintenance
Hybrid approach (best of both worlds):
Many teams use Both: :
- LookML for BI users — Analysts create dashboards in Looker
- YAML Registry for Agents – Subset of metrics exposed to AI agents
- shared definitions — both point to similar BigQuery ideas
# metrics.yaml - Mirrors subset of LookML
metrics:
net_revenue:
description: "Revenue excluding tax, refunds, and cancelled orders"
looker_reference:
model: analytics
explore: orders
measure: net_revenue
fallback_sql: "SUM(net_amount)" # If Looker unavailable
source: analytics.fact_orders
it gives you:
- BI tools for human users (Looker)
- Lightweight agent access (BigQuery directly via YAML)
- Single Source of Truth (LookML is official)
- Flexibility (agents can fall back directly to SQL)
Using Looker’s Open SQL interface
In August 2024, Google released Open SQL Interface For Looker, which provides JDBC/SQL access to LookML metrics. This is huge for agent integration.
import jaydebeapi# Connect to Looker via JDBC
conn = jaydebeapi.connect(
"com.looker.jdbc.LookerDriver",
"jdbc:looker://your-instance.looker.com:19999",
{"user": "your-user", "password": "your-password"}
)
cursor = conn.cursor()
# Query LookML metrics using SQL
cursor.execute("""
SELECT
orders.country,
orders.net_revenue,
orders.order_count
FROM analytics.orders
WHERE orders.order_date >= '2025-01-01'
GROUP BY orders.country
""")
results = cursor.fetchall()
The agent can now treat LookML like a database – but it’s a semantic database where column names are metric definitions.
Best of both: LookML + ADK + BigQuery
Here is the architecture I recommend for enterprises that are already using Looker:
User Question
↓
ADK Agent decides:
├─→ Complex/exploratory? → Looker API (leverages joins, caching)
├─→ Simple/real-time? → Direct BigQuery (lower latency)
└─→ New metric? → Suggest LookML addition
↓
Both paths use same BigQuery views
execution:
class HybridSemanticAgent:
def __init__(self):
self.looker = LookerSemanticAgent()
self.bigquery = SemanticLayerAgent(registry_path="./semantic-layer")@Tool
def query_metric(self, metric_name: str, filters: dict, **kwargs):
"""Route queries to optimal backend"""
# Check if metric requires joins (use Looker)
if self._requires_joins(metric_name):
return self.looker.query_looker_metric(
model="analytics",
explore=self._get_explore(metric_name),
measures=(metric_name),
filters=filters
)
# Simple metric, use direct BigQuery
return self.bigquery.query_metric(
metric_name=metric_name,
filters=filters,
**kwargs
)
def _requires_joins(self, metric_name: str) -> bool:
"""Check if metric needs data from multiple tables"""
complex_metrics = {
'customer_lifetime_value', # Needs orders + customers
'product_revenue_share', # Needs orders + products
'repeat_customer_rate' # Needs customer history
}
return metric_name in complex_metrics
Real-world example: multi-table metrics
This is where LookML shines – metrics that span multiple tables:
# This is painful to manage in YAML, elegant in LookML
explore: orders {
join: customers {
sql_on: ${orders.customer_id} = ${customers.customer_id} ;;
relationship: many_to_one
}join: products {
sql_on: ${orders.product_id} = ${products.product_id} ;;
relationship: many_to_one
}
}
view: orders {
measure: revenue_by_customer_tier {
type: sum
sql: ${net_amount} ;;
filters: (customers.tier: "enterprise") # Cross-view filter
}
measure: high_margin_product_revenue {
type: sum
sql: ${net_amount} ;;
filters: (products.margin_percentage: ">30") # Another join
}
}
Your agent can query these complex metrics without having to worry about join logic:
result = looker_agent.query_looker_metric(
model="analytics",
explore="orders",
measures=("revenue_by_customer_tier"),
dimensions=("order_month"),
filters={"order_date": "90 days"}
)
Looker handles joins, caching, and SQL generation.
Migration path: YAML → LookML
If you’re a beginner with YAML and need Looker:
Step 1: YAML Only (Week 1-2)
- Quick prototyping with minimal infrastructure
- Prove value with core metrics
Step 2: Hybrid (month 1-3)
- Introduce Looker to BI users
- Migrate complex metrics to LookML
- Have simple metrics in YAML for agents
Step 3: LookML Primary (month 3+)
- LookML is the source of truth
- Agents perform queries via the Looker API or Open SQL interface
- YAML becomes a cache/fallback layer
Investments are tailored to your needs.
Published via Towards AI
