LookML: An Alternative Semantic Layer Approach to Building a Trusted AI Analytics Agent with BigQuery

Author(s): allglen

Originally published on Towards AI.

Before we talk about where to store your registry, let’s address the elephant in the room: What about LookML?

If you’re already using Looker, you may be wondering if you need to create Completely YAML-based semantic layer – And that’s a fair question.

What is LookML?

LookML is Looker’s modeling language for defining your semantic layer. This is what makes Looker powerful: instead of writing SQL for each chart, you define your business logic once in LookML, and Looker automatically generates the SQL.

same here net_revenue The metric looks like this in LookML:

view: orders {
sql_table_name: analytics.fact_orders ;;dimension: order_id {
primary_key: yes
type: string
sql: ${TABLE}.order_id ;;
}
dimension_group: order {
type: time
timeframes: (date, week, month, year)
sql: ${TABLE}.order_date ;;
}
dimension: country {
type: string
sql: ${TABLE}.country ;;
}
dimension: net_amount {
type: number
sql: ${TABLE}.net_amount ;;
hidden: yes # Don't expose raw dimension
}
measure: net_revenue {
type: sum
sql: ${net_amount} ;;
description: "Revenue excluding tax, refunds, and cancelled orders"
value_format_name: usd
}
measure: order_count {
type: count_distinct
sql: ${order_id} ;;
description: "Total number of completed orders"
}
measure: average_order_value {
type: number
sql: ${net_revenue} / NULLIF(${order_count}, 0) ;;
description: "Average net revenue per order"
value_format_name: usd
}
}
# Explores define which views can be joined together
explore: orders {
description: "Order analytics"
join: customers {
type: left_outer
sql_on: ${orders.customer_id} = ${customers.customer_id} ;;
relationship: many_to_one
}
}

LookML gives you:

✅ Dimensions and measurement definitions (Just like our YAML registry)
✅ type security (dates, numbers, strings with verification)
✅ reusable logic (Derived measures refer to other measures)
✅ add relationships (Defines how views are joined)
✅ access control (field-level permissions)
✅ version control (LookML files reside in Git)

Integration: LookML + ADK

Here’s the good news: LookML and ADK work beautifully together.

Google to open source Looker’s semantic layer in 2024 Open SQL InterfaceWhich means you can access LookML-defined metrics programmatically from anywhere – including your ADK agents.

Architecture with LookML:

Natural language question
↓
(ADK Agent) ← Reasoning & orchestration
↓
(Looker API / Open SQL Interface) ← Query LookML metrics
↓
(LookML Models) ← Semantic layer (metrics, dimensions, joins)
↓
(BigQuery) ← Raw storage & computation

Implementation with Looker API:

from google.adk import Agent, Tool
import requests
import osclass LookerSemanticAgent:
def __init__(self):
self.looker_api_url = os.getenv("LOOKER_API_URL")
self.looker_client_id = os.getenv("LOOKER_CLIENT_ID")
self.looker_client_secret = os.getenv("LOOKER_CLIENT_SECRET")
self.access_token = self._authenticate()
def _authenticate(self):
"""Authenticate with Looker API"""
auth_response = requests.post(
f"{self.looker_api_url}/login",
data={
"client_id": self.looker_client_id,
"client_secret": self.looker_client_secret
}
)
return auth_response.json()("access_token")
@Tool
def query_looker_metric(
self,
model: str,
explore: str,
measures: list(str),
dimensions: list(str) = None,
filters: dict = None,
limit: int = 500
) -> dict:
"""
Query metrics defined in LookML through Looker API.
Args:
model: LookML model name (e.g., 'analytics')
explore: Explore name (e.g., 'orders')
measures: List of measures to query (e.g., ('net_revenue'))
dimensions: List of dimensions to group by (e.g., ('country'))
filters: Dimension filters (e.g., {'order_date': '30 days'})
limit: Maximum number of rows to return
"""# Build Looker query
query_body = {
"model": model,
"view": explore,
"fields": measures + (dimensions or ()),
"filters": filters or {},
"limit": limit
}
# Create and run query
headers = {"Authorization": f"Bearer {self.access_token}"}
# Create query
create_response = requests.post(
f"{self.looker_api_url}/queries",
json=query_body,
headers=headers
)
query_id = create_response.json()("id")
# Run query
run_response = requests.get(
f"{self.looker_api_url}/queries/{query_id}/run/json",
headers=headers
)
results = run_response.json()
# Get field definitions for context
explore_response = requests.get(
f"{self.looker_api_url}/lookml_models/{model}/explores/{explore}",
headers=headers
)
explore_metadata = explore_response.json()
return {
"results": results,
"metadata": {
"model": model,
"explore": explore,
"measures": measures,
"dimensions": dimensions,
"filters": filters,
"field_definitions": self._extract_field_metadata(
explore_metadata, 
measures + (dimensions or ())
)
}
}
def _extract_field_metadata(self, explore_metadata, fields):
"""Extract descriptions and types for requested fields"""
metadata = {}
for field in fields:
# Search in measures
for measure in explore_metadata.get("fields", {}).get("measures", ()):
if measure("name") == field:
metadata(field) = {
"description": measure.get("description"),
"type": measure.get("type"),
"sql": measure.get("sql")
}
# Search in dimensions
for dimension in explore_metadata.get("fields", {}).get("dimensions", ()):
if dimension("name") == field:
metadata(field) = {
"description": dimension.get("description"),
"type": dimension.get("type")
}
return metadata
# Create ADK agent with Looker integration
looker_agent = LookerSemanticAgent()
agent = Agent(
name="LookerAnalyticsAgent",
model="gemini-2.0-flash-exp",
instruction="""You are an analytics expert with access to Looker's semantic layer.
When users ask about metrics:
1. Identify which LookML model and explore to use
2. Map their question to specific measures and dimensions
3. Extract any filters from their question
4. Call query_looker_metric with the correct parameters
5. Present results clearly with metric definitions
Always include the metric description in your response so users understand
exactly what they're seeing.
Available models: 'analytics'
Available explores: 'orders', 'customers', 'products'
""",
tools=(looker_agent.query_looker_metric)
)

When to use LookML vs Custom YAML Registry

Here is a decision framework:

Use LookML when:

✅ You are already using Looker – Don’t rebuild what you have
✅ you have complex join logic – LookML’s Explore layer handles this beautifully
✅ You need a BI tool anyway – Get a semantic layer + visualization in one
✅ You have non-technical users – Looker’s UI makes metric searching easy
✅ You want enterprise features – Row-level security, caching, built-in alerts

Use custom YAML registry when:

✅ you don’t need looker — Why pay for software you won’t use?
✅ you want complete control – No vendor lock-in, complete flexibility
✅ you have simple aggregation — Most metrics are SUM/COUNT/AVG on a table
✅ You are making a special device – Embedded analytics, CLI tools, etc.
✅ You want minimal infrastructure – Only files in Git, no service for maintenance

Hybrid approach (best of both worlds):

Many teams use Both: :

LookML for BI users — Analysts create dashboards in Looker
YAML Registry for Agents – Subset of metrics exposed to AI agents
shared definitions — both point to similar BigQuery ideas

# metrics.yaml - Mirrors subset of LookML
metrics:
net_revenue:
description: "Revenue excluding tax, refunds, and cancelled orders"
looker_reference:
model: analytics
explore: orders
measure: net_revenue
fallback_sql: "SUM(net_amount)" # If Looker unavailable
source: analytics.fact_orders

it gives you:

BI tools for human users (Looker)
Lightweight agent access (BigQuery directly via YAML)
Single Source of Truth (LookML is official)
Flexibility (agents can fall back directly to SQL)

Using Looker’s Open SQL interface

In August 2024, Google released Open SQL Interface For Looker, which provides JDBC/SQL access to LookML metrics. This is huge for agent integration.

import jaydebeapi# Connect to Looker via JDBC
conn = jaydebeapi.connect(
"com.looker.jdbc.LookerDriver",
"jdbc:looker://your-instance.looker.com:19999",
{"user": "your-user", "password": "your-password"}
)
cursor = conn.cursor()
# Query LookML metrics using SQL
cursor.execute("""
SELECT 
orders.country,
orders.net_revenue,
orders.order_count
FROM analytics.orders
WHERE orders.order_date >= '2025-01-01'
GROUP BY orders.country
""")
results = cursor.fetchall()

The agent can now treat LookML like a database – but it’s a semantic database where column names are metric definitions.

Best of both: LookML + ADK + BigQuery

Here is the architecture I recommend for enterprises that are already using Looker:

User Question
↓
ADK Agent decides:
├─→ Complex/exploratory? → Looker API (leverages joins, caching)
├─→ Simple/real-time? → Direct BigQuery (lower latency)
└─→ New metric? → Suggest LookML addition
↓
Both paths use same BigQuery views

execution:

class HybridSemanticAgent:
def __init__(self):
self.looker = LookerSemanticAgent()
self.bigquery = SemanticLayerAgent(registry_path="./semantic-layer")@Tool
def query_metric(self, metric_name: str, filters: dict, **kwargs):
"""Route queries to optimal backend"""
# Check if metric requires joins (use Looker)
if self._requires_joins(metric_name):
return self.looker.query_looker_metric(
model="analytics",
explore=self._get_explore(metric_name),
measures=(metric_name),
filters=filters
)
# Simple metric, use direct BigQuery
return self.bigquery.query_metric(
metric_name=metric_name,
filters=filters,
**kwargs
)
def _requires_joins(self, metric_name: str) -> bool:
"""Check if metric needs data from multiple tables"""
complex_metrics = {
'customer_lifetime_value', # Needs orders + customers
'product_revenue_share', # Needs orders + products
'repeat_customer_rate' # Needs customer history
}
return metric_name in complex_metrics

Real-world example: multi-table metrics

This is where LookML shines – metrics that span multiple tables:

# This is painful to manage in YAML, elegant in LookML
explore: orders {
join: customers {
sql_on: ${orders.customer_id} = ${customers.customer_id} ;;
relationship: many_to_one
}join: products {
sql_on: ${orders.product_id} = ${products.product_id} ;;
relationship: many_to_one
}
}
view: orders {
measure: revenue_by_customer_tier {
type: sum
sql: ${net_amount} ;;
filters: (customers.tier: "enterprise") # Cross-view filter
}
measure: high_margin_product_revenue {
type: sum
sql: ${net_amount} ;;
filters: (products.margin_percentage: ">30") # Another join
}
}

Your agent can query these complex metrics without having to worry about join logic:

result = looker_agent.query_looker_metric(
model="analytics",
explore="orders",
measures=("revenue_by_customer_tier"),
dimensions=("order_month"),
filters={"order_date": "90 days"}
)

Looker handles joins, caching, and SQL generation.

Migration path: YAML → LookML

If you’re a beginner with YAML and need Looker:

Step 1: YAML Only (Week 1-2)

Quick prototyping with minimal infrastructure
Prove value with core metrics

Step 2: Hybrid (month 1-3)

Introduce Looker to BI users
Migrate complex metrics to LookML
Have simple metrics in YAML for agents

Step 3: LookML Primary (month 3+)

LookML is the source of truth
Agents perform queries via the Looker API or Open SQL interface
YAML becomes a cache/fallback layer

Investments are tailored to your needs.

Published via Towards AI