LookML: An Alternative Semantic Layer Approach to Building a Trusted AI Analytics Agent with BigQuery

by
0 comments
LookML: An Alternative Semantic Layer Approach to Building a Trusted AI Analytics Agent with BigQuery

Author(s): allglen

Originally published on Towards AI.

Before we talk about where to store your registry, let’s address the elephant in the room: What about LookML?

If you’re already using Looker, you may be wondering if you need to create Completely YAML-based semantic layer – And that’s a fair question.

What is LookML?

LookML is Looker’s modeling language for defining your semantic layer. This is what makes Looker powerful: instead of writing SQL for each chart, you define your business logic once in LookML, and Looker automatically generates the SQL.

same here net_revenue The metric looks like this in LookML:

view: orders {
sql_table_name: analytics.fact_orders ;;

dimension: order_id {
primary_key: yes
type: string
sql: ${TABLE}.order_id ;;
}

dimension_group: order {
type: time
timeframes: (date, week, month, year)
sql: ${TABLE}.order_date ;;
}

dimension: country {
type: string
sql: ${TABLE}.country ;;
}

dimension: net_amount {
type: number
sql: ${TABLE}.net_amount ;;
hidden: yes # Don't expose raw dimension
}

measure: net_revenue {
type: sum
sql: ${net_amount} ;;
description: "Revenue excluding tax, refunds, and cancelled orders"
value_format_name: usd
}

measure: order_count {
type: count_distinct
sql: ${order_id} ;;
description: "Total number of completed orders"
}

measure: average_order_value {
type: number
sql: ${net_revenue} / NULLIF(${order_count}, 0) ;;
description: "Average net revenue per order"
value_format_name: usd
}
}

# Explores define which views can be joined together
explore: orders {
description: "Order analytics"

join: customers {
type: left_outer
sql_on: ${orders.customer_id} = ${customers.customer_id} ;;
relationship: many_to_one
}
}

LookML gives you:

  • Dimensions and measurement definitions (Just like our YAML registry)
  • type security (dates, numbers, strings with verification)
  • reusable logic (Derived measures refer to other measures)
  • add relationships (Defines how views are joined)
  • access control (field-level permissions)
  • version control (LookML files reside in Git)

Integration: LookML + ADK

Here’s the good news: LookML and ADK work beautifully together.

Google to open source Looker’s semantic layer in 2024 Open SQL InterfaceWhich means you can access LookML-defined metrics programmatically from anywhere – including your ADK agents.

Beyond AI Tools: How I Architect Systems That Really Run Business

Architecture with LookML:

Natural language question

(ADK Agent) ← Reasoning & orchestration

(Looker API / Open SQL Interface) ← Query LookML metrics

(LookML Models) ← Semantic layer (metrics, dimensions, joins)

(BigQuery) ← Raw storage & computation

Implementation with Looker API:

from google.adk import Agent, Tool
import requests
import os

class LookerSemanticAgent:
def __init__(self):
self.looker_api_url = os.getenv("LOOKER_API_URL")
self.looker_client_id = os.getenv("LOOKER_CLIENT_ID")
self.looker_client_secret = os.getenv("LOOKER_CLIENT_SECRET")
self.access_token = self._authenticate()

def _authenticate(self):
"""Authenticate with Looker API"""
auth_response = requests.post(
f"{self.looker_api_url}/login",
data={
"client_id": self.looker_client_id,
"client_secret": self.looker_client_secret
}
)
return auth_response.json()("access_token")

@Tool
def query_looker_metric(
self,
model: str,
explore: str,
measures: list(str),
dimensions: list(str) = None,
filters: dict = None,
limit: int = 500
) -> dict:
"""
Query metrics defined in LookML through Looker API.

Args:
model: LookML model name (e.g., 'analytics')
explore: Explore name (e.g., 'orders')
measures: List of measures to query (e.g., ('net_revenue'))
dimensions: List of dimensions to group by (e.g., ('country'))
filters: Dimension filters (e.g., {'order_date': '30 days'})
limit: Maximum number of rows to return
"

""

# Build Looker query
query_body = {
"model": model,
"view": explore,
"fields": measures + (dimensions or ()),
"filters": filters or {},
"limit": limit
}

# Create and run query
headers = {"Authorization": f"Bearer {self.access_token}"}

# Create query
create_response = requests.post(
f"{self.looker_api_url}/queries",
json=query_body,
headers=headers
)
query_id = create_response.json()("id")

# Run query
run_response = requests.get(
f"{self.looker_api_url}/queries/{query_id}/run/json",
headers=headers
)

results = run_response.json()

# Get field definitions for context
explore_response = requests.get(
f"{self.looker_api_url}/lookml_models/{model}/explores/{explore}",
headers=headers
)
explore_metadata = explore_response.json()

return {
"results": results,
"metadata": {
"model": model,
"explore": explore,
"measures": measures,
"dimensions": dimensions,
"filters": filters,
"field_definitions": self._extract_field_metadata(
explore_metadata,
measures + (dimensions or ())
)
}
}

def _extract_field_metadata(self, explore_metadata, fields):
"""Extract descriptions and types for requested fields"""
metadata = {}

for field in fields:
# Search in measures
for measure in explore_metadata.get("fields", {}).get("measures", ()):
if measure("name") == field:
metadata(field) = {
"description": measure.get("description"),
"type": measure.get("type"),
"sql": measure.get("sql")
}

# Search in dimensions
for dimension in explore_metadata.get("fields", {}).get("dimensions", ()):
if dimension("name") == field:
metadata(field) = {
"description": dimension.get("description"),
"type": dimension.get("type")
}

return metadata

# Create ADK agent with Looker integration
looker_agent = LookerSemanticAgent()

agent = Agent(
name="LookerAnalyticsAgent",
model="gemini-2.0-flash-exp",
instruction="""You are an analytics expert with access to Looker's semantic layer.

When users ask about metrics:
1. Identify which LookML model and explore to use
2. Map their question to specific measures and dimensions
3. Extract any filters from their question
4. Call query_looker_metric with the correct parameters
5. Present results clearly with metric definitions

Always include the metric description in your response so users understand
exactly what they're seeing.

Available models: 'analytics'
Available explores: 'orders', 'customers', 'products'
""",
tools=(looker_agent.query_looker_metric)
)

When to use LookML vs Custom YAML Registry

Here is a decision framework:

Use LookML when:

You are already using Looker – Don’t rebuild what you have
you have complex join logic – LookML’s Explore layer handles this beautifully
You need a BI tool anyway – Get a semantic layer + visualization in one
You have non-technical users – Looker’s UI makes metric searching easy
You want enterprise features – Row-level security, caching, built-in alerts

Use custom YAML registry when:

you don’t need looker — Why pay for software you won’t use?
you want complete control – No vendor lock-in, complete flexibility
you have simple aggregation — Most metrics are SUM/COUNT/AVG on a table
You are making a special device – Embedded analytics, CLI tools, etc.
You want minimal infrastructure – Only files in Git, no service for maintenance

Hybrid approach (best of both worlds):

Many teams use Both: :

  • LookML for BI users — Analysts create dashboards in Looker
  • YAML Registry for Agents – Subset of metrics exposed to AI agents
  • shared definitions — both point to similar BigQuery ideas
# metrics.yaml - Mirrors subset of LookML
metrics:
net_revenue:
description: "Revenue excluding tax, refunds, and cancelled orders"
looker_reference:
model: analytics
explore: orders
measure: net_revenue
fallback_sql: "SUM(net_amount)" # If Looker unavailable
source: analytics.fact_orders

it gives you:

  • BI tools for human users (Looker)
  • Lightweight agent access (BigQuery directly via YAML)
  • Single Source of Truth (LookML is official)
  • Flexibility (agents can fall back directly to SQL)

Using Looker’s Open SQL interface

In August 2024, Google released Open SQL Interface For Looker, which provides JDBC/SQL access to LookML metrics. This is huge for agent integration.

import jaydebeapi

# Connect to Looker via JDBC
conn = jaydebeapi.connect(
"com.looker.jdbc.LookerDriver",
"jdbc:looker://your-instance.looker.com:19999",
{"user": "your-user", "password": "your-password"}
)

cursor = conn.cursor()

# Query LookML metrics using SQL
cursor.execute("""
SELECT
orders.country,
orders.net_revenue,
orders.order_count
FROM analytics.orders
WHERE orders.order_date >= '2025-01-01'
GROUP BY orders.country
"""
)

results = cursor.fetchall()

The agent can now treat LookML like a database – but it’s a semantic database where column names are metric definitions.

Best of both: LookML + ADK + BigQuery

Here is the architecture I recommend for enterprises that are already using Looker:

User Question

ADK Agent decides:
├─→ Complex/exploratory? → Looker API (leverages joins, caching)
├─→ Simple/real-time? → Direct BigQuery (lower latency)
└─→ New metric? → Suggest LookML addition

Both paths use same BigQuery views

execution:

class HybridSemanticAgent:
def __init__(self):
self.looker = LookerSemanticAgent()
self.bigquery = SemanticLayerAgent(registry_path="./semantic-layer")

@Tool
def query_metric(self, metric_name: str, filters: dict, **kwargs):
"""Route queries to optimal backend"""

# Check if metric requires joins (use Looker)
if self._requires_joins(metric_name):
return self.looker.query_looker_metric(
model="analytics",
explore=self._get_explore(metric_name),
measures=(metric_name),
filters=filters
)

# Simple metric, use direct BigQuery
return self.bigquery.query_metric(
metric_name=metric_name,
filters=filters,
**kwargs
)

def _requires_joins(self, metric_name: str) -> bool:
"""Check if metric needs data from multiple tables"""
complex_metrics = {
'customer_lifetime_value', # Needs orders + customers
'product_revenue_share', # Needs orders + products
'repeat_customer_rate' # Needs customer history
}
return metric_name in complex_metrics

Real-world example: multi-table metrics

This is where LookML shines – metrics that span multiple tables:

# This is painful to manage in YAML, elegant in LookML
explore: orders {
join: customers {
sql_on: ${orders.customer_id} = ${customers.customer_id} ;;
relationship: many_to_one
}

join: products {
sql_on: ${orders.product_id} = ${products.product_id} ;;
relationship: many_to_one
}
}

view: orders {
measure: revenue_by_customer_tier {
type: sum
sql: ${net_amount} ;;
filters: (customers.tier: "enterprise") # Cross-view filter
}

measure: high_margin_product_revenue {
type: sum
sql: ${net_amount} ;;
filters: (products.margin_percentage: ">30") # Another join
}
}

Your agent can query these complex metrics without having to worry about join logic:

result = looker_agent.query_looker_metric(
model="analytics",
explore="orders",
measures=("revenue_by_customer_tier"),
dimensions=("order_month"),
filters={"order_date": "90 days"}
)

Looker handles joins, caching, and SQL generation.

Migration path: YAML → LookML

If you’re a beginner with YAML and need Looker:

Step 1: YAML Only (Week 1-2)

  • Quick prototyping with minimal infrastructure
  • Prove value with core metrics

Step 2: Hybrid (month 1-3)

  • Introduce Looker to BI users
  • Migrate complex metrics to LookML
  • Have simple metrics in YAML for agents

Step 3: LookML Primary (month 3+)

  • LookML is the source of truth
  • Agents perform queries via the Looker API or Open SQL interface
  • YAML becomes a cache/fallback layer

Investments are tailored to your needs.

Published via Towards AI

Related Articles

Leave a Comment