Thumbtack Powering Safe, Smart Home Services on Databricks with GenAI

by
0 comments
Thumbtack Powering Safe, Smart Home Services on Databricks with GenAI

Building the most trusted home care platform

Thumbtack’s mission is simple but ambitious: to empower people to manage their homes with confidence and ease by making every service, repair, and repair reliable and secure. We support local economies by connecting millions of homeowners across the country with more than 300,000 skilled professionals, from plumbers and electricians to wellness providers and event organizers. The opportunity is huge, but so is the complexity – our goal is to guarantee consistent, exceptional results for every client, every time.

Unlocking GenAI Value on Thumbtack

The rapid growth of home services and rising customer expectations means we are constantly evolving our platform – data volumes, unpredictable customer and professional needs, and expanding service categories present technical and organizational challenges. Thumbtack faced fragmented data science and engineering workflows, siled infrastructure, and a high bar for privacy and security.

Solving these challenges requires more than clever algorithms or faster infrastructure. This required a connected, trusted data and machine learning platform that puts security, privacy and collaboration at the core. Our vision: Integrate our GenAI ecosystem on top of Databricks to drive real, measurable impact.

Trusted GenAI, centralized security and productive data science

Boosting trust and security with fine-tuned LLM

Thumbtack’s semi-automated message review pipeline is the backbone of our Digital Trust Platform. Every message between the customer and the professional is checked by both rule-based engines and machine learning models. While common abuse cases can be caught by simple rules, many subtle policy violations may not be caught. Early systems based on Convolutional Neural Networks (CNNs) struggled to distinguish between sarcasm, context or implicit threats.

Fine-tuning the larger language model on Thumbtack’s own labeled data made the difference in phase-change. With our hybrid workflow, a CNN model explicitly pre-filters for good messages, reducing LLM workload by 80%. The streamlined LLM then focuses its power on the most challenging 20%, increasing detection accuracy by 3.7x and recall accuracy by 1.5x. Millions of messages are processed each year, ensuring conversations remain secure and avoiding unnecessary costs while maintaining honest conversations.

Building on Databricks: Secure, Standardized, and Flexible

All of Thumbtack’s advanced AI and trust workflows now run through a unified ML platform built on Databricks. Key investments and security measures include:

  • Centralized LLM Workload Management: By running all GenAI workloads on Databricks, we reduce our attack surface and maintain a consistent governance model.
  • Workplace Segregation: Virtual private clouds ensure that sensitive data remains secure, with granular permissions managed through tools like Terraform. We use Unity Catalog to enable serverless and Databricks Genie to access BigQuery to ensure secure permissions management.
  • Automatic Privacy Protection: Open-source and internally developed scrubbers remove personally identifiable information (PII) and confidential information from data as it flows through notebooks, models, and pipelines.
  • Comprehensive Overview and Monitoring: Every model, notebook, and API route is tracked for data flow and PII exposure. Visualization tools confirm that risky data is not leaking into downstream systems.
  • Centralized secrets and artwork management: With MLflow and secrets managers, teams securely manage credentials, version all models, and collaborate productively – no more decentralized, brittle copy-pasting of keys or libraries.

Best Practices in GenAI Operations

  • Hybrid AI Workload: Production services run on AWS with analytics on Google Cloud, but all GenAI workflows are centralized and standardized for reproducibility.
  • Reuse and Efficiency: MLflow and notebook tracking mean experiments or solutions can be shared, compared, and extended across engineering, SRE, and analytics – all with consistent privacy controls.
  • Proactive Privacy Safeguards: Thumbtack customizes open source PII scrubbers to your specific needs and applies monitoring at every layer. Industry trends indicate a 300% increase in PII-related notebook and model breaches since 2022, making these security concerns business-critical.

More Security, More Trust, More Innovation

  • Market Scale: Millions of US users and over 300,000 local service businesses now interact on a platform that prioritizes security and reliability.
  • Better message filtering: Precision up to 3.7x, recall up to 1.5x, costs controlled by processing only the riskiest 20% of messages with LLM while protecting privacy every step of the way.
  • Collaboration and Efficiency: Centralized, reproducible ML workflows eliminate manual handoffs and enable rapid cross-team innovation, allowing data scientists, SREs, and ML engineers to work in sync.
  • Confidence in scale: With strong technical and process controls, Thumbtack fulfills its mission to be the most trusted, transparent marketplace for home services.

As Thumbtack continues its GenAI journey, every team is empowered to experiment, collaborate, and deliver secure, smart home service experiences. The strategy is based on real-world impact, demonstrating how AI, privacy and platform thinking work together to create value for both professionals and homeowners.

see thumbtack Boosting Data Science and AI Productivity with Databricks Notebooks 2025 Data + AI Summit Presentation.

Related Articles

Leave a Comment