Roadmap to real integration

by
0 comments
Roadmap to real integration

Do you know that feeling when you’re building something and the ground keeps slipping beneath your feet? It’s exactly like building an agentic AI stack right now. GPUs evolve, frameworks update, models improve; Everything is in constant flux. But here’s what I’ve learned: Some things remain constant, and these are the basics you need to focus on.

I recently shared my journey of building an agentic stack for startup Play, an OTT platform aggregator service. Let me tell you what worked, what didn’t and what you need to know if you’re stepping into this field.

Enterprise development that got us here

Think about where we come from. We started with monolithic architectures, and hey, Prime Video still uses them for surveillance, so they’re not dead yet. Then there were progressions: servers, microservices, event-driven architectures, and finally serverless with Lambda functions.

Now? We are in the AI-native era. And that means adding reasoning capabilities, large language models, RAG systems, and agent AI to our existing enterprise stack. The biggest challenge is not technology, but integration. How do you add agentic capabilities to systems that are already running, already serving customers, already generating revenue?

The Layers That Matter (And Why You Can’t Skip Any)

Let me paint you a picture of what the modern agentic stack actually looks like. Yes, it’s complicated. No, you can’t shed layers and hope for the best.

Starting at the top, you’ve got your API layers: the interface between your agents and the world. Below that is the orchestration layer, whether it’s Kubernetes, microservices, or something like Langgraph for workflows. Then come your language models (large or small, depending on your use case), followed by the memory and context layer – this is where the embeddings live, where knowledge graphs provide the semantic understanding.

The action layer is where things get interesting. Your agents need tools and APIs to work in the real world. And underneath it all? Data and governance. Because without proper data management and security, you are building a house of cards.

When GPT-5 thinks like a scientist

GPT-5 is transforming research with innovative insights, deep literature searches, and human-AI collaboration that drive scientific breakthroughs.

microservices mandate

Here’s something important: Your microservices must be stateless. I cannot emphasize this enough. Store your state information in Kafka, Redis, Cassandra, or MongoDB – anywhere but in the service itself. It’s not just about following best practices; It’s about building something that can be scaled up when needed.

And speaking of scale, I want to talk about what we have achieved: a system supporting one million transactions per second. Yes, you read that right. It is possible, but only if you plan for it from day one.

Your API needs clear lifecycle management. Are they experimental? steady? Excluded? This matters more than you think, especially when you’re iterating rapidly.

Database writes should be append-only. For reads, leverage the cache aggressively. And your data pipeline? It needs schema validation, ETL processes, incremental load, and backfill capabilities. These are not good things; They are necessary.

For expert advice like this delivered straight to your inbox every other Friday, sign up for a Pro+ membership.

You’ll get 300+ hours of exclusive video content, a complimentary summit ticket, and much more.

So what are you waiting for?

Get Pro+

five ways forward

Through trial and error, I’ve identified five different approaches to building your agentic stack. Let me break them down:

path one Meant for teams with existing enterprise systems. You have microservices, they’re stateless, and you’re selling the state to Redis or Kafka. The beauty here? Symbolic efficiency. You are not calling models unnecessarily. Maybe you have a Lambda function running for 15 minutes, calling an LLM or small language model as needed. It’s faster to get it to market because you’re building on what you already have.

give path Looks similar but with one main difference: hosting. In path one, you host the models yourself. Path leverages two public cloud providers: Google, Azure, AWS. exchange? Less controls for more convenience.

path three Introduces MCP (Model Context Protocol) as a separate component. It standardizes your tooling, queries, and resource access. It’s about creating stability in a world of constant change.

path four Focuses on workflows. Tools like Langgraph let you define states and transitions, calling different models or agents depending on where you are in the process. It is powerful for complex, multi-step operations.

path five (And this is cutting edge stuff) It includes agent sandboxes. Think of it like Android apps running in a sandbox on Linux. Everything is controlled: your data, your file system, your execution environment. This came to light literally last week with the announcements of Enterprise Agent Cloud and Kubernetes North America 2025. I am optimistic about this approach. Imagine an agent store where developers deploy agents like mobile apps. We’re not there yet, but it’s coming.

Use cases that taught me everything

Let me share what we have built for our OTT aggregator platform. Instead of subscribing to multiple streaming services, users subscribe to our aggregator and access them all through one interface. We built models for metadata enrichment, recommendations, discovery, video monitoring, quality of experience tracking, and content publishing.

The important lesson here is: We built this framework three years ago. Models have changed. Structures have evolved. But the application data, the interface patterns, the user insights we captured? They are still gold. The data you collect today will outlive any specific model or framework you choose.

Our multimodal recommendation system taught us the value of flexibility. We use proxies and load balancers to route calls between the locally hosted model and the remote model. This means we can change models without disrupting service. That kind of architectural decision pays off over time.

Case Study: Cute

Stockholm-based “vibe coding” platform Lovable is demonstrating that Europe is still a leading incubator for global AI unicorns.

Data: constants in a world of variables

Let me be very clear about this: data management will make or break your agentic system. You need to think about data at three levels:

  1. session data: What remains in a single user session?
  2. cross-session data: What persists across multiple interactions?
  3. long term data:What forms part of your institutional knowledge?

Every data that enters your system needs to be captured and organized. Whether this requires ranking, deduplication, prioritization or aging, you need a plan. It’s not sexy work, but it’s the foundation upon which everything else is built.

We experimented extensively with vector databases. The LightFM and DeepFM models were giving us slow query-to-embedding performance. After testing several options, we landed on Milvus for its scaling capabilities. For the knowledge graph, we went in-depth on metadata enrichment, carefully designing our node and context structures.

Build vs Buy Decision Matrix

This is where things get strategic. You need to identify what won’t change and what will add unique value to your organization. Here is my outline:

Build these components:

  • Your orchestration layer (if you have microservices, keep them stateless and add an SEO layer for delivery)
  • Memory architecture (Redis or Hazelcast for short-term, Neo4j for knowledge graphs)
  • Context Routing (this is your secret sauce, keep it in house)
  • Data pipelines (transformations, schema mapping, deduplication – all important and specific to your use case)
  • Governance and security rules (domain-specific and important for compliance)
  • Cost optimization and model routing (you need visibility into what your money is being spent on)

Buy or adopt these:

  • Large and small language models (the open source ecosystem is rich here)
  • Torrent estimation capabilities (Akamai’s torrent estimation is game-changing for scale)
  • Vector Database (Milvus has proven itself)
  • MCP Framework (Langgraph or CrewAI are solid choices)
  • DevOps and MLOps platforms (unless you have very specific needs)
  • Experimental platform (MLFlow for weights and biases, Comet, or model version)

torrent guess revolution

Here’s something that doesn’t get enough attention: There is no need for guesswork in your centralized infrastructure. Edge estimation is important for scale. When you’re headed towards that million TPS mark, centralizing all projections becomes your hurdle. Akamai and Cloudflare are doing incredible things here. Consider this seriously.

Small AI models can see powerful language models like GPT-4

A new framework called BeMyEyes shows how lightweight vision models can act as “eyes” for text-only AI systems.

Your integration touchpoint strategy

It’s about future-proofing. Your stack needs multiple integration touchpoints: API-driven, modular, replaceable. You will change the model. You will adopt new platforms. If you’re tied too tightly to any one ingredient, you’re setting yourself up for pain.

Takeaways that matter

After building and rebuilding these systems, I know this for sure:

First, the founding pillars of your enterprise system still matter. scale, reliability, security; These don’t go away because you’ve added AI. They become more critical.

Second, intelligence should be woven into the fabric of your enterprise, not locked in. Your agentic architecture needs to reason, adapt, collaborate, and, importantly, work with your existing systems.

Third, now identify your business case. Not in three months. Not after much research. Now. Use prompts and agents to create something today. But recognize that this is just the first step.

Fourth, create your own domain by establishing agentic rules and structures specific to your domain. It’s not about following someone else’s playbook; It’s about creating your own.

Fifth, build solid application workflows that handle memory, context, and knowledge graphs. It becomes your treasure trove of information that only you can create for your specific domain.

Sixth, constantly fine tune. Common language models won’t cut it. Whether you use LoRA, QLoRA, or other methods, you need models that understand your specific context.

Seventh, invest in better estimation methods. Edge-based estimation is not optional if you want to scale. Think meta-scale, not MVP-scale.

Finally, own your domain. The application layer, the data, the user behavior, they’re yours. They will last longer than any typical technology option you choose today.

bottom line

Building the agentive stack feels like building a building during an earthquake. Everything is changing, developing, improving. But some things remain constant: the need for solid architecture, the value of your data, and the importance of building for change rather than stability.

Your application layer and the data it generates will stay with you long after today’s hot frameworks become obsolete. Build your stack to capture and take advantage of that value. Make it flexible enough to evolve but stable enough to be trusted.

Models will change. The outline will be developed. But what are the problems you are solving and the value you are creating? They are yours to own. Build accordingly.


Pratap Chaudhary, SVP, Head of Product Engineering, Tata Play, made this presentation at our Agentic AI Summit in London in December 2025.

Related Articles

Leave a Comment