LAI #110: Fixing context rot and rethinking how agents reason

Author(s): To the AI Editorial Team

Originally published on Towards AI.

Good morning, AI enthusiasts,

This week, we’re looking at why agent systems get lost, confused, or silently break down when tasks take long. I uncover the real cause of “random” agent degradation: the gradual burial of essential information under noise in the form of context decay, interactions, tool calls, and heaps of intermediate steps.

Curated articles expand on the topic of structure and clarity. You’ll find a guide to microservice architectures for ML systems, breaking down training, inference and data pipelines into modular services; A vector-free evaluation method (BriarLM) for models that predict continuous representations rather than tokens; A real-world case study on metro delay prediction using telemetry data; and a practical overview of context engineering as the “operating system” of agent performance. We take a look at recursive language models, systems that decompose tasks into clean, isolated subtasks to avoid the limitations of traditional context windows.

Let’s join in.

What is AI Weekly?

This week, in What is AI?I explain why so many agents “randomly” screw up during long tasks and why this is usually not your signal or model. The real culprit is context: as conversations and tool calls pile up, essential details get buried beneath the noise, leading to drift, confusion, and hallucinations (what I call context rot). I explain what context engineering really means in practice, and the core techniques that keep systems reliable at scale, particularly retrieval, compaction, and structured memory, so that the model can see the right information at the right time. Watch full video here!

-Louis-François Bouchard, co-founder and head of community at Towards AI

Learn AI together in the Community section!

Featured community posts from Discord

Kiskare. Shared APIHub, a platform that offers affordable API prices for image generation models. It offers predictable flat pricing per request; You can choose between NanoBanana, NanoBanana Pro, or Imagen 4. Check it out here And see if it is useful for you. If you have any questions, access thread!

AI Survey of the Week!

Most of you are building multi-agent systems with LangGraph, with a large second group and noticeable n8n slices on custom frameworks. Determinism, tool-call reliability and observability are deciding the stack. Langgraph’s graph/state model scratches that itch; Custom builds appear where teams need domain-specific guards, cheaper runtimes, or lighter deployments than heavy frameworks.

Share an actual workflow diagram or snippet (nodes/steps + guards) and the necessary runtime feature(s) you can choose from for your stack, for example, checkpointing, phase/budget limits, human-in-the-loop, or tracing. let’s talk in the thread!

opportunities for collaboration

The Learn AI Together Discord community is bursting with opportunities for collaboration. If you’re excited to get into applied AI, want a study partner, or even want to find a partner for your passion project, Join the collaboration channel! Keep an eye on this section too – we share great deals every week!

1. View Loading Is building automation with AI and looking for partners. if you are interested, Join them in the thread!

2. skazan_ GenAI is learning with Langchain and is looking for a study partner. If that’s your focus for the next few months, reach it in the thread!

3. Sakshamgarg08295 Starting to create an agent afresh. If you are interested in pursuing this, Contact them in the thread!

Meme of the week!

Meme shared Bigbuckshungas

TAI Curated Section

article of the week

Understanding Microservice Architecture for Machine Learning Applications faizulkhan

This guide explains how to use microservices architecture for machine learning applications. It begins by comparing monolithic and microservices approaches, outlining the benefits of the latter for ML systems. The discussion covers core services, including data ingestion, model training and inference, as well as communication protocols, such as REST and gRPC. It also clarifies the difference between stateless and stateful services. To demonstrate these concepts, a practical laboratory is included to create a simple two-service system, providing hands-on experience with the architecture.

Our must-read articles

1. Beyond entanglement: Evaluating next-vector prediction when softmax is not an option Fabio Yanez Romero

Evaluating language models that predict continuous vectors rather than discrete tokens requires going beyond traditional metrics such as perturbation. This summary covers BriarLM, a probability-free alternative that assesses model quality using only generated samples. Based on the Brier score, this method evaluates both the accuracy and uncertainty calibration of the model against the ground truth by comparing two independent samples. The approach provides a stable evaluation for sample-based systems and can be extended to n-grams to measure text coherence, providing a practical tool for architectures such as CALM or diffusion-based language models.

2. Can you predict subway delays before transit officials announce them? By charlie taggart

This author explores a method of predicting subway delays before official transit announcements. Using 10 months of public MBTA train telemetry, a machine learning model was developed to identify patterns that preceded service disruptions. The model, a random forest classifier, analyzed metrics such as train headway and station stopping times. It successfully predicted official alerts with an average time of 35 minutes. This approach proved to be more reliable than simple estimates by providing a better balance between accurate warnings and false alarms, offering a practical way to give passengers advance notice of potential delays.

3. Context Engineering: The Silent Revolution Transforming AI by Agents Mahendra Medapati

The performance of AI agents often depends not on models, but on effective context engineering, the systematic design of information flows. This piece puts context engineering as the operating system for AI, managing how agents store, retrieve, and use information. It outlines key strategies, including memory management, advanced retrieval methods such as RAG, context compression, and task isolation using specialized sub-agents. Additionally, it addresses common failure modes such as reference toxicity and drift, while providing practical guidance on performance optimization, particularly through KV cache management to improve speed and reduce costs.

4. MIT abolished the context window Alok Ranjan Singh

Context Decay,To address the degradation in LLM performance with long inputs, this article explores recursive language models (RLMs). Instead of remembering huge contexts, an RLM acts as a programmatic wrapper that decomposes tasks into subtasks. It writes code to solve a problem, creates new LLM instances to analyze each part with clean context, and then synthesizes the results into a final answer. This method enables precise reasoning over millions of tokens, shifting the approach from direct context processing to a more structured, programmatic exploration of the data, thereby overcoming the limitations of traditional models.

If you’re interested in publishing with Towards AI, check out our guidelines and sign up. If your work meets our editorial policies and standards we will publish your work on our network.

Published via Towards AI

LAI #110: Fixing context rot and rethinking how agents reason

Author(s): To the AI ​​Editorial Team