This Week in AI: Production Feasibility – O’Reilly

by ai-intensify
0 comments
This Week in AI: Production Feasibility - O'Reilly

In this week’s episode, the host and founder of an AI consulting firm intelligence briefing Andreas Welch brings together co-founder and CEO Maya Mikhailov Savi AIand Doug Shannon, Generative AI and Intelligent Automation Leader, to cover some of the interconnected topics that practitioners are paying attention to right now: The push of OpenAI in personal finance, the role of metacognition In AI-assisted technical work, the growing backlash against token-based productivity metrics, and the new role of the forward-deployed engineer. Together, these stories paint a picture of an industry that is good at generating output but is still figuring out what the value of output is.

Why does OpenAI want your bank account data?

When OpenAI announced it was Analyzing users’ transaction data In partnership with financial institutions, the coverage focuses on consumer benefits: a smart way to track spending, similar to what Credit Karma or Mint offer but with a more interactive interface.

But that’s not the only thing the company is interested in, or even the main thing. Maya redefined the stakes: “What OpenAI wants to do is figure out consumer intent.” Being able to access users’ financial data is less about helping people manage their money and more about completing a profile that the company can then monetize. OpenAI already creates a surprisingly accurate picture of users from their chat histories. Add transaction data and you’ll find details that weren’t there before: what someone is saving for, what they’re worried about, where their money is actually going. This is a very valuable data asset for advertisers.

We’ve seen this pattern before, and as Andreas notes, companies have long possessed (and used) potentially offensive data to recommend products. target pregnancy prediction story It’s over a decade old now, but it’s still taught in business school, including Andreas’s, precisely because it shows how behavioral data can be combined to predict things people haven’t explicitly disclosed — and highlights the fine line between effective recommendations and feel-good ones. Very Personalized, reminding consumers how much information companies have. Companies’ profile-building capabilities haven’t changed, Maya said, but AI chat adds a new complication. A conversational interface makes disclosure natural, so a knowledge graph based on your chat history is very powerful. And these tools are also better positioned to share recommendations than traditional methods. “By having this style of what’s acceptable, what’s attractive,” Maya explained, “those recommendations will be much stickier than a piece of sentence I typed into a regular search engine.”

Metacognition as a professional skill

When you hand over thinking to a system that averages a huge range of inputs to produce an answer, you need to know when that answer is good enough and when it’s not.

“We’re basically being averaged out,” Doug said. The model is doing a lot of work behind the scenes to get the average response. The human’s job is to ask questions about questions, pursue answers first, and figure out if their own decision is still in the loop. That’s why Doug is pushing for a renewed interest in metacognition, or “thinking about thinking.” It’s OK to reduce the peripheral cognitive load of your work, Doug and Maya agreed. Removing the logic that is central to the value of your job – what Doug calls cognitive dedication – is where organizations get into trouble.

Future benefits will not come from access to AI. Everyone will have some kind of access to it. It pays to know what to take away, what questions to ask, and what human judgment should never be abandoned. This is as much a philosophical question as it is a question of skill development. The people who will be most effective with AI tools are not those who use them the most; They are the ones who understand what to leave and what to keep. This requires domain knowledge, judgment about when a model’s answer is plausible but wrong, and enough fluency in how these systems work to recognize when you are being given an average rather than an answer.

Tokenmaxing and wrong incentives

tokenmaxing The debate seems to be reaching its peak. Amazon Finished your AI productivity leaderboard Employees then started gaming it by writing ineffective code to increase the usage of the tokens. and a company reportedly burnt down $500 million in Anthropic tokens in one month After failing to set limits. Maya argued that companies that encourage tokenmaxing are encouraging the wrong metrics. It’s like determining which bakery is the best by the amount of flour it uses. The right question is “Are we making a quality product?”

Andreas shared his own Vibe coding experience as an example of how token consumption and technical debt mix in practice. A developer starts with a modest plan and knocks out agents hitting their quota in half an hour. They upgrade to a higher level by paying five times more, but now the sunk-cost logic applies. As Andreas explained, they now feel “they should be getting five times the value (from their subscription),” so the scope expands from a tool to an integrated business operating system. Three weeks later, the accumulated complexity has outstripped the ability to evaluate it: Repeated security audits keep turning up new issues, each pass generating recommendations that require cybersecurity expertise that most Vibe coders don’t have. Doug’s point about metacognition applies here: the more a builder is actively involved in understanding what the system is actually doing, the better their judgment about whether it is working or not. For less engaged users, the risk is to accept the output, send the loan, and discover the results later.

Much of the misalignment arises from the gap between what executives expect from AI and what practitioners deal with on a day-to-day basis. Maya said executives see a potential that could change the productivity curve. Engineers and analysts live with technical debt, version control problems, and regulatory hurdles that don’t disappear because you have a better code completion tool. The leaderboard problem is a symptom of that disconnect.

GitHub’s recent change from unlimited to usage-based pricing for Copilot is likely to realign these incentives faster than any internal policy change. When more CFOs start seeing actual bills, the leaderboards will come down.

Doug identified a related problem related to “cognitive dedication” for LLMs. When organizations encourage employees to incorporate internal processes, proprietary logic, and institutional knowledge into the basic model without governance, they are not merely footing the token bill. They are delivering operational knowledge that sets them apart. Process documentation, workflow logic, and institutional memory about why certain decisions were made are all forms of intellectual property, and once they are encoded into a general-purpose model, the organization’s benefit from them is reduced.

Forward-deployed engineers are not enough on their own

Is the answer to these challenges to place a skilled engineer directly into a customer environment to find out what a model produces and what an organization actually needs? This is the promise of the forward-deployed engineer (FDE) approach popularized by AI companies. Doug and Maya both made some criticisms of the model.

Maya’s objection was structural. Enterprise AI deployment is not a matter of adding capacity on top of existing infrastructure. Organizations come with confidential data, legacy systems, and regulatory hurdles that no forward-thinking engineer can solve based on technical skills alone. “You can’t just sprinkle some AI on it, and it will just work from a package of tokens,” he said. Engineers need to know the context behind why certain data cannot be used or why a particular model cannot be deployed in a regulated context. FDEs who are fresh to an organization do not have this understanding and as a result may undo decisions that were taken carefully and for reasons that are not clearly written down anywhere.

Doug’s concern was about communication. In their experience, FDEs come with strong technical inclinations and limited organizational context. They get to work quickly but struggle to communicate with the entire group of stakeholders involved. That’s why business analysts exist, to understand customer problems and what the process actually is before engineers can solve them. Skip that step and you will get technically correct output that solves the wrong problem.

Both Maya and Doug were emphasizing that the deployment of AI at the enterprise level is fundamentally a Context crisis. Models are capable. It is difficult to know what capability to apply, where to do so, and with what constraints. That knowledge does not reside in the model; It resides in those who have worked inside the organization for a long time to know why things are the way they are.

measurement problem

All the topics in this episode boil down to the same question: What are we actually measuring, and what incentives are we setting up with those measurements? Token counts and lines of code don’t always correspond to the results companies want. To figure out what goals you want to achieve and what measures you need to take to get there, you need human expertise and relevant knowledge of the business.

in next monday’s episode This week in AIRecomind founder Miguel Fierro joins host Christina Stathopoulos to discuss responsible AI, multimodal content creation, and how LLM is changing personalization and user understanding. Miguel will also lead a live demo offering a glimpse of the next generation of recommendation experiences – register here.

We’ll continue to publish our takeaways and share full episodes on Radar every Friday youtube, spotify, AppleOr wherever you get your podcasts.

Related Articles

Leave a Comment