Yann LeCun’s new venture is a contrarian bet against big language models

by
0 comments
Yann LeCun's new venture is a contrarian bet against big language models

You were working on AI long before LLM became a mainstream approach. But ever since ChatGPT started, LLM has become almost synonymous with AI.

Yes, and we’re going to change that. The public face of AI is, perhaps, mostly LLMs and chatbots of various types. But the latest among them are not pure LLM. They are many things other than LLM, such as perception systems and code that solve particular problems. So we are going to see LLMs as orchestrators in the system to some extent.

Apart from LLM, there is a lot of AI behind the scenes that runs a large part of our society. There are assisted driving programs in cars, quick-turning MRI images, algorithms that run social media – it’s all AI.

You have been vocal in arguing that the LLM can only take us so far. Do you think LLM is over-hyped these days? Can you briefly tell our readers why you believe LLMs are not enough?

One sense that they haven’t been overstated is that they’re extremely useful to a lot of people, especially if you write text, do research, or write code. LLMs manipulate language really well. But people have this illusion or misconception that it’s just a matter of time until we can get them to human level intelligence, and that’s absolutely false.

The really hard part is understanding the real world. This is the Moravec paradox (a phenomenon observed in 1988 by computer scientist Hans Moravec): what is easy for us, such as perception and navigation, is hard for computers, and vice versa. LLMs are confined to a separate world of text. They can’t really reason or plan, because they have no model of the world. They cannot predict the consequences of their actions. This is why we don’t have any domestic robots that are as agile as a domestic cat or a truly autonomous car.

We will have AI systems that will have human and human-level intelligence, but they will not be built on LLM, and it is not going to happen in the next year or two from now. This will take some time. Before we can have AI systems with human-level intelligence, major conceptual breakthroughs must be made. And this is what I am working on. And this company, AMI Labs, is focusing on the next generation.

And your solution is world models and the JEPA architecture (JEPA, or “Joint Embedding Predictive Architecture”, is a learning framework that trains AI models to understand the world, which LeCun created when he was at Meta). What is the pitch of the elevator?

Related Articles

Leave a Comment