The Allen Institute for AI on Tuesday released Molmo 2, a suite of open video language models. The new additions, along with the training data, reflect the nonprofit’s continued commitment to open source, which is a benefit for enterprises that want to better control their use of the model.
The new models include the Molmo 2-4B and Molmo 2-8B, both built on Alibaba’s Quen3 language model. The release also includes Molmo2-o-7b, a completely open version based on AI2’s Olmo language model.
Along with the models, non profitable Released nine new data sets, including a long-term quality assurance data set for multi-image and video input, as well as an open video pointing and tracking data set.
Molmo 2’s capabilities
According to the company, the Olma variant, the Molmo 2-O-7B, is a transparent model that users can study from start to finish. AI2 said, because users have access to the vision language model and its LLM, Olmo, they can fully customize the model, providing a level of transparency.
For the Molmo 2 model, one of the new capabilities added by Ai2 is the ability to understand multiple images; The model supports all images and any video, regardless of length, the company said.
Users can ask the model questions about images or videos and it can base its reasoning on patterns identified in the videos, said Ranjay Krishna, director of perceptual reasoning and interaction research at AI2.
“What I mean by ‘ground’ is that it’s not just giving you the answer, it’s giving you a point in pixels or at the time when something happens,” Krishna said.
Models can also generate descriptive captions, track and count objects in the frame, and detect rare or surprising events in extended video sequences. Available on Molmo 2 hugging face and AI2 Playground, the nonprofit’s platform where users can play with its various tools and models.
Commitment to open source
This release reflects Ai2’s continued commitment to open sourceAccording to Bradley Shimin, an analyst at The Futurum Group, this highlights the importance of a vendor that releases not only models but also the data and weights associated with them.
“They should be given some attention, especially as we start to push to bring corporate data into a model that emphasizes sovereignty,” Shimin said, “where the data must comply with the laws of the country where it was generated.”
He said AI2’s decision to keep its model size small, using four or eight billion parameters, is important because not every enterprise can afford or need it. a trillion parameter model This has to be fixed.
“It is not economically viable,” he said. “Molmoo is such a significant family of models that you don’t need a frontier scale to get the value.”
Enterprises are also becoming aware of the fact that rather than the size of the model, it is the data it is trained on that matters, he said.
“Many companies are demanding a level of transparency and accountability from model makers not only for the model, but for the data it is built on, to give them the freedom to innovate,” Shimin said. “This is another reason why the open source model of innovation is so important for the entire IT landscape.”
While the new Molmo 2 models offer greater flexibility for fine-tuning high quality data For those who want to replicate this, there are some challenges too – namely adoption and, with it, financing.
“When an industry like ours moves money around based on estimated future value, it’s easy for a company like this to get left behind or out,” Shimin said.