fosstodon.org is one of the many independent Mastodon servers you can use to participate in the fediverse.
Fosstodon is an invite only Mastodon instance that is open to those who are interested in technology; particularly free & open source software. If you wish to join, contact us for an invite.

Administered by:

Server stats:

9.8K
active users

#ai2

0 posts0 participants0 posts today
Continued thread

#AI2 releases OLMo 2 32B, trained on 6T tokens with #Tulu3.1 post-training. Matches or exceeds GPT3.5 Turbo while using just 1/3 the compute of #Qwen2.5 32B. Complete open recipe includes data, code, weights and training methodology.

Game over for #OpenAI?

Tülu 3 405B achieves competitive or superior performance to both #Deepseek v3 and #GPT-4o, while surpassing prior open-weight post-trained models of the same size including #Llama 3.1 405B Instruct and Nous Hermes 3 405B on many standard benchmarks

#AI #aI2 #tech #technology #opensource #altman #software #data

allenai.org/blog/tulu-3-405B

allenai.orgScaling the Tülu 3 post-training recipes to surpass the performance of DeepSeek V3 | Ai2Introducing Tülu 3 405B, the first application of fully open post-training recipes to the largest open-weight models.

These Mini AI Models Match OpenAI With 1,000 Times Less Data

Jason Dorrier discusses the AI industry's focus on scaling up models, and contrasts this with the Allen Institute for AI's (Ai2) approach of creating efficient, smaller models like Molmo.
Molmo outperforms larger models using high-quality data and is open-source.


singularityhub.com/2024/10/04/

Singularity Hub · These Mini AI Models Match OpenAI With 1,000 Times Less DataAi2 new family of open-source AI models are competitive with state-of-the-art models like OpenAI's GPT-4o—but an order of magnitude smaller.

🧠 #AI2 unveils #opensource #Molmo #LLM family, competing with top proprietary models

🏆 72B-parameter Molmo outperforms #GPT4 in image and document comprehension tests

🎯 7B-parameter version approaches state-of-the-art performance with significantly less data

📊 Trained on 600k high-quality, annotated images vs. billions in other models

👆 New "pointing" capability allows Molmo to identify specific elements in images

🌐 Available for developers on #HuggingFace, promoting open-source #AI development

technologyreview.com/2024/09/2

MIT Technology Review · A tiny new open-source AI model performs as well as powerful big onesBy Melissa Heikkilä

A tiny new open-source AI model performs as well as powerful big ones
The Allen Institute for Artificial Intelligence (#Ai2), called #Molmo, that it says perform as well as top proprietary models from OpenAI, Google, and Anthropic. The results suggest that training models on less, but higher-quality, data can lower computing costs.
They claim its biggest Molmo model, which has 72B parameters, outperforms GPT-4o, which is estimated to have over a trillion parameters
technologyreview.com/2024/09/2

Інститут Аллена з досліджень штучного інтелекту (AI2) представив OLMoE — нову мовну модель з відкритим вихідним кодом, яка обіцяє високу продуктивність та економічну ефективність.

#AI #AI2 #LLM #OLMoE #ШІ

thetransmitted.com/ai/nova-mod