Fosstodon @fosstodon

3 posts2 participants0 posts today

**Scott McCarty** @fatherlinux@noc.social · 2d

Scott McCarty @fatherlinux@noc.social

How RamaLama helps make AI model testing safer https://buff.ly/6r5RW27 #AI #ML #DL #NN #oss #opensource

**Habr** @habr@zhub.link · 2d

Habr @habr@zhub.link

Головоломка, кофе и охапка книг, или как я искал истоки термина «Deep Learning». Часть 2

Привет! Некоторое время назад я начал искать истоки термина «Deep Learning» . Тогда я изучал только зарубежные источники и обещал вернуться позже с обзором советской и российской литературы. Что ж, откладывать это больше нельзя. Посмотрим, на кого будут ссылаться отечественные авторы в том, что касается истории развития глубокого обучения. Без долгого вступления — берем в руку пальцы Ctrl/Cmd+F и начинаем раскопки!

https://habr.com/ru/companies/selectel/articles/899050/

ХабрГоловоломка, кофе и охапка книг, или как я искал истоки термина «Deep Learning». Часть 2Привет! Некоторое время назад я начал искать истоки термина «Deep Learning» . Тогда я изучал только зарубежные источники и обещал вернуться позже с обзором советской и российской литературы. Что ж,...

#selectel #ии #искусственный_интеллект

**Scott McCarty** @fatherlinux@noc.social · 2d

Scott McCarty @fatherlinux@noc.social

Simplify AI data integration with RamaLama and RAG | Red Hat Developer https://buff.ly/w9tZ1dg #AI #ML #DL #NN #oss #opensource

**Nate Gaylinn** @ngaylinn@tech.lgbt · Apr 2

Apr 2

Nate Gaylinn @ngaylinn@tech.lgbt

AI is a hot topic these days, but what does that word even mean? Originally it was about our quest to understand our own minds, but it has come to refer to one technology: deep learning. We often talk about AI as if it were just a human mind in a box, but the reality is quite different, in nuanced ways that AI companies play down. In this month's blog post, I explore how AI relates to human intelligence, what it reproduces, and what it doesn't.

https://thinkingwithnate.wordpress.com/2025/04/02/how-is-ai-like-human-intelligence/

A photo of a jacquard loom in the middle of weaving a complex pattern in red and gold. The loom itself is a massive structure of solid wood beams with many strings, pulleys, and a collection of "punch cards" barely visible in the top left corner.

Thinking with Nate · Apr 2How is AI like human intelligence?(The image for this post is a photo of a jacquard loom I took in the workshop of Luigi Bevilacqua in Venice. It’s a heavy wooden frame covered in pulleys and string, in the midst of weaving f…

#ai #dl #deeplearning

**Scott McCarty** @fatherlinux@noc.social · Apr 1

Apr 1

Scott McCarty @fatherlinux@noc.social

How RamaLama helps make AI model testing safer https://buff.ly/5H8phOO #AI #ML #DL #NN #oss #opensource

**HGPU group** @hgpu@mast.hpc.social · Mar 30

Mar 30

HGPU group @hgpu@mast.hpc.social

TileLink: Generating Efficient Compute-Communication Overlapping Kernels using Tile-Centric Primitives

#CUDA #PTX #Triton #LLM #DeepLearning #DL

https://hgpu.org/?p=29841

hgpu.org · Mar 30TileLink: Generating Efficient Compute-Communication Overlapping Kernels using Tile-Centric PrimitivesLarge deep learning models have achieved state-of-the-art performance in a wide range of tasks. These models often necessitate distributed systems for efficient training and inference. The fundamen…

**Sikorski Arkadiusz** @sikorski@mindly.social · Mar 25 *

Mar 25 *

Sikorski Arkadiusz @sikorski@mindly.social

Aby pobrać zdjęcia z flickr przed ich usunięciem, jedną z metod jest : https://www.flickrhelp.com/hc/en-us/articles/4404079675156-Downloading-content-from-Flickr

#flickr
#takeway
#download
#dl
#takeout

**Arkadiusz Sikorski** @sikorski@mastodon.pirati.cz · Mar 25

Mar 25

Arkadiusz Sikorski @sikorski@mastodon.pirati.cz

Ke stažení fotografií z flickru před jejich odstraněním je jedna metoda: https://www.flickrhelp.com/hc/en-us/articles/4404079675156-Downloading-content-from-Flickr

#flickr #takeway #download

**Scott McCarty** @fatherlinux@noc.social · Mar 24

Mar 24

Scott McCarty @fatherlinux@noc.social

Red Hat named to Fast Company’s annual list of the World’s Most Innovative Companies of 2025 https://buff.ly/yJk2dvw #AI #ML #DL #NN #oss #opensource

**Pustam | पुस्तम | পুস্তম** @pustam_egr@mathstodon.xyz · Mar 21

Mar 21

Pustam | पुस्तम | পুস্তম @pustam_egr@mathstodon.xyz

Moore’s Law for AI agents: the length of tasks that AIs can do is doubling about every 7 months.

These results appear robust. The authors were able to retrodict back to GPT-2. They further ran experiments on SWE-bench Verified and found a similar trend.

Read more: https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks

#AIBoom #AI #AIAgents

**Pustam | पुस्तम | পুস্তম** @pustam_egr@mathstodon.xyz · Mar 18

Mar 18

Pustam | पुस्तम | পুস্তম @pustam_egr@mathstodon.xyz

The rise of AI research in a graph — see how its ArXiv submissions compare to other fields over the past decade. #AI #ArXiV #ArtificialIntelligence #DL #ML #CV #NLP #XAI #AIResearch #CS #ComputerScience #DataScience #Research #Revolution #AIBoom #AIRevolution

**ma𝕏pool** @maxpool@mathstodon.xyz · Mar 6 *

Mar 6 *

ma𝕏pool @maxpool@mathstodon.xyz

Self-Improving Reasoners.

Both expert human problem solvers and successful language models employ four key cognitive behaviors

1. verification (systematic error-checking),

2. backtracking (abandoning failing approaches),

3. subgoal setting (decomposing problems into manageable steps), and

4. backward chaining (reasoning from desired outcomes to initial inputs).

Some language models naturally exhibits these reasoning behaviors and exhibit substantial gains, while others don't and quickly plateau.

The presence of reasoning behaviors, not the correctness
of answers is the critical factor. Models with incorrect solutions containing proper reasoning patterns achieve comparable performance to those trained on correct solutions.

It seems that the presence of cognitive behaviors enables self-improvement through RL.

Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs
https://arxiv.org/abs/2503.01307

#reinforcementlearning #RL
#AI #DL #LLM

arXiv.orgCognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRsTest-time inference has emerged as a powerful paradigm for enabling language models to ``think'' longer and more carefully about complex challenges, much like skilled human experts. While reinforcement learning (RL) can drive self-improvement in language models on verifiable tasks, some models exhibit substantial gains while others quickly plateau. For instance, we find that Qwen-2.5-3B far exceeds Llama-3.2-3B under identical RL training for the game of Countdown. This discrepancy raises a critical question: what intrinsic properties enable effective self-improvement? We introduce a framework to investigate this question by analyzing four key cognitive behaviors -- verification, backtracking, subgoal setting, and backward chaining -- that both expert human problem solvers and successful language models employ. Our study reveals that Qwen naturally exhibits these reasoning behaviors, whereas Llama initially lacks them. In systematic experimentation with controlled behavioral datasets, we find that priming Llama with examples containing these reasoning behaviors enables substantial improvements during RL, matching or exceeding Qwen's performance. Importantly, the presence of reasoning behaviors, rather than correctness of answers, proves to be the critical factor -- models primed with incorrect solutions containing proper reasoning patterns achieve comparable performance to those trained on correct solutions. Finally, leveraging continued pretraining with OpenWebMath data, filtered to amplify reasoning behaviors, enables the Llama model to match Qwen's self-improvement trajectory. Our findings establish a fundamental relationship between initial reasoning behaviors and the capacity for improvement, explaining why some language models effectively utilize additional computation while others plateau.

**HGPU group** @hgpu@mast.hpc.social · Mar 3

Mar 3

HGPU group @hgpu@mast.hpc.social

CRIUgpu: Transparent Checkpointing of GPU-Accelerated Workloads

#ROCm #CUDA #DeepLearning #DL #Package

https://hgpu.org/?p=29795

hgpu.org · Mar 3CRIUgpu: Transparent Checkpointing of GPU-Accelerated WorkloadsDeep learning training at scale is resource-intensive and time-consuming, often running across hundreds or thousands of GPUs for weeks or months. Efficient checkpointing is crucial for running thes…

**HGPU group** @hgpu@mast.hpc.social · Mar 3

Mar 3

HGPU group @hgpu@mast.hpc.social

TritonBench: Benchmarking Large Language Model Capabilities for Generating Triton Operators

#CUDA #CodeGeneration #LLM #DeepLearning #DL #Python #Package

https://hgpu.org/?p=29794

hgpu.org · Mar 3TritonBench: Benchmarking Large Language Model Capabilities for Generating Triton OperatorsTriton, a high-level Python-like language designed for building efficient GPU kernels, is widely adopted in deep learning frameworks due to its portability, flexibility, and accessibility. However,…

**Nate Gaylinn** @ngaylinn@tech.lgbt · Feb 19

Feb 19

Nate Gaylinn @ngaylinn@tech.lgbt

Read any Deep Learning papers that made you do a double take?

Share them here, and we can make a list to blow each other's minds and get closer to actually understanding what the hell is going on. Boosts appreciated!

We've learned a ton about Deep Learning over the years, but in a fundamental way we still don't get it. There's tons of tricks we use without knowing why, and weird examples that work much better or much worse than you'd expect. We try to probe and visualize what's going on inside the black box, and what we find is often strange and hard to interpret.

I'm in an excellent class right now exploring the "surprises" of deep learning, reading papers like this to build a better understanding. I've shared a few of them here, but now I'm looking for more to share back with the class.

Any suggestions?

#deeplearning #dl #machinelearning

**Scott McCarty** @fatherlinux@noc.social · Feb 16

Feb 16

Scott McCarty @fatherlinux@noc.social

"We encourage the open source community, regulatory authorities and industry to continue to strive toward greater transparency and alignment with open source development principles when training and fine-tuning AI models" https://buff.ly/3Eyn85w #AI #ML #DL #NN #oss #opensource

**Scott McCarty** @fatherlinux@noc.social · Feb 16

Feb 16

Scott McCarty @fatherlinux@noc.social

I know this is provocative, but I agree with Kelsey, this is just a different type of software, and we know that a lot of best practices apply: open source, Contianer, CI/CD, etc https://buff.ly/42TrnTu #AI #ML #DL #NN #oss #opensource