fosstodon.org is one of the many independent Mastodon servers you can use to participate in the fediverse.
Fosstodon is an invite only Mastodon instance that is open to those who are interested in technology; particularly free & open source software. If you wish to join, contact us for an invite.

Administered by:

Server stats:

8.7K
active users

#LocalLLMs

0 posts0 participants0 posts today
Miguel Afonso Caetano<p>"Simon Willison has a plan for the end of the world. It’s a USB stick, onto which he has loaded a couple of his favorite open-weight LLMs—models that have been shared publicly by their creators and that can, in principle, be downloaded and run with local hardware. If human civilization should ever collapse, Willison plans to use all the knowledge encoded in their billions of parameters for help. “It’s like having a weird, condensed, faulty version of Wikipedia, so I can help reboot society with the help of my little USB stick,” he says.</p><p>But you don’t need to be planning for the end of the world to want to run an LLM on your own device. Willison, who writes a popular blog about local LLMs and software development, has plenty of compatriots: r/LocalLLaMA, a subreddit devoted to running LLMs on your own hardware, has half a million members.<br>For people who are concerned about privacy, want to break free from the control of the big LLM companies, or just enjoy tinkering, local models offer a compelling alternative to ChatGPT and its web-based peers.</p><p>The local LLM world used to have a high barrier to entry: In the early days, it was impossible to run anything useful without investing in pricey GPUs. But researchers have had so much success in shrinking down and speeding up models that anyone with a laptop, or even a smartphone, can now get in on the action. “A couple of years ago, I’d have said personal computers are not powerful enough to run the good models. You need a $50,000 server rack to run them,” Willison says. “And I kept on being proved wrong time and time again.”"</p><p><a href="https://www.technologyreview.com/2025/07/17/1120391/how-to-run-an-llm-on-your-laptop/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">technologyreview.com/2025/07/1</span><span class="invisible">7/1120391/how-to-run-an-llm-on-your-laptop/</span></a></p><p><a href="https://tldr.nettime.org/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> <a href="https://tldr.nettime.org/tags/GenerativeAI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GenerativeAI</span></a> <a href="https://tldr.nettime.org/tags/LLMs" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LLMs</span></a> <a href="https://tldr.nettime.org/tags/Chatbots" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Chatbots</span></a> <a href="https://tldr.nettime.org/tags/LocalLLMs" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LocalLLMs</span></a> <a href="https://tldr.nettime.org/tags/Privacy" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Privacy</span></a> <a href="https://tldr.nettime.org/tags/DataProtection" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataProtection</span></a> <a href="https://tldr.nettime.org/tags/Decentralization" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Decentralization</span></a></p>
Hacker News<p>Local LLMs versus Offline Wikipedia</p><p><a href="https://evanhahn.com/local-llms-versus-offline-wikipedia/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">evanhahn.com/local-llms-versus</span><span class="invisible">-offline-wikipedia/</span></a></p><p><a href="https://mastodon.social/tags/HackerNews" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>HackerNews</span></a> <a href="https://mastodon.social/tags/LocalLLMs" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LocalLLMs</span></a> <a href="https://mastodon.social/tags/OfflineWikipedia" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OfflineWikipedia</span></a> <a href="https://mastodon.social/tags/TechDebate" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>TechDebate</span></a> <a href="https://mastodon.social/tags/AIResearch" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AIResearch</span></a> <a href="https://mastodon.social/tags/KnowledgeAccess" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>KnowledgeAccess</span></a></p>
mast4sc<p>.... Vielen Dank für den Vortrag! ( leider erst im Nachgang auf Video angeschaut...). Hat mich motiviert ollama mal zu installieren , hatte bisher nur <a href="https://mastodon.social/tags/gpt4all" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>gpt4all</span></a> als <a href="https://mastodon.social/tags/LocalLLMs" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LocalLLMs</span></a> in Gebrauch, jetzt bin ich gespannt!</p><p><span class="h-card" translate="no"><a href="https://sigmoid.social/@orbiterlab" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>orbiterlab</span></a></span> <span class="h-card" translate="no"><a href="https://mastodon.social/@clt_news" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>clt_news</span></a></span> <br><a href="https://mastodon.social/tags/clt2025" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>clt2025</span></a></p>
Nick Kuhn<p>Are you interested in exploring <a href="https://vmst.io/tags/OpenWebUI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenWebUI</span></a> and <a href="https://vmst.io/tags/LocalLLMs" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LocalLLMs</span></a> on Cloud Foundry? Then be sure to check out this week's episode of Cloud Foundry Weekly, in which Nicky Pike and I deep-dive into deploying the popular <a href="https://vmst.io/tags/GenAI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GenAI</span></a> frontend on your Cloud Foundry foundations.</p><p>Watch the replay:<br><a href="https://www.youtube.com/watch?v=0DZb70-HwrM&amp;t" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">youtube.com/watch?v=0DZb70-Hwr</span><span class="invisible">M&amp;t</span></a></p><p>Listen to the podcast:<br><a href="https://cloudfoundryweekly.com/episodes/installing-open-webui-and-exploring-local-llms-on-cf-cloud-foundry-weekly-episode-46" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">cloudfoundryweekly.com/episode</span><span class="invisible">s/installing-open-webui-and-exploring-local-llms-on-cf-cloud-foundry-weekly-episode-46</span></a></p>
Masoud Masoumi<p>any app that can be easily installed on Android phones to run LLMs locally? No data tracking.</p><p>I'm currently using ChatterUI, which runs Llama models decently well and it is easy to install and run.<br><a href="https://mastodon.social/tags/llm" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>llm</span></a> <a href="https://mastodon.social/tags/LocalLLMs" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LocalLLMs</span></a></p>
PKPs Powerfromspace1<p>@carnage4life<br>Goldman Sachs issues a report about generative <a href="https://mstdn.social/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a>; currently too expensive to justify ROI, no killer app and looming poser &amp; chip shortages.<br>That said they see money to be made if a killer app is found and also simply because bubbles take a long time to burst. </p><p><a href="https://x.com/carnage4life/status/1809590088768405922?s=46" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">x.com/carnage4life/status/1809</span><span class="invisible">590088768405922?s=46</span></a></p><p>( ed : We are officially in the trough of disillusionment. <br>This is when the real work begins. <a href="https://mstdn.social/tags/OpenSource" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenSource</span></a> <a href="https://mstdn.social/tags/LocalLLMs" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LocalLLMs</span></a> 😈🇨🇳🦾)</p>
Peter Binkley<p>Picking up my Ollama learning: on my home workstation Ollama uses the GPU and Jan doesn't, but I like Jan's web interface. So, why not let Jan use Ollama as its engine, like it can do with OpenAI? Instructions here: <a href="https://jan.ai/docs/local-inference/ollama" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">jan.ai/docs/local-inference/ol</span><span class="invisible">lama</span></a> . It works! And using Ollama is much faster vs Jan's Llama 2 Chat 7B Q4 model, presumably because of the GPU use: 12.3 tokens/second vs 4.4 for Jan without Ollama (but I need to figure out whether the two are really comparable models) <a href="https://code4lib.social/tags/LLMs" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LLMs</span></a> <a href="https://code4lib.social/tags/LocalLLMs" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LocalLLMs</span></a></p>
Po-chiang "Bob" Chao<p>I recently tried out PrivateGPT, and found it to be quite easy to install and use. It reminded me of some advanced tools I had played with years ago for searching files on my own computer, which were similar to what Google had released at the time. Nowadays, operating systems also have something built-in for this purpose. I wouldn't be surprised if in the coming months, they integrate Local LLMs and decompose users' files into chunks to retrieve answers. <a href="https://floss.social/tags/ai" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ai</span></a> <a href="https://floss.social/tags/RAG" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>RAG</span></a> <a href="https://floss.social/tags/LocalLLMs" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LocalLLMs</span></a></p>
spaduf<p>Imma be honest I'm not willing to give up being able to talk to my books because some douchey billionaire exploited labor <a href="https://hachyderm.io/tags/LocalLLMs" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LocalLLMs</span></a></p>
David Ruffner<p>Just tried out GPT4All. It's an open source application for running LLMs locally. You can pick from a variety of open-source models. From initial testing it works pretty well. Not as good out-of-the-box as GPT4 or Gemini or Claude but has a lot of potential. I love that it is running locally on my machine and that I can look into how it works. It allows you to tweak things like temperature and has a way to integrate a document store.<br><a href="https://github.com/nomic-ai/gpt4all" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">github.com/nomic-ai/gpt4all</span><span class="invisible"></span></a><br><a href="https://raphus.social/tags/LocalLLMs" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LocalLLMs</span></a> <br><a href="https://raphus.social/tags/gpt4all" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>gpt4all</span></a><br><a href="https://raphus.social/tags/foss" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>foss</span></a></p>
happyborg<p><span class="h-card" translate="no"><a href="https://aus.social/@ajsadauskas" class="u-url mention">@<span>ajsadauskas</span></a></span> <br />I agree we need better and remember the early days well. Before indexes we passed URLs, in fact just IP addresses of servers we&#39;d visit to see what was there, and that was often a directory of documents, papers etc. It filled us with awe, but let&#39;s not dial back that far!</p><p>Another improvement will be <a href="https://fosstodon.org/tags/LocalLLMs" class="mention hashtag" rel="tag">#<span>LocalLLMs</span></a> both for privacy and personalised settings. Much of the garbage now is in service of keeping us searching rather than finding what we want.<br /><span class="h-card" translate="no"><a href="https://lemmy.ml/c/degoogle" class="u-url mention">@<span>degoogle</span></a></span></p>
happyborg<p>What happens when we have local AI and we ask it to play blah?</p><p>Blah isn&#39;t on your machine so it searches the web, downloads a torrent and plays that, breaching copyright.</p><p>Who&#39;s responsible for ensuring the AI doesn&#39;t breach copyright, and how do they ensure it doesn&#39;t?</p><p>🤔 <br /><a href="https://fosstodon.org/tags/LLMs" class="mention hashtag" rel="tag">#<span>LLMs</span></a> <a href="https://fosstodon.org/tags/LocalAI" class="mention hashtag" rel="tag">#<span>LocalAI</span></a> <a href="https://fosstodon.org/tags/LocalLLMs" class="mention hashtag" rel="tag">#<span>LocalLLMs</span></a></p>
Peter Krupa<p>I had an unsettling experience a few days back where I was booping along, writing some code, asking ChatGPT 4.0 some questions, when I got the follow message: “You’ve reached the current usage cap for GPT-4, please try again after 4:15 pm.” I clicked on the “Learn More” link and basically got a message saying “we actually can’t afford to give you unlimited access to ChatGPT 4.0 at the price you are paying for your membership ($20/mo), would you like to pay more???”</p><p>It dawned on me that OpenAI is trying to speedrun enshitification. The classic enshitification model is as follows: 1) hook users on your product to the point that it is a utility they cannot live without, 2) slowly choke off features and raise prices because they are captured, 3) profit. I say it’s a speedrun because OpenAI hasn’t quite accomplished (1) and (2). I am <em>not</em> hooked on its product, and it is <em>not</em> slowly choking off features and raising prices– rather, it appears set to do that right away.</p><p>While I like having a coding assistant, I do not want to depend on an outside service charging a subscription to provide me with one, so I immediately cancelled my subscription. Bye, bitch.</p><blockquote></blockquote><p>But then I got to thinking: people are running LLMs locally now. Why not try that? So I procured an Nvidia RTX 3060 with 12gb of VRAM (from what I understand, the entry-level hardware you need to run AI-type stuff) and plopped it into my Ubuntu machine running on a Ryzen 5 5600 and 48gb of RAM. I figured from poking around on Reddit that running an LLM locally was doable but eccentric and would take some fiddling.</p><p>Reader, it did not.</p><p>I installed <a href="https://ollama.ai/" rel="nofollow noopener" target="_blank">Ollama</a> and had codellama running locally within minutes.</p><p>It was honestly a little shocking. It was <em>very </em>fast, and with Ollama, I was able to try out a number of different models. There are a few clear downsides. First, I don’t think these “quantized” (I think??) local models are as good as ChatGPT 3.5, which makes sense because they are quite a bit smaller and running on weaker hardware. There have been a couple of moments where the model just obviously misunderstands my query.</p><p>But codellama gave me a pretty useful critique of this section of code:</p><p>… which is really what I need from a coding assistant at this point. I later asked it to add some basic error handling for my “with” statement and it did a good job. I will also be doing more research on context managers to see how I can add one.</p><p>Another downside is that the console is not a great UI, so I’m hoping I can find a solution for that. The open-source, locally-run LLM scene is <em>heaving</em> with activity right now, and I’ve seen a number of people indicate they are working on a GUI for Ollama, so I’m sure we’ll have one soon.</p><p>Anyway, this experience has taught me that an important thing to watch now is that <em>anyone</em> can run an LLM locally on a newer Mac or by spending a few hundred bucks on a GPU. While OpenAI and Google brawl over the future of AI, in the present, you can use Llama 2.0 or Mistral <em>now</em>, tuned in any number of ways, to do basically anything you want. Coding assistant? Short story generator? Fake therapist? AI girlfriend? Malware? Revenge porn??? The activity around open-source LLMs is chaotic and fascinating and I think it will be the main AI story of 2024. As more and more normies get access to this technology with guardrails removed, things are going to get spicy.</p><p><a href="https://www.peterkrupa.lol/2024/01/28/moving-on-from-chatgpt/" rel="nofollow noopener" target="_blank">https://www.peterkrupa.lol/2024/01/28/moving-on-from-chatgpt/</a></p><p><a rel="nofollow noopener" class="hashtag u-tag u-category" href="https://www.peterkrupa.lol/tag/chatgpt/" target="_blank">#ChatGPT</a> <a rel="nofollow noopener" class="hashtag u-tag u-category" href="https://www.peterkrupa.lol/tag/codellama/" target="_blank">#CodeLlama</a> <a rel="nofollow noopener" class="hashtag u-tag u-category" href="https://www.peterkrupa.lol/tag/coding-assistant/" target="_blank">#codingAssistant</a> <a rel="nofollow noopener" class="hashtag u-tag u-category" href="https://www.peterkrupa.lol/tag/llama-2-0/" target="_blank">#Llama20</a> <a rel="nofollow noopener" class="hashtag u-tag u-category" href="https://www.peterkrupa.lol/tag/llms/" target="_blank">#LLMs</a> <a rel="nofollow noopener" class="hashtag u-tag u-category" href="https://www.peterkrupa.lol/tag/local-llms/" target="_blank">#LocalLLMs</a> <a rel="nofollow noopener" class="hashtag u-tag u-category" href="https://www.peterkrupa.lol/tag/openai/" target="_blank">#OpenAI</a> <a rel="nofollow noopener" class="hashtag u-tag u-category" href="https://www.peterkrupa.lol/tag/python/" target="_blank">#Python</a></p>