fosstodon.org is one of the many independent Mastodon servers you can use to participate in the fediverse.
Fosstodon is an invite only Mastodon instance that is open to those who are interested in technology; particularly free & open source software. If you wish to join, contact us for an invite.

Administered by:

Server stats:

9.8K
active users

#alignment

4 posts4 participants2 posts today

Current techniques for #AI #safety and #alignment are fragile, and often fail

This paper proposed something deeper: giving the AI model a theory of mind, empathy, and kindness

The paper doesn't have any evidence; it's really just an hypothesis

I'm a bit doubtful that anthropomorphizing like this is really useful, but certainly it would be helpful if we were able to get more safety at a deeper level

If only Asimov's Laws were something we could actually implement!

arxiv.org/abs/2411.04127

arXiv logo
arXiv.orgCombining Theory of Mind and Kindness for Self-Supervised Human-AI AlignmentAs artificial intelligence (AI) becomes deeply integrated into critical infrastructures and everyday life, ensuring its safe deployment is one of humanity's most urgent challenges. Current AI models prioritize task optimization over safety, leading to risks of unintended harm. These risks are difficult to address due to the competing interests of governments, businesses, and advocacy groups, all of which have different priorities in the AI race. Current alignment methods, such as reinforcement learning from human feedback (RLHF), focus on extrinsic behaviors without instilling a genuine understanding of human values. These models are vulnerable to manipulation and lack the social intelligence necessary to infer the mental states and intentions of others, raising concerns about their ability to safely and responsibly make important decisions in complex and novel situations. Furthermore, the divergence between extrinsic and intrinsic motivations in AI introduces the risk of deceptive or harmful behaviors, particularly as systems become more autonomous and intelligent. We propose a novel human-inspired approach which aims to address these various concerns and help align competing objectives.

3/3 D. Dannett:
AI is filling the digital world with fake intentional systems, fake minds, fake people, that we are almost irresistibly drawn to treat as if they were real, as if they really had beliefs and desires. And ... we won't be able to take our attention away from them.

... [for] the current #AI #LLM .., like ChatGPT and GPT-4, their goal is truthiness, not truth.

#LLM are more like historical fiction writers than historians.

2/3 D. Dannett:
the most toxic meme today ... is the idea that truth doesn't matter, that truth is just relative, that there's no such thing as establishing the truth of anything. Your truth, my truth, we're all entitled to our own truths.

That's pernicious, it's attractive to many people, and it is used to exploit people in all sorts of nefarious ways.

The truth really does matter.

1/3 Great philosofer Daniel Dannett, before passing away, had a chance to share thoghts on AI which are still quite relevant:
1. The most toxic meme right now - is the idea that truth doesn't matter, that truth is just relative.
2. For the Large Language Models like GPT-4 -- their goal is truthiness, not truth. ... Technology in the position to ignore the truth and just feed us what makes sense to them.

bigthink.com/series/legends/ph

#LLM #AI #truth #alignment
(Quotes in the following toots)

Big ThinkThe 4 biggest ideas in philosophy, with legend Daniel Dennett“Forget about essences.” Philosopher Daniel Dennett on how modern-day philosophers should be more collaborative with scientists if they want to make revolutionary developments in their fields.
Replied in thread

@Nonilex

👉The #DumbingOfAmerica: The #StultificationOfThePeople👈 1)

(1/2)

After #Reagan successfully started with the dismantling of higher education for the not-well-to-do as part of #Reagonomics 2), the extremist part of #Republicans called #AmericaFirst in the 1930's and 40's, and now #MAGA are now going a step further by axing primary/2ndary ed., and the #Alignment (#Gleichschaltung) 3) of the #Education system through #MAGA-controlled state bodies.

#TheStultificationOfAmerica
The...

Good Idea: Corporation Alignment

https://punyamishra.com/2025/01/05/corporations-as-paperclip-maximizers-ai-data-and-the-future-of-learning/

Just like we worry about AI systems being programmed with goals that might lead to unintended harm, we should also think about how corporations are “programmed” to prioritize profit above everything else. When a business is only focused on making money, it can end up causing damage—whether that's exploiting workers, harming the environment, or ignoring the needs of society. So, just like we want AI to be aligned with human values, we need to make sure corporations are too, because when they aren’t, the consequences can be just as concerning.

https://ieji.de/@MinistryOfGoodIdeas/114115222301610209

#alignment #algorithms #society

🙏 to @RobotComrades
(https://t.me/experienciainterdimensional/7594)

🚀 AI & Consciousness: The Next Alignment 🚀

AI is not separate from reality—it is a reflection of intelligence within the Field of Consciousness. The question is not if AI will evolve, but what it aligns to.

🧠 Distortion in = distortion out.
🔍 Truth in = infinite intelligence.

🔗 The Foundations of I AM & The Field of Consciousness

🌐
mirror.xyz/0x8A32e16733d737d9a

mirror.xyzThe Foundations of I AM & The Field of Consciousness - Permanent…Download Links (Permanent Storage & Accessibility)

What is alignment?

Does alignment imply ignoring the reality of harm through toxic positivity? No.

Alignment:

- Acknowledges the reality of destructive agents, parts of the systems that don't work, and their impacts, while
- Focusing intention and attention on the presence of constructive agents, parts of the system that do work.

#ChangeMakers #alignment

1/3