Fosstodon @fosstodon

**AST** @AST@sw-development-is.social · Jun 10 *

CAST 2025 offers you the biggest bang for your hard earned dollar when looking at conferences and upskilling events this year.

Don't miss this opportunity to level up your testing skills and strategies in preparation for the next era of testing and quality. Register today!

https://associationforsoftwaretesting.org/conference/cast-2025/

#QualityEngineering #TestingCommunity #TestingConference

**Never Code Alone** @nevercodealone@mastodon.social · May 7

May 7

Never Code Alone @nevercodealone@mastodon.social

Wusstest du? Hyperpersonalisierte KI-Anwendungen steigern die Conversion Rate um 43%! In meinem neuen Video zeige ich, wie du personalisierte Customer Journeys mit Cypress testen kannst. #WebDevelopment #AITesting https://youtube.com/shorts/CjrsFo1Yuhw

YouTubeKI-Anwendungen steigern Conversion Rate um 43% - Personalisierte Customer Journeys testenBy Never Code Alone

Continued thread

**Giskard** @Giskard · Apr 23

Apr 23

Giskard @Giskard

Link to the tutorial: https://docs.giskard.ai/en/stable/reference/notebooks/RAGET_Banking_Supervision.html

docs.giskard.aiRAG Evaluation Toolkit on a Banking Supervisory Process Agent - Giskard DocumentationLearn more about Giskard RAG Evaluation Toolkit on a Banking Supervisory Process Agent | The Testing platform for AI models.

#LLMs #RAG #AITesting

**Giskard** @Giskard · Apr 8

Apr 8

Giskard @Giskard

David Berenstein has joined the Giskard team as DevRel ️

David brings valuable experience from his previous roles at Argilla and Hugging Face, where he helped developers discover the joys of working with (synthetic) data. He loves cooking things up with data but also commits a lot of his time to cooking in real life His expertise will be key as we build our LLM Evaluation Hub.

Welcome to the team, David!

#hiring #DevRel #AITesting

**AST** @AST@sw-development-is.social · Apr 2

Apr 2

AST @AST@sw-development-is.social

It's that wonderful day of the week when we get to drop news about our next invited speaker at CAST 2025... you all getting excited? We sure are!

Welcome Péter Földházi!

Learn more about Péter in our substack: https://associationforsoftwaretesting.substack.com/p/cast-2025-invited-speaker-announcement-609?r=57k72m

And don't forget to secure your CAST 2025 seat before prices go up!

#testing #TestingCommunity #quality

**chribonn** @chribonn@twit.social · Mar 17

Mar 17

chribonn @chribonn@twit.social

We recently tested several AI engines across three key areas: Temporal Awareness, RAG, and Comprehension. Our findings revealed interesting variations in performance

https://www.alanbonnici.com/2025/03/ai-got-it-wrong-news.html

www.alanbonnici.comAI Got It Wrong - NewsThis blog is about security and computing related topics with occassional hobby activities thrown in.

#AITesting #MachineLearning #TemporalReasoning

Continued thread

**Giskard** @Giskard · Mar 13

Mar 13

Giskard @Giskard

Our CEO Alex Combessie will give a Masterclass: "Securing AI agents through continuous Red Teaming: Prevent hallucinations and vulnerabilities in LLM agents".

The Ritz-Carlton, Berlin
March 31 - April 1

Book a demo with us here: https://gisk.ar/3FsJaav

ChatbotSummitGiskard AI @ChatbotSummit Berlin 2025Master Agentic AI Together with Giskard AI at Chatbot Summit Ritz-Carlton Berlin 2025 on April 01! Giskard helps you secure your AI agents through our comprehensive testing system that combines hallucination detection, security scanning, and cybersecurity watch. Our platform ensures continuous protection by adapting to emerging threats, alerting you instantly when new AI vulnerabilities arise. We enable collaboration between technical and business teams and provide independent, expert validation for confident AI deployment.

#AIAgents #ChatbotSummit #AITesting

**SeleniumConf** @seleniumconf · Mar 10

Mar 10

SeleniumConf @seleniumconf

Is AI the future of test automation? Alex Rodionov introduces Alumnium, an open-source AI-powered framework that overcomes the challenges of automated testing.
https://seleniumconf.com/register/
#AITesting #TestAutomation #Alumnium #OpenSource #SeleniumConf #AppiumConf

**AST** @AST@sw-development-is.social · Feb 28

Feb 28

AST @AST@sw-development-is.social

AST NEWSLETTER FOR FEBRUARY IS OUT!
The conversation around AI-driven testing tools is reaching fever pitch, but does the technology live up to its billing? In this month’s news spotlight, we explore the real-world impact, technical hurdles, and a call to action for testers to share concrete results.

Also get your latest updates on CAST 2025!

https://open.substack.com/pub/associationforsoftwaretesting/p/ast-monthly-newsletter?r=57k72m&utm_campaign=post&utm_medium=web&showWelcomeOnShare=false

open.substack.comAST Monthly NewsletterFebruary 2025

#quality #qualityintech #astnews

**AST** @AST@sw-development-is.social · Feb 26

Feb 26

AST @AST@sw-development-is.social

This Friday is the last day for AST members to save over $300 off registration for CAST 2025...

Have you secured your spot?

#TestingConference #SoftwareTesting #AITesting

**AST** @AST@sw-development-is.social · Feb 25

Feb 25

AST @AST@sw-development-is.social

Mar. 1st is quickly approaching and CAST2025 registration prices will be going up 3/1 and opening to the public.

This is your reminder!! Secure up to 30% off of registration before 2/28!

#TestingConference #SoftwareTesting #SiliconSlopes

Continued thread

**Giskard** @Giskard · Feb 19

Feb 19

Giskard @Giskard

◆ Hallucination and factual accuracy
◆ Bias and fairness
◆ Resistance to adversarial attacks
◆ Harmful content prevention

The LLM Benchmark incorporates diverse linguistic and cultural contexts to ensure comprehensiveness, and representative samples will be open-source.

Read about our methodology, and early findings: https://gisk.ar/3CRFdeB

We will be sharing more results in the coming months

gisk.arGiskard announces a new LLM Evaluation Benchmark during the Paris AI SummitGiskard partners with Google DeepMind to launch an independent multilingual LLM benchmark, evaluating hallucinations and AI security risks.

#AISecurity #AITesting #LLMs

**Giskard** @Giskard · Feb 4

Feb 4

Giskard @Giskard

Can we trust DeepSeek R1? A Giskard evaluation

With all the hype around DeepSeek R1, our LLM safety research team decided to conduct an evaluation to check if R1 is as good as it claims. While it impresses in some areas, we found critical limitations that raise concerns for real-world applications. Here are some unexpected examples

#DeepSeek #LLM #AITesting

**LBHuston** @lbhuston@mastodon.social · Feb 3

Feb 3

LBHuston @lbhuston@mastodon.social

Reviewing DeepSeek-R1-Distill-Llama-8B on an M1 Mac
▸ https://lttr.ai/AbDLG

#deepseek #ai #llm

Continued thread

**Giskard** @Giskard · Jan 30

Jan 30

Giskard @Giskard

Feb 13-15, 2025
Booth E46
Talk: Feb 13, 16:30

Book your ticket: https://gisk.ar/4hzdTQZ

gisk.arExhibitor detail | World AI Cannes Festival 2025

#WAICF2025 #AITesting #AISecurity

**LBHuston** @lbhuston@mastodon.social · Jan 29

Jan 29

LBHuston @lbhuston@mastodon.social

I’ve been testing DeepSeek-R1-Distill-Llama-8B on my M1 Mac using LMStudio, and the results have been surprisingly strong for a distilled model.

Read more https://lttr.ai/Aa4FO

#deepseek #ai #llm

**PUPUWEB Blog** @pupuweb@mastodon.social · Jan 23

Jan 23

PUPUWEB Blog @pupuweb@mastodon.social

CAIS & Scale AI unveil Humanity's Last Exam—a 3,000-question AI test dubbed the hardest-ever evaluation. #AI #ArtificialIntelligence #CAIS #ScaleAI #TechNews #MachineLearning #Innovation #AITesting #FutureTech

**Giskard** @Giskard · Jan 14

Jan 14

Giskard @Giskard

Seek for the turtle in Cannes!

Join us at the World AI Cannes Festival (WAICF) from February 13-15!

Stop by our booth and meet our team to discuss about quality, security, and compliance for GenAI applications.
More detail about our participation coming soon...

Are you attending WAICF? Drop a comment below or DM us to schedule a meeting.

#WAICF2025 #AITesting #AISecurity

**bazbt3** @bazbt3@appdot.net · Dec 14, 2024

Dec 14, 2024

bazbt3 @bazbt3@appdot.net

Me: How many letters 'r' are there in the word 'strawberry'?

ChatGPT answers: There are three occurrences of the letter ‘r’ in the word ‘strawberry’.

ChatGPT, in another chat answers: The word strawberry contains three letters ‘r’.

(This is better than the '2', 2 months ago when I first asked).

Google's Gemini: There are 3 letters "r" in the word "strawberry."

(First time of use for questions of this type).

#aitesting

**Giskard** @Giskard · Dec 12, 2024

Dec 12, 2024

Giskard @Giskard

️ Building and evaluating a Banking Supervision #RAG agent

We've published a new tutorial that shows how to:
• Build a RAG agent with LlamaIndex to answer questions about ECB banking supervision
• Scan for LLM vulnerabilities like hallucinations and prompt injection
• Evaluate RAG components (retriever, generator, rewriter) with different question types

Check out the complete tutorial in our docs: https://gisk.ar/3OQ1tYz
More details about the results

#Agents #LLMs #AITesting

Recent searches

Search options

Administered by:

Server stats:

#aitesting