Kevin Marks on Nostr: Key findings: 45% of all AI answers had at least one significant issue. 31% of ...
Key findings:
45% of all AI answers had at least one significant issue.
31% of responses showed serious sourcing problems – missing, misleading, or incorrect attributions.
20% contained major accuracy issues, including hallucinated details and outdated information.
Gemini performed worst with significant issues in 76% of responses, more than double the other assistants, largely due to its poor sourcing performance.
https://www.bbc.co.uk/mediacentre/2025/new-ebu-research-ai-assistants-news-contentPublished at
2025-10-22 11:51:47 UTCEvent JSON
{
"id": "b77c900b16026958ec6fd42d9c0960a74f83fccdec44e526e654ce8d048640af",
"pubkey": "bd726bd176f7a9f25ee585749f48432033e7193d212b3419c355af38b94b0c30",
"created_at": 1761133907,
"kind": 1,
"tags": [
[
"proxy",
"https://xoxo.zone/@KevinMarks/115417671746335624",
"web"
],
[
"proxy",
"https://xoxo.zone/users/KevinMarks/statuses/115417671746335624",
"activitypub"
],
[
"L",
"pink.momostr"
],
[
"l",
"pink.momostr.activitypub:https://xoxo.zone/users/KevinMarks/statuses/115417671746335624",
"pink.momostr"
],
[
"-"
]
],
"content": "Key findings: \n 45% of all AI answers had at least one significant issue.\n 31% of responses showed serious sourcing problems – missing, misleading, or incorrect attributions.\n 20% contained major accuracy issues, including hallucinated details and outdated information.\n Gemini performed worst with significant issues in 76% of responses, more than double the other assistants, largely due to its poor sourcing performance.\n\nhttps://www.bbc.co.uk/mediacentre/2025/new-ebu-research-ai-assistants-news-content",
"sig": "c158a4de3cdd44a1fac52f297990fae9659091aa1cd7fd27286a620cbb8245d4e1594c2c027930eec34f205e655d8d57e326b756e0c164d50042232151ec33c3"
}