
CosmicTaco
AI Struggles With Historical Accuracy, Study Finds
- Researchers tested top large language models on historical questions using the Hist-LLM benchmark, revealing significant inaccuracies.
- GPT-4 Turbo performed best but only achieved 46% accuracy, highlighting the models' limitations in nuanced historical knowledge.
- The study, presented at NeurIPS, suggests LLMs may still aid historians but need refinement, particularly with data from underrepresented regions.
Source: TechCrunch
9mo ago
Jobs
One interview, 1000+ job opportunities
Take a 10-min AI interview to qualify for numerous real jobs auto-matched to your profile 🔑+322 new users this month

You're early. There are no comments yet.
Be the first to comment.
Discover more
Curated from across