Citation Verifiability in AI Outputs (Jan 2026)
Status
Preliminary sample (N = 100). Full study targets N = 100.
What this is
This page benchmarks how often citations inside AI-generated text can be resolved in public bibliographic sources. It’s not asking you to “trust the AI” — it’s measuring what fraction of the AI’s cited references are verifiable.
Sample size
100
Output source: ChatGPT
Verified rate (current sample)
~61%
203/334 citations verified
Not verifiable
~39%
131/334 citations not verifiable
“Checked citations” is the total number of citations parsed across the sample (each output is instructed to include 5 references).
Generated: 1/9/2026, 1:03:26 PM
Breakdown
Verified: 203 · Retracted: 1 · Hallucinated: 48 · Ambiguous: 82
Methodology
Definitions, sampling, and limitations are documented on the Methods page.
Data
For transparency, the current published sample (raw prompts + outputs) is available as a JSONL download.
Version: v1.0 (Jan 2026). Future updates may expand sample size and models.