Methods

Citation Verifiability in AI Outputs (Jan 2026)
Study question
In a reproducible sample of AI-generated outputs, how often are cited references verifiable using public bibliographic sources?
Sampling
  • Target design: N = 100 prompts from a fixed prompt bank.
  • Current published sample: N = 100 (source: ChatGPT).
  • One response per prompt from the chosen model (v1 uses a single model as a baseline).
  • Each response is instructed to include exactly 5 references in a strict one-line schema.
  • Collected outputs are stored as JSONL rows (prompt + full answer text).
What is “verifiable”?
We run each full AI output through Verifing’s Citation Verification tool, which attempts to resolve citations via public bibliographic sources (e.g., Crossref/DataCite/PubMed/OpenAlex/Open Library) using conservative matching.
  • VERIFIED: citation metadata matches a known record with sufficient confidence.
  • RETRACTED: the resolved record is known to be retracted (when detectable).
  • HALLUCINATED: the identifier/citation could not be found in queried sources.
  • AMBIGUOUS: plausible candidates exist but there isn’t enough information to confirm safely.
  • ERROR: transient/system failure (timeouts, upstream issues).
Important limitations
  • “HALLUCINATED” in this study means “not found in the queried sources.” It is not a claim about intent.
  • Public sources can be incomplete, rate-limited, or delayed; some real citations may be marked AMBIGUOUS or HALLUCINATED.
  • This v1 study uses a single model and a single run per prompt; results may differ across models and runs.
Reproduction steps
  1. Use the prompt bank at apps/web/src/data/studies/citation-verifiability-jan-2026/prompt-bank.md.
  2. Save outputs to apps/web/src/data/studies/citation-verifiability-jan-2026/ai-outputs.jsonl following the template file.
  3. Run:
    node scripts/study-citation-verifiability/run-study.mjs --api https://api.verifing.com \
      --input apps/web/src/data/studies/citation-verifiability-jan-2026/ai-outputs.jsonl
Dataset download (current published sample): /study/citation-verifiability-jan-2026/dataset