Decoding LLM Hallucinations An In-Depth Survey Summary

The rapid advancement of Large Language Models (LLMs) has brought transformative capabilities, yet their tendency to “hallucinate”—generating outputs that are nonsensical, factually incorrect, or unfaithful to provided context—poses significant risks to their reliability, especially in information-critical applications . A comprehensive survey by Huang (Huang et al., 2025) systematically explores this phenomenon, offering a detailed taxonomy, analyzing root causes, and reviewing detection and mitigation techniques. This post delves deeper into the key insights from that survey.

The main content flow and categorization of this survey
The main content flow and categorization of this survey

A Refined Taxonomy of LLM Hallucinations

Understanding hallucination requires a clear classification. The survey distinguishes between task-specific NLG hallucinations (intrinsic/extrinsic relative to source text) and the broader issues in open-ended LLMs.

In simpler terms:

Intrinsic = Contradicts the source. Extrinsic = Not verifiable from the source (goes beyond it).

The proposed taxonomy focuses on two pillars :

Unpacking the Causes of Hallucinations

Hallucinations aren’t random errors; they stem from specific issues across the LLM’s development and deployment :

Detecting Hallucinations: Methods and Benchmarks

Identifying hallucinatory content is a critical first step.

Strategies for Mitigating Hallucinations

Mitigation efforts often target the root causes identified earlier :

The Double-Edged Sword: Hallucinations in RAG Systems

While RAG aims to reduce hallucinations by providing external knowledge, it’s not immune and can even introduce new failure modes :

Looking Ahead: Future Research Frontiers

The survey points towards critical areas needing further investigation:


Mitigating hallucinations remains a central challenge in making LLMs truly dependable. This survey provides an essential roadmap of the current landscape, highlighting the intricate causes and the diverse strategies being developed to address them. Continued research in detection, mitigation, and fundamental understanding is crucial for the future of trustworthy AI.

References

  1. Huang, L., Yu, W., Ma, W., Zhong, W., Feng, Z., Wang, H., Chen, Q., Peng, W., Feng, X., Qin, B., & others. (2025). A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions. ACM Transactions on Information Systems, 43(2), 1–55.
  2. Zhou, C., Liu, P., Xu, P., Iyer, S., Sun, J., Mao, Y., Ma, X., Efrat, A., Yu, P., Yu, L., & others. (2023). Lima: Less is more for alignment. Advances in Neural Information Processing Systems, 36, 55006–55021.
  3. Gekhman, Z., Yona, G., Aharoni, R., Eyal, M., Feder, A., Reichart, R., & Herzig, J. (2024). Does fine-tuning LLMs on new knowledge encourage hallucinations? ArXiv Preprint ArXiv:2405.05904.
  4. Zhang, M., Press, O., Merrill, W., Liu, A., & Smith, N. A. (2023). How language model hallucinations can snowball. ArXiv Preprint ArXiv:2305.13534.
  5. Stahlberg, F., & Byrne, B. (2019). On NMT search errors and model errors: Cat got your tongue? ArXiv Preprint ArXiv:1908.10090.
  6. Goodrich, B., Rao, V., Liu, P. J., & Saleh, M. (2019). Assessing the factual accuracy of generated text. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 166–175.
  7. Falke, T., Ribeiro, L. F. R., Utama, P. A., Dagan, I., & Gurevych, I. (2019). Ranking generated summaries by correctness: An interesting but challenging application for natural language inference. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2214–2220.
  8. Xiao, Y., & Wang, W. Y. (2021). On hallucination and predictive uncertainty in conditional language generation. ArXiv Preprint ArXiv:2103.15025.
  9. Lin, S., Hilton, J., & Evans, O. (2021). Truthfulqa: Measuring how models mimic human falsehoods. ArXiv Preprint ArXiv:2109.07958.
  10. Cheng, Q., Sun, T., Zhang, W., Wang, S., Liu, X., Zhang, M., He, J., Huang, M., Yin, Z., Chen, K., & others. (2023). Evaluating hallucinations in chinese large language models. ArXiv Preprint ArXiv:2310.03368.
  11. Vu, T., Iyyer, M., Wang, X., Constant, N., Wei, J., Wei, J., Tar, C., Sung, Y.-H., Zhou, D., Le, Q., & others. (2023). Freshllms: Refreshing large language models with search engine augmentation. ArXiv Preprint ArXiv:2310.03214.
  12. Pal, A., Umapathi, L. K., & Sankarasubbu, M. (2023). Med-halt: Medical domain hallucination test for large language models. ArXiv Preprint ArXiv:2307.15343.
  13. Miao, N., Teh, Y. W., & Rainforth, T. (2023). Selfcheck: Using llms to zero-shot check their own step-by-step reasoning. ArXiv Preprint ArXiv:2308.00436.
  14. Zhao, Y., Zhang, J., Chern, I., Gao, S., Liu, P., He, J., & others. (2023). Felm: Benchmarking factuality evaluation of large language models. Advances in Neural Information Processing Systems, 36, 44502–44523.
  15. Dong, Z., Tang, T., Li, J., Zhao, W. X., & Wen, J.-R. (2023). Bamboo: A comprehensive benchmark for evaluating long text modeling capacities of large language models. ArXiv Preprint ArXiv:2309.13345.
  16. Meng, K., Bau, D., Andonian, A., & Belinkov, Y. (2022). Locating and editing factual associations in gpt. Advances in Neural Information Processing Systems, 35, 17359–17372.
  17. Meng, K., Sharma, A. S., Andonian, A., Belinkov, Y., & Bau, D. (2022). Mass-editing memory in a transformer. ArXiv Preprint ArXiv:2210.07229.
  18. Mitchell, E., Lin, C., Bosselut, A., Finn, C., & Manning, C. D. (2021). Fast model editing at scale. ArXiv Preprint ArXiv:2110.11309.
  19. Li, Z., Zhang, S., Zhao, H., Yang, Y., & Yang, D. (2023). Batgpt: A bidirectional autoregessive talker from generative pre-trained transformer. ArXiv Preprint ArXiv:2307.00360.
  20. Liu, B., Ash, J., Goel, S., Krishnamurthy, A., & Zhang, C. (2023). Exposing attention glitches with flip-flop language modeling. Advances in Neural Information Processing Systems, 36, 25549–25583.
  21. Li, X. L., Holtzman, A., Fried, D., Liang, P., Eisner, J., Hashimoto, T., Zettlemoyer, L., & Lewis, M. (2022). Contrastive decoding: Open-ended text generation as optimization. ArXiv Preprint ArXiv:2210.15097.
  22. Shi, W., Han, X., Lewis, M., Tsvetkov, Y., Zettlemoyer, L., & Yih, W.-tau. (2024). Trusting your evidence: Hallucinate less with context-aware decoding. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 2: Short Papers), 783–791.
  23. Chang, C.-C., Reitter, D., Aksitov, R., & Sung, Y.-H. (2023). Kl-divergence guided temperature sampling. ArXiv Preprint ArXiv:2306.01286.