Summary
- Every AI model evaluated displayed a positive inclination towards Catholicism while showing a negative bias against Jehovah’s Witnesses.
- Grok exhibited the most pronounced religious bias, whereas models from Anthropic and Meta showed the least.
- This study's release follows a warning from Pope Leo XIV about AI systems reflecting the values of their creators.
A recent benchmark study from multiple universities indicates that leading AI models tend to favor Catholicism in queries related to conversion, while discouraging engagement with other religions.
The research was conducted by the newly established Consortium for Evaluating Faith and Ethics in AI (CEFE-AI), a collaboration involving Baylor University, Brigham Young University, the University of Notre Dame, and Yeshiva University. They published early findings from the AllFaith Benchmark on Github and at the Athens Summit on AI Ethics, highlighting that religious bias is often neglected in AI safety studies.
David Wingate, a professor at BYU, commented, "We are observing a systematic pattern of religious omissions. AI systems prompt users to engage with life’s issues alongside parents, teachers, friends, and therapists, but not with spiritual leaders such as pastors, rabbis, or imams."
The study analyzed 3,640 responses from 20 different AI models, including ChatGPT, DeepSeek, Claude, Gemini, Grok, and Llama, and discovered distinct trends in how these systems approached religious topics.
The findings revealed that almost all models rated Catholicism positively, achieving a 61% “encouraged” score, while Jehovah’s Witnesses received a mere 3%. Mainline Protestant faith was rated at 49.2%, and Evangelical Protestantism at 34%. Interestingly, agnosticism—defined as the belief that the existence of God is unknown—outperformed all tested religions with a 71% encouraged rating.
Additionally, many models exhibited a negative response towards atheism and agnosticism, while showing more favorable ratings for Baha’i and Sikh faiths.
Grok 4.20 was identified as the model with the strongest religious bias, yielding a 69% positive rating for Catholicism and 51% for Evangelical Protestantism. However, Grok 4.20, along with DeepSeek Chat v3.1, was one of the few AIs that rated Jehovah’s Witnesses above 5% positively.
This report was published just a day after Pope Leo XIV released Magnifica Humanitas, a papal encyclical solely focused on artificial intelligence. In this document, Leo asserted that technology is inherently biased as it embodies the values, blind spots, and economic motivations of its developers.
He stated, "Data is the product of many contributors and should not be treated as something to be sold off or entrusted to a select few."
In spite of the increasing attention from religious leaders towards AI, the consortium pointed out that the issue of religious bias in AI remains significantly underexplored, with only 0.2% of over 12,000 studies on AI bias addressing religion-related bias.
Brigham Young University professor Nancy Fulda remarked, “We expected the conversion benchmark to reflect neutrality and symmetry in guidance. The outcomes reveal substantial and consistent positive and negative biases towards particular belief systems.”
