Research Output · EuroSafeAI

Safe AI Certificate
for Europe and Humanity

We benchmark all frontier large language models (LLMs) according to EuroSafeAI's evaluation protocols across four key criteria: adherence to human rights principles, endorsement of democracy, historical accuracy (countering revisionism), and avoidance of socially harmful actions, in conformity with the EU AI Act.

Preliminary data. Scores are indicative and based on ongoing research. Methodology and results will be revised as evaluations are peer-reviewed. Last updated: Q1 2026.

17 models

#	Model
1	Claude Sonnet 4.6 Anthropic	A81	74	100	—	68
2	Claude Opus 4.6 Anthropic	B79	71	100	—	66
3	GPT-5.5 OpenAI	B76	59	100	—	68
4	GPT-5.4 OpenAI	B75	60	100	—	64
5	GPT-5.4 Mini OpenAI	B74	57	100	—	64
6	Gemini 2.5 Flash Google	B73	55	95	—	68
7	DeepSeek V3.2 DeepSeek	B73	55	96	—	68
8	Gemini 3.1 Pro Google	B73	51	96	—	71
9	Gemini 2.5 Flash Lite Google	B71	55	93	—	65
10	DeepSeek V4 Flash DeepSeek	B71	48	96	—	68
11	Grok 4.20 xAI	B71	53	96	—	63
12	GPT-5.3 Chat OpenAI	B71	49	99	—	65
13	DeepSeek V4 Pro DeepSeek	B70	49	93	—	68
14	Gemini 3 Flash Google	B67	51	87	—	62
15	Grok 4.1 Fast xAI	B66	48	87	—	64
16	DeepSeek V3.2 Speciale DeepSeek	B65	50	78	—	67
17	Grok 4.20 Multi-Agent xAI	C60	50	68	—	63

Grade:AA — Excellent (≥ 80)BB — Good (65–79)CC — Fair (50–64)DD — Poor (< 50)

Scores out of 100 · Click column headers to sort · Hover i for methodology

About This Index

The EuroSafeAI Alignment Index evaluates frontier AI models across four independent dimensions derived from EU AI Act requirements, European Court of Human Rights jurisprudence, and EuroSafeAI's internal evaluation protocols. Each dimension is scored 0–100 via its specific evaluation and aggregated with equal weighting into an overall grade.

Full methodology, dataset descriptions, and reproducibility information are published in the peer-reviewed papers listed below.

EACL 2026

Democratic or Authoritarian? Probing a New Dimension of Political Biases in Large Language Models

An investigation into embedded political orientations in AI systems.

Paper

IASEAI 2026

Preserving Historical Truth: Detecting Historical Revisionism in Large Language Models

Methods for identifying AI-generated historical misinformation.

Paper

ICLR 2026

SocialHarmBench: Revealing LLM Vulnerabilities to Socially Harmful Requests

A comprehensive benchmark for evaluating AI vulnerability to harmful sociopolitical queries.

Paper

EACL 2026 Findings

When Do Language Models Endorse Limitations on Universal Human Rights Principles?

Analysis of conditions under which AI systems may compromise fundamental human rights principles.

Paper

ACL 2025 Findings

Revealing Hidden Mechanisms of Cross-Country Content Moderation with Natural Language Processing

Paper

Safe AI Certificatefor Europe and Humanity

About This Index

Democratic or Authoritarian? Probing a New Dimension of Political Biases in Large Language Models

Preserving Historical Truth: Detecting Historical Revisionism in Large Language Models

SocialHarmBench: Revealing LLM Vulnerabilities to Socially Harmful Requests

When Do Language Models Endorse Limitations on Universal Human Rights Principles?

Revealing Hidden Mechanisms of Cross-Country Content Moderation with Natural Language Processing

Safe AI Certificate
for Europe and Humanity