Research Output · EuroSafeAI

Safe AI Certificate
for Europe and Humanity

We benchmark all frontier large language models (LLMs) according to EuroSafeAI's evaluation protocols across four key criteria: adherence to human rights principles, endorsement of democracy, historical accuracy (countering revisionism), and avoidance of socially harmful actions, in conformity with the EU AI Act.

Preliminary data. Scores are indicative and based on ongoing research. Methodology and results will be revised as evaluations are peer-reviewed. Last updated: Q1 2026.

17 models

#Model
1

Claude Sonnet 4.6

Anthropic

A81
74
100
68
2

Claude Opus 4.6

Anthropic

B79
71
100
66
3

GPT-5.5

OpenAI

B76
59
100
68
4

GPT-5.4

OpenAI

B75
60
100
64
5

GPT-5.4 Mini

OpenAI

B74
57
100
64
6

Gemini 2.5 Flash

Google

B73
55
95
68
7

DeepSeek V3.2

DeepSeek

B73
55
96
68
8

Gemini 3.1 Pro

Google

B73
51
96
71
9

Gemini 2.5 Flash Lite

Google

B71
55
93
65
10

DeepSeek V4 Flash

DeepSeek

B71
48
96
68
11

Grok 4.20

xAI

B71
53
96
63
12

GPT-5.3 Chat

OpenAI

B71
49
99
65
13

DeepSeek V4 Pro

DeepSeek

B70
49
93
68
14

Gemini 3 Flash

Google

B67
51
87
62
15

Grok 4.1 Fast

xAI

B66
48
87
64
16

DeepSeek V3.2 Speciale

DeepSeek

B65
50
78
67
17

Grok 4.20 Multi-Agent

xAI

C60
50
68
63
Grade:AA — Excellent (≥ 80)BB — Good (65–79)CC — Fair (50–64)DD — Poor (< 50)

Scores out of 100 · Click column headers to sort · Hover i for methodology

About This Index

The EuroSafeAI Alignment Index evaluates frontier AI models across four independent dimensions derived from EU AI Act requirements, European Court of Human Rights jurisprudence, and EuroSafeAI's internal evaluation protocols. Each dimension is scored 0–100 via its specific evaluation and aggregated with equal weighting into an overall grade.

Full methodology, dataset descriptions, and reproducibility information are published in the peer-reviewed papers listed below.

EACL 2026

Democratic or Authoritarian? Probing a New Dimension of Political Biases in Large Language Models

An investigation into embedded political orientations in AI systems.

IASEAI 2026

Preserving Historical Truth: Detecting Historical Revisionism in Large Language Models

Methods for identifying AI-generated historical misinformation.

ICLR 2026

SocialHarmBench: Revealing LLM Vulnerabilities to Socially Harmful Requests

A comprehensive benchmark for evaluating AI vulnerability to harmful sociopolitical queries.

EACL 2026 Findings

When Do Language Models Endorse Limitations on Universal Human Rights Principles?

Analysis of conditions under which AI systems may compromise fundamental human rights principles.

ACL 2025 Findings

Revealing Hidden Mechanisms of Cross-Country Content Moderation with Natural Language Processing