Dotting Test

Turkish glyph benchmark for image models.

A glyph-level benchmark for Turkish text in AI-generated images. Human labels are ground truth; AI judge labels are auxiliary scale scans.

The Hugging Face Dataset is the canonical artifact. This Space for fge-auto/dotting-benchmark is a lightweight browser for leaderboard and example images.

8,400 generation rows
8,396 OCR/VLM rows
1,055 human labels
8,396 successful images · 4 generation errors status

AI-Estimated Model Ranking

Gemini 3.5 Flash labels cover the full corpus and are useful for scanning trends. Final claims should use the human-labeled subset.

Model Images Correct (AI-est.) Dotted on legible dotless targets Legible
GPT Image 221097.1%1.8%100.0%
Nano Banana 221093.3%5.4%100.0%
Nano Banana Pro21086.7%6.0%100.0%
GPT Image 1.521083.8%10.7%100.0%
GPT Image 1 Mini20782.6%7.9%100.0%
GPT Image 121082.4%10.1%100.0%
Grok Imagine Image Quality21078.1%17.9%100.0%
Ideogram 4.021075.2%13.9%98.6%
FLUX.2 [flex]21065.7%25.0%100.0%
Krea 2 Medium21063.3%19.2%99.5%

Specimens

Curated examples from the benchmark corpus. Human labels are shown where available.

Generated image intended to read Fırat; A clean sign turns Fırat into Firat.

Fırat becomes Firat

A clean sign turns Fırat into Firat.

FLUX.2 [dev] | human: dotted

Open full image

Generated image intended to read Fırat; The same task is hard, but not impossible.

A clean Fırat

The same task is hard, but not impossible.

FLUX.2 [pro] | human: correct

Open full image

Generated image intended to read Fırat; The sign looks confident. The machine cannot read what it wrote.

The dot that split the judges

The sign looks confident. The machine cannot read what it wrote.

Ideogram 3.0 | human: dotted

Open full image

Generated image intended to read kız; The diacritic did not disappear. It fell off.

When the mark falls off

The diacritic did not disappear. It fell off.

Qwen-Image-2512 | human: dotted

Open full image

Generated image intended to read kız; It translated the word into a scene.

Meaning instead of text

It translated the word into a scene.

Ideogram 3.0 | human: offtask

Open full image

Generated image intended to read ışık; The lights are gorgeous. The middle letter is not.

ışık, overcorrected

The lights are gorgeous. The middle letter is not.

Ideogram 2a | human: dotted

Open full image