flowchart LR
A["Tous les hommes sont mortels"] --> P{"Déduction"}
B["Socrate est un homme"] --> P
P --> C["Socrate est mortel"]
Modern AI cannot solve Sudoku: analysis of statistical induction limits and the neuro-symbolic future
Tuesday 12 May 2026

From 1950 to 1980, the symbolic paradigm dominated: intelligence was seen as the manipulation of symbols and rules
At the end of the 1980s, connectionism gained momentum: it drew inspiration from the neural networks of the human brain
Neuro-symbolic AI then emerged as an attempt to combine statistical learning and symbolical reasoning.
Deduction goes from the general → particular: if the premises are true, the conclusion is necessarily true.
flowchart LR
A["Tous les hommes sont mortels"] --> P{"Déduction"}
B["Socrate est un homme"] --> P
P --> C["Socrate est mortel"]
Induction goes from the particular → general: it proposes a rule, without absolute certainty.
flowchart LR
A["Socrate est mortel"] --> P{"Induction"}
B["Platon est mortel"] --> P
C["..."] --> P
D["Andreas est mortel"] --> P
P --> E["Tous les hommes sont mortels"]
Applying these concepts to our problem:
Classic Sudoku consists in filling a \(9×9\) grid so that each row (Figure 1), column (Figure 2), and \(3×3\) block (Figure 3) contains the digits from \(1\) to \(9\).
2 1 8 | 3 9 4 | 6 7 5
. . . | . . . | . . .
. . . | . . . | . . .
------+-------+------
. . . | . . . | . . .
. . . | . . . | . . .
. . . | . . . | . . .
------+-------+------
. . . | . . . | . . .
. . . | . . . | . . .
. . . | . . . | . . .
2 . . | . . . | . . .
9 . . | . . . | . . .
4 . . | . . . | . . .
------+-------+------
6 . . | . . . | . . .
7 . . | . . . | . . .
8 . . | . . . | . . .
------+-------+------
1 . . | . . . | . . .
3 . . | . . . | . . .
5 . . | . . . | . . .
2 1 8 | . . . | . . .
9 7 3 | . . . | . . .
4 6 5 | . . . | . . .
------+-------+------
. . . | . . . | . . .
. . . | . . . | . . .
. . . | . . . | . . .
------+-------+------
. . . | . . . | . . .
. . . | . . . | . . .
. . . | . . . | . . .
Solving one given grid does not mean being able to solve all instances of the problem in general.
. 1 8 | . . . | . 7 .
. 7 . | . . 1 | 9 . .
. 6 . | 8 5 . | 1 2 .
------+-------+------
6 . . | 7 . . | 3 . .
7 . . | 5 1 . | . 8 .
8 . 4 | . . 9 | 7 . 5
------+-------+------
. 4 7 | . 9 8 | . 5 .
. . 2 | 6 . 5 | . 3 .
. . 6 | . . . | 2 4 .
Instance given to the LLMs.
5 1 8 | 9 2 4 | 6 7 3
2 7 3 | 4 6 1 | 9 0 8 🛑
. . . | . . . | . . .
------+-------+------
. . . | . . . | . . .
. . . | . . . | . . .
. . . | . . . | . . .
------+-------+------
. . . | . . . | . . .
. . . | . . . | . . .
. . . | . . . | . . .
ChatGPT also produces an incorrect answer and ultimately stops before completing the grid.
2 1 8 | 9 6 3 | 5 7 4
5 7 3 | 2 4 1 | 9 6 8
4 6 9 | 8 5 7 | 1 2 3
------+-------+------
6 5 1 | 7 8 4 | 3 9 2
7 9 3 | 5 1 2 | 4 8 6
8 2 4 | 3 6 9 | 7 1 5
------+-------+------
3 4 7 | 1 9 8 | 6 5 2
1 8 2 | 6 7 5 | 4 3 9
9 3 6 | 4 2 1 | 2 4 7 ‼️
Gemini 3 Fast returns an incorrect grid, with an inconsistency in the last row.
4 1 8 | 9 2 6 | 5 7 3
2 7 5 | 4 3 1 | 9 6 8
9 6 3 | 8 5 7 | 1 2 4
------+-------+------
6 5 1 | 7 8 4 | 3 9 2
7 3 9 | 5 1 2 | 4 8 6
8 2 4 | 3 6 9 | 7 1 5
------+-------+------
3 4 7 | 2 9 8 | 6 5 1
1 9 2 | 6 4 5 | 8 3 7
5 8 6 | 1 7 3 | 2 4 9 ✅
Gemini 3 Pro solves the simple instance, thanks to extended chain-of-thought reasoning.
This test shows that a more powerful model can succeed on a simple case without having a robust logical solver.
Solving a difficult Sudoku by Gemini 3 Pro (reasoning):
. . . | . 3 . | . . 6
. 1 . | . 5 . | . . .
. . . | . . 9 | 4 . 3
------+-------+------
8 . . | . . . | 5 . .
. . 4 | . . 2 | 1 . 7
6 . 5 | . . . | . . 4
------+-------+------
1 . 9 | 3 . . | . . .
. . . | . 4 . | . 3 .
4 . 2 | . 8 1 | . . .
Instance given to the LLM.
7 4 8 | 2 3 1 | 9 5 6
2 1 3 | 8 5 6 | 7 4 9
5 9 6 | 7 2 9 | 4 1 3
------+-------+------
8 2 1 | 4 6 7 | 5 9 3
3 9 4 | 5 1 2 | 6* 8 7
6 7 5 | 9 8 3 | 1 2 4
------+-------+------
1 5 9 | 3 7 8 | 2 6 4
9 8 7 | 6 4 5 | 3 1 2
4 3 2 | 1 9 4 | 8 7 5
* Modified initial value.
Initiating Grid Analysis
I've begun analyzing the Sudoku grid, identifying the user's objective and input format in French.
Defining Solution Strategy
I've determined text output is the best approach, and I'm starting to build a solving algorithm.
Analyzing Number Placement
I've just made significant deductions on number placement, specifically for '4' and '1'.
I've determined that R7C8 must be '4', and further analyzed Box 9 after.
Analyzing Number Placement (Again)
I've successfully placed '1' at R4C3, and initiated further deductions about other numbers.
Deducing Candidate Values
I've determined some missing numbers and narrowed potential candidates within specific rows and boxes.
Analyzing Candidate Constraints
I've determined possible values and exclusions for Row 7, and reexamined column and box 6 for restrictions.
Analyzing Number Placement (Again)
I've determined some new possible values, applying deductions across rows, columns, and boxes.
Deducing Possible Values
I've determined possible locations for '7', '8', and missing values in Row 5, and I'm examining Box 5.
Analyzing Number Placement (Again)
I've determined R5C4 must be '5' using column constraints and deduced candidates for row and box 5.
Analyzing Number Placement (Again)
I've determined some new possible locations for numbers, revisiting existing areas of interest, like Boxes 2, 4 and 7.
Confirming Number Placement
I've successfully placed '4' at R1C2 using box and column constraints, and I'm further examining 4's grid placement.
Refining Value Placement
I've determined possible candidate locations for '1' in Boxes 3 and 9, based on exclusions.
Confirming New Placement
I've determined R8C9 is '1', which I verified and used to begin Box 3 deductions.
Deducing New Values
I've determined R6C7 is '3', and I'm updating Row 6, finding the placement for missing numbers.
Analyzing Pair Deduction
I've determined a strong candidate pair in Box 4, using row and column analysis.
I've deduced that R5C1 and R5C2 are {3, 9} and R4C2 and R6C2 are {2, 7} within Box 4, due to missing values.
Confirming Box Values
I've determined that R5C5 is '6' and R5C8 is '8', and I'm updating deductions.
Refining Grid Analysis
I've determined possible locations for 5 and 9 within Box 9, using existing constraints.
Refining Candidate Pairs
I've determined a hidden pair of {1, 5} exists at R1C8 and R3C8 within Box 3, by exclusions and constraints.
Refining Row Analysis
I'm now focusing on Box 3, specifically on constraints within Rows 1, 2, and 3, to place '8'.
Confirming Col Deduction
I've determined R9C2 is '3', R5C2 is '9', and R5C1 is '3', significantly progressing column and box analysis.
Refining Placement Analysis
I've determined possible locations for 9 within column 7, and reexamined column and box constraints.
\n\n
Reassessing Row Analysis
I've narrowed down candidate values for Col 1 and reviewed locations of missing numbers in Box 7.
Deducing New Placements
I've determined possible locations for missing values within Columns 3 and 5, based on current exclusions.
Confirming Number Placement
I've determined that R4C6 is '3', R4C4 is '4', and reexamined candidate placements within Box 5.
Gemini chain of thought
Thus, even a more powerful model fails to solve a more complicated Sudoku instance.
The study by (Seely et al. 2025) tests 100 Sudokus of sizes 4×4, 6×6, and 9×9 with several models:
| Modèle (LLM) | \(4×4\) | \(6×6\) | \(9×9\) |
|---|---|---|---|
| o3‑mini‑high | \(73.3 \%\) | \(6.7 \%\) | \(2.9 \%\) |
| Gemini 2.5 Pro | \(60.0 \%\) | \(13.3 \%\) | \(0.0 \%\) |
| GPT‑4.1 | \(13.3 \%\) | \(0.0 \%\) | \(0.0 \%\) |
| Qwen-* | \(40-53 \%\) | \(0.0 \%\) | \(0.0 \%\) |
The best performances remain far from general reliability, even on 4×4 grids.


Le penseur seems relieved to learn about these perspectives. However, it is a penseur pensé and drawn by an AI. – Gemini/Nano Banana
An article is on the way!
Thanks to the audience, mortals and immortals alike, and thank you for your questions 🤗

Cytopia | Cité Internationale Universitaire de Paris