
Solving Sudoku With Prolog: A Journey Into Declarative Programming
Table of contents
Timeline: | Oct-Nov 2015 | |
Languages Used: | Prolog | |
School: | AIT at Budapest University of Technology and Economics | |
Course: | Semantic and Declarative Technologies |
How I Learned to Stop Worrying and Love Logic Programming
During my Semantic and Declarative Technologies course at AIT-Budapest, I was introduced to Prolog, a language that’s dramatically different from imperative languages like Python or Java. Instead of specifying how to achieve a goal, you describe the goal itself and the logical rules that define it. This makes it a fascinating and powerful tool for problems that are all about constraints and logic—like Sudoku.
Prolog is built on the idea of unification and backtracking. You define a set of facts and rules, and Prolog’s engine searches for solutions that satisfy them. If it hits a dead end, it backtracks and tries another path. This is, conceptually, very similar to how a human solves Sudoku: you make a guess, see if it breaks any rules, and if it does, you backtrack and try something else.
This project was my first real dive into this different way of thinking about programming. I started with a standard Sudoku solver and then moved on to a more complex variant called “Skyscraper Sudoku” to really test my understanding.
The most fun challenges
1. The Core Logic: Defining Sudoku Rules
A Sudoku puzzle is a grid where every row, column, and 3x3 sub-grid must contain all the numbers from 1 to 9 without repetition. In Prolog, we can translate these rules directly into logical predicates.
The main predicate, sudoku/2
, ties everything together. It takes an incomplete grid (Grid0
) and produces a solved grid (Grid
).
1sudoku([], []).
2sudoku([R|Rs], Grid):-
3 zeros_gone(R, SR),
4 append([SR], GridN, Grid),
5 sudoku(Rs, GridN),
6 complete(Grid).
This might look intimidating, so let’s break it down:
- The first line,
sudoku([], []).
, is our base case: an empty grid is already solved. - The
zeros_gone/2
predicate is a helper I wrote to replace the blank entries (represented as0
) with candidate numbers that can be tested later for consistency. append/3
reconstructs the grid with the newly variable-filled row.sudoku(Rs, GridN)
recursively processes the rest of the rows.- Finally,
complete(Grid)
is called to check if the generated grid is a valid Sudoku solution.
The complete/1
predicate is where the rules of Sudoku are enforced:
1complete(Grid):-
2 fullyfilled(Grid),
3 consistent(Grid).
4
5consistent(Grid):-
6 rows_alldiff(Grid),
7 transpose(Grid, GridT),
8 rows_alldiff(GridT),
9 findall(Subgrid, subgrid(Grid, Subgrid), L),
10 ( foreach(Sub, L)
11 do grid_alldiff(Sub)
12 ).
fullyfilled/1
simply ensures there are no empty spots left.consistent/1
is the real powerhouse. It checks three conditions:rows_alldiff(Grid)
: All numbers in each row must be unique.transpose(Grid, GridT), rows_alldiff(GridT)
: It cleverly transposes the grid (rows become columns) and runs the same uniqueness check, thus validating the columns.subgrid/2
: It defines and then checks all the 3x3 sub-grids to ensure they also contain unique numbers.
This declarative approach is the beauty of Prolog. We don’t implement a step-by-step solving algorithm. Instead, we define what a correct solution looks like, and Prolog’s engine does the hard work of finding it.
2. Upping the Ante: Skyscraper Sudoku
To push myself further, I tackled Skyscraper Sudoku. This variant adds a new layer of constraints. Imagine the numbers in the grid represent the heights of buildings. The numbers outside the grid tell you how many buildings are visible from that vantage point in that row or column. A taller building hides any shorter ones behind it.

Skyscraper Sudoku
Grid numbers represent building heights. Numbers on the perimeter tell you how many buildings are visible from that vantage point.
Loading interactive scene...
This required a more advanced approach. My first solver was a “generate-and-test” model: it would generate a complete grid and then test if it was valid. For Skyscraper Sudoku, this is incredibly inefficient. The solution was to use Constraint Logic Programming over Finite Domains (CLP(FD)), a library that turns Prolog into a powerful constraint solver.
With CLP(FD), you define the domains for your variables (e.g., integers from 1 to 9) and the constraints they must satisfy before Prolog starts its search. This prunes the search space dramatically, making the problem feasible to solve.
The key predicate here is visnum/4
, which defines the “visibility” constraint:
1% visnum(V, Dir, RC, SS): The number of elements in row or column RC
2% of grid SS that are visible from the direction Dir is V.
3visnum(V, n, RC, SS):- % north => left visibility, RC is column number
4 transpose(SS, SST),
5 nth(RC, SST, Col),
6 visnum_left(V, Col).
7visnum(V, w, RC, SS):- % west => left visibility, RC is row number
8 nth(RC, SS, Row),
9 visnum_left(V, Row).
10...
The interesting part is visnum_left/2
, which calculates the visibility from one direction.
1visnum_left(V, [A,B|As]):-
2 A #< B #<=> Visible,
3 V #= V0 + Visible,
4 Visible #=> C #= B,
5 #\ Visible #=> C #= A,
6 visnum_left(V0, [C|As]).
This uses CLP(FD)’s specialized operators (#<
, #<=>
, #=
, #=>
):
A #< B #<=> Visible
:Visible
becomes1
(true) ifA
is less thanB
, and0
(false) otherwise. This determines if buildingB
is visible pastA
.V #= V0 + Visible
: The total visibilityV
is the sum of the visibility from the rest of the list (V0
) plus the currentVisible
state.- The next two lines cleverly determine the new “tallest building so far” (
C
) for the recursive call.
Using CLP(FD) felt like a superpower. It shows the real potential of logic programming for solving combinatorial problems by defining constraints and letting the solver handle the optimization.
Final Thoughts
Working with Prolog was a mind-bending and rewarding experience. It forces a different kind of problem-solving—one focused on defining logical systems rather than detailing computational steps. It’s a powerful reminder that the tools we choose can fundamentally shape how we approach a problem. For puzzles, logic, and AI, Prolog remains a uniquely elegant tool.
The complete code for both the standard and Skyscraper Sudoku solvers is embedded in this post below. Feel free to play with it and explore the world of logic programming!
More Resources
Source Code
sudoku.pl
1% Soeren Walls - October 14th, 2015
2
3:- use_module(library(lists),[append/2,transpose/2]).
4
5% sudoku(+Grid0, ?Grid): Grid is a complete refinement of the Sudoku
6% grid Grid0, where the "blank" entries of Grid0 are zeros.
7sudoku([], []).
8sudoku([R|Rs], Grid):-
9 zeros_gone(R, SR),
10 append([SR], GridN, Grid),
11 sudoku(Rs, GridN),
12 complete(Grid).
13
14% zeros_gone(+L0, -L): The list L is the list L0 with all zeros
15% replaced by positive integers that are not in L0 but are =< the
16% length of L0. All elements of L are different from one another.
17zeros_gone(L0, L):-
18 length(L0, S),
19 zeros_gone(L0, L0, L, S).
20
21% zeros_gone(+L0, +LC, -L, +S): The list L is the list L0 with all
22% zeros replaced by positive integers that are not in LC but are
23% all =< S, where S is the length of LC. The list LC is the unchanged
24% original version of list L0 at the time of the first call to this
25% predicate. All elements of L are also different from one another.
26zeros_gone([], _, [], _).
27zeros_gone([A|T], LC, L, S):-
28 ( \+ A = 0 -> L = [A|L1]
29 ; between_butnotin(1, S, N, LC),
30 L = [N|L1]
31 ),
32 zeros_gone(T, LC, L1, S),
33 alldiff_not0(L).
34
35% between_butnotin(N, M, X, L): X is an integer >= N and =< M, but
36% X is not a member of L.
37between_butnotin(N, M, X, L):-
38 ( nonmember(N, L) ->
39 ( N =< M, X = N
40 ; N < M, N0 is N+1,
41 between_butnotin(N0, M, X, L)
42 )
43 ; N < M, N0 is N+1,
44 between_butnotin(N0, M, X, L)
45 ).
46
47% complete(+Grid): Grid is fully filled in and consistent.
48complete(Grid):-
49 fullyfilled(Grid),
50 consistent(Grid).
51
52% fullyfilled(+Grid): All fields of Grid contain positive integers.
53fullyfilled([]).
54fullyfilled([R|Rs]):-
55 allpositive(R),
56 fullyfilled(Rs).
57
58% allpositive(+List): All elements in List are positive integers.
59allpositive([]).
60allpositive([A|T]):-
61 integer(A),
62 A > 0,
63 allpositive(T).
64
65% consistent(+Grid): For all areas of the Sudoku grid Grid, it holds
66% that all positive integers in the area are distinct.
67consistent([]).
68consistent(Grid):-
69 rows_alldiff(Grid),
70 transpose(Grid, GridT),
71 rows_alldiff(GridT),
72 findall(Subgrid, subgrid(Grid, Subgrid), L),
73 ( foreach(Sub, L)
74 do grid_alldiff(Sub)
75 ).
76
77% subgrid(+Grid, ?Subgrid): Subgrid is a sub-grid
78% of Grid with size K x K, where K is the square root of the
79% size of Grid, at (row,col) coordinates (I,J).
80subgrid(Grid, Subgrid):-
81 length(Grid, M),
82 K is integer(sqrt(M)),
83 sublist(Grid, SubRows, I, K),
84 I mod K =:= 0,
85 transpose(SubRows, SubCols),
86 sublist(SubCols, SubgridT, J, K),
87 J mod K =:= 0,
88 transpose(Subgrid, SubgridT).
89
90% grid_alldiff(+Grid): All the elements in the list of lists Grid
91% are different from each other, disregarding the number 0.
92grid_alldiff([]).
93grid_alldiff(Grid):-
94 flatten(Grid, L),
95 alldiff_not0(L).
96
97% flatten(+Grid, -L): L is the "flattened" list consisting
98% of all individual elements from a list of lists called Grid.
99flatten([], []).
100flatten([R|Rs], L):-
101 append(R, L2, L),
102 flatten(Rs, L2).
103
104% rows_alldiff(+L): In a list L consisting of sublists, each
105% sublist has elements which are all different from each other,
106% disregarding the number 0.
107rows_alldiff([]).
108rows_alldiff([R|Rs]):-
109 alldiff_not0(R),
110 rows_alldiff(Rs).
111
112% alldiff_not0(+L): The elements of list L are all different from
113% each other, disregarding the number 0.
114alldiff_not0([]).
115alldiff_not0([D|Ds]) :-
116 ( \+ D = 0 -> nonmember(D, Ds) % BIP == \+ member(D, Ds)
117 ; true
118 ),
119 alldiff_not0(Ds).
120
121% sublist(+Whole, ?Part, +Before, +Length): Part is a sublist of
122% Whole such that there are Before number of elements in Whole
123% before Part and the length of Part is Length.
124sublist(Whole, Part, Before, Length):-
125 append([BL,Part,_], Whole),
126 length(BL, Before),
127 length(Part, Length).
skysudoku.pl
1% Soeren Walls - November 15th, 2015
2
3:- use_module(library(clpfd)).
4:- use_module(library(lists),[append/2,transpose/2,reverse/2]).
5
6% :- type spuzzle ---> ss(size,list(clue))
7% :- type size == int.
8% :- type clue ---> g(num,row,col) ; v(vcount,dir,rowcol).
9% :- type row == int.
10% :- type col == int.
11% :- type rowcol == int.
12% :- type num == int.
13% :- type vcount == int.
14% :- type dir ---> n ; e ; s ; w.
15% :- type ssol == list(list(num)).
16% :- pred skysudoku(spuzzle::in, ssol::out).
17
18% skysudoku(SP, SS): SS is a solution of the Skyscraper Sudoku puzzle SP.
19skysudoku(ss(0,_), [[]]).
20skysudoku(ss(1,_), [[1]]):- !.
21skysudoku(ss(K,Clues), SS):-
22 M #= K*K,
23 length(SS, M),
24 length(SST, M),
25 transpose(SS, SST),
26 clues(Clues, SS),
27 grid_alldiff(SS, SST, M),
28 SGI #= M-K,
29 subgrids_alldiff(SS, K, SGI, SGI, M), % K=2 => (SS,2,2,2,4), K=3 => (SS,3,6,6,9)
30 append(SS, SSF),
31 labeling([], SSF).
32
33% clues(Clues, SS): SS is a Skyscraper Sudoku puzzle for which the
34% properties in the list Clues hold true.
35clues([], _).
36clues([g(N,R,C)|Clues], SS):-
37 given(N, R, C, SS),
38 clues(Clues, SS).
39clues([v(V,Dir,RC)|Clues], SS):-
40 visnum(V, Dir, RC, SS),
41 clues(Clues, SS).
42
43% given(+N, +R, +C, ?Grid): N appears in row R, column C of Grid.
44% The rows and columns begin at index 1.
45given(N, 1, C, [Row|_]):-
46 element(C, Row, N), !.
47given(N, R, C, [_|Rows]):-
48 R0 #= R-1,
49 given(N, R0, C, Rows).
50
51% visnum(V, Dir, RC, SS): The number of elements in row or column RC
52% of grid SS that are visible from the direction Dir is V. The rows
53% and columns begin at index 1.
54visnum(V, n, RC, SS):- % north => left visibility, RC is column number
55 transpose(SS, SST),
56 nth(RC, SST, Col),
57 visnum_left(V, Col).
58visnum(V, w, RC, SS):- % west => left visibility, RC is row number
59 nth(RC, SS, Row),
60 visnum_left(V, Row).
61visnum(V, s, RC, SS):- % south => right visibility, RC is column number
62 transpose(SS, SST),
63 nth(RC, SST, Col),
64 visnum_right(V, Col).
65visnum(V, e, RC, SS):- % east => right visibility, RC is row number
66 nth(RC, SS, Row),
67 visnum_right(V, Row).
68
69% nth(?N, +L, ?E): The N-th element of the proper list L is E.
70% The head of a list is considered its 1st element.
71nth(1, [Head|_], Head).
72nth(N, [_|Tail], E):-
73 nth(M, Tail, E),
74 N #= M+1.
75
76% visnum_left(V, L): The number of elements in list L that are visible
77% from the left is V.
78visnum_left(0, []).
79visnum_left(1, [_]).
80visnum_left(V, [A,B|As]):-
81 A #< B #<=> Visible,
82 V #= V0 + Visible,
83 Visible #=> C #= B,
84 #\ Visible #=> C #= A,
85 visnum_left(V0, [C|As]).
86
87% visnum_right(V, L): The number of elements in list L that are visible
88% from the right is V.
89visnum_right(V, L):-
90 reverse(L, LR),
91 visnum_left(V, LR).
92
93% grid_alldiff(+Grid, +GridT, M): The rows of Grid contain distinct
94% integers from 1 to M. The same holds for the rows of GridT. GridT
95% is the transpose of the grid Grid of size MxM.
96grid_alldiff([], [], _).
97grid_alldiff([Row|Rows], [Col|Cols], M):-
98 length(Row, M),
99 length(Col, M),
100 domain(Row, 1, M),
101 domain(Col, 1, M),
102 all_distinct(Row),
103 all_distinct(Col),
104 grid_alldiff(Rows, Cols, M).
105
106% subgrids_alldiff(+Grid, ?Subgrid, K, I, J, C): Subgrid is the C-th
107% sub-grid of Grid with size KxK, where K is the square root of the
108% width of Grid, at (row,col) coordinates (I,J), and all elements
109% in Subgrid are distinct integers from 1 to K^2.
110subgrids_alldiff(_, _, _, _, 0):- !.
111subgrids_alldiff(Grid, K, I, J, SGCount):-
112 M #= K*K,
113 I in 0..M,
114 J in 0..M,
115 sublist(Grid, SubRows, I, K),
116 transpose(SubRows, SubCols),
117 sublist(SubCols, SubgridT, J, K),
118 append(SubgridT, Subgrid),
119 length(Subgrid, M),
120 domain(Subgrid, 1, M),
121 all_distinct(Subgrid),
122 I #= 0 #<=> IZero,
123 IZero #=> I0 #= M-K,
124 IZero #=> J0 #= J-K,
125 #\ IZero #=> I0 #= I-K,
126 #\ IZero #=> J0 #= J,
127 SGC0 #= SGCount - 1,
128 subgrids_alldiff(Grid, K, I0, J0, SGC0).
129
130% sublist(+Whole, ?Part, +Before, +Length): Part is a sublist of
131% Whole such that there are Before number of elements in Whole
132% before Part and the length of Part is Length.
133sublist(Whole, Part, Before, Length):-
134 append([BL,Part,_], Whole),
135 length(BL, Before),
136 length(Part, Length).
It’s only logical.