« INFORMS’ first Blog Challenge Results: OR and the Holidays | Main | Some new Answer Set Programming Programs »

Rogo grid puzzle in Answer Set Programming (Clingo) and MiniZinc

Encoding/models:
ASP (Clingo): rogo.lp, rogo2.lp
MiniZinc: rogo.mzn, rogo2.mzn
(See below for some problem instances.)

Later update: I have now also implemented versions with symmetry breaking constraints in the two encodings: rogo2.lp and rogo2.mzn. See below for more into

In Operations Research, Sudoko, Rogo, and Puzzles Mike Trick presented the Rogo puzzle, a grid puzzle where the object is to find a loop of a number of steps and get as many points as possible. He writes
The loop must be a real loop: it must return where it started and can’t cross itself or double back. Steps can be either horizontal or vertical: they cannot be diagonal. The loop cannot include any of the black squares. ... Rogo was created by two faculty members (Nicola Petty and Shane Dye)  at the University of Canterbury in Christchurch, New Zealand.  Not surprisingly, Nicola and Shane teach management science there:  the problem has a very strong operations research flavor.
From Creative Heuristics Ltd (the Rogo puzzle site): Rogo is an entirely new type of puzzle. The object is to collect the biggest score possible using a given number of steps in a loop around a grid. The best possible score for a puzzle is given with it, so you can easily check that you have solved the puzzle. Rogo puzzles can also include forbidden squares, which must be avoided in your loop..

Below I have assumed that the path must be in exactly the given number of steps and the programs are written with this assumption in mind (and further reading of the problem descriptions support this interpretation). However, my first approach - probably caused by sloppy reading - was that the optimal path could possible be in lesser steps. It took several hours trying to implement a program supporting this, but was abandoned after re-reading of the problem descriptions. This seems to be a harder problem; maybe it could be used as a variant of the original problem?

I was inspired to solve these puzzle in part because of Mike's last words: Creating a solver would make a nice undergraduate project (and I suspect there are at least a few master's theses and perhaps a doctoral dissertation on algorithmic aspects of creating and solving these). One other reason was to see how to do this with Answer Set Programming - here using Clingo (a Potassco tool) and to compare it with a Constraint Programming system, MiniZinc.

Some more links about Rogo:
  • Instructions
  • Rogo blog
  • YouTube clip.
  • Nicola Petty, Shane Dye: Determining Degree Of Difficulty In Rogo, A TSP-based Paper Puzzle (PDF)
    From the Conclusions: The Rogo puzzle format has a number of aspects that can be controlled to potentially affect degree of difficulty of solving. As a pilot, this study showed that there are many aspects of puzzle-solving related to the nature of the puzzle that can be explored, and there appear to be some general effects, though there are still marked individual differences between people solving the puzzles. This research has the potential to provide interesting insights into both human behaviour, and the nature of puzzles.

    Note: I didn't noticed this until I was almost finished with this blog post (and have just glanced through it).

Example

Here is the Rogo example from Mike Trick's site (pictures borrowed from his site; click on them for larger versions).

Problem:

Rogo puzzle, problem.

One solution:

Rogo puzzle, solution

Note that there is not an unique solution to these puzzles. All three problem instances I tested have more than one solution. For example, the Mike Trick problem has 48 solutions including path symmetries. Since there are 12 steps there are (removing the path symmetry) 48 / 12 = 4 distinct different paths. These different paths are shown below as MiniZinc solutions, where the first step have been fixed to (2,2), i.e. x[1]=2 and y[1]=2<, and it has 3 points (points[1]):

points = array1d(1..12, [3, 0, 0, 2, 0, 0, 2, 0, 0, 1, 0, 0]);
x = array1d(1..12, [2, 3, 4, 4, 5, 5, 5, 4, 3, 2, 2, 2]);
y = array1d(1..12, [2, 2, 2, 3, 3, 4, 5, 5, 5, 5, 4, 3]);
----------
points = array1d(1..12, [3, 0, 0, 2, 0, 0, 2, 0, 0, 1, 0, 0]);
sum_points = 8;
x = array1d(1..12, [2, 3, 3, 4, 5, 5, 5, 4, 3, 2, 2, 2]);
y = array1d(1..12, [2, 2, 3, 3, 3, 4, 5, 5, 5, 5, 4, 3]);
----------
points = array1d(1..12, [3, 0, 0, 1, 0, 0, 2, 0, 0, 2, 0, 0]);
sum_points = 8;
x = array1d(1..12, [2, 2, 2, 2, 3, 4, 5, 5, 5, 4, 3, 3]);
y = array1d(1..12, [2, 3, 4, 5, 5, 5, 5, 4, 3, 3, 3, 2]);
----------
points = array1d(1..12, [3, 0, 0, 1, 0, 0, 2, 0, 0, 2, 0, 0]);
sum_points = 8;
x = array1d(1..12, [2, 2, 2, 2, 3, 4, 5, 5, 5, 4, 4, 3]);
y = array1d(1..12, [2, 3, 4, 5, 5, 5, 5, 4, 3, 3, 2, 2]);
----------

Answer Set Programming, Clingo

Here is the full ASP (Clingo) encoding, without the data:

% domains
rows(1..r).
cols(1..c).

% max number of steps
steps(1..max_steps).

% define adjacency between cells
adj(R,C, R1,C1) :- rows(R;R1), cols(C;C1), |R-R1| + |C-C1|==1.

% the path: unique index
0 { path(I, Row, Col) : steps(I) } 1 :- rows(Row), cols(Col).
1 { path(I, Row, Col) : rows(Row) : cols(Col) } 1 :- steps(I).

% close the circuit: ensure that the first and last cells
% in the path are connected.
:- path(1, R1, C1), path(max_steps, R2, C2), not adj(R1,C1,R2,C2).

% remove bad paths
:- path(I-1,R1,C1), path(I,R2,C2), not adj(R1,C1, R2,C2).

% no black cells in the path
:- path(I, R,C), black(R,C).

% total points, needed since
% "Optimization:" don't show the proper value.
total(Total) :- Total = #sum[got_points(R,C,Value) = Value].

% list the cells in path with points
got_points(R,C, Value) :- point(R,C,Value), path(I, R, C).

% maximize the number of points
% #maximize [ path(I,R,C) : steps(I) : point(R,C,P) = P ].

% alternative: we can add an second objective to
% start with the cell with lowest indices
#maximize [ path(I,R,C) : steps(I) : point(R,C,P) = P@2 ].
#minimize [ path(1,R,C) = R*c+C@1].

#hide.
#show path(I, Row, Col).
#show total(Total).
#show got_points(R,C,Value).

Here is the encoding for Mike Trick's problem instance:
#const max_steps = 12.
#const r = 5.
#const c = 9.

% the point cells
point(1,1,2).
point(2,2,3).
point(2,5,1).
point(2,8,2).
point(3,9,3).
point(4,3,2).
point(5,5,2).
point(5,8,1).

% black cells (to avoid)
black(3,7).
black(4,4).
The solution of Mike Trick's problem (edited), using the following command line:
clingo --heuristic=Vmtf --stat rogo_data_mike_trick.lp rogo.lp
total: 8
got_points:
  5,5,2
  4,3,2
  2,5,1
  2,2,3

path:
  1,2,2
  2,2,3
  3,2,4
  4,2,5
  5,3,5
  6,4,5
  7,5,5
  8,5,4
  9,5,3
  10,4,3
  11,4,2
  12,3,2
Statistics for this solution:
Models      : 1     
  Enumerated: 6
  Optimum   : yes
Optimization: 184 20 
Time        : 0.960
  Prepare   : 0.060
  Prepro.   : 0.020
  Solving   : 0.880
Choices     : 19826
Conflicts   : 16539
Restarts    : 1

Atoms       : 912   
Bodies      : 22839 
Tight       : Yes

  Deleted   : 10406 
Update With the following symmetry breaking added, the problem is solved in 0.58 seconds.

% symmetry breaking: the cell with the lowest coordinates
% should be in the first step
:- path(1, R, C), steps(Step), Step > 1, path(Step, R2, C2),
R*c+C > R2*c+C2.


The statistics for this variant:

Time        : 0.580
  Prepare   : 0.080
  Prepro.   : 0.030
  Solving   : 0.470
Choices     : 8727
Conflicts   : 6914
Restarts    : 2
End of update

Some notes:
One nice feature in Clingo (and lparse) is that it is possible to have many optimization objective. Here we we first maximize the number of points (#maximize [ path(I,R,C) : steps(I) : point(R,C,P) = P@2 ] , and as a second objective (with lower priority @1) we minimize the start cell to start with the cell with the lowest coordinate: #minimize [ path(1,R,C) = R*c+C@1].. Sometimes this is faster than the plain #maximize objective, sometimes not.

The size of "core" of the encoding is quite small. Here is the code with the comments and the helper predicates (for outputs) removed.

rows(1..r).
cols(1..c).
steps(1..max_steps).
adj(R,C, R1,C1) :- rows(R;R1), cols(C;C1), |R-R1| + |C-C1|==1.
0 { path(I, Row, Col) : steps(I) } 1 :- rows(Row), cols(Col).
1 { path(I, Row, Col) : rows(Row) : cols(Col) } 1 :- steps(I).
:- path(1, R1, C1), path(max_steps, R2, C2), not adj(R1,C1,R2,C2).
:- path(I-1,R1,C1), path(I,R2,C2), not adj(R1,C1, R2,C2).
:- path(I, R,C), black(R,C).
#maximize [ path(I,R,C) : steps(I) : point(R,C,P) = P ].

The corresponding "core" of the MiniZinc program (see below for the full code) is larger.

Constraint Programming, MiniZinc

Here is the full MiniZinc code (without data). Compared with the ASP approach, the decision variables are represented in another way: the paths is represented by two arrays x and y, and the points are collected in a separate array (points) so we can simply sum over it for the optimization.

include "globals.mzn";

int: W = 0; % white (empty) cells
int: B = -1; % black cells
int: max_val = max([problem[i,j] | i in 1..rows, j in 1..cols]);

% define the problem
int: rows;
int: cols;
int: max_steps; % max length of the loop
array[1..rows, 1..cols] of int: problem;

% the coordinates in the path
array[1..max_steps] of var 1..rows: x :: is_output;
array[1..max_steps] of var 1..cols: y :: is_output;

% the collected points
int: max_point = max([problem[i,j] | i in 1..rows, j in 1..cols]);
array[1..max_steps] of var 0..max_point : points :: is_output;

% objective: sum of points in the path
int: max_sum = sum([problem[i,j] | i in 1..rows, j in 1..cols where problem[i,j] > 0]);
var 0..max_sum: sum_points :: is_output;

% solve satisfy;
solve maximize sum_points;
% solve :: int_search(x ++ y, first_fail, indomain_min, complete) maximize sum_points;

% all coordinates must be unique
constraint forall(s in 1..max_steps, t in s+1..max_steps) (
x[s] != x[t] \/ y[s] != y[t]
);

% calculate the points (to maximize)
constraint forall(s in 1..max_steps) (
points[s] = problem[x[s], y[s]]
)
/\
sum_points = sum(points);

% ensure that there are no black cells
% in the path
constraint forall(s in 1..max_steps) (
problem[x[s],y[s]] != B
);

% get the path
constraint forall(s in 1..max_steps-1) (
abs(x[s] - x[s+1]) + abs(y[s] - y[s+1]) = 1
)
/\ % close the path around the corner
abs(x[max_steps] - x[1]) + abs(y[max_steps] - y[1]) = 1;


Except for more declaration of the arrays and decision variables, this code don't have more logic than the ASP encoding. However it is more verbose.

The solution for Mike Trick's problem, using LazyFD, takes 1.1 second, slightly slower than using Clingo (see below for more comparison of times):

points = array1d(1..12, [0, 0, 3, 0, 0, 2, 0, 0, 2, 0, 0, 1]);
sum_points = 8;
x = array1d(1..12, [2, 2, 2, 3, 3, 4, 5, 5, 5, 4, 3, 2]);
y = array1d(1..12, [4, 3, 2, 2, 3, 3, 3, 4, 5, 5, 5, 5]);


Update
After some thoughts I decided to try the same symmetry breaking that was an option the ASP encoding. It is implemented in rogo2.mzn and use the following extra constraint that ensure that the cell with the lowest coordinate is in the first step. % symmetry breaking: the cell with lowest coordinates % should be in the first step. constraint forall(i in 2..max_steps) ( x[1]*cols+y[1] < x[i]*cols+y[i] );

With this model, LazyFD solves Mike Trick's problem in 0.627 seconds. Also see under "Comparison" below.

End of update

Comparison

I was curious how well the systems should do so here is a "recreational" comparison. Please don't read too much into it:
  • it is just 3 problems.
  • there is probably better heuristics for both Clingo and Gecode/fz
The following 3 problem instances was used in the test. Unfortunately, I have not found any direct links for the two latter instances (see below for links to my encodings). Instead a variant of the coding used in MiniZinc is shown, where "_" ("W" in the MiniZinc code) is white/blank, "B" is a black cell to avoid, and a number represent a point of the cell.
  • Mike Trick's example. 5 rows, 9 column, 12 steps; good: 6, best: 8.
    %1 2 3 4 5 6 7 8 9  
     2,_,_,_,_,_,_,_,_, % 1
     _,3,_,_,1,_,_,2,_, % 2
     _,_,_,_,_,_,B,_,2, % 3
     _,_,2,B,_,_,_,_,_, % 4
     _,_,_,_,2,_,_,1,_, % 5
    
  • The Paper Rogo puzzle from Creative Heuristics Ltd for 20110106. 9 rows, 7 columns, 16 steps; good: 28, best: 31 points.
     %1 2 3 4 5 6 7
      B,_,6,_,_,3,B, % 1
      2,_,3,_,_,6,_, % 2
      6,_,_,2,_,_,2, % 3
      _,3,_,_,B,B,B, % 4
      _,_,_,2,_,2,B, % 5
      _,_,_,3,_,_,_, % 6
      6,_,6,B,_,_,3, % 7
      3,_,_,_,_,_,6, % 8
      B,2,_,6,_,2,B, % 9
    
  • The Paper Rogo puzzle from Creative Heuristics Ltd for 20110107. 12 rows, 7 columns, 16 steps; good: 34 points, best: 36 points.
     %1 2 3 4 5 6 7
      4,7,_,_,_,_,3, % 1
      _,_,_,_,3,_,4, % 2
      _,_,4,_,7,_,_, % 3
      7,_,3,_,_,_,_, % 4
      B,B,B,_,3,_,_, % 5
      B,B,_,7,_,_,7, % 6
      B,B,_,_,_,4,B, % 7
      B,4,4,_,_,_,B, % 8
      B,_,_,_,_,3,B, % 9
      _,_,3,_,4,B,B, % 10
      3,_,_,_,_,B,B, % 11
      7,_,7,4,B,B,B  % 12
    
For the ASP encoding I used clingo (a combined grounder and solver) and the parameter --heuristi=Vmtf after some minimal testing with different parameters. For MiniZinc, both solvers LazyFD and Gecode/fz where used and I settled with the search heuristic solve minimize sum_points which seems to be fast enough for this experiment. Note that LazyFD tend to behave best without any explicit search heuristics . (Also: there is always a better search heuristics that the one you settle with.)

The time reported is the total time, i.e. including the grounding time/convert to FlatZinc.

Update I have added the second version of the MiniZinc model with the added symmetry breaking constraint as a separate entry: End of update

Mike Trick problem

Clingo: 0.96 seconds, 19826 choices, 16539 conflicts, 1 restart.
Clingo with symmetry breaking: 58 seconds, 8727 choices, 6914 conflicts, 2 restarts.
LazyFD: 1.1 second. (failure is not reported)
LazyFD with symmetry breaking: 0.6 seconds (failure is not reported)
Gecode/fz: 2.62s, 92113 failures2577853 failures
Gecode/fz: with symmetry breaking: 0.4 seconds, 9418 failures

20110106 problem

Clingo: 1:57.07 minutes, 1155290 choices, 1044814 conflicts, 1 restart
Clingo with symmetry breaking: 20.4 seconds, 157146 choices, 135178 conflicts, 3 restarts
LazyFD: 2:58 minutes
LazyFD with symmetry breaking: 19.9 seconds (failure is not reported)
Gecode/fz: 58.6 seconds, 1380512 failures
Gecode/fz with symmetry breaking: 7.8 seconds, failures

20110107 problem

Clingo: 3:13.72 1541808 choices, 1389396 conflicts, 1 restart
Clingo with symmetry breaking: 31.6 seconds 178301 choices, 151439 conflicts, 1 restart
LazyFD: 2:55.18 minutes
LazyFD with symmetry breaking: 44.5 seconds (failure is not reported)
Gecode/fz: 1:54.50 minutes 2577853 failures
Gecode/fz with symmetry breaking: 11.3 seconds, failures

Here we see that Gecode/fz without (symmetry breaking) is the fastest for the two larger problems (but the slowest on the first), and both Clingo and LazyFD was each placed second in the harder problems. So, I'm not really sure we can draw any real conclusions from this small test.

Update With symmetry breaking added the result is more unanimous. All three solvers benefit much from this. Gecode/fz is faster on all three problems, and the other two still one second place each. We also see how much symmetry breaking can do. End of update

Some comments/conclusions

Here are some general comments about the problem and the coding.

ASP

As mentioned above, my initial assumption of the problem was that the given number of steps was not always the best path length, and trying to code this was not easy. After a long time sitting with this approach (say 6 hours coding time?), I re-read the description and couldn't find any real support for this assumption, so I skipped it in favor for the "fixed length" approach.

To get to the final version it took several hours, say 9-10 hours total coding (in smaller sessions over several days). This includes time for coding the initial assumption/interpretation. It also includes the time trying to get the first assumption approach to work with the incremental solver iclingo which - I hoped - should first try the longest length and if that gave no solution it should try at the the lengths in decrement of 1; but I didn't get this to work.

As usual I realized that several of the predicates I first thought was needed could just be thrown away. An example is this snipped to ensure "positive" that there is a path between the cell R1,C1 and the cell R2,C2.

{ x(I-1, R1, C1) : adj(R1,C1, R2,C2) : rows(R1;R2) : cols(C1;C2) : not black(R1,C1) : not black(R2,C2) } max_steps :- x(I, R2, C2), steps(I).

Instead it could be replaced with :- x(I, R,C), black(R,C).
:- x(I-1,R1,C1), x(I,R2,C2), not adj(R1,C1, R2,C2).


Well, I hope that with time I'll recognize these things faster.

MiniZinc

It took about 30 minutes to code the MiniZinc model (after a unnecessarily long detour of debugging bad thinking and a bug in the data). The reason for this much shorter time is two-fold:
  • It's much easier to code something when the general strategy (approach) of the problem is known and has been coded in another system. All the (de)tours when writing the ASP version made much more comfortable with the given problem.
  • I'm much more comfortable coding in MiniZinc than in ASP for two reasons: 1) I have programmed in MiniZinc much longer than in ASP. 2) I am also more used with the high-level CP approach of stating the requirements/constraints with traditional loops and using arrays/matrices.
Programs and data files Here are all the files used, both program and data files for the two systems. ASP (Clingo)
MiniZinc