My Constraint Programming Blog: At last, a Nonogram solver using regular constraint in MiniZinc

Here it is: nonogram_regular.mzn, a MiniZinc solver for Nonogram problems, using regular constraint.

In Comet version 2.0 released I wrote about the rewritten automata handler for the new Comet model nonogram_automaton.co (using the built-in constraint regular and helper function Automaton, both new in Comet version 2.0). This inspired me to finish the project of a "real" Nonogram solver for MiniZinc which use the regular constraint instead of the old (very slow) model nonogram.mzn.

Since MiniZinc has a built-in regular constraint, the hardest part was to create an automaton (finite state machine) given a Nonogram pattern. To be honest, I didn't write it purely in MiniZinc; instead a Perl program was written for this conversion (see below for more information about this program). Update: Well, I do have a version fully written in MiniZinc, nonogram_create_automaton.mzn, but it is not fast enough to be really interesting (must faster than the old version, nonogram.mzn, though). End of update this time.

The conversion pattern -> automaton is the same as described in Comet: regular constraint, a much faster Nonogram with the regular constraint, some OPL models, and more. To quote verbatim:

For the Nonogram clue [3,2,1] - which represents the regular expression 
"0*1110*110*10*" - the following automata (transition matrix) is 
generated:
1 2
0 3
0 4
5 0
5 6
0 7
8 0
8 9
9 0

Note that the regular function uses 0 (zero) as the failing state, so the states 
must start with 1..

Results: P200

In fact, the only model I was really curious about was the P200 problem since it has been the one that was the challenge for the Comet Nonogram solver. See the following posts for more about this quest:

First, here is a picture of the solution of the P200 problem (nonogram_p200_aut.dzn, generated by the minizinc solver (currently the only solver that use output as formatting output):

 ##   ##     ###
####  # #    # ####
 #### # ##   #     #
####  # # #  #  #  #
 ##   # # ##  ###  #####
      #   # #   #  ##  #
     ###  # #####  #  ##
   ### ##  ##    # ##  ##
  ## #  ####   # # #    #
 ##      ## # ## #     ##
 #  #     # ### ##   ###
 #  # ##  #######  ###
 # ##     ## # #####
 ###  ## ##  #  ##
  ###   ##   #   ##
    #####    #    ##
  ##   ##    #     ##
 ####   ##   #    ##
######   ##  ### ##
#######   #### ###  ##
 #######    ####   ####
#######      #      ####
######       #     ####
 ####       ##      ##
  ##         #

P200: How does different solver do?

Here is a small benchmark for solving the P200 problem with different solvers. It was run on a Linux machine (Mandriva), Dual CPU 3.40GHz, 2Gb memory. "Runtime" is the total runtime, including converting from MiniZinc model to FlatZinc, startup, and also small time for running a wrapper program (where I can choose solver etc). Also, the solvers searched through the whole search tree for a solution (which happens to be exactly one).

All versions of the solvers was "the latest" as of 2009-09-08, i.e. the versions from respective CVS, SVN, or release of the day.

The search strategy used was first_fail where the columns (j) is labeled before rows (i):

solve :: int_search(
        [x[i,j] | j in 1..col_max, i in 1..row_max],
        first_fail,
        indomain_min,
        complete
        ) 
    satisfy;

For the MiniZinc/lazy solver, I also tested with solve satisfy since it often do very well without any specific search heuristics.

Gecode/FlatZinc: runtime 1.0s, solvetime 0.2s, failures 520 (also see below)
MiniZinc/minizinc: runtime 9s, 7697 choice points
MiniZinc/lazy: runtime 6s, choice points 2572
MiniZinc/lazy (with solve satisfy): runtime 13s, choice points 0
MiniZinc/fd: runtime 14s, choice points 11615
MiniZinc/fdmip: runtime 14s, choice points 11615
ECLiPSe/ic: runtime 109s
ECLiPSe/fd: runtime 34s

The Gecode/FlatZinc solver was by far the fastest. Here is the full statistics (for one random instance):

runtime:       0.207 (207.693000 ms)
solvetime:     0.198 (198.988000 ms)
solutions:     1
variables:     625
propagators:   50
propagations:  22940
nodes:         1041
failures:      520
peak depth:    22
peak memory:   1220 KB
mzx.pl nonogram_regular.mzn --fz --data nonogram_p200_aut.dzn  0,96s user 0,10s system 98% cpu 1,078 total

For comparison, here is the statistics for running the Comet model nonogram_automaton.co:

time:      459
#choices = 520
#fail    = 794
#propag  = 693993
comet nonogram_automaton.co  1,32s user 0,09s system 90% cpu 1,558 total

And, finally, the statistics for the Gecode (C++) program nonogram solving the P200 problem:

$ time nonogram -solutions 0 9
# ....
Initial
        propagators:  50
        branchings:   25

Summary
        runtime:      1.342 (1342.926000 ms)
        solutions:    1
        propagations: 35728
        nodes:        1409
        failures:     704
        peak depth:   25
        peak memory:  1027 KB
nonogram -solutions 0 9  1,45s user 0,06s system 99% cpu 1,517 total

It seems that the Gecode/FlatZinc version compares quite good.

For completion, here is the time for generating the automata of the P200 problem using the Perl program:

$ time perl make_nonogram_automata.pl nonogram_p200.dzn 1 > nonogram_p200_aut.dzn
0,20s user 0,02s system 96% cpu 0,233 total

Model and problem instances

Below are the problem instances as MiniZinc data file (.dzn), all the Nonogram problems listed on My Comet page, and some more. For each instance there are two variants: the "normal" version (which is the input file of the transformation) called "nonogram_name.dzn", and the generated automata version, called "nonogram_name_aut.mzn". It is the latter version ("_aut") that is used with nonogram_regular.mzn.

nonogram_regular.mzn: The actual MiniZinc model.
Bear: nonogram_bear_aut.dzn nonogram_bear.dzn
Car: nonogram_car_aut.dzn nonogram_car.dzn
Castle: nonogram_castle_aut.dzn nonogram_castle.dzn
Crocodile: nonogram_crocodile_aut.dzn nonogram_crocodile.dzn
Difficult: nonogram_difficult_aut.dzn nonogram_difficult.dzn
Dragonfly: nonogram_dragonfly_aut.dzn nonogram_dragonfly.dzn
Griddler: nonogram_griddler_aut.dzn nonogram_griddler.dzn
Hard: nonogram_hard_aut.dzn nonogram_hard.dzn
Hen: nonogram_hen_aut.dzn nonogram_hen.dzn
House: nonogram_house_aut.dzn nonogram_house.dzn
Lambda: nonogram_lambda_aut.dzn nonogram_lambda.dzn
n2: nonogram_n2_aut.dzn nonogram_n2.dzn
n3: nonogram_n3_aut.dzn nonogram_n3.dzn
n4: nonogram_n4_aut.dzn nonogram_n4.dzn
n5: nonogram_n5_aut.dzn nonogram_n5.dzn
n6: nonogram_n6_aut.dzn nonogram_n6.dzn
Nonunique: nonogram_nonunique_aut.dzn nonogram_nonunique.dzn
Optima: nonogram_optima_aut.dzn nonogram_optima.dzn
P199: nonogram_p199_aut.dzn nonogram_p199.dzn
P200: nonogram_p200_aut.dzn nonogram_p200.dzn
ps: nonogram_ps_aut.dzn nonogram_ps.dzn
Simple: nonogram_simple_aut.dzn nonogram_simple.dzn
Simple2: nonogram_simple2_aut.dzn nonogram_simple2.dzn
Soccer player: nonogram_soccer_player_aut.dzn nonogram_soccer_player.dzn
t1: nonogram_t1_aut.dzn nonogram_t1.dzn
t3: nonogram_t3_aut.dzn nonogram_t3.dzn

The program for converting to automata

As noted above, the Perl program make_nonogram_automata.pl converts a Nonogram problem instannce in "normal" pattern (clue) format to a format using automata.

The requirement of the indata file is semi-strict: the names must be the one in the example below, and each patterns must be on a separate line. The program use regular expressions to extract the information and can handle some variants in the format. The result is printed to standard output.

%% This is the problem instance of 'Hen', in "normal" pattern format.
%% Comments is keeped as they are
rows = 9;
row_rule_len = 2;
row_rules = array2d(1..rows, 1..row_rule_len,
       [0,3, 
        2,1, 
        3,2, 
        2,2, 
        0,6, 
        1,5, 
        0,6, 
        0,1, 
        0,2]);

   cols = 8;
   col_rule_len = 2;
   col_rules = array2d(1..cols, 1..col_rule_len,
       [1,2,
        3,1, 
        1,5, 
        7,1, 
        0,5, 
        0,3, 
        0,4, 
        0,3]);

To run the program:

$ perl make_nonogram_automata.pl nonogram_hen.dzn 1 > nonogram_hen_aut.dzn

The second parameter parameter 1 gives some more debugging info. The result is nonogram_hen_aut.dzn.

Minor note: For easy of debugging and further developments I decided to keep the 1-based version of arrays from the Comet model (Perl is 0-based) which made the code somewhat uglier.