« Comet version 2.0 released | Main | At last 2: A Nonogram solver using regular written in "all MiniZinc" »

At last, a Nonogram solver using regular constraint in MiniZinc

Here it is: nonogram_regular.mzn, a MiniZinc solver for Nonogram problems, using regular constraint.

In Comet version 2.0 released I wrote about the rewritten automata handler for the new Comet model nonogram_automaton.co (using the built-in constraint regular and helper function Automaton, both new in Comet version 2.0). This inspired me to finish the project of a "real" Nonogram solver for MiniZinc which use the regular constraint instead of the old (very slow) model nonogram.mzn.

Since MiniZinc has a built-in regular constraint, the hardest part was to create an automaton (finite state machine) given a Nonogram pattern. To be honest, I didn't write it purely in MiniZinc; instead a Perl program was written for this conversion (see below for more information about this program). Update: Well, I do have a version fully written in MiniZinc, nonogram_create_automaton.mzn, but it is not fast enough to be really interesting (must faster than the old version, nonogram.mzn, though). End of update this time.

The conversion pattern -> automaton is the same as described in Comet: regular constraint, a much faster Nonogram with the regular constraint, some OPL models, and more. To quote verbatim:
For the Nonogram clue [3,2,1] - which represents the regular expression 
"0*1110*110*10*" - the following automata (transition matrix) is 
generated:
1 2
0 3
0 4
5 0
5 6
0 7
8 0
8 9
9 0

Note that the regular function uses 0 (zero) as the failing state, so the states 
must start with 1..

Results: P200

In fact, the only model I was really curious about was the P200 problem since it has been the one that was the challenge for the Comet Nonogram solver. See the following posts for more about this quest: First, here is a picture of the solution of the P200 problem (nonogram_p200_aut.dzn, generated by the minizinc solver (currently the only solver that use output as formatting output):
 ##   ##     ###
####  # #    # ####
 #### # ##   #     #
####  # # #  #  #  #
 ##   # # ##  ###  #####
      #   # #   #  ##  #
     ###  # #####  #  ##
   ### ##  ##    # ##  ##
  ## #  ####   # # #    #
 ##      ## # ## #     ##
 #  #     # ### ##   ###
 #  # ##  #######  ###
 # ##     ## # #####
 ###  ## ##  #  ##
  ###   ##   #   ##
    #####    #    ##
  ##   ##    #     ##
 ####   ##   #    ##
######   ##  ### ##
#######   #### ###  ##
 #######    ####   ####
#######      #      ####
######       #     ####
 ####       ##      ##
  ##         #

P200: How does different solver do?

Here is a small benchmark for solving the P200 problem with different solvers. It was run on a Linux machine (Mandriva), Dual CPU 3.40GHz, 2Gb memory. "Runtime" is the total runtime, including converting from MiniZinc model to FlatZinc, startup, and also small time for running a wrapper program (where I can choose solver etc). Also, the solvers searched through the whole search tree for a solution (which happens to be exactly one).

All versions of the solvers was "the latest" as of 2009-09-08, i.e. the versions from respective CVS, SVN, or release of the day.

The search strategy used was first_fail where the columns (j) is labeled before rows (i):
solve :: int_search(
        [x[i,j] | j in 1..col_max, i in 1..row_max],
        first_fail,
        indomain_min,
        complete
        ) 
    satisfy;
For the MiniZinc/lazy solver, I also tested with solve satisfy since it often do very well without any specific search heuristics.
  • Gecode/FlatZinc: runtime 1.0s, solvetime 0.2s, failures 520 (also see below)
  • MiniZinc/minizinc: runtime 9s, 7697 choice points
  • MiniZinc/lazy: runtime 6s, choice points 2572
  • MiniZinc/lazy (with solve satisfy): runtime 13s, choice points 0
  • MiniZinc/fd: runtime 14s, choice points 11615
  • MiniZinc/fdmip: runtime 14s, choice points 11615
  • ECLiPSe/ic: runtime 109s
  • ECLiPSe/fd: runtime 34s
The Gecode/FlatZinc solver was by far the fastest. Here is the full statistics (for one random instance):
runtime:       0.207 (207.693000 ms)
solvetime:     0.198 (198.988000 ms)
solutions:     1
variables:     625
propagators:   50
propagations:  22940
nodes:         1041
failures:      520
peak depth:    22
peak memory:   1220 KB
mzx.pl nonogram_regular.mzn --fz --data nonogram_p200_aut.dzn  0,96s user 0,10s system 98% cpu 1,078 total
For comparison, here is the statistics for running the Comet model nonogram_automaton.co:
time:      459
#choices = 520
#fail    = 794
#propag  = 693993
comet nonogram_automaton.co  1,32s user 0,09s system 90% cpu 1,558 total
And, finally, the statistics for the Gecode (C++) program nonogram solving the P200 problem:
$ time nonogram -solutions 0 9
# ....
Initial
        propagators:  50
        branchings:   25

Summary
        runtime:      1.342 (1342.926000 ms)
        solutions:    1
        propagations: 35728
        nodes:        1409
        failures:     704
        peak depth:   25
        peak memory:  1027 KB
nonogram -solutions 0 9  1,45s user 0,06s system 99% cpu 1,517 total
It seems that the Gecode/FlatZinc version compares quite good.

For completion, here is the time for generating the automata of the P200 problem using the Perl program:
$ time perl make_nonogram_automata.pl nonogram_p200.dzn 1 > nonogram_p200_aut.dzn
0,20s user 0,02s system 96% cpu 0,233 total

Model and problem instances

Below are the problem instances as MiniZinc data file (.dzn), all the Nonogram problems listed on My Comet page, and some more. For each instance there are two variants: the "normal" version (which is the input file of the transformation) called "nonogram_name.dzn", and the generated automata version, called "nonogram_name_aut.mzn". It is the latter version ("_aut") that is used with nonogram_regular.mzn.

The program for converting to automata

As noted above, the Perl program make_nonogram_automata.pl converts a Nonogram problem instannce in "normal" pattern (clue) format to a format using automata.

The requirement of the indata file is semi-strict: the names must be the one in the example below, and each patterns must be on a separate line. The program use regular expressions to extract the information and can handle some variants in the format. The result is printed to standard output.
%% This is the problem instance of 'Hen', in "normal" pattern format.
%% Comments is keeped as they are
rows = 9;
row_rule_len = 2;
row_rules = array2d(1..rows, 1..row_rule_len,
       [0,3, 
        2,1, 
        3,2, 
        2,2, 
        0,6, 
        1,5, 
        0,6, 
        0,1, 
        0,2]);

   cols = 8;
   col_rule_len = 2;
   col_rules = array2d(1..cols, 1..col_rule_len,
       [1,2,
        3,1, 
        1,5, 
        7,1, 
        0,5, 
        0,3, 
        0,4, 
        0,3]);
To run the program:
$ perl make_nonogram_automata.pl nonogram_hen.dzn 1 > nonogram_hen_aut.dzn
The second parameter parameter 1 gives some more debugging info. The result is nonogram_hen_aut.dzn.

Minor note: For easy of debugging and further developments I decided to keep the 1-based version of arrays from the Comet model (Perl is 0-based) which made the code somewhat uglier.