### Comet: Nonogram improved: solving problem P200 from 1:30 minutes to about 1 second

In Comet: regular constraint, a much faster Nonogram with the regular constraint, some OPL models, and more I reported the running time of the Nonogram P200 problem (one of the hardest standard problems) as 1:30 minutes with the Comet model nonogram_regular.co (using the `regular`

constraint).

In the last paragraph of the Nonogram section, I noted: *Maybe some creative labeling or a "proper" regular constraint can fasten things up...*

Well, it was "creative labeling" part that made it. After some help from Pascal Van Hentenryck (one of the creators of Comet), the time for generating all the solutions of P200 - which is unique - went to about 1 second.

The statistics for the full search of P200 is shown below. Note: the unit for "time" is milliseconds and reports the solving time excluding the startup of Comet itself, i.e. 0.627 seconds.

num_solutions: 1

time: 621

#choices = 520

#fail = 1040

#propag = 1213237

comet -j2 nonogram_regular.co 1,48s user 0,10s system 99% cpu 1,582 total

For comparison the Gecode program Nonogram displays one solution in about the same time, but searching the complete search tree takes about 45 seconds, using the following command line option:

`nonogram -pk speed -solutions 0 8`

.

Maybe there are some more options that will make the Gecode program works faster.

### What was changed?

Pascal showed me two improvements (annotated by "Pascal's improvement" in nonogram_regular.co):

The first change didn't do much difference. It was a change in the `regular`

function. The following

forall(i in x_range) {

cp.inside(x[i], 1..S);// Do this in case it's a var.

cp.post(a[i+1] == d2[a[i], x[i]], onDomains); // Determine a[i+1].

}

was changed into

forall(i in x_range) {

cp.post(x[i] >= 1);// hakank: Pascal's improvement

cp.post(x[i] <= S);// hakank: Pascal's improvement

cp.post(a[i+1] == d2[a[i], x[i]], onDomains); // Determine a[i+1].

}

The second change, however, made the whole thing. In the `using`

section, the labeling (search strategy) was originally stated as

forall(i in 1..rows, j in 1..cols : !board[i,j].bound()) {

tryall(v in 1..2) m.post(board[i,j] == v, onDomains);

}

Pascal's suggested that `j in 1..cols`

and `i in 1..rows`

should switch place, which made the whole improvement.

The labeling is now as follows, i.e. it starts with the smallest "dimension" (rows or cols).

if (rows * row_rule_len < cols * col_rule_len ) { forall(i in 1..rows, j in 1..cols: !board[i,j].bound()) { tryall(v in 1..2) by (-v) m.label(board[i,j],v); } } else { forall(j in 1..cols, i in 1..rows: !board[i,j].bound()) { tryall (v in 1..2) by (-v) m.label(board[i,j],v); } }

Thanks again Pascal for your help and inspiration!

## Comments

There are no options in the Gecode nonogram example that makes P200 run faster. The difference is really in branching strategy: the standard Gecode model needs 142167 failures to find all solutions, which is 136 times as many failures as with your improved branching.

When comparing different constraint programming models in different systems, ensuring that the same search space is searched is very important, since with different search trees any number of odd effects might happen. An excellent paper (IMHO) on CP benchmarking is "On Benchmarking Constraint Logic Programming Platforms[...]" by Wallace, Schimpf, Shen, and Harvey.[1]

I'm not sure how an equivalent search strategy would look in Gecode. A quick fix gave a branching where 740 failures were needed (taking around 0.29 seconds on my machine). YMMV.

In general, it might be even more interesting to order the branching on the columns and rows individually based on length of the row/col and the constraint for that row/col. To measure the weight of the constraint the number of hints for it as well as the total slack allowed (computed from the amount of white space allowed for example) could be used. Whether to first place empty or marked might also be switched individually to give the smallest amount of slack.

[1] http://ai.uwaterloo.ca/~vanbeek/Constraints/vol09.html

Posted by: Mikael Zayenz Lagerkvist | March 6, 2009 08:03 AM

Mikael: Thanks for your comments.

I will read the paper and hopefully do better comparisons in the future. :-)

The reason I wrote about Gecode is that is the fastest constraint programming system I know and have installed on my computer, and its running time on the full tree search was significant slower than the Comet model.

On all the other problems I tested, Gecode searches the full tree in a second, so the slow time surprised me.

Today I compared the "difficult" problem (nonogram_difficult.co).

The statistics for Comet (full tree search) is:

num_solutions: 1

time: 1513

#choices = 5303

#fail = 10606

#propag = 3884317

comet -j2 nonogram_regular.co 2,20s user 0,10s system 95% cpu 2,422 total

The statistics for Gecode

runtime: 0.540 (540.000000 ms)

solutions: 1

propagations: 203826

failures: 4994

clones: 4994

commits: 17778

peak memory: 611 KB

nonogram -solutions 0 5 0,54s user 0,00s system 99% cpu 0,549 total

I'm running Linux Mandriva 3.40GHz with 2Gb RAM. Comet is version 1.1-rev2 and Gecode 2.2.0.

About the ordering on rows/colmns: Funny, I thought this in about the same lines yesterday, but have not implemented the idea yet.

Note: visit Mikael's home page for many interesting papers.

Posted by: hakank | March 6, 2009 09:04 PM

A remark: Gecode's Nonogram model using the same labeling principle as above is now in the SVN trunk (which will be version 3.0.0).

It's fast now. Here is the statistics for search all the solutions of the P200 problem (on a Linux, 32-bit, Pentium, 3.4MHz with 2Gb RAM):

runtime: 0.640 (640.000000 ms)

solutions: 1

propagations: 35728

nodes: 1409

failures: 704

peak depth: 25

peak memory: 963 KB

examples/nonogram -solutions 0 9 0,72s user 0,06s system 98% cpu 0,792 total

Posted by: hakank | March 11, 2009 08:15 PM