My Constraint Programming Blog: JaCoP: a request from the developers (Knapsack and Geost) and Nonogram labeling

Here are some JaCoP related news.

A request from the developers: Knapsack and Geost

This is a request from the JaCoP developers, from the article Geost and Knapsack constraint:

Dear all, We have tested intiial but already optimized implementations of Geost constraint as well as Knapsack constraint. We need testers who will use and test the constraint in their problems. Feel free to contact me by email ( radoslaw [dot] szymanek {at} gmail (dot) com) to get source code for any of those two constraints. Geost constraint has been shown to be even 1000 times faster for problems of size 1000+ objects than other open-source implementation available. Knapsack constraint is also very efficient as it uses the sublinear algorithm presented in the literature. We have extended Knapsack constraint to be able to handle 0..n quantity variables as well knapsack capacity and profit specified as finite domain variables and not just integers. Geost constraint is a result of Master thesis done by Marc-Olivier Fleury. Knapsack constraint is based on the student project completed by Wadeck Follonier. best regards, Radoslaw Szymanek

Nonogram

In the latest version of JaCoP (2.3.2, from May 2009) there are 151 test instances for the Nonogram model. One of these is the well known (infamous) problem P200, which I have written about before, for example in Comet: Nonogram improved: solving problem P200 from 1:30 minutes to about 1 second, where I - aided by Pascal Van Hentenryck - happened to find a labeling that was quite good for the P200 problem (and not very bad for the other instances).

The labeling used in the JaCoP distribution of Nonogram.java is (according to the source) the "Zigzag based variable ordering". I thought it would be fun to compare it with the labeling used in the Comet model (also later used in the Gecode model).

The "new" labeling

The "new" labeling described in Comet: Nonogram improved: solving problem P200 from 1:30 minutes to about 1 second is simply to order the variables according to the length of the row rules and column rules:

if (board.length*row_rules.length <  board[0].length*col_rules.length) {
    // i first
    for (int i = 0; i < row_rules.length; i++) {
	for (int j = 0; j < col_rules.length; j++) {
            vars.add(board[i][j]);
	}
    }
} else {
    // j first
    for (int j = 0; j < col_rules.length; j++) {
        for (int i = 0; i < row_rules.length; i++) {
            vars.add(board[i][j]);
	}
    }
}

Here are the statistics of running the P200 problem with the two different labelings. P200 is the first (and hard coded) problem presented in Nonogram.java.

Original labeling:

Nodes : 6472
Decisions : 3236
Wrong Decisions : 3236
Backtracks : 3236
Max Depth : 38

...

Execution time = 3135 ms

The "New" labeling:

Nodes : 1040
Decisions : 520
Wrong Decisions : 520
Backtracks : 520
Max Depth : 22

...

Execution time = 638 ms

So, for P200 this new labeling was an improvement: nearly 5 times better time and about 6 times fewer backtracks.

Running all instances

But how good is the new labeling overall? As noted above, there are 150 test instances (plus P200). I ran all of them and compared the individual results with the two labelings. Note that the instance 83 is the "alien waving problem" instance which has a lot of solutions; I changed the code so it just finds one solution. For the rest of the instances the solving time is for finding all solutions (which happens to be unique).

All the times below are the reported time from the Nonogram model, excluding time for printing the model and the solution. It should also be noted that the time is from one single run.

Total solving time
Running all the 150 + 1 problem instances, the total solving time was:

Total original: 6102
Total new     : 5462
Total diff    : 640

The total of differences means that the newer labeling has better (total) solving time, about 0.5 second.

Individual instances
For a complete listing of the test instances see nonogram_all.txt. Here we will concentrate on a few cases.

Below is the times of the instances where the differences between the two labelings where larger than 10ms, what I call "dominating" instances:

38: orig:34 new:21 diff:13
51: orig:11 new:35 diff:-24
67: orig:35 new:10 diff:25
84: orig:1222 new:963 diff:259
91: orig:872 new:2800 diff:-1928
p200: orig:3006 new:700 diff:2306

As we can see, there are three problem that dominates the total times: 84, 91, and P200. Note that for one of these (instance #91) the new labeling is actually worse.

Here are all differences grouped depending on which labeling was the best. Last (in parenthesis) is the number of instances in each group:

original_best diffs: 
3 1 1 1 10 2 3 2 2 1 1 3 3 2 2 2 1 1 24 1 3 4 2 2 2 1 1 9 2 1 1928 1 1 1 2 1 1 1 2 3 1 2 1 2 1 1 2 1 3 (49)
new_best diffs     : 
1 3 1 2 1 3 4 3 3 4 1 3 1 1 2 1 13 1 1 2 2 1 3 1 1 25 2 1 3 259 1 3 3 1 2 2 3 3 3 1 1 3 1 1 1 3 1 2306  (48)
same               : 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 (55)

Summary

To summarise: the "new" labeling is overall quite good, but not better than the original labeling in all cases. It was better for two of the three dominating problem instances, and worse on one. Well, as always, there is no silver bullet.

The dominating instances

I don't know the name of the two dominating instances #84 and #91 so here they are, together with the infamous P200, using the output from JaCoP's Nonogram model.

Problem P200
 00   00     000
0000  0 0    0 0000
 0000 0 00   0     0
0000  0 0 0  0  0  0
 00   0 0 00  000  00000
      0   0 0   0  00  0
     000  0 00000  0  00
   000 00  00    0 00  00
  00 0  0000   0 0 0    0
 00      00 0 00 0     00
 0  0     0 000 00   000
 0  0 00  0000000  000
 0 00     00 0 00000
 000  00 00  0  00
  000   00   0   00
    00000    0    00
  00   00    0     00
 0000   00   0    00
000000   00  000 00
0000000   0000 000  00
 0000000    0000   0000
0000000      0      0000
000000       0     0000
 0000       00      00
  00         0

Problem 84
                 0
     0           00
     00         000
     000       00 0
     0 00     000 0
     00 00   000 00
      00 00 0000 00  00
      000 00000  00   00
       00 00000 000 0 0
       000 0000 0000 00
 0000000000 00 0000 00
  0  0000000 0 000 000
   00 0000000 000 000
    00 000000 00 0000
     00 0000 00 000000000
00000000 000 0 00000  0
  0000000 00  0000  00
   0000000   00   000
    0000000 00 00000
     000000   0000
       000 00000000
         000000000000
      0      000000
       0
        0

Problem 91
      0      0           
      00    00           
      00000000           
     00  00  00          
    00        00         
    00  0  0  00         
    00        00         
   00000    00000        
   00 00000000 00        
   0000 0000 0000        
   000000 0000000        
    0 0000 000 0         
    0000 00 0000         
     0 000000 0          
     00 00 0 00          
      00000000           
00    00000000           
 000    0000             
   000000000000          
     000000000000        
    000       00000000000
   000           00000000
  000                0000
000                    00
00

My Constraint Programming Blog

JaCoP: a request from the developers (Knapsack and Geost) and Nonogram labeling