January 24, 2010

Off topic: Eureqa - equation discovery with genetic programming

The last week I have tested the equation discovery system Eureqa which uses genetic programming (symbolic regression). Hopefully it may be of some interest.

Today I blogged about my (simple) experiments on my Swedish blog Eureqa: equation discovery med genetisk programmering. Google translation to English: Eureqa: equation discovery with genetic programming.

Also, see My Eureqa page.

January 14, 2010

Some new MiniZinc models, mostly Enigma problems

The last week I have solved some (more) of New Scientist's Enigma problems. Note: You may have to be a subscriber to read all the articles, however there is a grace period with some free peeks (7 it seems to be).

Enigma problems

Here are the new models of Enigma problems. All are written in MiniZinc.
  • enigma_counting_pennies.mzn: Enigma Counting pennies (from 2005)
    Here all the 24 ways of permuting a list of 4 items are needed. Instead of calculating them, I simply listed all the variants and used them later as index.
  • enigma_843.mzn: Enigma How many are whole? (#843)
    In this model there was problem using the following type of reifications:
      square(ONE) /\ ( square(TWO) \/ square(THREE) \/ square(FOUR)) % /\ ..
      % or
      (square(ONE) /\ ( bool2int(square(TWO)) + bool2int(square(THREE)) + bool2int(square(FOUR)) = 1) % /\ ...
    
    where square is defined as:
    predicate square(var int: y) =
      let {
         var 1..ceil(sqrt(int2float(ub(y)))): z
      } in
       z*z = y;
    
    In some of the later versions of MiniZinc (probably 1.0.2), the handling of reification was changed. My fix was to add a boolean matrix which handled the boolean constraints. It is not beatiful:
       % In the list ONE TWO THREE FOUR just the first and one other are perfect squares.
       square(ONE) /\ 
       [bools[1,j] | j in 1..4] = [square(ONE),square(TWO),square(THREE),square(FOUR)] /\
       sum([bool2int(bools[1,j]) | j in 1..4]) = 2 /\
       % ...
    
  • enigma_1530.mzn: Enigma Tom Daley (#1530)
    A simple alphametic puzzle.
  • enigma_1535.mzn: Enigma Back to front (#1535)
    Magic square with some more constraints.
  • enigma_1555.mzn: Enigma Not a square (#1555)
    Another alphametic puzzle on a grid.
  • enigma_1557.mzn: Enigma Reverse Division (#1557)
    Division and reversing some numbers.
  • enigma_1568.mzn: Enigma Odd puzzle (#1568)
    Long multipication alphametic puzzle.
  • enigma_1570.mzn: Enigma Set of cubes (#1570)
    Pandigital problem over at set of cubes. Here I simply listed the cubes between 0^3 and 1000^3 instead of calculating them. This may be consider cheating...
  • enigma_1575.mzn: Enigma All our days (#1575)
    Another alphametic puzzle with some more constraints.
Some of these problem may have been better solved using other means, e.g. pure (or not so pure) mathematics, but this is after all a blog about constraint programming, and modellng them is good exercise. And fun.

I have also solved - or at least modeled - some of the open problems: 1573, 1574, 1576, and 1577, but will not publish the models until they are closed. I "accidentally" published problem #1574 the other day before I realized it was still open, but so be it.

Some other models/data files

Here are two simple new problems taken from Choco's (forthcoming site ) example distribution: Also, a new data file for the Bridge and Torch model: bridge_and_torch_problem8.dzn, which is, as I understand it, the same as Choco's U2planning problem.

Birthdays 2010 puzzle

Last, here is a very simple puzzle of my own based on a coincidence of birth years of me and my brother and the current year:
This year (2010) my older brother Anders will be the same age as the year I was born in the last century, and I will be the same age as the year he was born in the last century. My brother is four years older than me. How old are we this year?
A (unnecessary complex) MiniZinc model of this problem is birthdays_2010.mzn. Of course, it can be solved much easier with the simple equation: {H+A=110,A-H=4} (link to Wolfram Alpha), or in MiniZinc:
constraint H+A = 110 /\ A-H = 4;
But this latter version is not very declarative, and I - in general - prefer the declarative way.

Update: Added link to Wolfram Alpha for the simple equation above.

January 04, 2010

Finding celebrities at a party

This problem is from Uwe Hoffmann Finding celebrities at a party (PDF):
Problem: Given a list of people at a party and for each person the list of people they know at the party, we want to find the celebrities at the party. A celebrity is a person that everybody at the party knows but that only knows other celebrities. At least one celebrity is present at the party.
(This paper contains an implementation of the problem in Scala.)

Note: The original of this problem is Richard Bird and Sharon Curtis: "Functional pearls: Finding celebrities: A lesson in functional programming" (J. Funct. Program., 16(1):13–20, 2006), but I (nor Hoffmann) have not been able to access this paper. Update: I have now got hold of this paper. Thank you SW!

The example in Hoffmann's paper is to find of who are the celebrity/celebrities in this party graph:
Adam  knows {Dan,Alice,Peter,Eva},
Dan   knows {Adam,Alice,Peter},
Eva   knows {Alice,Peter},
Alice knows {Peter},
Peter knows {Alice}
Solution: the celebrities are Peter and Alice.

MiniZinc model

The MiniZinc model of this problem is finding_celebrities.mzn.

Following are some discussion of the two constraints
  • All must knows a celebrity
  • Celebrities only knows a celebrity
But first a comment how the party graph is represented.

Party graph

Here I have chosen to represent the party as an array of sets:
  Adam  (1) knows {Dan,Eva,Alice,Peter}  {2,3,4,5}
  Dan   (2) knows {Adam,Alice,Peter}     {1,4,5}
  Eva   (3) knows {Alice,Peter}          {4,5}
  Alice (4) knows {Peter}                {5}
  Peter (5) knows {Alice}                {4}
This is coded in MiniZinc as:
party = [
          {2,3,4,5}, % 1, Adam
          {1,4,5},   % 2, Dan 
          {4,5},     % 3, Eva
          {5},       % 4, Alice
          {4}        % 5, Peter
         ];
This corresponds to the following incidence matrix, which is calculated in the model. For simplifying the calculations, we assume that a person know him/herself (this is also handled in the model).
% 1 2 3 4 5
  1,1,1,1,1, % 1
  1,1,0,1,1, % 2
  0,0,1,1,1, % 3
  0,0,0,1,1, % 4
  0,0,0,1,1  % 5
This conversion from incidence sets (party) to incidence matrix (graph) is done by the set2matrix predicate:
predicate set2matrix(array[int] of var set of int: s,
                     array[int,int] of var int: mat) =
     forall(i in index_set(s)) ( graph[i,i] = 1)
     /\
     forall(i,j in index_set(s) where i != j) (
      j in party[i] <-> graph[i,j] = 1
    )

All must know a celebrity

I started this model by consider the celebrities as a clique, and therefore using the constraint clique (which is still included in the model). However, is not enough to identify the clique since there may be other cliques but are not celebrities. In the above example there is another clique: {Dan,Adam}.

In fact, the clique constraint is not needed at all (and it actually makes the model slower). Instead we can just look for person(s) that everybody knows, i.e. where there are all 1's in the column of a celebrity in the party graph matrix (graph in the model). This is covered by the constraint:
forall(i in 1..n) (
   (i in celebrities -> sum([graph[j,i] |j in 1..n]) = n)
)  
This is a necessary condition but not sufficient for identifying celebrities.

Celebrities only know each other

We must also add the constraint that all people in the celebrity clique don't know anyone outside this clique. This is handled by the constraint the all celebrities must knows the same amount of persons, i.e. the size (cardinality) of the clique:
forall(i in 1..n) (
   i in celebrities -> sum([graph[i,j] | j in 1..n]) = num_celebrities
)
As noted above, a person is assumed to know him/herself.

Running the model

If we run the MiniZinc model finding_celebrities.mzn we found the following solution:
celebrities = 4..5;
num_celebrities = 2;
which means that the celebrities are {4,5}, i.e. {Alice, Peter}.

Another, slightly different party

We now change the party matrix slightly by assuming that Alice (4) also know Adam (1), i.e. the following incidence matrix:
 % 1 2 3 4 5
   1,1,1,1,1, % 1
   1,1,0,1,1, % 2
   0,0,1,1,1, % 3
   1,0,0,1,1, % 4
   0,0,0,1,1  % 5
]);
This makes Alice a non celebrity since she knows the non celebrity Adam, and this in turn also makes Peter a non celebrity since he knows the non celebrity Alice. In short, in this party there are no celebrities.

The model also contains a somewhat larger party graph with 10 persons.

Update about 8 hours later There is now a version which uses just set constraints, i.e. no conversion to an incidence matrix: finding_celebrities2.mzn. The constraint sections is now:
constraint
  num_celebrities >= 1

  /\ % all persons know the celebrities,
     % and the celebrities only know celebrities
  forall(i in 1..n) (
     (i in celebrities -> 
             forall(j in 1..n where j != i) (i in party[j]))
     /\
     (i in celebrities -> 
             card(party[i]) = num_celebrities-1)
  )
I have kept the same representation of the party (the array of sets of who knows who) as the earlier model which means that a person is now not assuming to know him/herself. The code reflects this change by using where j != i and num_celebrities-1.

December 29, 2009

1 year anniversary and Secret Santa problem II

Exactly one year ago, I started this blog. Little did I know what to expect. Through this blog and its accompanying pages - I have met many very interesting persons that has learned me much in my pursuit of learning constraint programming. I am very grateful for your help and inspiration! And thanks to all my (other) readers.

I hope the following year will be as rewarding as this last.

Secret Santa problem II

As an anniversary gift, here is another Secret Santa problem (compare with Merry Christmas: Secret Santas Problem) with a slightly different touch.

The problem formulation is from Maple Primes forum Secret Santa Graph Theory:
Every year my extended family does a "secret santa" gift exchange. Each person draws another person at random and then gets a gift for them. At first, none of my siblings were married, and so the draw was completely random. Then, as people got married, we added the restriction that spouses should not draw each others names. This restriction meant that we moved from using slips of paper on a hat to using a simple computer program to choose names. Then people began to complain when they would get the same person two years in a row, so the program was modified to keep some history and avoid giving anyone a name in their recent history. This year, not everyone was participating, and so after removing names, and limiting the number of exclusions to four per person, I had data something like this:

Name: Spouse, Recent Picks

Noah: Ava. Ella, Evan, Ryan, John
Ava: Noah, Evan, Mia, John, Ryan
Ryan: Mia, Ella, Ava, Lily, Evan
Mia: Ryan, Ava, Ella, Lily, Evan
Ella: John, Lily, Evan, Mia, Ava
John: Ella, Noah, Lily, Ryan, Ava
Lily: Evan, John, Mia, Ava, Ella
Evan: Lily, Mia, John, Ryan, Noah
I have interpreted the problem as follows:
  • one cannot be a Secret Santa of one's spouse nor of oneself
  • one cannot be a Secret Santa for somebody two years in a row
  • objective: maximize the "Secret Santa distance", i.e. the number of years since the last assignment of the same person
My MiniZinc model for this problem is secret_santa2.mzn.

This is a more traditional linear programming problem compared to Secret Santa I, using a distance matrix for maximizing the "Secret Santa Distance". M is a "large" number (number of persons + 1) for coding that there have been no previous Secret Santa assignment.
rounds = array2d(1..n, 1..n, [
%N  A  R  M  El J  L  Ev 
 0, M, 3, M, 1, 4, M, 2, % Noah
 M, 0, 4, 2, M, 3, M, 1, % Ava 
 M, 2, 0, M, 1, M, 3, 4, % Ryan
 M, 1, M, 0, 2, M, 3, 4, % Mia 
 M, 4, M, 3, 0, M, 1, 2, % Ella
 1, 4, 3, M, M, 0, 2, M, % John
 M, 3, M, 2, 4, 1, 0, M, % Lily
 4, M, 3, 1, M, 2, M, 0  % Evan
]);

The original problem don't say anything about single persons, i.e. those without spouses. In this model, singleness (no-spouseness) is coded as spouse = 0, and the no-spouse-Santa constraint has been adjusted to takes care of this.

The constraint part is the following, where n is the number of persons:
constraint
   all_different(santas)

   /\ % no Santa for one self or the spouse
   forall(i in 1..n) (
      santas[i] != i /\
      if spouses[i] > 0 then santas[i] != spouses[i] else true endif
   )

   /\ % the "santa distance" (
   forall(i in 1..n) ( santa_distance[i] = rounds[i,santas[i]] )

   /\ % cannot be a Secret Santa for the same person two years in a row.
   forall(i in 1..n) (
      let { var 1..n: j } in
       rounds[i,j] = 1 /\ santas[i] != j
   )

   /\
   z = sum(santa_distance)

Solution

This model gives - when using solve satisfy and the constraint z >= 67 - the following 8 solutions with a total of Secret Santa distance of 67 (sum(santa_distance)). If all Secret Santa assignments where new it would have been a total of n*(n+1) = 8*9 = 72. As we can see there is always one Santa with a previous existing assignment. (With one Single person and the data I faked, we can get all brand new Secret Santas. See the model for this.)
santa_distance: 9 9 4 9 9 9 9 9
santas        : 7 5 8 6 1 4 3 2

santa_distance: 9 9 4 9 9 9 9 9
santas        : 7 5 8 6 3 4 1 2

santa_distance: 9 9 9 4 9 9 9 9
santas        : 7 5 6 8 1 4 3 2

santa_distance: 9 9 9 4 9 9 9 9
santas        : 7 5 6 8 3 4 1 2

santa_distance: 9 9 9 9 4 9 9 9
santas        : 4 7 1 6 2 8 3 5

santa_distance: 9 9 9 9 4 9 9 9
santas        : 4 7 6 1 2 8 3 5

santa_distance: 9 9 9 9 9 9 4 9
santas        : 4 7 1 6 3 8 5 2

santa_distance: 9 9 9 9 9 9 4 9
santas        : 4 7 6 1 3 8 5 2

December 25, 2009

Merry Christmas: Secret Santas Problem

Here is a fun little problem related to the holiday. Merry Christmas, everyone! (For the Swedish readers: Sorry for the one day off greeting.)

This problem is from the Ruby Quiz#2 Secret Santas
Honoring a long standing tradition started by my wife's dad, my friends all play a Secret Santa game around Christmas time. We draw names and spend a week sneaking that person gifts and clues to our identity. On the last night of the game, we get together, have dinner, share stories, and, most importantly, try to guess who our Secret Santa was. It's a crazily fun way to enjoy each other's company during the holidays.

To choose Santas, we use to draw names out of a hat. This system was tedious, prone to many "Wait, I got myself..." problems. This year, we made a change to the rules that further complicated picking and we knew the hat draw would not stand up to the challenge. Naturally, to solve this problem, I scripted the process. Since that turned out to be more interesting than I had expected, I decided to share.

This weeks Ruby Quiz is to implement a Secret Santa selection script.

Your script will be fed a list of names on STDIN.

...

Your script should then choose a Secret Santa for every name in the list. Obviously, a person cannot be their own Secret Santa. In addition, my friends no longer allow people in the same family to be Santas for each other and your script should take this into account.
The MiniZinc model secret_santa.mzn skips the parts of input and mailing. Instead, we assume that the friends are identified with a unique number from 1..n, and the families are identified with a number 1..num_families.

We use two arrays:
  • the array x represents whom a person should be a Santa of. x[1] = 10 means that person 1 is a Secret Santa of person 10, etc.
  • the family array consists of the family identifier of each person.
Now, the three constraints can easily be stated in a constraint programming system like MiniZinc:
  • "everyone gives and received a Secret Santa gift": this is handled with a permutation of the values 1..n using all_different(x).
  • "one cannot be one own's Secret Santa". This is captured in the no_fix_point predicate, stating that there can be no i for which x[i] = i (i.e. no "fix point").
  • "no Secret Santa to a person in the same family". Here we use the family array and checks that for each person (i), the family of i (family[i]) must not be the same as the family of the person that receives the gift (family[x[i]]).
Here is the complete MiniZinc model (in a slightly compact form):
include "globals.mzn"; 
int: n = 12;
int: num_families = 4;
array[1..n] of 1..num_families: family = [1,1,1,1, 2, 3,3,3,3,3, 4,4];
array[1..n] of var 1..n: x :: is_output;

% Ensure that there are no fix points in the array.
predicate no_fix_points(array[int] of var int: x) = 
      forall(i in index_set(x)) ( x[i] != i  );

solve satisfy;

constraint
  % Everyone gives and receives a Secret Santa
  all_different(x) /\      

  % Can't be one own's Secret Santa
  no_fix_points(x) /\   

  % No Secret Santa to a person in the same family
  forall(i in index_set(x)) (   family[i] != family[x[i]]  )
; 

% output (just for the minizinc solver)
output [
   "Person " ++ show(i) ++ 
   " (family: " ++ show(family[i]) ++ ") is a Secret Santa of " ++ 
    show(x[i]) ++ 
   " (family: " ++ show(family[x[i]]) ++ ")\n"
   | i in 1..n
] ++ 
["\n"];

Solution

Here is the first solution (of many):
[10, 9, 8, 5, 12, 4, 3, 2, 1, 11, 7, 6]
This means that person 1 should be a Secret Santa of person 10, etc.

The minizinc solver gives the following, using the output code (slightly edited):
Person  1 (family: 1) is a Secret Santa of 10 (family: 3)
Person  2 (family: 1) is a Secret Santa of  9 (family: 3)
Person  3 (family: 1) is a Secret Santa of  8 (family: 3)
Person  4 (family: 1) is a Secret Santa of  5 (family: 2)
Person  5 (family: 2) is a Secret Santa of 12 (family: 4)
Person  6 (family: 3) is a Secret Santa of  4 (family: 1)
Person  7 (family: 3) is a Secret Santa of  3 (family: 1)
Person  8 (family: 3) is a Secret Santa of  2 (family: 1)
Person  9 (family: 3) is a Secret Santa of  1 (family: 1)
Person 10 (family: 3) is a Secret Santa of 11 (family: 4)
Person 11 (family: 4) is a Secret Santa of  7 (family: 3)
Person 12 (family: 4) is a Secret Santa of  6 (family: 3)

Bales of Hay

As an extra, here is another MiniZinc model: bales_of_hay.mzn which solves the following problem (from The Math Less Traveled the other day):
You have five bales of hay.

For some reason, instead of being weighed individually, they were weighed in all possible combinations of two. The weights of each of these combinations were written down and arranged in numerical order, without keeping track of which weight matched which pair of bales. The weights, in kilograms, were 80, 82, 83, 84, 85, 86, 87, 88, 90, and 91.

How much does each bale weigh? Is there a solution? Are there multiple possible solutions?
The answer? There is a unique solution (when bales are ordered on weight): 39, 41, 43, 44, 47.

December 22, 2009

Some updated SICStus Prolog models

Mats Carlsson (designer and main implementer of SICStus Prolog) was kind enough to send me alternative versions of some of my SICStus Prolog models, which now are included in the models (marked with his name). His versions are - of course - more elegant and faster than mine. Thanks, Mats.

Assignment problem

Model: assignment.pl using the global constraint global_cardinality.
assignment2(Problem) :-
        cost(Problem, Mode, CostRows),
        format('\nProblem ~d\n~w\n', [Problem,Mode]),
        transpose(CostRows, CostCols),
        length(CostRows, R),
        length(CostCols, C),
        length(Vars, R),
        domain(Vars, 1, C),
        (   for(I,1,C),
            foreach(I-Y,Ys)
        do  Y in 0..1
        ),
        global_cardinality(Vars, Ys, [cost(Cost,CostRows)]),
        (Mode==minimize -> Dir=up ; Dir=down),
        labeling([Dir], [Cost]),
        labeling([], Vars), !,
        format('assignment = ~w, cost = ~d\n', [Vars,Cost]),
        fd_statistics, nl.

Bin Packing

Model: bin_packing.pl. Here is the global constraint cumulative used. Note: The representation of the filled bins is however not the same as in my approach.
bin_packing2(Problem) :-
        problem(Problem, StuffUnordered, BinCapacity),
        portray_clause(problem(Problem, StuffUnordered, BinCapacity)),
        length(StuffUnordered, N),
        N1 is N+1,
        (   foreach(H,StuffUnordered),
            foreach(task(S,1,E,H,1),Tasks),
            foreach(NH-S,Tagged),
            foreach(S,Assignment),
            foreach(_-X,Keysorted),
            foreach(X,Vars),
            param(N1)
        do  S in 1..N1,
            E in 2..N1,
            NH is -H
        ),
        keysort(Tagged, Keysorted),
        maximum(Max, Vars),
        cumulative(Tasks, [limit(BinCapacity)]),
        labeling([minimize(Max),step], Vars),
        portray_clause(assignment(Assignment)),
        fd_statistics,
        nl.

Post Office Problem

Model: post_office_problem2.pl. This is a much better way of solving this problem in Prolog than my translation from MiniZinc.
go2 :-
        Days = 7,
        % requirements number workers per day
        Need = [17, 13, 15, 19, 14, 16, 11],
        % Total cost for the 5 day schedule.
        % Base cost per day is 100.
        % Working saturday is 100 extra
        % Working sunday is 200 extra.
        Cost = [500, 600, 800, 800, 800, 800, 700],
        % Decision variables. X[I]: Number of workers starting at day I
        length(X, Days),
        domain(X, 0, 10),
        append(X, X, [_,_,_|Ys]),
        (   foreach(Nd,Need),
            fromto(Ys, Ys1, Ys2, _)
        do  Ys1 = [Y1|Ys2],
            Ys2 = [Y2,Y3,Y4,Y5|_],
            Y1+Y2+Y3+Y4+Y5 #>= Nd
        ),
        scalar_product(Cost, X, #=, Total),
        labeling([minimize(Total),bisect], X),
        format('assignment = ~w, total cost = ~d\n', [X,Total]),
        fd_statistics.
I also corrected a weird (and wrong) way of calculating the total cost in my version (as well as in the MiniZinc model post_office_problem2.mzn).

December 19, 2009

Choco's new site

Soon the website for Choco constraint programming solver will be changed to a this site (will replace the old one). There are some broken links and missing information, but it seems to be well organized.

Some links: For anyone not familiar with Choco, here is a short description (from the new site):
choco is a java library for constraint satisfaction problems (CSP), constraint programming (CP) and explanation-based constraint solving (e-CP).
It is built on a event-based propagation mechanism with backtrackable structures.

choco can be used for:

* teaching (a user-oriented constraint solver with open-source code)
* research (state-of-the-art algorithms and techniques, user-defined constraints, domains and variables)
* real-life applications (many application now embed choco)

December 15, 2009

SweConsNet 2010 (May 21, 2010 in Uppsala, Sweden)

SweConsNet 2010 is to be held May 21 2010 in Uppsala, Sweden. This is the 9th workshop of the Network for Sweden-based researchers and practitioners of Constraint programming (SweConsNet). It is collocated with the SAIS Workshop 2010 (May 20-21).

From the SweConsNet 2010 site:
The purpose of this workshop is to learn about ongoing research in constraint programming, existing projects and products, and further development of the network.

The workshop is open to everybody interested in the theory and practice of constraint programming, whether based in Sweden or elsewhere.

...

Please forward this information to people who might be interested in this workshop but are not yet on the SweConsNet mailing list. They can subscribe to it by sending a message to Justin Pearson.

...

Location

Uppsala University, Information Technology Center (ITC), Polacksbacken, house 1 (Directions).

Programme

List of speakers (preliminary):

Organisation

SweConsNet 2010 is chaired by Magnus Ågren (SICS). Local arrangements are made by Pierre Flener and Justin Pearson (Uppsala University).

The last work shop (May this year) was very interesting and fun, and I'm really looking forward to SweConsNet 2010. For more about SweConsNet 2009, see its homepage and my own Report from SweConsNet2009, including my presentation.

December 14, 2009

The Global Constraint Catalog has been updated

A new version of Global Constraint Catalog has been released. That's great! The update date is 2009-12-10, but I didn't see it until today.

Here is the presentation of the catalog:
About the catalogue

The catalogue presents a list of 348 global constraints issued from the literature in constraint programming and from popular constraint systems. The semantic of each constraint is given together with a description in terms of graph properties and/or automata.

The catalogue is periodically updated by Nicolas Beldiceanu, Mats Carlsson and Jean-Xavier Rampon. Feel free to contact the first author for any questions about the content of the catalogue.

Download the Global Constraint Catalog in pdf format:

The news in this release:
2009-12-10 working version update: 348 constraints
Another nice thing is that it is now possible to search just the HTML pages, or PDF documents.

Sophie Demassey has done a wonderful work with the online version.

Also, I am happy for the links to this blog and my MiniZinc page, see the section on (decompositions of) global constraints.

December 12, 2009

SICStus Prolog 4.1 released and My SICStus Prolog page

SICStus Prolog version 4.1 was released yesterday (2009-12-11). Congratulations to the SICStus team! See here for a summary of the news in this version.

For the past weeks I have tested the beta version of this release. In these tests I focused mainly on the following:
  • the do-loops (for loops)
  • support for MiniZinc version 1.0
  • and somewhat SPIDER (the new IDE)
I did report some findings in the beta version, but most of the time I have been creating SICStus models and just learning the system, especially the clpfd part. These models can be downloaded at My SICStus Prolog page. See below for more about these models.

do-loops

SICStus Prolog's support for do-loops is based on Joachim Schimpf's paper Logical Loops (link to CiteSeer). These constructs where first implemented in the ECLiPSe Constraint Programming System (which I wrote about in Constraint programming in ECLiPSe Constraint Programming System).

The following constructs are implemented in SICStus version 4.1:
fromto(First,In,Out,Last)
foreach(X,List)
foreacharg(X,Struct)
foreacharg(X,Struct,Idx)
for(I,MinExpr,MaxExpr)
count(I,Min,Max)
param(Var)
IterSpec1, IterSpec2
An example: quasigroup_completion.pl is a simple model that demonstrates the foreach loop, the constraints makes it a latin square given the hints in the Problem matrix:
solve(ProblemNum) :-
        problem(ProblemNum, Problem),
        format('\nProblem ~d\n',[ProblemNum]),
        length(Problem, N),
        append(Problem, Vars),
        domain(Vars, 1, N),

        ( foreach(Row, Problem)
        do
          all_different(Row,[consistency(domain)])
        ),

        transpose(Problem, ProblemTransposed),
        ( foreach(Column, ProblemTransposed)
        do
          all_different(Column,[consistency(domain)])
        ),

        labeling([], Vars),
        pretty_print(Problem),nl,
        fd_statistics.

problem(1, [[1, _, _, _, 4],  
            [_, 5, _, _, _],
            [4, _, _, 2, _],
            [_, 4, _, _, _],
            [_, _, 5, _, 1]]).

Some other examples are from the Bin Loading model bin_packing.pl. First is an example of a constraint that requires a (reversed) ordered array.
% load the bins in order:
% first bin must be loaded, and the list must be ordered
element(1, BinLoads, BinLoads1),
BinLoads1 #> 0,
( fromto(BinLoads,[This,Next|Rest],[Next|Rest],[_])
do
  This #>= Next
),
Next is how to sum occurrences of the loaded bins, using fromto to count the number of loaded bins (Loaded is a reified binary variable):
% calculate number of loaded bins 
% (which we later will minimize)
( foreach(Load2,BinLoads),
  fromto(0,In,Out,NumLoadedBins),
  param(BinCapacity)
do
  Loaded in 0..1,
  Load2 #> 0 #<=> Loaded #= 1,
  indomain(Loaded),
  Out #= In + Loaded
),
A slighly more general version of ordering an array is the predicate my_ordered from young_tableaux.pl:
my_ordered(P,List) :-
        ( fromto(List, [This,Next | Rest], [Next|Rest],[_]),
          param(P)
        do
          call(P,This,Next)
        ).
where a comparison operator is used in the call, e.g. my_ordered(#>=,P) to sort in reversed order.

Most of my models use these do loop constructs. As you can see of from the models I'm not a Prolog guy by profession, even though the language has fascinated me for a long time. SICStus' (and also ECLiPSe's) new approach of using for do-loops has made me appreciate Prolog in a new way. Very inspiring!

Two missing construct in SICStus (compared to the ECLiPSe implementation) are for(...) * for(...) and multifor. When writing the ECLiPSe models I used these two quite often, for example in the ECLiPSe model assignment.ecl. Also, SICStus don't support the syntactic sugar of array/matrix access with [] (note, this feature is not related to do-loops). When I first realized that they where missing in SICStus, I thought that it would be a hard job of converting my ECLiPSe models to SICStus. Instead, these omissions was - to my surprise - quite a revelation.

My ECLiPSe models was mostly just a translation of the MiniZinc/Comet models I've written before, i.e. array/matrix based models where array access and loops such as forall(i in 1..n) () is used all the time. Not surprising, my first SICStus models was then just a conversion of these array/matrix access to nth1 or element constraints to simulate the ECLiPSe constructs. But it was not at all satisfactory; in fact, it was quite ugly sometimes, and also quite hard to understand. Instead I started to think about the problem again and found better ways of stating the problem, thus used much less of these constructs. Compare the above mentioned model with the SICStus version assignment.pl which in my view is more elegant than my ECLiPSe version.

Here is a simple comparison of the assignment models mentioned above.

The ECLiPSe version (using []-indices):
% exacly one assignment per row, all rows must be assigned
( for(I,1,Rows), param(X,Cols) do
      sum(X[I,1..Cols]) #= 1
),
% zero or one assignments per column
( for(J,1,Cols), param(X,Rows) do
      sum(X[1..Rows,J]) #=< 1
),
% calculate TotalCost
(for(I,1,Rows) * for(J,1,Cols), 
 fromto(0,In,Out,TotalCost),
 param(X,Cost) do 
     Out #= In + X[I,J]*Cost[I,J]
),
As we see, this is heavily influenced by the MiniZinc/Comet way of coding.

Here is the corresponding code in SICStus:
% exacly one assignment per row, all rows must be assigned
( foreach(Row, X) do
      sum(Row,#=,1)
),
% zero or one assignments per column
transpose(X, Columns),
( foreach(Column, Columns) do
      sum(Column,#=<,1)
),
% calculate TotalCost
append(Cost, CostFlattened),
scalar_product(CostFlattened,Vars,#=,TotalCost),
(Almost the same code could be written in ECLiPSe as well; this is more about showing how my approach of coding has changed.)

However, I have not been able to completely free myself from the matrix approach; sometimes it was simply too hard, and sometimes I was just lazy.

A final note: The example directory of library(clpfd) now contains many examples of do-loops. Some of the older examples has also been rewritten using these constructs.

Support for MiniZinc/FlatZinc verion 1.0

One other quite large part of the testing was of the support of MiniZinc/FlatZinc version 1.0 (the earlier SICStus version supported just version 0.9). The solver is often very fast, but I did not do any systematic comparison with other solvers using the beta version.

I am especially happy about that the FlatZinc solver now shows statistics, e.g.
% runtime:      80 ms
% solvetime:    10 ms
% solutions:    1
% constraints:  22
% backtracks:   1
% prunings:     64
This makes it easier to compare it to other solvers.

There are a number of ways of running a MiniZinc model or FlatZinc file (see the documentation). Here is how I normally run the MiniZinc models from the command line (via my Perl wrapper program):

mzn2fzn -G sicstus minesweeper_3.mzn && sicstus --goal "use_module(library(zinc)), on_exception(E,zinc:fzn_run_file('minesweeper_3.fzn',[solutions(1),statistics(true),timeout(30000)]), write(user_error,E)), halt."

Explanation:
  • statistics(true): show the statistics
  • solutions(1): just show one solution. Use solutions(all) for all solutions
  • timeout(30000): time out in milliseconds (here 30 seconds)
  • the on_exception wrapper, so errors don't give the prompt. Note: There may be some strange results if the timeout from library(timeout) is used at the same time.
Note: "set vars" and "float vars" are not supported in this version.

SPIDER

I did just briefly tested SPIDER, the Eclipse-based development environment for SICStus, and I honestly don't know how much I will use it instead of good old Emacs. However, where I a professional SICStus Prolog programmer, I would surely use it all the time since it has a lot of very good features especially crafted for SICStus.

One very useful feature is that it is quite good at detecting variables not properly declared in the do-loops (and singletons elsewhere); one has to use param for using a variable in inner loops. SPIDER is very good at detecting errors when such param values are missing (as well as other errors). For many models I have just started SPIDER to check if there where such bugs.

Another nice feature is the "balloon help", where the SICStus documentation is presented (click in the yellow area for more information).

My SICStus Prolog models

As always when starting learning a new constraint programming system, I started with my learning models, this time mostly by translating the ECLiPSe models. It first took quite a long time, since I missed some constructs I was used to in ECLiPSe (see above). But after a time it was rather easy to do it more like a SICStus approach, and now I really like writing in SICStus. Still, there are some models where this direct translations shows (and I'm not very proud of these models).

Also, I would like to thank Mats Carlsson and Magnus Ågren of the SICStus team for some great help with some of the first models.

So, I have now done almost all my ECLiPSe models in SICStus, except for those using floats or some where the SICStus' distributed examples are much better. SICStus don't support "set vars", so I used boolean lists instead, see for example steiner.pl and clique.pl. Also, there are some 30-40 models not done in written ECLiPSe, and hopefully more is to come. All these models can be seen at My SICStus Prolog page.

These models contains comments of the problem, e.g. references and also links to other of my models implementing the problem (in other constraint programming systems).

November 30, 2009

Gecode 3.2.2 released

Gecode version 3.2.2 has been released. It can be downloaded here.

From the Changelog:

Changes in Version 3.2.2 (2009-11-30)

This release adds the sequence constraint (contributed by David Rijsman, Quintiq) and has as always some small additions and fixes.

  • Kernel
    • Bug fixes
      • Added missing assignment operator for space-based allocators for STL data structures. (minor, thanks to Gustavo Gutierrez)
  • Search engines
    • Bug fixes
      • The memory reported could be sometimes too low (the previous fix for 3.2.0 did not fix it for branch and bound search). (minor)
  • Finite domain integers
    • Additions
      • Added sequence constraint. (major , contributed by David Rijsman)
    • Bug fixes
      • The global cardinality (count) constraint now accepts unsorted arrays of values. It previously propagated incorrectly if the array was not sorted. (minor, thanks to Alberto Delgado)
      • Fixed bug in the ICL_VAL propagator for global cardinality. (minor)
      • Subscription to constant views did not honor the flag to avoid processing. (minor)
  • Finite integer sets
    • Bug fixes
      • Subscription to constant views did not honor the flag to avoid processing (did not occur in practice). (minor)
  • Script commandline driver
    • Additions
      • Report if search engine has been stopped. (minor)
  • Range and value iterators
    • Other changes
      • Renamed test for subset or disjointness of range iterators to "compare". (minor)
  • Example scripts
    • Additions
      • Added car sequencing example (problem 1 in CSPLib). Uses the new sequence-constraint. (minor)
  • Gecode/FlatZinc
    • Bug fixes
      • Support search annotations with constants in the variable arrays. (minor, thanks to Håkan Kjellerstrand)
      • The set_in and set_in_reif constraints were buggy when used with Boolean variables (which are usually not generated by mzn2fzn so that the issue probably does not occur in practice). (minor)
      • The global_cardinality constraint was not completely compatible with the MiniZinc semantics. It would constrain values not mentioned in the array to have zero occurrences, while in MiniZinc they are unrestricted. (minor)
      • Element constraints in reified positions produced an error in the mzn2fzn translation. (major, thanks to Håkan Kjellerstrand)

November 08, 2009

Update on Nonogram: Jan Wolter's Survey and my own new benchmark

Survey of Paint-by-Number Puzzle Solvers

In Some new models, etc I mentioned the great Survey of Paint-by-Number Puzzle Solvers, created by Jan Wolter (also author of the Nonogram solver pbnsolve).

In this survey he included both Gecode's Nonogram solver written by Mikael Lagerkvist as well as my own Nonogram model (with Gecode/FlatZinc).

Since the last update on the former blog post the following has happeded:
  • both our solver has now the assessment: "An amazingly good solver, especially for a simple demo program", and are placed 4, 5, and 6 of 10 tested systems
  • my Gecode/FlatZinc model has been tested for "Full results"; it came 4 out of 5.
  • my Nonogram model with the Lazy FD solver is now included in the "Sample result", at 6'th place
It seems than Wolter has appreciated constraint programming as a general tool for solving these kind of combinatorial problems, much for its ease of experimentation, e.g. with labeling strategies and (for the MiniZinc models) changing solvers:

From the analysis of Lagerkvist's Gecode model:
This is especially impressive because the faster solvers are large, complex programs custom designed to solve paint-by-number puzzles. This one is a general purpose solver with just a couple hundred custom lines of code to generate the constraints, run the solver, and print the result. Considering that this is a simple application of a general purpose solving tool rather than hugely complex and specialized special purpose solving tool, this is an astonishingly good result.

Getting really first class search performance usually requires a lot of experimentation with different search strategies. This is awkward and slow to do if you have to implement each new strategies from scratch. I suspect that a tool like Gecode lets you try out lots of different strategies with relatively little coding to implement each one. That probably contributes a lot to getting to better solvers faster.
From the analysis of my MiniZinc model:
If you tried turning this into a fully useful tool rather than a technology demo, with input file parsers and such, it would get a lot bigger, but clearly the constraint programming approach has big advantages, achieving good search results with low development cost.

...

These two results [Gecode/FlatZinc and LazyFD] highlight the advantage of a solver-independent modeling language like MiniZinc. You can describe your problem once, and then try out a wide variety of different solvers and heuristics without having to code every single one from scratch. You can benefit from the work of the best and the brightest solver designers. It's hard to imagine that this isn't where the future lies for the development of solvers for applications like this.
And later in the Conclusions:
The two constraint-based systems by Lagerkvist and Kjellerstrand come quite close in performance to the dedicated solvers, although both are more in the category of demonstrations of constraint programming than fully developed solving applications. The power of the underlaying search libraries and the ease of experimentation with alternative search heuristics obviously serves them well. I think it very likely that approaches based on these kinds of methods will ultimately prove the most effective.
I think this is an important lesson: Before starting to write very special tools, first try a general tool like a constraint programming system and see how well it perform.

The Lazy FD solver and the Lion problem

Most of the problems in the Sample Results where solved by some solver within the time limit of 30 minutes. However, one problem stand out as extra hard: The Lion problem. When I tested the MiniZinc's Lazy FD solver on my machine I was very excited that it took just about 2 minutes, and mentioned this to Wolter. He also tested this but on his 64-bit machine it took 80 minutes to solve (and since it was above the time limit this is not in the result table). This is how he describes the Lazy FD solver:
But the remarkable thing is that [the Lazy FD solver] solves almost everything. Actually, it even solved the Lion puzzle that no other solver was able to solve, though, on my computer, it took 80 minutes to do it. Now, I didn't allow a lot of other solvers 80 minutes to run, but that's still pretty impressive. (Note that Kjellerstrand got much faster solving times for the Lion than I did. Lagerkvist also reported that his solver could solve it, but I wasn't able to reproduce that result even after 30 CPU hours. I don't know why.)
After some discussion, we come to the conclusion that the differences was probably due to the fact that I use a 32-bit machine (and the 32-bit version of MiniZinc) with 2 Gb memory, and Wolter use a 64-bit machine with 1 Gb memory.

One should also note that the all other solvers was compiled without optimization, including Gecode/FlatZinc; however the LazyFD does not come with source so it is running optimized. This may be an unfair advantage to the LazyFD solver.

My own benchmark of the Sample Results

The times in the Sample Results is, as mentioned above, for solvers compiled with no optimization. I have now run the same problems on my machine (Linux Mandriva, Intel Dual 3.40GHz, with 2Gb memory), but the solvers uses the standard optimization. All problems was run with a time limit of 10 minutes (compared to Wolters 30 minutes) and searched for 2 solutions, which checks for unique solutions. The last three problems (Karate, Flag, Lion) has multiple solutions, and it is considered a failure if not two where found in the time limit. I should also note that during the benchmark I am using the machine for other things, such as surfing etc.

The problems
I downloaded the problems from Wolter's Webbpbn: Puzzle Export. For copyright reasons I cannot republish these models, but it is easy to download each problem. Select ".DZN" for the MiniZinc files, and "A compiled in C++ format" for Gecode. There is no support for Comet's format, but it's quite easy to convert a .dzn file to Comet's.

The solvers + labeling strategies
Here is a description of each solver and its labeling strategy:
  • fz, "normal" (column_row)
    MiniZinc model with Gecode/FlatZinc. The usual labeling in nonogram_create_automaton2.mzn, i.e. where the columns are labeled before rows:
    solve :: int_search(
          [x[i,j] | j in 1..cols, i in 1..rows], 
          first_fail, 
          indomain_min, 
          complete) 
    satisfy;
    
  • fz, "row_column"
    MiniZinc model with Gecode/FlatZinc. Here the order of labeling is reversed, rows are labeled before columns. Model is nonogram_create_automaton2_row_column.mzn
    solve :: int_search(
          [x[i,j] | i in 1..rows, j in 1..cols], 
          first_fail, 
          indomain_min, 
          complete) 
    satisfy;
    
  • fz, "mixed"
    MiniZinc model with Gecode/FlatZinc: nonogram_create_automaton2_mixed.mzn.
    I have long been satisfied with the "normal" labeling in the MiniZinc model because P200 (the hardest problem I until now has tested) was solved so fast. However, the labeling used in the Comet Nonogram model described in Comet: Nonogram improved: solving problem P200 from 1:30 minutes to about 1 second, and which is also used in the Gecode model, is somewhat more complicated since it base the exacl labeling by comparing the hints for the rows and the column.

    I decided to try this labeling in MiniZinc as well. However, labeling in MiniZinc is not so flexible as in Comet and Gecode. Instead we have to add a dedicated array for the labeling (called labeling):
    array[1..rows*cols] of var 1..2: labeling;
    
    and then copy the element in the grid to that array based on the relation between rows and column hints:
    constraint
          % prepare for the labeling
          if rows*row_rule_len < cols*col_rule_len then
               % label on rows first
               labeling = [x[i,j] | i in 1..rows, j in 1..cols]
          else 
               % label on columns first
               labeling = [x[i,j] | j in 1..cols, i in 1..rows]
          endif
          /\
          % .... 
    
    and last, the search is now based just on this labeling array:
    solve :: int_search(
            labeling,
            first_fail, 
            indomain_min, 
            complete)
    satisfy;
    
  • jacop, "normal"
    MiniZinc normal model with JaCoP/FlatZinc using the same model as for fz "normal".
  • lazy, satisfy
    Model: nonogram_create_automaton2_mixed.mzn. This use the MiniZinc LazyFD solver with the search strategy:
    solve satisfy;
    
    This labeling is recommended by the authors of LazyFD. See MiniZinc: the lazy clause generation solver for more information about this solver.

    Note: The solver in MiniZinc latest official version (1.0.3) don't support set vars. Instead I (and also Jan Wolter) used the latest "Release Of The Day" version (as per 2009-11-02).
  • Comet, normal
    Model: nonogram_regular.co. This is the Comet model I described in Comet: Nonogram improved: solving problem P200 from 1:30 minutes to about 1 second. No changes has been done.
  • Gecode, normal
    This is the Nonogram model distributed with Gecode version 3.2.1. The labeling is much like the one used in the Comet model, as well as fz, "mixed". (In fact the labeling in the Gecode model was inspired by the labeling in the Comet model).


Here is the results. For each model (+ labeling strategy) two values are presented:
  • time (in seconds)
  • number of failures if applicable (the LazyFD solver always returns 0 here).
The result
Model fz
normal
fz
row_column
fz
mixed
jacop
normal
lazy
satisfy
Comet
normal
Gecode
normal
Dancer
(#1)
0.48s
(0)
0.31s
(0)
1.00s
(0)
3.64s
(0)
0.91s
(0)
0.691s
(0)
0.199s
(0)
Cat
(#6)
0.24s
(0)
0.24s
(0)
0.25s
(0)
1.20s
(0)
1.13s
(0)
0.6s
(0)
0.025s
(0)
Skid
(#21)
0.24s
(13)
0.23s
(3)
0.28s
(13)
0.78s
(13)
1.37s
(0)
0.586s
(0)
0.217s
(0)
Bucks
(#27)
0.32s
(3)
0.32s
(9)
0.37s
(3)
0.96s
(3)
2.37s
(0)
1.366s
(5)
0.026s
(2)
Edge
(#23)
0.16s
(25)
0.17s
(25)
0.18s
(25)
0.59s
(25)
0.31s
(0)
0.521s
(43)
0.175s
(25)
Smoke
(#2413)
0.27s
(5)
0.27s
(8)
0.28s
(8)
0.83s
(5)
1.44s
(0)
0.616s
(14)
0.275s
(5)
Knot
(#16)
0.42s
(0)
0.43s
(0)
0.48s
(0)
1.19s
(0)
8.15s
(0)
1.307s
(0)
0.329s
(0)
Swing
(#529)
0.95s
(0)
0.94s
(0)
0.96s
(0)
2.19s
(0)
21.94s
(0)
1.782s
(0)
0.431s
(0)
Mum
(#65)
0.64s
(20)
0.64s
(22)
0.66s
(22)
1.68s
(20)
16.34s
(0)
1.268s
(39)
0.491s
(22)
Tragic
(#1694)
340.32s
(394841)
1.02s
(255)
436.92s
(394841)
--
(198329)
45.97s
(0)
477.39s
(702525)
1.139s
(255)
Merka
(#1611)
--
(361587)
1.44s
(79)
--
(294260)
--
(136351)
80.92s
(0)
1.654s
(46)
0.645s
(13)
Petro
(#436)
2.97s
(1738)
143.09s
(106919)
3.42s
(1738)
7.27s
(1738)
9.86s
(0)
3.103s
(3183)
151.09s
(106919)
M_and_m
(#4645)
1.41s
(89)
601.27s
(122090)
1.59s
(89)
3.43s
(89)
66.98s
(0)
2.215s
(162)
2.797s
(428)
Signed
(#3541)
1.87s
(929)
23.12s
(6484)
28.23s
(6484)
5.75s
(929)
73.02s
(0)
20.369s
(12231)
1.648s
(929)
Light
(#803)
600.47s--
(400660)
--
(621547)
--
(485056)
601.53s--
(171305)
8.64s
(0)
--
(0)
--
(538711)
Forever
(#6574)
4.14s
(17143)
7.86s
(30900)
6.22s
(17143)
12.21s
(17143)
3.27s
(0)
7.5s
(27199)
8.077s
(30900)
Hot
(#2040)
--
(303306)
--
(330461)
--
(248307)
--
(119817)
165.72s
(0)
--
(0)
--
(312532)
Karate
(#6739)
95.78s
(215541)
67.27s
(130934)
133.43s
(215541)
373.02s
(215541)
19.32s
(0)
120.02s
(272706)
80.56s
(170355)
Flag
(#2556)
--
(1686545)
5.69s
(14915)
7.93s
(14915)
--
(243222)
9.29s
(0)
7.28s
(24678)
3.998s
(16531)
Lion
(#2712)
--
(542373)
--
(1124697)
--
(420216)
--
(187215)
115.56s
(0)
--
(0)
--
(869513)

Some conclusions, or rather notes

Here are some conclusions (or notes) about the benchmark.
  • The same solver Gecode/FlatZinc is here compared with three different labelings. There is no single labeling that is better than the other. I initially has some hopes that the "mixed" labeling should take the best labeling from the two simpler row/columns labelings, but this is not really the case. For example for Tragic the row_column strategy is better than "normal" and "mixed". I am, however, somewhat tempted, to use the "row_column" labeling, but the drawback is that "P200" problem (not included in Wolfter's sample problems) takes much longer with this labeling.
  • The same model and labeling but with different solvers is compared: Gecode/FlatZinc is faster than JaCoP/FlatZinc on all the problems. For the easier problems this could be explained by the extra startup time of Java for JaCoP, but that is not the complete explanation for the harder problems. Note: Both Gecode/FlatZinc and JaCoP/FlatZinc has dedicated and fast regular constraints (whereas the LazyFD, and the Comet solvers use a decomposition).
  • The LazyFD solver is the only one that solves all problems (including Lion), but is somewhat slower on the middle problems than most of the others. It emphasizes that this is a very interesting solver.
  • It is also interesting to compare the result of the Comet model and Gecode/FlatZinc "mixed", since they use the same principle of labeling. However there are some differences. First, the MiniZinc model with Gecode/FlatZinc use a dedicated regular constraint, and Comet use my own decomposition of the constraint. For the Merka problem the Comet version outperforms the Gecode/FlatZinc version, otherwise it's about the same time (and number of failures).
  • The Light problem: It is weird that this problem was solved in almost exactly 10 minutes (the timeout is 10 minutes) for Gecode/FlatZinc and JaCoP/FlatZinc. The solutions seems correct but I'm suspicious of this. Update: Christian Schulte got me on the right track. Here is was happened: The first (unique) solution was found pretty quick and was printed, but the solvers could not prove a unique solution so it timed out. JaCoP/FlatZinc actually printed "TIME-OUT" but I didn't observe that. Case closed: They both FAILED on this test. Thanks, Christian.End update
As said above, I can only agree with Jan Wolter in his comment that the ease of experimenting, for example changing labeling, and solver for the FlatZinc solvers, is a very nice feature.

Last word

No benchmark or comparison of (constraint programming) models is really complete without the reference to the article On Benchmarking Constraint Logic Programming Platforms. Response to Fernandez and Hill's "A Comparative Study of Eight Constraint Programming Languages over the Boolean and Finite Domains" by Mark Wallace, Joachim Schimpf, Kish Shen, Warwick Harvey. (Link is to ACM.)

From the Abstract:
... The article analyses some pitfalls in benchmarking, recalling previous published results from benchmarking different kinds of software, and explores some issues in comparative benchmarking of CLP systems.

November 05, 2009

Gecode version 3.2.1 released

Version 3.2.1 of Gecode is released.

From the Change log:
Changes in Version 3.2.1 (2009-11-04) This release fixes one serious bug in the element constraint for matrices; adds branchings using accumulated failure counts (also known as weighted degree); provides some optimizations (mostly for element constraints and for regular expressions with millions of nodes); adds two cute models (word-square and crossword); and a little this and that as always.

  • Kernel
    • Additions
      • Propagators and variables now maintain an accumulated failure count (AFC). (major). Details: The AFC of a propagator counts how often has the propagator failed during the entire search, and the AFC of a variable is its degree plus the sum of the AFCs of all propagators depending on the variable. While it looks straightforward, this required a major extension of the Gecode toplevel namespace. kernel to deal with global information accessed concurrently from several threads during search. The AFC is also known as "weighted degree".
  • Finite domain integers
    • Additions
      • Added INT_VAL_RANGES_MIN (and INT_VAL_RANGES_MAX) as value selection for branching: it tries the values from the smallest (largest) range first, if the variable's domain has several ranges, otherwise it splits the domain. (minor)
      • Added AFC-based branching strategies for integer and Boolean variables: INT_VAR_AFC_MIN, INT_VAR_AFC_MAX, INT_VAR_SIZE_AFC_MIN, INT_VAR_SIZE_AFC_MAX. For details, see "Modeling with Gecode". (major)
    • Other changes
      • The semantics of division and modulo has changed to match the C standard. This means that the operations round towards zero, and that the result of the modulo has the same sign as its first argument. (minor)
    • Bug fixes
      • Fixed segfault in matrix element constraint. (major)
      • Added missing dispose function for linear disequality of Boolean variables (the only problem was that with a proper dispose function more memory can be reused when the propagator becomes subsumed, so really a tiny quirk). (minor)
    • Performance improvements
      • Element constraints for integer arrays now accept shared integer arrays (IntSharedArray). By this, the very same array can be shared among several element constraints. Models require no change, as IntArgs are automatically coerced to IntSharedArrays. See "Modelling with Gecode" for more explanation. (minor)
      • Optimized element for matrices (in special cases, the propagator is up to six times as efficient as before). (minor)
  • Finite integer sets
    • Additions
      • Added AFC-based branching strategies for set variables: SET_VAR_AFC_MIN, SET_VAR_AFC_MAX, SET_VAR_SIZE_AFC_MIN, SET_VAR_SIZE_AFC_MAX. For details, see "Modeling with Gecode". (major)
    • Other changes
      • Split rel-op.cpp and rel-op-const.cpp into several compilation units, to avoid excessive memory and time usage of the gcc compiler. (minor)
  • Minimal modeling support
    • Performance improvements
      • Conversion of a regular expression to a DFA would crash on regular expressions with several million nodes (due to running out of call stack space). (minor, thanks to Håkan Kjellerstrand)
  • Example scripts
    • Additions
      • Added word square puzzle. (minor , contributed by Håkan Kjellerstrand)
      • Added crossword puzzle (thanks to Peter Van Beek for providing access to some crossword grids). (minor, thanks to Peter Van Beek)
    • Other changes
      • Sudoku and GraphColor now uses smallest size over accumulated failure count (AFC) as the default heuristic. (minor)
    • Bug fixes
      • The Nonogram example no longer crashes on empty lines. (minor, thanks to Jan Wolter)
  • Gecode/FlatZinc
    • Bug fixes
      • Fixed statistics output (number of solutions was sometimes wrong). (minor)
  • General
    • Other changes
      • Now posting of propagators and branchers take an object of class Home (rather than just a space) that can carry additional information relevant for posting (for example, groups and accumulated failure information). Models do not need to be changed in any way! (major) :Details: While models require no change, propagator rewriting does (it will work but some information might be lost). Instead of writing something like GECODE_REWRITE(*this,SomeProp::post(home,...)); where *this is the propagator to be rewritten, you should now write GECODE_REWRITE(*this,SomeProp::post(home(*this),...)); By this, the new propagator will inherit information from *this (in particular the accumulated failure count).

Some notes about the new models

I can now proudly announce that my model word_square.cpp has been included (as the model word-square.cpp), or rather a much improved version of it. Christian Schulte and Mikael Lagerkvist has done a great job of making it very effective with two different approaches. For example, it presents all 17 word squares of order 7 (i.e. a 7x7 square) given the included word list in about a minute on my machine. Some of the fixed bugs above was found when we tested this model.

The other new model, crossword.cpp, solves crossword puzzles given a problem template and a word list. Included are 72 problems taken from different sources, basically a standard benchmark repository. This is also an impressive demonstration of an effective solution of a standard (and fun) problem using a general system.

October 31, 2009

Some new models, etc

The readers that follows me on Twitter (hakankj) have probably already has seen this, but here follows a list of models, etc, not yet blogged about.

MiniZinc: Different models

The following four models are translation of my Comet models: And here are some new models:

Nonogram related

A large 40x50 Nonogram problem instance in MiniZinc: nonogram_stack_overflow.dzn to be used with nonogram_create_automaton2.mzn model. The problem was mentioned in Stack Overflow thread Solving Nonograms (Picross). It is solved in under 1 second and 0 failures.

Today my Nonogram model nonogram_create_automaton2.mzn was included in the great Survey of Paint-by-Number Puzzle Solvers (created by Jan Wolter).

My MiniZinc model is described and analyzed here. I'm not at all surprised that it's slower compared to the other solvers; it was quite expected.

Some comments:
Assessment: Slow on more complex puzzles.

...

Results: Run times are not really all that impressive, especially since it is only looking for one solution, not for two like most of the other programs reviewed here. I don't know what the differences are between this and Lagerkvist's system, but this seems much slower in all cases, even though both are ultimately being run in the Gecode library.
Update 2009-11-01
I later realized two things:

1) That the mzn2fzn translator did not use the -G gecode flag, which means that Gecode/FlatZinc uses a decomposition of the regular constraint instead of Gecode's built in, which is really the heart in this model. The model is basically two things: build an automata for a pattern and then run regular on it.
2) When Jan compiled Gecode, he set off all optmization for comparison reason. This is quite unfortunate since Gecode is crafted with knowledge of the specific optimizations.

I there have run all problems by myself and see how well it would done (at least the ballpart figure) when using an "normal" optimized Gecode and with the -G gecode flag for mzn2fzn.

Explanation of the values:
Problem: Name of the problem instance
Runtime: The value of runtime from Gecode/FlatZinc solver
Solvetime: The value of solvetime from Gecode/FlatZinc solver
Failures: Number of failures:
Total time: The Unix time for running the complete problem, including the time of mzn2fzn (which was not included in the benchmark).
A "--" means that a solution was not reached in 10 minutes.
Problem Runtime Solvetime Failures Total time
Dancer 0.002 0.000 0 1.327s
Cat 0.009 0.002 0 0.965s
Skid 0.012 0.006 13 0.660s
Bucks 0.015 0.004 3 0.866s
Edge 0.008 0.005 25 0.447s
Smoke 0.011 0.004 5 0.963s
Knot 0.026 0.006 0 1.450s
Swing 0.059 0.012 0 1.028s
Mum 0.120 0.093 20 1.811s
Tragic 6:24.273 6:23.607 394841 6:28,10
Merka -- -- -- --
Petro 2.571 2.545 1738 4.071s
M&M 0.591 0.510 89 1.961s
Signed 1.074 1.004 929 2.461s
Hot -- -- -- --
Flag -- [lazy solver: 2 solutions in 10 seconds] -- -- --
Lion -- -- -- --

It's interesting to note that the Lazy solver finds some solutions quite fast for the "Flag" problem. However, there where no other big differences compared to Gecode/FlatZinc. I also tested the problems with JaCoP's FlatZinc solver which solved the problems in about the same time as Gecode/FlatZinc with no dramatic differences.

As mentioned above, the exact values is not really comparable to the benchmark values. But it should give an indication of the result when using -G gecode and a "normal" optimized Gecode.

Unfortunately, I cannot link to the specific models due to copyright issues, but they can all be downloaded from the page Web Paint-by-Number Puzzle Export.

(End update)

Update 2009-11-02
Jan Wolter now have rerun the tests of my solver with the -G gecode option, and the time is much more like mine in the table above. The analysis is quite different with an assessment of Pretty decent, and the following under Result:
...

When comparing this to other solvers, it's important to note that nonogram_create_automaton2.mzn contains only about 100 lines of code. From Kjellerstrand's blog, it is obvious that these 100 lines are the result of a lot of thought and experimentation, and a lot of experience with MiniZinc, but the Olšák solver is something like 2,500 lines and pbnsolve approaches 7,000. If you tried turning this into a fully useful tool rather than a technology demo, with input file parsers and such, it would get a lot bigger, but clearly the CSP approach has big advantages.

(End update 2)

Also, the Gecode nonogram solver is included in the survey: called Lagerkvist. I'm not sure when it was added to the survey. It use the latest version of Gecode, so it must have been quite recent.

Some comments:
Assessment: Pretty decent.

...

Puzzles with blank lines seem to cause the program to crash with a segmentation fault.

Otherwise it performs quite well. There seems to be about 0.02 seconds of overhead in all runs, so even simple puzzles take at least that long to run. Aside from that, it generally outperforms the Wilk solver. It's not quite in the first rank, especially considering that it was only finding one solution, not checking for uniqueness, but it's still pretty darn good.

...
I mailed Jan today about the -solutions n options in the Gecode related solvers (he tested my MiniZinc model with Gecode/FlatZinc), as well as some other comments about my model. Also, I will download the problems tested and play with them.

Tailor/Essence': Zebra puzzle

Suddenly I realized that there where no Zebra problem, neither at Andrea's or here. So here it is: zebra.eprime: Zebra puzzle.

Gecode: Words square problem

See Gecode: Modeling with Element for matrices -- revisited for an earlier discussion of this problem.

Thanks to Christian Schulte my Word square model is now much faster: word_square2.cpp.

From the comment in the model:
Christian Schulte suggested 
using branch strategy INT_VAL_SPLIT_MIN 
instead of INT_VAL_MAX .
This made an improvement for size 7 from
322 failures to 42 and from 1:16 minutes
to 10 seconds (for 1 solution).
 
Now it manage to solve a size 8 in a reasonable 
time (1:33 minutes and 1018 failures):
  abastral
  bichrome
  achiotes
  shippers
  troponin
  rotenone
  amerinds
  lessness
 
But wait, there's more!

In the SVN trunk version of Gecode, there is now a version of this model: examples/word-square.cpp where Christian Schulte and Mikael Zayenz Lagerkvist has done a great job of improving it (using a slighly smaller word list, though). It solves a size 8 problem in about 14 seconds. There is also two different approaches with different strengths, etc. I have great hopes that it will improved even further...

October 25, 2009

Comet version 2.0.1 released

Comet version 2.0.1 has been released (this Wednesday). It can be downloaded here.

From the release notes:

Version 2.0.1 is a revision on 2.0 that includes several enhancements and bug fixes. The Comet Documentation is now also shipping as a separate installer for each OS so that it can be updated on a more frequent basis, independent of the release of the Comet System. The installers for the Comet System include the latest version of the documentation.

...

This release has the following new features [with bug tracking id where applicable]:
Major additions since 2.0:

* New FlexibleUnaryResource that allows to specify different durations for activity depending on which resource the activity is executed.
* New CP constraints over set variables, including: SetGlobalCardinality, allDisjoint, atleastIntersection,isExcluded, and unionOf.

Other improvements:

* Added ability to change the row of a VisualActivity in gant charts
* Added ability to use set variables in LNS[#395]
* New section on soft global constraints in CP in the manual
* Added tooltips and setColor to VisualActivities
* Added events to var<:CP>{bool} [#365]
* Class documentation now shows inherited methods [#373,#393]
* Improved error messages for set atRank method [#402]
* Better error messages when no solution is found in CP [#401]
* Improved error messages in LNS with no initial solution [#400]
* Improved error messages in DBStatement::setParam with index out of range [#385]
* Added methods excludes and requires to var{set} [#380]
* Added getValue method to var{float} [#356]
* Allow use of var{int} in by clause of a forall [#374]
* Improved error handling in atmost constraint [#362]
* Tooltips on VisualRectangle, VisualCircle, VisualLine, VisualText, and VisualTextTable. [#363]
* Improved error handling in VisualDisplayTable [#359]

Bug Fixes:

* Fixed rank and filtering algorithms of AlternativeUnaryResource[#381]
* Fixed rank on UnaryResource with optional activities[#239]
* Fixed exactIntersection constraint.[#397]
* Fixed bug in random seeds leading to correlated UniformDistribution objects [#384]
* Fixed bug in Graphical Debugger that caused the last line to not be shown [#361]
* Fixed element constraint on 3D matrices[#369]
* Fixed array out of bounds error on VisualDisplayTable [#370]
* Fixed rounding problem when using cotln with SCIP library [#398]
* Implemented getSwapDelta and getMultipleAssignDelta on DisequationSystem [#318]
* Fixed memory leaks in CP library [#293]
* Fixed bug in breakable activities [#388]
* Fixed segfault when returning an int[][] from a user plugin [#386]
* Implemented casting var{float} to an int [#355]


For more about Comet, see My Comet page.

Recent comments

Twitter tweets

Powered by
Movable Type 3.2