" /> My Constraint Programming Blog: March 2009 Archives

« February 2009 | Main | April 2009 »

March 28, 2009

de Bruijn sequences in Gecode (and other systems)

First, the Gecode model of de Bruijn sequences that is refered below: debruijn.cpp.

Update 2009-03-30: Thanks to Mikael Zayenz Lagerkvist, I have fixed/updated/added some things. These comments are last in this post.

Introduction

I have been fascinated by de Bruijn sequences (Wikipedia) for years, and made some web based programs: Explanation of the terms "classic" and "arbitrary" (which is not standard vocabulary):
  • "classic" de Bruijn sequence: the sequence length is base^n (n is the number of bits),
  • "arbitrary" sequence: where the sequence length is arbitrary.
Modeling de Bruijn sequences is also one of the "standard" problems I always model when learning a new constraint programming system. See below for the implementation in other systems.

Principle used

The basic principle in generating de Bruijn sequences used in this model is the following. Note: The names for the parameters base, n, m, and bin_code are perhaps unfortunate and confusing, but are kept since they are used in all my other implementations (see below).

Given:
  • a base (parameter base)
  • number of bits (n)
  • length of sequence (m)
Then:
  • make a list of distinct integers in the range 0..(base^n)-1. This array is called x in the model. These are the nodes in a de Bruijn graph. The goal of this model is to find nodes that really are "de Bruijn nodes".
  • calculate the "bit sequence" (in base base) for each integer. This is a matrix with m rows and n columns, here called binary.
  • apply the de Bruijn condition for each consecutive integers, i.e. the first elements in binary[r] is the same as the last elements in binary[r-1], and also "around the corner".
  • the de Bruijn sequence is then the first element in each row, here called bin_code.
For "classic" de Bruin sequences this is really overkill, but it makes it possible to calculate sequences of arbitrary lengths.

Here is a simple run of the program with the following command line (see below for a discussion of the options):

debruijn.exe -solutions 1 -base 13 -n 4 -m 52 -print-matrix 1 -int-var smallest -int-val indomain-min

Result:
DeBruijn
base: 13
number of bits (n): 4
length of sequence (m): 52

x:{0, 1, 13, 169, 2197, 2, 26, 338, 4394, 3, 39, 507, 6591, 4, 52, 676, 8788, 5, 65, 845, 10985, 6, 78, 1014, 13182, 7, 91, 1183, 15379, 8, 104, 1352, 17576, 9, 117, 1521, 19773, 10, 130, 1690, 21970, 11, 143, 1859, 24167, 12, 156, 2029, 26389, 325, 4225, 26364}
de Bruijn sequence{0, 0, 0, 0, 1, 0, 0, 0, 2, 0, 0, 0, 3, 0, 0, 0, 4, 0, 0, 0, 5, 0, 0, 0, 6, 0, 0, 0, 7, 0, 0, 0, 8, 0, 0, 0, 9, 0, 0, 0, 10, 0, 0, 0, 11, 0, 0, 0, 12, 0, 1, 12}
binary matrix:
0 0 0 0
0 0 0 1
0 0 1 0
...
0 1 12 0
1 12 0 0
12 0 0 0


Initial
        propagators:  312
        branchings:   3

Summary
        runtime:      0.020 (20.000000 ms)
        solutions:    1
        propagations: 1119
        nodes:        16
        failures:     0
        peak depth:   15
        peak memory:  100 KB
For calculating the standard "keypad problem" (i.e. base = 10, n = 4, and m = 10000), this model takes too much memory for my computer (2 Gb RAM, a Linux 3.4MHz dual core). There are probably some things to make this model more efficient. Update: See below how to handle this.

However, calculating the same base and n with a length of 1000 is fast:
Initial
        propagators:  6000
        branchings:   3

Summary
        runtime:      1.660 (1660.000000 ms)
        solutions:    1
        propagations: 138390
        nodes:        253
        failures:     0
        peak depth:   252
        peak memory:  42651 KB

Findings

Here are some findings when modeling this Gecode model.

* Matrix<IntVarArray>

One of things that took the longest time to do what the "channeling" from the integer array x and the matrix binary. The reason was simply that I haven't read the documentation how to use the Matrix wrapper. The order in the call is columns, rows, not the other way around which I assumed. Here is how to use it (a simple example):

...
IntVarArray some_array(*this, number_of_columns*number_of_rows, 0, 100);
Matrix matrix_version(some_array, number_of_columns, number_of_rows);

...


For accessing the matrix, we - of course - use the same order:
matrix_version(column, row);.

* Options

This model has a lot of different parameters to play with:
  • base
  • n
  • m
  • print_matrix (printing the binary matrix)
  • int-var IntVarBranch
  • int-val IntValBranch
Testing different variants via re-compilation is simply not feasible. Hence I wrote a subclass of Option for this, with initial help from Mikael Zayenz Lagerkvist. The most interesting part is handling different branching options (see the model debruijn.cpp how it is implemented). The default option branching is not enough since both IntVarBranch and IntValBranch should be possible to change via the command line. (A simpler version using the branching option is in donald_gerald_robert.cpp.)

Here is the result of running the program with the -help option showing the different options (the standard options is removed):
...
       -base (unsigned int) default: 2
                base to use
        -n (unsigned int) default: 3
                number of bits to use.
        -m (unsigned int) default: 0
                length of the sequence.
        -print-matrix (unsigned int) default: 0
                1 prints the binary matrix.
        -int-var (input-order, first-fail, anti-first-fail, smallest, largest, occurrence, max-regret) default: smallest
                options for IntVarBranch
                  input-order: use VAR_NONE
                  first-fail: use VAR_SIZE_MIN
                  anti-first-fail: use VAR_SIZE_MAX
                  smallest: use VAR_MIN_MIN
                  largest: use VAR_MAX_MAX
                  occurrence: use VAR_DEGREE_MAX
                  max-regret: use VAR_REGRET_MIN_MAX
        -int-val (indomain-min, indomain-max, indomain-median, indomain-split) default: indomain-min
                options for IntValBranch
                  indomain-min: use VAL_MIN
                  indomain-max: use VAL_MAX
                  indomain-median: use VAL_MED
                  indomain-split: use VAL_SPLIT_MIN

Some notes and other findings about this:
  • If no -m (the sequence length) is stated, it is defaulted to base^n. This is done in the class DeBruijnOptions.
  • The names used in -int-var and -int-val is taken from the mappings that Gecode/FlatZinc use for the MiniZinc branching options.
  • It seems that these extra options must be stated after all of the builtin options from the (standard) Options class.
  • This is another finding: When adding an option and you want to use it (say) in the print function, don't forget to add it to the clone constructor. Here is the constructor:
      //
      // Constructor for cloning s
      //
      DeBruijn(bool share, DeBruijn& s) : Example(share,s),
                                          n(s.n), 
                                          m(s.m),
                                          base(s.base),
                                          print_matrix(s.print_matrix),
                                          int_var(s.int_var),
                                          int_val(s.int_val)
      {
        x.update(*this, share, s.x);
        binary.update(*this, share, s.binary);
        bin_code.update(*this, share, s.bin_code);
        // gcc.update(*this, share, s.gcc);
      }
  • In the model, there are a (commented) extra constraint which states that the occurrences of the different values in the de Bruijn sequence should be the same, using the global cardinality constraint count. I will investigate how to make this constraint optional in the way I want. Update:: See below for more about this.

Other implementations

Here are my other constraint programming implementations of de Bruijn sequences:

See also

Update 2009-03-30
Thanks to Mikael Zayenz Lagerkvist there is a new version of debruijn.cpp. The improvements are:
  • It now compiles clean with gcc -Wall. For some reason I have forgotten to include that option in my Makefile.
  • There is a a new option -print_x which print x, i.e. the array of integers in the de Bruijn graph. The default it is now off (0).
  • This version always calculates the occurrences of elements in the de Bruijn sequence (the bin_code array), which is handled by a new IntVarArray variable gcc (for "global cardinality constraint").

    There is also a new option, -same-occurrences, which - when set to 1 - requires that all elements in the de Bruijn sequence should be the same.

    An example: Here is a problem where base=13, n=4 and m=52 and -same-occurrence 1
    de Bruijn sequence: {0, 0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6, 7, 7, 7, 8, 8, 8, 9, 9, 9, 10, 10, 10, 11, 11, 11, 12, 12, 12, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1}
    gcc: {4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4}
    
    
    Initial
            propagators:  314
            branchings:   4
    
    Summary
            runtime:      0.040 (40.000000 ms)
            solutions:    1
            propagations: 2516
            nodes:        46
            failures:     0
            peak depth:   45
            peak memory:  254 KB
    
  • Exhaused memory

    I wrote above that this problem:
    base=10
    n=4 (number of bits) and
    m = 10000 (length)
    required too much memory.

    This was fixed by incrementing the options -c-d ("recomputation commit distance") and -a-d ("recomputation adaption distance"). Mikael's suggestion was to use the following values:
    -c-d 256 -a-d 8

    With these values set, the problem took about 5:14 minutes on my computer (3.4MHz dual core, Mandriva Linux and 2Gb RAM) and required only about 400 Mb (before I had to abort since the memory was exhausted). Here is a result of running debruijn.exe -base 10 -n 4 -c-d 256 -a-d 8
    Initial
            propagators:  60002
            branchings:   4
    
    Summary
            runtime:      5:12.300 (312300.000000 ms)
            solutions:    1
            propagations: 12793015
            nodes:        2529
            failures:     0
            peak depth:   2528
            peak memory:  346288 KB
    debruijn.exe -base 10 -n 4 -c-d 256 -a-d 8  310,60s user 1,92s system 99% cpu 5:13,53 total
    

    I also tested with debruijn.exe -base 10 -n 4 -c-d 1256 -a-d 18, i.e. by just adding a "1" before each of the -c-d and -a-d values. The result was:
    Initial
            propagators:  60002
            branchings:   4
    
    Summary
            runtime:      5:05.150 (305150.000000 ms)
            solutions:    1
            propagations: 12793015
            nodes:        2529
            failures:     0
            peak depth:   2528
            peak memory:  61039 KB
    debruijn.exe -base 10 -n 4 -c-d 1256 -a-d 18  303,65s user 1,66s system 99% cpu 5:05,62 total
    
    The interesting part is that it required just about 20% of the memory. I have to study these things more.
    The two options -c-d, and -a-d 8are documented here.

    Later note: See more about these options in Mikael's comment below (which I didn't see when writing the update).


As usual, thanks Mikael!

March 27, 2009

Gecode version 3.0.2 released

Gecode version 3.0.2 is released. Download here.

From the changelog:

Changes in Version 3.0.2 (2009-03-26)
This is a bug fix release fixing two more embarrassing bugs. However, this time we redesigned our tests carefully such that they cover all changes and optimizations done for the transition from 2.2.0 to 3.0.*. Please update asap.


  • Finite domain integers
    • Bug fixes
      • Fixed bug in optimization of extensional constraints with DFAs (hard to reproduce, almost impossible). (major)

    • Performance improvements
      • Reoptimized element with integer values and created bizarre testcases. (minor)

  • Minimal modelling support
    • Bug fixes
      • Fixed bug in posting of Boolean expressions including reified linear expressions. Again, that escaped our testsuite (also fixed). (major, thanks to Gustavo Guiterrez)

  • Example scripts
    • Bug fixes
      • The radiotherapy example was missing in the Makefile. (minor, thanks to Håkan Kjellerstrand)

  • Gist
    • Other changes
      • A separator is printed between solutions in the TextInspector. (minor)


March 26, 2009

My first Gecode models

Gecode has long been an inspiration for many of my models: I have borrowed sample problems from its vast collection of examples, e.g. from Gecode's Minesweeper in the MiniZinc model minesweeper.mzn, from Gecode's Nonogram in the Comet model nonogram_regular.co and so on.

Since the Gecode models are very fast (one reason is that it use C++), they have also been used as a benchmark for my other models. For example, see the discussion about performance in Comet: Nonogram improved: solving problem P200 from 1:30 minutes to about 1 second. (The Gecode team later incorporated one of these findings in their Nonogram model.)

Gecode (and also the MiniZinc solver Gecode/FlatZinc) has a feature I like very much: it prints out much interesting statistics (note: almost all other systems I have used shows some sort of statistics). Here is an example of the output from the 1000-queens model (from the Gecode distribution):


Initial
propagators: 3
branchings: 1

Summary
runtime: 0.870 (870.000000 ms)
solutions: 1
propagations: 3006
nodes: 996
failures: 2
peak depth: 993
peak memory: 77721 KB

Using the Example framework (see below) makes it easy to interactively testing different strategies etc via the command line since there is a lot of default options to use. Here are the options for the queens model:

Options for Queens:
-help, --help, -?
print this help message
-propagation (binary, mixed, distinct) default: distinct
propagation variants
binary: only binary disequality constraints
mixed: single distinct and binary disequality constraints
distinct: three distinct constraints
-icl (def, val, bnd, dom) default: def
integer consistency level
-solutions (unsigned int) default: 1
number of solutions (0 = all)
-c-d (unsigned int) default: 8
recomputation commit distance
-a-d (unsigned int) default: 2
recomputation adaption distance
-node (unsigned int) default: 0
node cutoff (0 = none, solution mode)
-fail (unsigned int) default: 0
failure cutoff (0 = none, solution mode)
-time (unsigned int) default: 0
time (in ms) cutoff (0 = none, solution mode)
-mode (solution, time, stat, gist) default: solution
how to execute example
-iterations (unsigned int) default: 500
iterations per sample (time mode)
-samples (unsigned int) default: 1
how many samples (time mode)
(unsigned int) default: 100
which version/size for example

In Gecode version 3.0.1 and Gecode/FlatZinc 1.5 released I mentioned the Gist GUI for graphically exploring (and debugging) the search tree.


My Gecode models


Well, this week I started to program some Gecode models. I have created a special page for them: My Gecode page which - right now - contains the following models:

  • coins_grid.cpp: Placing coins on a grid (Tony Hurlimann)
    Note: Constraint programming systems are not very good on this problem (at least not my models). Using integer programming is much faster, see Comet's MIP solver goins_grid.co).

  • diet.cpp: simple diet problem (operations research)

  • donald_gerald_robert.cpp: DONALD+GERALD=ROBERT, an experiment in "tweaking".
    Note: Here I experiment in different search strategies, different branchings etc.

  • least_diff.cpp: Least diff problem, an alphametic puzzle: minimize the difference ABCDE-FGHIJ where A..J are integers 0..9 and all different.

  • map.cpp: Simple coloring problem (Van Hentenryck)

  • quasigroup_completion.cpp: Quasigroup Completion

  • send_more_money_any_base.cpp: SEND+MORE=MONEY in any base

  • seseman.cpp: simple recreational mathematics puzzle. See the CGI-program Seseman convent problem

  • seseman2.cpp: A more general model of the Seseman problem (using a matrix for representing the problem)

  • survo_puzzle.cpp: Survo puzzle (with just one hardcoded example)

  • xkcd.cpp: XKCD's knapsack problem. See the xkcd strip for the problem


Some notes about the models


These are all problems I tend to start with when learning constraint programming systems: they are all simple and different in some way or another. (Compare with other implementations in My MiniZinc page, My JaCoP page, My Choco page, My Gecode/R page, and My Comet page.)

My other implementations of Quasigroup Completion and Survo puzzle usually have some some data/problems files. Well, this time I was kind of lazy and just hard coded one example or two. Sorry about that.

As you may note my C++ is quite rusty, so some things could probably have been coded better. And to quote myself: I tend to program more for readabilty than elegance, which makes my programs sometimes look quite boring....


Example framework
The models use the Example framework mentioned above, which is the framework that all Geocde examples use. Unfortunately it is not completely incorporated in the Minimodel framework and is not mentioned in the (very good) introduction Modeling with Gecode (PDF). But since all Gecode's examples use the framework, almost all the questions I have had so far could be solve by just browsing the examples, with just a glance or two at the online documentation.


See also


* This blog's Gecode category

Gecode site
* Gecode

Download
* Download

Dokumentation
* Documentation
* Publications
* Modeling with Gecode (PDF)
* The great collection of examples

Mailing list etc
* Community

Interfaces to other systems
* Interfaces

March 24, 2009

Gecode version 3.0.1 and Gecode/FlatZinc 1.5 released

Version 3.0.1 of Gecode is relased. It contains mostly bug fixes:

This is a bug fix release fixing an embarassing bug in reified Boolean linear constraints.
  • Finite domain integers

    • Bug fixes
      • IntSetArgs no longer inherit from PrimArgArray, which was wrong as IntSet is no primitive type and hence does not support vararg initializers. (minor)
      • Fixed bug in reified Boolean linear constraints (an optimization is currently disabled and will be active in the next release: the optimization was incorrect and was never tested). (major, thanks to Alberto Delgado)


  • Example scripts
    • Additions
    • Bug fixes
      • The examples now pass the c-d and a-d command line options correctly to Gist. (minor)
      • The Steel Mill Slab Design example had two bugs in the search heuristic and a missing redundant constraint. (minor, bugzilla entry, thanks to Chris Mears)




Gecode/FlatZinc has been updated to version 1.5. The bug fix here is very interesting (and exiting) since Gist now also works with Gecode/FlatZinc.

Gist in Gecode and Gecode/FlatZinc
Gist for Gecode has been around for time, and was officially in the distribution in version 3.0.0. In Modeling with Gecode (PDF) there is a section how to program and use Gist.

Stable Marriage Problem.
As a first example of Gist with Gecode/FlatZinc (i.e. MiniZinc models): stable_marriage.pdf is a PDF file which shows the full search tree for the MiniZinc model stable_marriage.mzn. The green nodes are solutions, the blue nodes are choice points, the red squares are failures, and the big red triangles are a sub tree of failures.

Running the model with fz -mode stats stable_marriage.fzn gives the following result.

wife : [7, 5, 9, 8, 3, 6, 1, 4, 2]
husband: [7, 9, 5, 8, 2, 6, 1, 4, 3]
----------
wife : [6, 5, 9, 8, 3, 7, 1, 4, 2]
husband: [7, 9, 5, 8, 2, 1, 6, 4, 3]
----------
wife : [6, 4, 9, 8, 3, 7, 1, 5, 2]
husband: [7, 9, 5, 2, 8, 1, 6, 4, 3]
----------
wife : [6, 1, 4, 8, 5, 9, 3, 2, 7]
husband: [2, 8, 7, 3, 5, 1, 9, 4, 6]
----------
wife : [6, 4, 1, 8, 5, 7, 3, 2, 9]
husband: [3, 8, 7, 2, 5, 1, 6, 4, 9]
----------
wife : [6, 1, 4, 8, 5, 7, 3, 2, 9]
husband: [2, 8, 7, 3, 5, 1, 6, 4, 9]
----------
==========

runtime: 10
solutions: 6
propagators: 346
propagations: 12426
nodes: 67
failures: 28
peak depth: 8
peak memory: 132 KB


Also see My MiniZinc Page for MiniZinc models which may be used with Gecode/FlatZinc.

Comet version 1.2 released

Comet version 1.2 is released. From the release notes:

New Features
This release has the following new features [with bug tracking id where applicable]:

* New linear programming library (cotln) with interfaces to LPSolve and COINOR-CLP.
* New database library (cotdb) using ODBC.
* Addition of events to Boolean CP variables [#112].
* Implementation mouse tracking on Gtk [#33].
* Improved error messages for semantic errors [#169, 160, 141]
* Allow filtering inside of a collect statement [#153].
* Addition of FunctionFloatExpr [#151].
* Support for classes as keys in dictionaries [#143].
* Added new mark method for 2D plots [#127].
* Improved handling of arrays of arrays [#138].
* Implemented return statement in void functions [#142].
* The compiler now gives an error if a class has no constructors [#156].
* The compmiler givs an error if a non-void method has no return statement [#102].
* Unified syntax of select, selectFirst, and selectCircular [#154].
* Unified syntax for arrays and matrices [#140].
* Unified syntax of queues and stacks [#139].
* Support for with atomic in CP search [#99, 121, 123].
* Support for sqrt in CP objectives [#135].
* Support for getSnapshot on float CP variables. [#131].
* Support for swapping operator :=: on int and tuple set variables in LS [#101].
* Added getLocalSolver on var{set{int}}, var{bool}, and var{float} [#119].

Bug Fixes
This release fixes the following bugs [with bug tracking id where applicable]:

* Fixed memory leaks in CP library [#148].
* Allow assignments to variables stored in a dict using the operator := [#172].
* Packaging of additional library .cob files [#159,#167].
* Fixes to implementation of built-in stack [#161].
* Fixed implementation of ofstream on Mac OS X [#150].
* Allow casting to float inside of constraint expressions [#149].
* Improved handling of float CP variables with large domains [#147].
* The operator := now works on Float objects [#155].
* Improved consistency in floating point operations [#122].
* Added evalFloat() to var{float} [#129].
* Improved handling of failed constraints outside explore constructs [#130].
* Implemented printing of arrays of ranges [#128].
* Fixed race condition in parallel loops resulting in deadlock [#124].
* Fixed bug in initial state of queues [#180].
* Fixed performance degradation for certain local search problems [#181].

Comet can be downloaded from Dynadec.

For more info about Comet, see My Comet page.

March 22, 2009

Gecode: Guido Tack 'Constraint Propagation - Models, Techniques, Implementation'

Guido Tack is one of the Gecode team. His doctoral dissertation Constraint Propagation - Models, Techniques, Implementation (2009, as PDF) discuss constraint propagation in general, and give many examples of how propagation works in Gecode.

Abstract:


This dissertation presents the design of a propagation-based constraint solver. The design is based on models that span several levels of abstraction, ranging from a mathematical foundation, to a high-level implementation architecture, to concrete data structures and algorithms. This principled design approach results in a well-understood, correct, modular, and efficient implementation.

The core of the developed architecture is the propagation kernel. It provides the propagation infrastructure and is thus crucial for correctness and efficiency of the solver. Based on a mathematical model as well as a careful design of the employed algorithms and data structures, the presented architecture results in an efficient and domain-independent kernel. Constraints are realized by propagators, and implementing a propagator is a challenging, error-prone, and time-consuming task. A practically useful solver must however provide a comprehensive propagator library. This dissertation introduces two techniques for automatically deriving correct and efficient propagators. Views generalize variables and are used to derive propagators from existing propagators. For constraints over set variables, propagators are derived from formal constraint specifications.

The presented techniques are the basis of Gecode, a production-quality, highly efficient, and widely deployed constraint solver. Gecode is the empirical evidence for success and relevance of the principled design approach of this dissertation.


Some comments
My interest right now is to step up from "just modeling" with decompositions to understand more about (and hopefully also write) "real" propagators, such as global constraints. Programming propagators is not possible in the MiniZinc language, the system I've used most the last year. Both Comet and Gecode has support for this, as well as the Java based systems Choco and JaCoP. Until now, I've only used Gecode's examples as a inspiration of and comparison to my own models, but I plan to start modeling in Gecode as well.

In view of this, Constraint Propagation - Models, Techniques, Implementation is a perfect book, explaning what propagations are and how they are implemented in Gecode. It is therefore great for understanding how Gecode works. It is also a modern introduction of the state-of-art techniques and implementation of propagators, which is one of the most important part in constraint programming.

Although quite theoretical in part (it is a dissertation in computer science after all), the book has a nice pedagogic pace and explains all core concepts very well, often with examples. Also, the proofs - there are lot of them - are often explained in "normal prose" before or after the proof, which makes it easy to follow the reasoning. (I didn't follow all the proofs, meaning that for some proofs I either didn't understand them or just glanced them through.)

It was very interesting to read the reasons, supported with benchmarks, behind some of the design decisions that was made in Gecode.

I enjoyed the "Related work" part at the end of a chapter or section, which related the discussed concepts to former research. Also, I liked that the Bibliography shows to what page a specific work is refered in the book.

I happened to read Modeling with Gecode just before Constraint Propagation - Models, Techniques, Implementation, and that was a good thing: Modeling with Gecode shows how to model in Gecode so the core concepts are laid out for the dissertation. (The Documentation page shows that a document "Programming with Gecode" is under preparation, which I look forward to read.)

Last, thanks to Mikael Zayenz Lagerkvist for recommending the book.

March 18, 2009

SweConsNet Workshop 2009

SweConsNet (the Network for Sweden-based researchers and practitioners of Constraint programming) arranges the following:


SweConsNet Workshop 2009
(www.it.uu.se/research/group/astra/SweConsNet09)

The 8th workshop of SweConsNet,
the Network for Sweden-based researchers and practitioners
of Constraint programming

Collocated with the SAIS Workshop 2009
(www.sais.se/sais2009)
on May 27th 2009 in Linköping, Sweden

Following the previous successful workshops, it is now time for the 8th
SweConsNet workshop, which will take place at Linköping University on 27
May 2009. The collocated SAIS (Swedish AI Society) workshop continues the
day after.

The purpose of this workshop is to learn about ongoing research in
constraint programming, existing projects and products, and further
development of the network.

The workshop is open to everybody interested in the theory and practice of
constraint programming, whether based in Sweden or elsewhere.

We hope for your participation, if not a presentation of your ongoing work,
recent results, or proposal for discussion. There are no paper
submissions, reviews, or proceedings, hence recent conference/journal
papers may also be presented. For better planning, we need a statement of
intent, and desirably the title and abstract of your talk. Please email
this information as soon as possible to us, the programme chairs. A
preliminary list of presentations is at the workshop website, and several
time slots remain available.

Please forward this message to people who might be interested in this
workshop but are not yet on the SweConsNet mailing list. They can
subscribe to it by sending a message to Justin.Pearson at it.uu.se.

Cheers,
Justin Pearson and Pierre Flener


This is the program (from SweConsNet Workshop 2009)


There will be a joint invited talk with the SAIS Workshop 2009:

together with the following preliminary list of talks:
  • 10:30 - 11:00 Krzysztof Kuchcinski (Lund University): Design of Processor Accelerators with Constraints
  • 11:00 - 11:30 Mikael Zayenz Lagerkvist (KTH): Propagating Berge-acyclic propagator graphs
  • 12:30 - 13:10 Tomas Nordlander (Sintef, Norway): Nurse Rostering
  • 13:10 - 13:40 Federico Pecora (Örebro University): From Space to Smart Homes: Constraint-Based Planning for Domestic Assistance
  • 13:50 - 14:20 Håkan Kjellerstrand (Sweden): Learning Constraint Programming (MiniZinc, JaCoP, Choco, Gecode/R, Comet, Gecode): Some Lessons Learned
  • 14:20 - 14:50 Wlodek Drabent (Linköping University): Negation with the Well-Founded Semantics for CLP

I will (at least) attend the SweConsNet Workshop, and are looking forward to listen to all the interesting researchers and practioners in constraint programming, and hopefully talk to some.

Update 2009-05-06
The list of talks is now updated with complete titles and time slots.

March 14, 2009

Solving Pi Day Sudoku 2009 with the global cardinality constraint

Here is an update of the Pi Day Sudoku 2009 problem.

After a suggestion from Mikael Zayenz Lagerkvist (of the Gecode team) the model now uses just the global cardinality constraint and is much faster than the model which used my own decomposition of all_different_except_0 (or rather all_different_except_Pi) and the count constraint.

Both of the new MiniZinc and Comet models solves the Pi Day Sudoku 2009 problem in less than a second (compared to couple of seconds before).

The Comet model has the following statistics for finding the first (and unique) solution.

time: 9
#choices = 6
#fail = 1
#propag = 1306

For Gecode/FlatZinc:

runtime: 20
solutions: 1
propagators: 35
propagations: 255
nodes: 3
failures: 0
peak depth: 2
peak memory: 158 KB

Compared to the former model, there are now much less failures.


Some more about the global cardinality constraint
Global cardinality (in Sudoku) is a way of stating how many occurrences there should be of each number per row, column or region. For the Pi Day Sudoku problem the following occurrences are needed:

Pi: 3 occurrences (coded as -1 in the model)
0: 0 occurrences (0 is a dummy value for handling the indata)
1..9: exactly one occurrence each

I MiniZinc the this is coded as

array[1..n, 1..n] of var -1..9: x;
array[-1..9] of 0..3: occ = array1d(-1..9, [3, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1]);
contraint
forall(i in 1..n) ( % rows
check([x[i,j] | j in 1..n], occ)
)
...

In Comet the function cardinality is used, i,e, m.post(cardinality(occ, all(j in 1..n) x[i,j]));.


For more information about the global cardinality constraint, see the entry in Global Constraint Catalog.


A comment
Introductions to constraint programming nowadays tend to start with the Sudoko problem (or SEND + MORE = MONEY). These models often use the alldifferent constraint, and seldom, if ever, have I seen any presentation using global cardinality. One reason may be that alldifferent is easier to understand in a introduction, another reason may be that it is general a better constraint to use for these problems.

"Vanilla" Sudoko solver using global cardinality
I have compared some few "vanilla" Sudoku problems with either alldifferent or global cardinality but haven't done any systematic comparison. Sometimes the global cardinality approach is better, sometimes it is somewhat worse.

sudoku_gcc.mzn is a "vanilla" Sudoku solver in MiniZinc using the global cardinality constraint. And the same approach in Comet: sudoku_gcc.co. Note: These models includes just a few problems.


Also see: Gecode's Sudoku solver is very fast, and includes 88 different problems.

March 13, 2009

Gecode version 3.0.0 released

Gecode version 3.0.0 is released. This is great!

From the release notes:

This release is a major consolidation release: interfaces have been cleaned up (consistent parameter passing, consistent naming, simpler Gist interface, namespaces for operator overloading); some functionality has been extended (propagators can be non-monotonic; branchings support tie-breaking and random variable and value selection); some functionality that did not meet our quality goals has been removed (complete set variables, reflection); usage has been simplified (auto-linking on Windows, more commonly used filename extensions); important aspects have been optimized (memory management, memory usage and efficiency on 64bit machines). These cleanups were in particular necessary to make Gecode easier to document (this release is the first to be accompanied by tutorial documentation explaining how to model with Gecode).

Apart from that, many small fixes and additions. Please see below for the details.

As the interfaces have changed considerably, please consult How to Change to Gecode 3.0.0..

The new interface to MiniZinc is also released: Gecode/FlatZInc 1.4. This release has the following changes:

Version 1.4
* Updated to compile with Gecode 3.0.0
* Fixed conditional output items
* Print variables as required by the upcoming FlatZinc specification

I also recommend reading the nice introduction to modeling in Gecode Modeling with Gecode (PDF).

March 12, 2009

MiniZinc Challenge 2008 Results

The MiniZinc Challenge 2008 was held in the summer 2008. I don't know exactly when the result was published (probably in the last week), but here it is; MiniZinc Challenge 2008 Results.

The summary states (higher result better):


The results of the challenge are available here; congratulations go to GeCode on a very convincing win! The summary of results is given in this table:

Contestant Total Score
eclipse_fd 787.3
eclipse_ic 938.8
g12_fd 1655.1
gecode 3418.8

Congratulations to the Gecode team!

This result confirms my impression about the MiniZinc solvers: Gecode/flatzinc is my favorite solver, since it is often very fast, and also that it shows important statistics. See Pi Day Sudoku 2009 for some examples of the latter.

As of writing (2009-03-12) the links to the models and data files in the result page don't work but all the files is included in the last ROTD (release of the day).

By the way, two of the models are from my MiniZinc collection:
* debruijn_binary.mzn: de Bruijn sequences
* quasiGroup7.mzn: quasigroup problem 7

March 11, 2009

Pi Day Sudoku 2009

Brainfreeze Puzzles has a Pi day Sudoku competition:


Pi Day is a celebration of the number π that occurs every March 14 (3.14...). Math geeks all over the world celebrate their math-geekiness on this day with pie-eating contests, recitations of the digits of pi, and occasionally fundraisers where math faculty get hit in the face with pies. At Brainfreeze Puzzles we celebrate Pi Day - how else? - with a Sudoku puzzle.

...

Rules: Fill in the grid so that each row, column, and jigsaw region contains 1-9 exactly once and π [Pi] three times.

The puzzle is also here (as a PDF file)

Models

I programmed this puzzle in both Comet (using the constraint programming module) and MiniZinc using different solvers. The models use the same principle of solving the problem: a (homebrewn) all different except 0 constraint and counting exactly 3 occurrences of Pi for each row/column and region.

All the solvers give the same unique answer, which of course may mean that there is some systematic error on my part.


Comet


Comet solves the problem in about 9 seconds with the following statistics, using exploreall to make sure that it is a unique solution.

time: 8821
#choices = 9999
#fail = 16231
#propag = 23168886

Somewhat surprisingly selecting j before i in the forall loop, gives an improvment from 15 seconds to 9 seconds. This was the same labeling strategy that was used for improving Nonogram (see Comet: Nonogram improved: solving problem P200 from 1:30 minutes to about 1 second). Here is the labeling:

// reversing i and j gives faster solution
forall(j in 1..n, i in 1..n: !x[i,j].bound()) {
tryall(v in V : x[i,j].memberOf(v)) by(v)
m.label(x[i,j], v);
onFailure
m.diff(x[i,j], v);
}

MiniZinc and Gecode/flatzinc

Interestingly, the same improvement was observed with MiniZinc and the Gecode/flatzinc solver (the interface to Gecode). Here is the labeling function, with the same swapping of i and j (I didn't know that this mattered in MiniZinc).


solve :: int_search([x[i,j] | j, i in 1..n], "input_order", "indomain", "complete") satisfy;

With the swap, the time went from 16 seconds to 10 seconds with the following statistics:

propagators: 9460
propagations: 13461717
failures: 11221
clones: 11221
commits: 31478
peak memory: 5638 KB
mzx.pl sudoku_pi.mzn --num 2 9,41s user 0,72s system 99% cpu 10,139 total

Fzntini, FlatZinc, ECLiPSe and searching for the first solution

MiniZinc's builtin solver flatzinc and fzntini (a SAT solver) only shows one solution and no statistics.

* fzntini solve problem in about 13 seconds
* flatzinc takes 3 seconds.

The ECLiPSe's MiniZinc solver ic takes 6.5 seconds using the same labeling as Gecode/flatzinc. No statistics is shown for this solver either.

For comparison both Comet and Gecode/flatzinc shows the first solutions in 2.5 seconds.

Solution

Since it is a competition I won't show my solution or the models on/via the blog. Sorry about that.

Reference

I learned about Pi Days Sudoku from the 360 blog: Pi Day Sudoku 2009.

March 08, 2009

Comet: About 15 new Comet models

The latter part of the last week was dominated by the optimization of the Nonogram solver, which I wrote about in Comet: Nonogram improved: solving problem P200 from 1:30 minutes to about 1 second. I would like to thank Pascal Van Hentenryck, Mikael Zayenz Lagerkvist, and Christian Schulte for help, comments, and inspiration.


Here are the other Comet models written the last week.

Puzzles, recreation mathematics, etc

Pascal Van Hentenryck's OPL book

These models are translations from Pascal Van Hentenryck "The OPL Optimization Programming Language"

Operations research

  • furniture_moving_cumulative.co: Furniture moving using global constraint cumulative (simple scheduling problem from Marriott & Stuckey: 'Programming with constraints', page 112f). Compare with furniture_moving.co which uses the Comet's Schedule constraint.

Global constraints

Many of these models has been inspired by the great collections in Global Constraint Catalog.

March 05, 2009

Comet: Nonogram improved: solving problem P200 from 1:30 minutes to about 1 second

In Comet: regular constraint, a much faster Nonogram with the regular constraint, some OPL models, and more I reported the running time of the Nonogram P200 problem (one of the hardest standard problems) as 1:30 minutes with the Comet model nonogram_regular.co (using the regular constraint).

In the last paragraph of the Nonogram section, I noted: Maybe some creative labeling or a "proper" regular constraint can fasten things up...

Well, it was "creative labeling" part that made it. After some help from Pascal Van Hentenryck (one of the creators of Comet), the time for generating all the solutions of P200 - which is unique - went to about 1 second.

The statistics for the full search of P200 is shown below. Note: the unit for "time" is milliseconds and reports the solving time excluding the startup of Comet itself, i.e. 0.627 seconds.

num_solutions: 1
time: 621
#choices = 520
#fail = 1040
#propag = 1213237
comet -j2 nonogram_regular.co 1,48s user 0,10s system 99% cpu 1,582 total

For comparison the Gecode program Nonogram displays one solution in about the same time, but searching the complete search tree takes about 45 seconds, using the following command line option:
nonogram -pk speed -solutions 0 8.

Maybe there are some more options that will make the Gecode program works faster.


What was changed?


Pascal showed me two improvements (annotated by "Pascal's improvement" in nonogram_regular.co):

The first change didn't do much difference. It was a change in the regular function. The following

forall(i in x_range) {
cp.inside(x[i], 1..S); // Do this in case it's a var.
cp.post(a[i+1] == d2[a[i], x[i]], onDomains); // Determine a[i+1].
}

was changed into

forall(i in x_range) {
cp.post(x[i] >= 1); // hakank: Pascal's improvement
cp.post(x[i] <= S); // hakank: Pascal's improvement
cp.post(a[i+1] == d2[a[i], x[i]], onDomains); // Determine a[i+1].
}

The second change, however, made the whole thing. In the using section, the labeling (search strategy) was originally stated as

forall(i in 1..rows, j in 1..cols : !board[i,j].bound()) {
tryall(v in 1..2) m.post(board[i,j] == v, onDomains);
}

Pascal's suggested that j in 1..cols and i in 1..rows should switch place, which made the whole improvement.

The labeling is now as follows, i.e. it starts with the smallest "dimension" (rows or cols).

if (rows * row_rule_len < cols * col_rule_len ) {
    forall(i in 1..rows, j in 1..cols: !board[i,j].bound()) {
      tryall(v in 1..2) by (-v)
        m.label(board[i,j],v);
    }
  } else {
    forall(j in 1..cols, i in 1..rows: !board[i,j].bound()) {
      tryall(v in 1..2) by (-v)
        m.label(board[i,j],v);
    }
  }


Thanks again Pascal for your help and inspiration!