Main

July 03, 2013

Gecode version 4.1.0 released

Gecode version 4.1.0 was released some weeks ago today. From the Changelog:
Changes in Version 4.1.0 (2013-06-13 2013-07-03)

This release adds a really useful feature and the required infrastructure to Gist (it now can show information on the alternatives in a search tree), has some new features, and fixes quite a number of bugs (some of which are quite serious, in particular for: LDSB, restart-based search, and FlatZinc).
  • Kernel
    • Additions
      • Branchers now support a print() member function. (major)
      • Both AFC and Activity can be changed by a set() function. (minor)
  • Search engines
    • Bug fixes
      • Fixed a bug in restart-based search, which would crash if the problem is failed at the root or only had a single solution during best-solution search. (major, thanks to Chris Mears, Roberto Castañeda Lozano)
  • Finite domain integers
    • Additions
      • Integer and Boolean branchings now take an optional argument for a user-defined print function. See MPG for details. (major)
      • Added if-then-else constraint (called ite) for integer variables. (minor)
    • Other changes
      • Added classes IntMinimizeSpace and IntMaximizeSpace for cost-based optimization with an integer cost variable. While the classes MinimizeSpace and MaximizeSpace are still available, there use is deprecated and a later release might remove them. (minor)
    • Bug fixes
      • Fixed crash due to combination of LDSB and Activity. (major, thanks to Stefano Gualandi)
      • The internal representation of integer variable domains could leak memory to the space. (major)
      • Fixed binpacking crash when the b and s arrays are empty. (minor, thanks to David Rijsman)
  • Finite integer sets
    • Additions
      • Set branchings now take an optional argument for a user-defined print function. See MPG for details. (major)
    • Bug fixes
      • Fixed crash due to combination of LDSB and Activity. (major, thanks to Stefano Gualandi)
  • Floats
    • Additions
      • Float branchings now take an optional argument for a user-defined print function. See MPG for details. (major)
      • Added classes FloatMinimizeSpace and FloatMaximizeSpace for cost-based optimization with a float cost variable. (minor, thanks to Pascal Francq)
    • Bug fixes
      • Fixed FLOAT_VAR_ACTIVITY_SIZE_MIN/MAX, which used to compute size/activity instead of activity/size. (minor)
      • Random value selection for float branchings was not declared with the right types. (minor)
      • Fixed missing propagation in reified rel-constraints. (minor, thanks to Duong Khanh Chuong)
  • Minimal modeling support
    • Bug fixes
      • Boolean expressions could lead to a segfault when initialized with default constructor. (minor, thanks to Roberto Castañeda Lozano)
  • Script commandline driver
    • Additions
      • Added classes FloatMinimizeScript and FloatMaximizeScript for cost-based optimization with a float cost variable. (minor, thanks to Pascal Francq)
    • Other changes
      • Added classes IntMinimizeScript and IntMaximizeScript for cost-based optimization with an integer cost variable. While the classes MinimizeScript and MaximizeSpcript are still available, there use is deprecated and a later release might remove them. (minor)
  • Example scripts
    • Additions
      • Added LDSB-based symmetry breaking to graph coloring example. (minor, thanks to Stefano Gualandi)
  • Gist
    • Additions
      • Gist now can show information on the alternatives in a search tree, using the new menu options "Label branches" and "Label path". (major)
  • Gecode/FlatZinc
    • Additions
      • The FlatZinc interpreter now supports float variables as objective functions. (major)
      • Added {int,bool,set,float}_default_search annotations. These can be used on solve items to declare the default search strategy for the respective variable types. (minor)
      • Added int2float constraint which was missing in the FlatZinc interpreter. (minor, thanks to Tias Guns)
    • Bug fixes
      • Fixed printing of float variables to be compatible with MiniZinc output. (minor, thanks to Peter Nightingale)
      • Fixed a segmentation fault when using FlatZinc with float variables that have initializers. (minor, thanks to Peter Nightingale)
      • Fixed the random search strategies to all use the same random number generator instead of different generators all initialised with the same seed (and therefore all producing the same sequences). (minor)
  • General
    • Other changes
      • CMake build system tweaked for out-of-source-dir building. (minor, thanks to Cliff Yapp)

March 14, 2013

Gecode version 4.0.0 released

Gecode version 4.0.0 has been released. Congrats to the developers!

From the Changelog:
Changes in Version 4.0.0 (2013-03-14)

This release adds a multitude of new features, fixes many bugs, and offers a number of performance improvements. There are too many major improvements to even summarize them here all, so here are just some highlights: LDSB as automatic symmetry breaking during search; restart-based search; complete redesign and reimplementation of branching enhancing the expressiveness massively (AFC with decay, activity, user-defined variable and value selection, tie-break control, better randomization, ...); half-reification; addition of floating point constraints.

It is recommended to read the new Chapter in MPG on branching as the changes are substantial and require you to change your programs, see also the full list of changes below.

As the interfaces has changed, please consult How to Change to Gecode 4.0.0.

  • Kernel
    • Additions
      • The random seeds for variable and value branching options can now be initialized from a hardware random number generator (see MPG for details). (minor)
      • All ArgArrays now accept STL vectors and input iterators for construction. (minor)
      • The kernel now can record the activity of variables. The activity of a variable is defined as how often the domain of a variable has been pruned during search. For details, see "Modeling and Programming with Gecode". (major)
    • Other changes
      • The default constrain() function for best-solution search now does by default nothing (it used to throw an exception). (minor)
      • The entire infrastructure for variable-value branchers has been reimplemented from scratch. The new design makes a much better compromise between code size, efficiency, and expressiveness: the efficiency is about the same (for examples with no propagation and just branching one can note a slowdown of 2-4%) while code size shrinks drastically (the overall code size for integer variables shrinks by 20%) and the architecture is much more expressive (in particular, it supports tie-break limits, see MPG). (major)
      • Throw exception when the user-defined copy constructor of a class that inherits from Space does not call the Space copy constructor. (minor)
    • Performance improvements
      • Fixed a bug in the main memory allocation routine: now heap block sizes are decreased dynamically as they should be. Also changed the memory configuration parameters as explained in: Zandra Norman, Memory Management for Gecode. KTH Royal Institute of Technology, Sweden, Bachelor thesis, TRITA-ICT-EX-2012:143, 2012. (major, thanks to Zandra Norman)
  • Search engines
    • Additions
      • Added support for restart-based search, see MPG for details. (major)
    • Other changes
      • Variable selection for branching used the quotient of size divided by degree, accumulated failure count, or activity. They now use the inverse. That is, for example, it is not any longer INT_VAR_SIZE_DEGREE_MIN() but INT_VAR_DEGREE_SIZE_MAX() (that is, largest degree divided by size). (major) That looks like an annoying change but is in fact essential: the strategies using accumulated failure count and activity now could have run into division by zero issues. And just changing the implementation is not good enough because the values of these measures can now be exposed during tie-breaking.
      • The restart best solution search engine has been removed (it is subsumed by the new restart-based meta search engine RBS). (major)
    • Performance improvements
      • The search engines now do not allocate memory on the search stack for the rightmost branch of each node. This means that the search tree depth is now computed differently. It now represents the actual peak depth required at any one time, taking into account that rightmost branches reuse the stack entry of their parents. (minor)
  • Finite domain integers
    • Additions
      • Gecode now supports Lightweight Dynamic Symmetry Breaking (LDSB), see MPG for details. (major , contributed by Christopher Mears)
      • Added value selection strategies for branching INT_VAL_NEAR_MIN(), INT_VAL_NEAR_MAX(), INT_VAL_NEAR_INC(), and INT_VAL_NEAR_DEC(). They can take a previous assignment to the variables to branch on and try to choose values which are near (see MPG for details). (major, thanks to Manuel Loth)
      • Added domain constraint that constrains a variable (or an array) according to the domain of another variable. (minor)
      • Added variants of dom that copy the domain from one variable to another. (minor)
      • The nooverlap constraint now allows sharing of unassigned variables in its argument arrays. (minor)
      • Added half-reification for reified constraints (see Modeling and Programming with Gecode for details). (major)
      • Added pow and nroot constraints. (major)
      • The binpacking constraint now also accepts items of size zero. (minor, thanks to Kathrin Dannmann, Roberto Castañeda Lozano)
      • Added activity-based branching strategies for integer and Boolean variables: INT_VAR_ACTIVITY_MIN, INT_VAR_ACTIVITY_MAX, INT_VAR_ACTIVITY_SIZE__MIN, INT_VAR_ACTIVITY_SIZE_MAX. For details, see "Modeling and Programming with Gecode". (major)
    • Other changes
      • The coefficients for linear constraints are now divided by their greatest common divisor. That means that some equations can be handled now that previously threw an OutOfLimits exception, and some equations can be handled with the more efficient integer precision propagators that previously required double precision. (minor)
      • Generalized definition of no-overlap propagators for better reuse. (minor, thanks to Roberto Castañeda Lozano)
      • The interface for branching on integer and Boolean variables has changed considerably (supporting user-defined variable and value selection, tie-break limit functions, handlers to created branchers, and more). In order to change, you have to add () to all variants of INT_VAR, INT_VAL, and INT_ASSIGN. For example, INT_VAR_SIZE_MIN becomes the function call INT_VAR_SIZE_MIN() and INT_VAL_MIN_MIN becomes the function call INT_VAL_MIN_MIN(). Some of these functions expect additional arguments and can take also optional arguments (this replaces the VarBranchOptions and ValBranchOptions). Please read the new "Branching" chapter in MPG. (major)
      • Respect IntConLevel argument for reified linear constraints with a single integer variable. (minor)
    • Bug fixes
      • Fixed precede constraint with less than two values. (minor, thanks to Roberto Castañeda Lozano)
      • Fixed a bug where bounds consistent distinct reported subsumption instead of failure in certain cases. (major, thanks to Lin Yong)
      • Fixed potential rounding issues in sqr and sqrt constraints. (minor)
      • Fixed copying of tuple sets in extensional constraints and IntSets in sequence constraints (could lead to crashes when using parallel search). (major, thanks to Manuel Baclet)
      • Added missing propagation for nary min/max constraint. (minor, thanks to Jean-Noël Monette)
      • Make extensional constraints work with empty tuple sets. (minor, thanks to Peter Nightingale)
      • Fix count (global cardinality constraint) for multiple occurrences of the same value in the cover array. (minor, thanks to Peter Nightingale)
    • Performance improvements
      • Arithmetic, linear, and cumulative constraints now resort to internal operations using "long long int" rather than "double". This improves performance but also extends the range over integer coefficients that can be handled by linear constraints. (major)
  • Finite integer sets
    • Additions
      • Gecode now supports Lightweight Dynamic Symmetry Breaking (LDSB), see MPG for details. (major , contributed by Christopher Mears)
      • Added domain constraint that constrains a variable (or an array) according to the domain of another variable. (minor)
      • Added domain constraints for arrays of set variables. (minor)
      • Added half-reification for reified constraints (see Modeling and Programming with Gecode for details). (major)
      • Added activity-based branching strategies for set variables: SET_VAR_ACTIVITY_MIN, SET_VAR_ACTIVITY_MAX, SET_VAR_ACTIVITY_SIZE_MIN, SET_VAR_ACTIVITY_SIZE_MAX. For details, see "Modeling and Programming with Gecode". (major)
      • Added channeling constraint between arrays of set variables. (major , contributed by Denys Duchier)
    • Other changes
      • The interface for branching on integer and Boolean variables has changed considerably (supporting user-defined variable and value selection, tie-break limit functions, handlers to created branchers, and more). In order to change, you have to add () to all variants of SET_VAR, SET_VAL, and SET_ASSIGN. For example, SET_VAR_SIZE_MIN becomes the function call SET_VAR_SIZE_MIN() and SET_VAL_MIN_INC becomes the function call SET_VAL_MIN_INC(). Some of these functions expect additional arguments and can take also optional arguments (this replaces the VarBranchOptions and ValBranchOptions). Please read the new "Branching" chapter in MPG. (major)
    • Bug fixes
      • Fixed precede constraint with less than two values. (minor, thanks to Roberto Castañeda Lozano)
  • Floats
    • Additions
      • Added support for float variables. (major , contributed by Vincent Barichard)
  • Minimal modeling support
    • Additions
      • Added pow and nroot expressions for integer variables. (minor)
    • Other changes
      • Made implementations of MiniModel expressions private, so that the MiniModel headers do not have to include propagator headers like gecode/int/linear.hh any longer. (minor)
  • Script commandline driver
    • Additions
      • Added commandline options to control restart-based search, see MPG for details. (major)
      • Decay values can now be passed on the command line using the switch -decay. (minor)
      • Added options -file-sol and -file-stat for writing solutions and statistics to arbitrary files and streams. (minor, thanks to Andrea Pretto)
      • The command line -print-last configures whether only the last solution found is printed. (minor, thanks to Josef Eisl)
    • Other changes
      • Boolean options (BoolOption) can now be given a false or true argument and hence are in-line with all other option types. (minor)
  • Range and value iterators
    • Documentation fixes
      • Clarified for several iterators that when using the assignment operator both iterators must be allocated from the very same region. (minor)
  • Support algorithms and datastructures
    • Bug fixes
      • Fixed a concurrency problem that caused an exception to be thrown at the end of a multi-threaded search on some platforms. (minor)
      • Fixed a bug in the allocation of very large bitsets. (minor, thanks to Max Ostrowski)
  • Example scripts
    • Additions
      • New example: Colored matrix without monochromatic rectangles. (minor)
  • Gist
    • Additions
      • Added option to invoke move cursors during the automatic search. (minor)
    • Other changes
      • Updated to compile with Qt version 5.0 (still works with Qt >= 4.3 as well). (minor)
  • Gecode/FlatZinc
    • Additions
      • Added support for more search annotations (defined in gecode.mzn), and for the restart and decay command line options. (minor)
      • Added native support for lex_less_bool and lex_lesseq_bool. (minor)
      • Gecode/FlatZinc now supports float constraints and variables. (major)
      • The FlatZinc interpreter now supports comparing of nodes in Gist. (major, thanks to Gabriel Hjort Blindell)
      • Added native support for the inverse_set constraint. (minor)
    • Other changes
      • Renamed the FlatZinc executable to fzn-gecode, to better distinguish it when installed alongside other FlatZinc implementations. (minor)
      • The FlatZinc interpreter no longer performs a complete search on variables annotated as var_is_introduced, but tries to extend a solution on the model variables to a solution that includes the introduced variables. Each solution on the model variables is therefore only reported once. (major)
    • Bug fixes
      • The mzn-gecode shell script now passes arguments correctly to the FlatZinc interpreter. (minor)
      • Removed spurious debug output for among constraint. (minor, thanks to Simon Ekvy)
      • Exported registry and helper functions so that users can add constraint handlers to the FlatZinc interpreter. (minor, thanks to Jean-Noël Monette)
  • General
    • Additions
      • Added CMake build script for Gecode (CMakeLists.txt). (minor, thanks to Victor Zverovich)
      • Variable selection using AFC now supports decay. Read more in MPG. (major)
    • Other changes
      • Compiles with MSVC 2012. (minor)
Also see:

March 31, 2012

Gecode version 3.7.3 released

Gecode version 3.7.3 has been released.

From the ChangeLog:
This release fixes some small bugs in the FlatZinc interpreter and library.
  • Gecode/FlatZinc
    • Additions
      • Added mzn-gecode scripts for conveniently solving MiniZinc models using the Gecode FlatZinc interpreter. (minor)
    • Other changes
      • Removed the print command line option. Instead, for optimization problems, using -a will print all solutions, while not using -a will only print the last one. This is consistent with the G12 FlatZinc command line interface. (minor)
    • Bug fixes
      • Fixed "largest" variable selection strategy for set variables. (minor, thanks to Marco Correia)
      • Fixed the parser for set literals. (minor, thanks to Thibaut Feydy)
      • Integer variables with empty domains result in unsatisfiable models instead of an error message. (minor)
      • Support 0-length array declarations. (minor)

February 27, 2012

Gecode version 3.7.2 released

Gecode version 3.7.2 has been released. It can be downloaded here.

From the Changelog:
Changes in Version 3.7.2 (2012-02-27)

This release fixes several small bugs.

  • Kernel
    • Additions
      • Added Archive operators for floats and doubles. (minor)
  • Finite domain integers
    • Other changes
      • Throw exception of type OutOfLimits instead of Exception when numerical arguments to sequence constraint are out of range. (minor)
    • Bug fixes
      • Added missing pruning to cumulative edge finding propagator. (minor, thanks to Joseph Scott)
      • Fixed sorted constraint to accept zero-length arrays. (minor, thanks to Kish Shen)
      • Added some missing propagation when posting a channel constraint between an array of Boolean variables and an integer variable. (minor, thanks to Kish Shen)
    • Performance improvements
      • Posting a reified dom constraint on IntVars with an assigned control variable does not create propagators any more, but updates the domain immediately. (minor)
  • Finite integer sets
    • Bug fixes
      • The element constraint with SOT_UNION and IntSetArgs reported subsumption too early, resulting in incorrect propagation. (major, thanks to Denys Duchier)
  • Minimal modeling support
    • Bug fixes
      • The BoolExpr default constructor did not properly initialize its members, causing crashes. (minor)
  • Script commandline driver
    • Bug fixes
      • Fixed rounding for printing the runtime (for example, 1:60:56.157 could be printed...). (minor, thanks to Serge Le Huitouze)
      • Fixed time output for times with zero minutes but nonzero hours. (minor, thanks to Jan Kelbel)
  • Gecode/FlatZinc
    • Bug fixes
      • Export RTTI symbols for the FlatZinc AST so that it can be used by client code. (minor)
      • Do not crash when encountering undefined identifier as constraint argument. (minor, thanks to Nicholas Tung)
  • General
    • Additions
      • Gecode now compiles on NetBSD. (minor, thanks to Adam Russell)
      • Added a macro GECODE_VERSION_NUMBER that is defined as x*1000000+y*100+z for Gecode version x.y.z. (minor, thanks to Denys Duchier)

October 10, 2011

Gecode version 3.7.1 released

Gecode version 3,7,1 released. Download.

From the Changelog:
Changes in Version 3.7.1 (2011-10-10)

This release fixes several bugs, upgrades to MiniZinc version 1.4, and features some minor improvements.

  • Search engines
    • Bug fixes
      • Fixed a bug that crashed the single-thread branch-and-bound search engine when initialized with a failed space. (minor)
  • Finite domain integers
    • Additions
      • Added efficient propagators for n-ary Boolean xor and equivalence (as they are now primitive in MiniZinc). (minor)
      • Domain consistency for simple counting constraints can be switched off. (minor, thanks to Kish Shen)
    • Other changes
      • The semantics of n-ary Boolean implication has been changed (to the more convential reading): rel(home, BOT_IMP, x, y) where x is an array of Boolean variable now assumes implication to be right associative. See MPG for explanation. (minor)
    • Bug fixes
      • Fixed bugs in the computation of the required precision (int or double) for linear propagation, and in division operations of scale views. These could cause an incorrect treatment of overflow in linear constraints. (major)
    • Performance improvements
      • Domain consistent distinct runs 10-20% faster. (minor)
  • Finite integer sets
    • Bug fixes
      • Do not use SharedArray<IntSet> in the set element constraints, because it does not properly udpate the IntSet during copying. This could cause memory corruption. (major)
  • Support algorithms and datastructures
    • Bug fixes
      • Compile again if threads are disabled. (minor)
  • Gist
    • Bug fixes
      • Fixed a crash that occurred when double-clicking an unexplored node while move inspectors were active. (minor
  • Gecode/FlatZinc
    • Other changes
      • The FlatZinc interpreter is now compatible with the G12 MiniZinc distribution 1.4. This adds support for var and par identifiers that begin with underscores, the array_bool_xor primitive, as well as the command line option -r for specifying a random seed. (minor)
    • Bug fixes
      • Fixed linear inequations over integer variables that are channeled from Boolean variables. (major, thanks to Håkan Kjellerstrand)

October 04, 2011

Crossword construction in MiniZinc using table constraints - a small benchmark on "72 Gecode problems"

Almost all models, wordlists, and files mentioned in this blog post is available at my MiniZinc Crossword page.

Abstract

The method presented here for construction (solving, generating) crosswords is based of "just a bunch" of table constraints representing the words together with a rather simple unicity constraint. After an introduction of the problem and a description of the general approach used, a benchmark of several (72) crossword problem instances is reported between two FlatZinc solvers (Gecode/fz and Chuffed) and the crossword solver distributed with Gecode (written in C++) using element constraints for the intersections and alldifferent constraint for the unicity requirement. We will find that this MiniZinc model and the FlatZinc solvers are competitive and in some case even faster than the Gecode (C++) model.

Introduction

Some weeks ago I started to read Ivan Bratko's "Prolog Programming for Artificial Intelligence" (the very new 4th edition, ISBN: 9780321417466). On page 27f there was a very simple crossword problem:
The problem is to fill this crossword with words:
    
    L1   L2    L3   L4    L5   XXX
    L6   XXX   L7   XXX   L8   XXX
    L9   L10   L11  L12   L13  L14
    L15  XXX   XXX  XXX   L16  XXX

Where the L* are letters to be identified.
and also a couple of words to use:
dog, run, top
five, four, lost, mess, unit
baker, forum, green, super
prolog, vanish, wonder, yellow
One common approach of solving/generating crosswords is to identify the intersections of the words and then using a wordlist for matching these intersections (e.g. the very simple crossword.mzn). This is usually done with element constraints for the intersections and alldifferent constraint to ensure unicity of the selected words.

Instead of this approach I got the idea (not very revolutionary, but still) to try using only "a bunch of" table constraint representing the words (called "segments" in the model) which would then handle the intersections via the naming of the variables. This was implemented in the MiniZinc model crossword_bratko.mzn. The table constraints where manually created by just using the numbers of the "free" letters in the problem (L1, L2, etc). The array of decision variables (L) is in the domain is in the domain 1..numletters (1..26, for "a".."z"). The dummy variable L[0] was later added to handle the fill-outs (and is hard-coded to a single value).

Here is the kernel of the Bratko problem which consists of the two row (across) words, and the three column (down) words:
array[0..N] of var 1..num_letters: L;
constraint
   % across
   table([L[1],L[2],L[3],L[4],L[5]], words5)
   /\
   table([L[9],L[10],L[11],L[12],L[13],L[14]], words6)


   /\ % down
   table([L[1],L[6],L[9],L[15]], words4)
   /\
   table([L[3],L[7],L[11]], words3)
   /\
   table([L[5],L[8],L[13],L[16]], words4);
The second argument of table is a matrix where all words of the same size are collected, e.g. words5 contains all words of length 5. In earlier crossword models I have tend to collect all words in a single - and often huge - matrix with a lot of "0" as fillers to make it a proper matrix.

As mentioned above the intersections are not explicitly identified in the model; instead they are just a consequence that the same letter identifier "happens" to be in a across word and a down word. E.g. L[3] represents the third letter of the first across word, and the first letter of the first down word. The index (3) is just a counter of the "free" letters. In this problem there are 14 free letters, represented by L[1]..L[14].

Also note that in this problem we don't have to care about 1-letter words. In the general model presented below, we will see that 1-letter words must be handled in a special way.

The constraint that all words should be distinct is implemented which the constraint shown below. The matrix segments is the segments (i.e. the words) in the crossword grid where all letters are identified as a unique integer (the index in L), and "0" (zero) is used as a filler so the segments has the same length and can be represented as a matrix. It has two parts: the segments across (rows) and the segments down (columns), and is the same structure as the table constraints.
segments = array2d(1..num_segments, 1..max_length, 
[
   % across
   1, 2, 3, 4, 5, 0,
   9,10,11,12,13,14,

   % down
   1, 6, 9,15, 0, 0,
   3, 7,11, 0, 0, 0,
   5, 8,13,16, 0, 0
]);

% the segments/words should be all distinct
constraint
   forall(I,J in 1..num_segments where I < J) (
      not(forall(K in 1..max_length) (
          L[segments[I,K]] = L[segments[J,K]]
      ))
   )
;
(The structure of the table constraints and the segments are much the same, and I have tried to represent this in a single matrix, but stumbled on the problem how to represent an appropriate wordlist structure. This may very well be fixed in a later version.)

This model has a single solution (given the wordlist show above):
L = [6, 15, 18, 21, 13, 9, 21, 5, 22, 1, 14, 9, 19, 8, 5, 19]

f o r u m *
i * u * e *
v a n i s h
e * * * s *

Generalization of the model and Gecode's 72 crossword problem

Since Bratko's problem was so small, I then wondered how well MiniZinc would do on a variety of larger crossword problems using the same basic approach, for example on the 72 crossword instances in Gecode's crossword.cpp (it is a large file but most of it are definitions of these 72 problem instances). This model uses element constraints by projecting words to their individual letters. In constraint, the MiniZinc model use table constraints for encoding the entire words. One of the objectives of the benchmark is to compare these two approaches.

The 72 problem instances are of different sizes which ranges from some small problems with grids of 5x5 cells to larger 23x23 grids. See the comments in the Gecode model for more details. The problems has been collected from different sources, e.g. Herald Tribune Crosswords and in early studies of backtracking, e.g. M.L Ginsberg Dynamic Backtracking and M.L. Ginsberg et al Search Lessons Learned from Crossword Puzzles. Some of them has also been used in later studies as well, e.g. the paper by Anbulagan and Adi Botea on phase transitions in crossword problems: Crossword Puzzles as a Constraint Problem (PDF).

The problems are represented as a grid in text form. For example the Bratko problem mentioned above would be represented as the following grid , where "_" represents a letter to be identified, and "*" is a blocked cell.
_ _ _ _ _ *
_ * _ * _ *
_ _ _ _ _ _
_ * * * _ *
A larger example is problem the 15x15 problem #10 (from Gecode's crossword.cpp) which is also identified as "15.01, 15 x 15".
_ _ _ _ * _ _ _ _ _ * _ _ _ _
_ _ _ _ * _ _ _ _ _ * _ _ _ _
_ _ _ _ _ _ _ _ _ _ * _ _ _ _
_ _ _ _ _ _ _ * * _ _ _ _ _ _
* * * _ _ _ * _ _ _ _ _ _ * *
_ _ _ _ _ * _ _ _ * _ _ _ _ _
_ _ _ * _ _ _ _ _ _ * _ _ _ _
_ _ _ * _ _ _ _ _ _ _ * _ _ _
_ _ _ _ * _ _ _ _ _ _ * _ _ _
_ _ _ _ _ * _ _ _ * _ _ _ _ _
* * _ _ _ _ _ _ * _ _ _ * * *
_ _ _ _ _ _ * * _ _ _ _ _ _ _
_ _ _ _ * _ _ _ _ _ _ _ _ _ _
_ _ _ _ * _ _ _ _ _ * _ _ _ _
_ _ _ _ * _ _ _ _ _ * _ _ _ _
However, since MiniZinc don't have the facility of parsing this kind of text representation, I wrote some Perl programs to convert a problem instance to a MiniZinc model. The MiniZinc model for problem #10 is crossword3_10.mzn, which contains the problem specific table constraints and segments definitions, much in the same way as in the Bratko model.

Here is one solution (using Gecode's SCOWL wordlist, see below for more about the wordlists):
roar*breed*area
urge*lager*weal
laughingly*also
elegant**aerate
***and*gadget**
acmes*err*osier
bra*ornate*tore
aah*magneto*non
snap*madras*add
heron*gay*idles
**imaged*pea***
vesper**surmise
echo*ambassador
trim*beach*gear
suss*snaky*ears


Note: By construction this approach requires that all 1-letter segments (words) must be distinct. This means that 1-letter segments must be handled with care since these segments are the same when seeing them across and down. More specific this means that the unicity constraint has a special check for 1-letter words:
forall(I,J in 1..num_segments where I < J) (
  if sum(K in 1..max_length) (
     bool2int(segments[I,K] > 0 ) ) = 1
     /\
     segments[I,1] = segments[J,1]
  then
     true 
  else 
    not(forall(K in 1..max_length) (
      L[segments[I,K]] = L[segments[J,K]]
    ))
  endif
)

Structure of the MiniZinc files

The structure of the model are
  • the general model crossword3.mzn: which includes the wordlist and unicity constraint
  • a wordlist, which is included in the crossword3.mzn
  • problem instance, e.g. crossword3_10.mzn: includes the specific data instance and table constraints. This instance file includes crossword3.mzn
The instances can be run as follows:
# Using G12/MiniZinc default solver
$ minizinc -v --no-optimize -s crossword3_0.mzn

# Using Gecode/fz
$ minizinc -v --no-optimize -G gecode -s crossword3_0.mzn -f "fz -mode stat -n 1 "

For larger problems, the parameter --no-optimize might be a good choice, otherwise mzn2fzn (the converter to FlatZinc) can take very long time.

More about the plain Gecode model

The (plain) Gecode model crossword.cpp is described in detail chapter 19 in Modeling and Programming with Gecode (PDF), which also include some benchmark results.

Benchmarking

One of the objectives with the benchmark was to see how well the MiniZinc model (together with a FlatZinc solver) would compete with Gecode's crossword model (Gecode's crossword.cpp, available in the Gecode distribution). This model will be named "plain Gecode model" to be able to discern it when also mentioning the Gecode/fz FlatZinc solver.

After doing some preliminary tests of several FlatZinc solvers using different labelings, I selected these two solvers to continue with the full benchmarks of all 72 instances:
  • Gecode/fz: SVN version (based on the version 3.7.0) per 2011-09-21.
  • Chuffed: Version compiled 2011-04-17 (no explicit version). Chuffed is a lazy clause solver written by Geoffrey Chu. As of writing it is not yet public available but has been mentioned several times in connection with MiniZinc Challenge where it has shown very impressive results; see MiniZinc challenge 2011 Results and MiniZinc challenge 2011 Results. For more information see this description.
After systematically testing all labelings on some problem instances (#20, #30, and with some wordlists #40), some labelings where selected for the full tests. Note: I looked for a problem instance that could be used as a "proxy" for the complete problem set in order to get the best labelings, but found no instance that was representative enough for this.

Benchmark

The benchmarks consists of testing all 72 problem instances with the following wordlists:
  • Gecode's SCOWL wordlist (66400 English words), available from MiniZinc Crossword page
  • Swedish wordlist based on an earlier version of Den stora svenska ordlistan ("The Big Swedish Wordlist"), (388493 Swedish words), available from MiniZinc Crossword page. Note that crossword3.mzn must be changed to handle the national characters "å", "ä", and "ö" (this is commented out in the current model).
  • 73871 English words from /usr/share/dict/words (some words where filtered out, e.g. the one not matching ^[A-Za-z]$), available from MiniZinc Crossword page
  • 80.txt (242606 English words)
Notes:
  • The time reported is for all 72 problem instances, with a timeout of 10 minutes (600 seconds) per problem instance.
  • The plain Gecode model was run with default parameters (except for the timeout of 10 minutes and statistics)
  • The solver stated as chuffed4 below is Chuffed using the single parameter --toggle-vsids (and the timeout)
  • The solver Gecode/fz was run with the parameter for timeout and statistics.
The reported runs are not all the tests that where made. Instead just the top results are shown.

Which time to compare: Total time, runtime, or solvetime?

The time for running the plain Gecode executable was measured via the Unix time command. This includes everything: parsing of the problem (compiled in the model), setting up the constraints, and then solving the model.

However, the way MiniZinc works is different: the MiniZinc model (.mzn) is first parsed and then translated (flattened) into a FlatZinc format (.fzn) which is the file used by the solvers. This flattening process can take some time: As the number of words in the wordlist increases, the time for flattening increases as a consequence. The time is about a couple of seconds for smaller wordlists to about 30 seconds for the largest Swedish wordlist. Also, it takes some seconds to generate the "fancy" output where the resulting grid is presented (see the output section in the MiniZinc model). This output is used to check that the solution is correct and don't include any duplicate words (this is done via a separate Perl program).

Example: The time for running problem #33 (a grid of size 21x21) using the plain Gecode crossword model with SCOWL wordlist is 44.9 seconds. The total time for the Gecode/fz solver using size_afc_min/indomain_min to solve the same problem takes 59.6 seconds. However, Gecode/fz reports two times in its output: runtime: 49.1s and solvetime: 47.6s. This means that - for this problem instance - there is an overhead of about 5 seconds to generate the FlatZinc file and then output the nice output grid. In comparison, Chuffed using first_fail/indomain_max took 13.22s total time, with a reported 3.41s runtime and 1.56s as solvetime.

In the benchmark below all the three different times for the FlatZinc solvers are reported: total time, runtime, and solvetime (summed for all 72 instances). One way of comparing the performance of the FlatZinc solvers with the plain Gecode model is to use the runtime which would exclude the overhead for flattening to FlatZinc and generating the output. On the other hand, comparing runtime values is really not fair to plain Gecode, since the flattening process might do some relevant optimization of the model. As we will see, some of the solvers has in fact a total time that is better than the plain Gecode model.

The benchmarks below are grouped on the different wordlists, and the ordering is on the runtime. We will see that there is no single FlatZinc solver + labeling that has the best runtimes (or total time) for all wordslists.

Wordlist Gecode SCOWL

This is the wordlist used in the "plain" Gecode model containing 66400 English words.

The total time for plain Gecode crossword (with default parameters) on all 72 problem instances is 57:16.25 minutes. The timeout/failures are for problem: #15, #30, #39, #40, #45, #49.

Problem #40 is not possible to solve with this wordlist since the problem requires two words of length 23. However the wordlist contain no word of this length.

It is interesting that Chuffed's total time (36 minutes 37 seconds) is significant less than plain Gecode's total time (and runtime is of course even faster). We also see that the overhead of flattening to FlatZinc format and then handling the output is about 6 minutes.
Solvervar select
val select
#solvedTotal timeRuntimeSolvetime
chuffed4first_fail
indomain_max
69 2197.65s (36 minutes and 37 seconds)1744.44s (29 minutes and 4 seconds)1681.32s (28 minutes and 1 second)
chuffed4most_constrained
indomain_max
69 2198.98s (36 minutes and 38 seconds)1746.56s (29 minutes and 6 seconds)1683.45s (28 minutes and 3 seconds)
chuffed4max_regret
indomain_max
67 3368.83s (56 minutes and 8 seconds)2917.64s (48 minutes and 37 seconds)2854.8s (47 minutes and 34 seconds)
fzsize_afc_min
indomain_min
66 4386.15s (1 hour, 13 minutes, and 6 seconds)3931.794s (1 hour, 5 minutes, and 31 seconds)3883.713s (1 hour, 4 minutes, and 43 seconds)
fzsize_afc_min
indomain_split
66 4564.53s (1 hour, 16 minutes, and 4 seconds)4114.111s (1 hour, 8 minutes, and 34 seconds)4066.457s (1 hour, 7 minutes, and 46 seconds)

Wordlist 80.txt

The 80.txt wordlist contains 242606 English words.

Total time for plain Gecode: 21:48.28 minutes with 70 instances solved.

Here we see that the best total time for the MiniZinc solver solves all 72 problems in about the same time as the plain Gecode model. The runtime reported is just about 3 minutes which is kind of weird. (The reported solvetime of 17 seconds is even weirder.)
Solvervar select
val select
#solvedTotal timeRuntimeSolvetime
chuffed4first_fail
indomain_min
72 1324.96s (22 minutes and 4 seconds)157.42s (2 minutes and 37 seconds)17.71s (17 seconds)
fzsize_afc_min
indomain_max
72 1375.62s (22 minutes and 55 seconds)214.765s (3 minutes and 34 seconds)103.848s (1 minute and 43 seconds)
fzsize_afc_min
indomain_median
72 1370.48s (22 minutes and 50 seconds)214.772s (3 minutes and 34 seconds)103.495s (1 minute and 43 seconds)
chuffed4max_regret
indomain_split
72 1410.68s (23 minutes and 30 seconds)266.42s (4 minutes and 26 seconds)126.43s (2 minutes and 6 seconds)

/usr/share/dict/words

This wordlists contains 73871 English words from /usr/share/dict/words. Some words where filtered out from the original file: the one not matching ^[A-Za-z]$.

Total time for plain Gecode: 33:43.19 minutes, 69 instances solved.

This is another benchmark where Chuffed's is the best FlatZinc solver. The total time (34 minutes and 1 second) is almost the same as plain Gecode, with the runtime (25 minutes and 56 seconds) is significantly faster.
Solvervar select
val select
#solvedTotal timeRuntimeSolvetime
chuffed4first_fail
indomain_max
69 2041.68s (34 minutes and 1 second)1556.97s (25 minutes and 56 seconds)1484.82s (24 minutes and 44 seconds)
chuffed4most_constrained
indomain_max
69 2048.91s (34 minutes and 8 seconds)1563.16s (26 minutes and 3 seconds)1491.13s (24 minutes and 51 seconds)
chuffed4first_fail
indomain_split
69 2196.82s (36 minutes and 36 seconds)1712.9s (28 minutes and 32 seconds)1639.85s (27 minutes and 19 seconds)
chuffed4first_fail
indomain_min
68 2522.89s (42 minutes and 2 seconds)2045.56s (34 minutes and 5 seconds)1974.13s (32 minutes and 54 seconds)
fzsize_afc_min
indomain_min
68 2563.63s (42 minutes and 43 seconds)2085.042s (34 minutes and 45 seconds)2029.719s (33 minutes and 49 seconds)

Swedish wordlist

This wordlist contains 388493 Swedish words.

There is no plain Gecode runtime for this, instead it is just a comparison of the FlatZinc solvers. I wanted to include this in the benchmark for two reasons: to see how/if MiniZinc could handle this large wordlist, and also because I'm quite curious about Swedish solutions for the problems (much because I am a Swede).

The best solver this time is Gecode/fz (using size_afc_min/indomain_min) with a runtime of 3 minutes and a solvetime of 42 seconds. Though the total time is much larger: almost 39 minutes. This means that there is a massive overhead of flattening to FlatZinc and presenting the output.
Solvervar select
val select
#solvedTotal timeRuntimeSolvetime
fzsize_afc_min
indomain_min
72 2330.14s (38 minutes and 50 seconds)181.685s (3 minutes and 1 second)42.129s (42 seconds)
fzsize_afc_min
indomain_split
72 2349.9s (39 minutes and 9 seconds)197.006s (3 minutes and 17 seconds)57.683s (57 seconds)
fzsize_afc_min
indomain_max
72 2393.97s (39 minutes and 53 seconds)255.495s (4 minutes and 15 seconds)116.09s (1 minute and 56 seconds)
chuffed4most_constrained
indomain_split
72 2415.61s (40 minutes and 15 seconds)258.14s (4 minutes and 18 seconds)89.89s (1 minute and 29 seconds)
chuffed4first_fail
indomain_split
72 2413.29s (40 minutes and 13 seconds)258.3s (4 minutes and 18 seconds)89.9s (1 minute and 29 seconds)

Summary and conclusions

The objective of the benchmark was to see how well the MiniZinc model would compete with the plain Gecode model. For all wordlists there is at least one FlatZinc solver with a total time that is near plain Gecode's total times, and for one wordlist (SCOWL) there is a solver that is much faster. Comparing the reported runtimes all measurable wordlists has a FlatZinc solver that is faster. For one wordlist (Swedish words) there where no run of the plain Gecode model.

As mentioned above, there is no single FlatZinc solver/labeling that is the best for all wordlist. Comparing just the FlatZinc solvers we see that Chuffed solver (with some labeling) was the best for SCOWL, 80.txt, and /usr/share/dict/words; whereas Gecode/fz was the best for the Swedish wordlist.

In the preliminary tests the best variable selection for Gecode/fz is size_afc_min, though the best value selection is not that clear. For Chuffed there is single variable/value selection combination that dominates, though both first_fail and most_constrained often gave quite good results.

As noted several times before in the CP field, these kind of benchmarks might be misleading and of limited value. The comparison between the plain Gecode model and the MiniZinc model (+ FlatZinc solvers) might be even worse since it compares at the same time:
  • two different CP systems: compiled C++ code vs MiniZinc eco system
  • two different approaches: element constraint on intersections vs. table constraint on words.
  • different time measurements
Still, it is probably not unfair to draw the conclusion that the MiniZinc model and the two tested FlatZinc solvers at least did give the plain Gecode model a match in generating/solving the selected problem instances.

Also, there might be some parameter or labeling that is much better than these tested. This includes - of course - parameters of the plain Gecode model.

Further research

It would be interesting to study how well the table approach would do as a plain Gecode model. It would also be interesting to do a MiniZinc model using a element/alldifferent approach.

System

Here is the system and version used in the benchmark:
  • Linux Ubuntu 11.04, 64-bit 8 cores (i7 930, 2.80GHz) with 12Gb RAM
  • Gecode: SVN version (based on the version 3.7.0) per 2011-09-21.
  • MiniZinc version: Version 1.4
  • Chuffed version: Version compiled 2011-04-17 (no explicit version).

Thanks

Thanks to Christian Schulte and Guido Tack for some explanations and intuitions. Also, thanks to Geoffrey Chu and Peter Stuckey for the opportunity to test Chuffed.

July 15, 2011

Gecode version 3.6 released

Gecode version 3.6 released. (Download.)

From the Changelog:
This release adds new constraints (value precedence constraints for integer and set variables, no-overlap constraints for rectangles, constraints for Hamiltonian paths), improves and cleans up a number of existing constraints (scheduling, channeling, relation, bin-packing, lexicographic relations), and adds new functionality (support for externalization of choices for distributed search, support for incremental propagation).



Some models might have to be changed as the graph and scheduling modules have been incorporated into the integer module (removing the respective include directives is sufficient).

On top, there are many small fixes, in particular for FlatZinc.
  • Kernel
    • Additions
      • Moved RangeList class, which is a list of ranges implemented as a FreeList, from the set module into the kernel. Also added corresponding Iter::Ranges::RangeList iterator. (minor)
    • Other changes
      • Choices can now be written into an externalized form (called an Archive), and reconstructed from it. This is necessary for serializing paths in a distributed search engine. (major)
  • Search engines
    • Bug fixes
      • Fixed memory leak when passing a failed space to a search engine with cloning option set to false. (minor)
  • Finite domain integers
    • Additions
      • The cumulative constraints now support an IntVar as the capacity argument. (minor)
      • Added value precedence constraint. (major , contributed by Christopher Mears)
      • Added no-overlap constraint that enforces that rectangles do not overlap (also known as diffn). See "Modeling and Programming with Gecode" for details. (major)
      • Added constraints for Hamiltonian paths (called path). See "Modeling and Programming with Gecode" for details. (major)
      • Generalized lexicographic constraint to arrays of different sizes. (minor, thanks to Kish Shen)
      • Added a CachedView that can cache the domain between propagator invocations and provides an efficient test whether a view has changed since the previous invocation as well as an iterator over the removed domain values. This makes it easier to implement incremental propagation algorithms that need exact delta information. (major)
    • Other changes
      • The cumulatives constraint now does not post the s+p=e constraints, harmonizing its semantics with the cumulative and unary constraints. (minor)
      • Changed semantics of rel(home, x, IRT_NQ), enforces that not all variables in x are equal. See "Modeling and Programming with Gecode" for details. (major)
    • Bug fixes
      • Fixed element and sequence propagators, which were only correct by accident (incorrect use of GECODE_ME_CHECK instead of GECODE_ES_CHECK). (minor)
    • Performance improvements
      • Optimized channeling propagator between an array of Boolean variables and an integer variables. (minor)
      • The disequality constraint between variable arrays has an efficient propagator now. (minor)
      • The ordering constraints rel(home, x, IRT_LE) (also for IRT_LQ, IRT_GR, IRT_GQ) now have an optimal implementation (single incremental propagator). (major)
      • Increased performance of bin-packing propagator by 40 to 300 percent by using staging. (major)
      • The channel constraints between two integer arrays are now more memory efficient when offsets are used. (minor)
  • Finite integer sets
    • Additions
      • Added value precedence constraint. (major , contributed by Christopher Mears)
      • Added a CachedView that can cache the domain between propagator invocations and provides an efficient test whether a view has changed since the previous invocation as well as an iterator over the removed domain values. This makes it easier to implement incremental propagation algorithms that need exact delta information. (major)
      • Added channel aliases for set union of an array of integer variables, and renamed channel to channelSorted. (minor, thanks to Marco Correia)
    • Bug fixes
      • Fixed sequence, partition, and union propagators, which were only correct by accident (incorrect use of GECODE_ME_CHECK instead of GECODE_ES_CHECK). (minor)
      • The constructors for set variable arrays and argument arrays threw incorrect VariableEmptyDomain exceptions. (minor)
    • Performance improvements
      • Use new cached views for a more efficient implementation of the channel constraint between IntVarArgs and SetVarArgs. (minor)
    • Documentation fixes
      • Fixed documentation for set channeling constraint. (minor, thanks to Marco Correia)
  • Scheduling constraints
    • Other changes
      • The scheduling module has been removed and its constraints have been added to the integer module. (major
    • Bug fixes
      • Fixed scheduling code for mandatory flexible tasks, which was only correct by accident (incorrect use of GECODE_ME_CHECK instead of GECODE_ES_CHECK). (minor)
  • Graph constraints
    • Additions
      • Added circuit constraints with offsets. (minor)
    • Other changes
      • The graph module has been removed and its constraints have been added to the integer module. (major)
  • Script commandline driver
    • Bug fixes
      • Fixed a small memory leak in the driver (stop objects were not deleted). (minor, thanks to Jan Kelbel)
  • Example scripts
    • Additions
      • Added Schur's Lemma puzzle. (minor)
  • Gist
    • Other changes
      • Zoom-to-fit can now be selected during search. (minor)
      • Compiles under MSVC 2005 SP1 again. (minor, thanks to Alin Gherman)
    • Bug fixes
      • Changed keyboard shortcuts in Gist so that they work on all platforms: "Inspect" is now Ctrl+number of inspector, for "Inspect before fixpoint" press Alt in addition (on Mac OS, use the Command key instead of Ctrl). (minor)
  • Gecode/FlatZinc
    • Additions
      • Added native support for the precedence constraint. (major)
      • Added native support for the no-overlap constraint (called diffn in MiniZinc/FlatZinc). (major)
      • Support indomain_middle and indomain_interval search annotation by replacing them with indomain_median and indomain_split, respectively. (minor, thanks to Håkan Kjellerstrand)
      • Added native support for link_set_to_booleans, global_cardinality_low_up_closed, and decreasing_bool. (minor)
    • Other changes
      • Adapted the MiniZinc declarations and the command line options for Gecode/FlatZinc to MiniZinc 1.3. The fz binary now works with the minizinc driver script. (minor)
    • Bug fixes
      • Re-enabled the global cardinality constraint in the FlatZinc interpreter. (minor, thanks to Håkan Kjellerstrand)
      • Fixed the MiniZinc definition for the circuit constraints to work with arbitrarily indexed arrays. (minor, thanks to Håkan Kjellerstrand)
  • General
    • Additions
      • Added configure option --enable-small-codesize that results in slightly less efficient but more compact code being generated for non-debug builds. (minor, thanks to Ruben Zilibowitz)
    • Bug fixes
      • Fixed Makefile, now installation works when FlatZinc library is disabled. (minor, thanks to Martin Mann)
Also, don't forget to read Modeling and Programming with Gecode.

February 01, 2011

Gecode version 3.5 released

Gecode version 3.5 was released some hours ago. Download.

From the Changelog:
Changes in Version 3.5.0 (2011-02-01)
This release fixes serious bugs in parallel search, FlatZinc, fixes some DLL issues on Windows, adds support for FreeBSD, and adds STL-style iterators for arrays.
  • Kernel
    • Additions
      • Added STL compatible iteration support for arrays (variable arrays, argument arrays, view arrays, and shared arrays). (major , contributed by Gregory Crosswhite)
  • Search engines
    • Bug fixes
      • Fixed a serious bug in parallel search (took over a year to isolate the bug). (major, thanks to Denys Duchier, Chris Mears)
  • Minimal modeling support
    • Bug fixes
      • Do not inline construction of linear, Boolean, and set expressions to avoid cross-DLL allocation/deallocation issues on Windows. (minor, thanks to Alexander Kleff)
  • Gecode/FlatZinc
    • Other changes
      • Fixed the definitions of global_cardinality to work with MiniZinc 1.2 and newer, and added corresponding definitions of global_cardinality_closed and global_cardinality_low_up_closed. (major)
    • Bug fixes
      • Fixed incorrect posting of linear constraints with variable arrays of size one. (major, thanks to Roberto Castañeda Lozano)
  • General
    • Additions
      • Gecode now compiles on FreeBSD. (minor, thanks to Peter Penchev)
      • Embed resource information into DLLs and EXEs on Windows. (major)
    • Other changes
      • Embed manifest into Gecode DLLs on Windows. (minor)

Related

Some days ago the Gecode team released (I quote) the source code of the Gecode FlatZinc parser, stripped of all Gecode-specific code. You can use it as a starting point for your own FlatZinc interpreter. For more info, see Gecode: Gecode/FlatZinc.

November 08, 2010

Comparison of some Nonogram solvers: Google CP Solver vs Gecode and two MiniZinc solvers

After writing Google CP Solver: Regular constraint, Nonogram solver, nurse rostering, etc yesterday, I thought it would be interesting to benchmark the new Nonogram solver written in Google CP Solver with some other solvers. And the benchmark is of course from Jan Wolter's great Survey of Paint-by-Number Puzzle Solvers, though I compare only with Gecode, and two MiniZinc solvers: Gecode/fz (Gecode's FlatZinc solver), and MiniZinc's LazyFD since I know them best and can test them my self.

In the table below, I have kept Wolter's original value is in parentheses to get a feeling for the differences in our machines and to check with my missing problems with Gecode.

System notes:
  • The timeout is 30 minutes as in Wolter's test.
  • My machine is a Linux Ubuntu 10.4 LTS with 12Gb RAM, 8 core (Intel i7/930, 2.80GHz), though all programs ran on a single processor.
  • Also, I was "working" (Spotify, web surfing, running other tests in parallel, etc) on the machine as the test ran.
  • Due to copyrights issues, I cannot distribute the examples. They can be downloaded from Wolter's Sample Paint-by-Number Puzzles Download.
And here are some comments about the different solvers.

Google CP Solver

Google CP Solver revision 259, with revision 259 of the Python solver nonogram_table2.py, which includes a lot of nice improvements, thanks by Laurent Perron.

Gecode

Gecode and Gecode/fz is version 3.4.2 (latest svn revision: 11517)
Please note that I have just tested the problem instances in the existing files for Gecode. Hence a lot of instances where not tested on my machine (they are marked N/A).

Also, I think that Wolter's time for Petro should be 1.4s (not 1.4m).

MiniZinc

MiniZinc version: 64bit Linux, snapshot version per 2010-11-05.

Note: In contrast to Wolter's result, the times for Gecode/fz and LazyFD includes generating the FlatZinc file, which may be considerable for large problem instances. Hence some of my results are larger for these solvers.

Some comments about Google CP Solver / Python solver

Wolter has done a great job analyzing the three other solvers (Gecode, Gecode/fz, and LazyFD), so I just comment on Google CP Solver solver.

It looks like Google CP Solver/Python solver is quite similar to Gecode/fz solver, and in some case (e.g. 9-Dom, Bucks, Tragic, Petro, etc) it has exactly the same number of failures. There are some exceptions, though:
  • Solved Merka and Dicap quite fast. Gecode/fz timed out for both these problems
  • Also solved Flag fast where Gecode/fz timed out. Here it is also faster than LazyFD and Gecode (note: Wolter's time).
  • Slower than Gecode/fz on Karate, Signed
  • Slightly slower than Gecode/fz on Tragic, with the same number of failures

Comparison - table

Here is the full table. The number in parenthesis is Wolter's times. The second row is my own timing. I also added the number of failures where applicable; LazyFD always returned 0 choice points so I skipped that. A + indicates time out (30 minutes).

The links for the puzzle is to Wolter's puzzle pages.
Puzzle Size Notes Gecode MiniZinc
Gecode/fz
MiniZinc
LazyFD
Google
CP Solver
#1: Dancer* 5 x 10 Line (0.01s)
0.01s/0 failures
(0,01s)
0.08s/0 failures
(0.04s)
0.1s
0.02s
0 failures
#6: Cat* 20 x 20 Line (0.01s)
0.1s/0 failures
(0.02s)
0.1s/0 failures
(0.44s)
0.95s
0.12s
0 failures
#21: Skid* 14 x 25 Line, Blanks (0.01s)
0.01s/0 failures
(0,02s)
0.09s/13 failures
(0.64s)
0.9s
0.03s
0 failures
#27: Bucks* 27 x 23 Blanks (0.01s)
0.2s/2 failures
(0.02s)
0.1s/3 failures
(1.1s)
1.6s
0.05s
3 failures
#23: Edge* 10 x 11   (0.01s)
0.01s/15 failures
(0.01s)
0.09s/25 failures
(0.08s)
0.16s
0.03s
25 failures
#2413: Smoke 20 x 20   (0.01s)
N/A
(0.02s)
0.11s/5 failures
(0.60s)
1.18s
0.05s
8 failures
#16: Knot* 34 x 34 Line (0.01s)
0.01s/0 failures
(0.02s)
0.14s/0 failures
(5.5s)
5.6s
0.12s
0 failures
#529: Swing* 45 x 45 Line (0.02s)
0.02s/0 failures
(0.04s)
0.24s/0 failures
(13.2s)
22.3s
0.3s
0 failures
#65: Mum* 34 x 40   (0.02s)
0.02s/17
(0.04s)
0.18s/20 failures
(11.0s)
12.4s
0.15s
22 failures
#7604: DiCap 52 x 63   (0.02s)
N/A
(0.06s)
0.29s/0 failures
(+)
1:48m
0.45s
0 failures
#1694: Tragic 45 x 50   (0.14s)
N/A
(3.6m)
2:14m/394841 failures
(32.1s)
34.57s
2:33m
394841 failures
#1611: Merka* 55 x 60 Blanks (0.03s)
N/A
(+)
+
(1.1m)
1:10m
0.47s
27 failures
#436: Petro* 40 x 35   (1.4m[s?])
0.05s/48 failures
(1.6s)
1.15s/1738 failures
(6.2s)
9.3s
1.19s
1738 failures
#4645: M&M 50 x 70 Blanks (0.93s)
N/A
(0.28s)
0.41s/89 failures
(48.1s)
1:14m
0.48s
82 failures
#3541: Signed 60 x 50   (0.57s)
N/A
(0.56s)
0.61s/929 failures
(1.0m)
1:02m
11.7s
6484 failures
#803: Light* 50 x 45 Blanks (+)
N/A
(+)
+
(4.7s)
18.32s
+
#6574: Forever* 25 x 25   (4.7s)
N/A
(2.3s)
1.5s/17143 failures
(1,7s)
1.99s
1.6s
17143 failures
#2040: Hot 55 x 60   (+)
N/A
(+)
+
(2.6m)
1:05m
+
#6739: Karate 40 x 40 Multiple (56.0s)
N/A
(57.8s)
38.0s/215541 failures
(13.6s)
22.52s
56.1s
215541 failures
#8098: 9-Dom* 19 x 19   (12.6m)
N/A
(3.8s)
2.59s/45226 failures
(1.8s)
3.11s
3.9s
45226 failures
#2556: Flag 65 x 45 Multiple, Blanks (3.4s)
N/A
(+)
+
(4.2s)
16.18s
1.6s
14859 failures
#2712: Lion 47 x 47 Multiple (+)
N/
(+)
+
(+)
8:54m
+
#10088: Marley 52 x 63 Multiple (+)
N/A
(+)
+
(2.9m)
+
+
#9892: Nature 50 x 40 Multiple (+)
N/A
(+)
+
(2.4m)
25.29s
+

October 09, 2010

Gecode version 3.4.2 released

Gecode version 3.4.2 has been released. Download.

From the Changelog:
Changes in Version 3.4.2 (2010-10-09)

This release removes LDS from Gecode as it is patented in the US.

  • Search engines
    • Removals
      • Removed limited discrepancy search (LDS) as it is patented in the US. (minor)
  • Gecode/FlatZinc
    • Additions
      • Added support for bin packing constraint. (minor)
Also, I have removed LDS from all my Gecode models.

Update: Mike Trick discuss the removal of LDS further in More patent madness!. There are also some very interesting comments to his post.

July 26, 2010

About 100 new Gecode models

Quite a few new Gecode models has been written to experiment with the new syntax sugar in Gecode version 3.4. Many of the models are Gecode versions of the models presented in Some 40 new ECLiPSe models, where almost all of these where first done in MiniZinc and then SICStus Prolog before implementing in ECLiPSe. So, now it's time for Gecode versions. It is probably not very surprising that it is much easier to translate from the MiniZinc version to Gecode than from the two Prolog-based versions. I have implemented all but 2 of these "about 40" ECLiPSe models.

As I wrote in ..., the new syntactic sugar for rel and expr (substituting the post in older Gecode versions) is very nice and it now much easier to write many constraints. As you may see in the examples, I also tend to (over)use the new append operator for IntVarArgs; it is very handy sometimes, e.g. when you don't know - or don't care to calculate - the length of an array with "dynamically" created IntVar's.

New Gecode models

Here are all the new models, often with comments about the problem or how I implemented them. Almost all of them has also been implemented in some - or all - of these systems:

Optimizing shopping basket

This is a rather straightforward translation of the MiniZinc model shopping_basket6.mzn where described in Optimizing shopping baskets: The development of a MiniZinc model and was presented in the StackOverflow question How to use constraint programming for optimizing shopping baskets?. The object is to minimize the cost of a shopping basket where items are from different shops, and there is a delivery cost ("punishment") if the total from a shop is below a certain limit.

Some comments:

1) Calculating the delivery_cost "punishment"
For calculating the delivery cost for a shop, first a total for the shop is summed. Here we loop over all shops and all items, and the shop's item cost is added if the item is bought from this shop (or else the cost for the item is 0). The implication (>>) is used. Here is my first attempt, i.e. to explicit state the two different cases:
rel(*this, (x[item] == shop) >> (this_shop_cost[item] == costs_m(shop,item)));
rel(*this, (x[item] != shop) >> (this_shop_cost[item] == 0));
However, there is a better way to state this constraint, better both in terms of modeling and propagation:
rel(*this, 
   this_shop_cost[item] == 
   expr(*this, x[item] == shop)*costs_m(shop,item));
Then, given the total cost for this shop, we see if there is a delivery cost. Again, the implication is used. Note the combined condition that cost must be larger than 0 in order for checking the delivery cost. Here is the first (very explicit) try:
// the cost is larger than 0 but below the limit
rel(*this, 
   (shop_costs[shop] > 0 && shop_costs[shop] <= delivery_costs_limits[shop]) 
   >>
   (punish[shop] == delivery_costs[shop]));
// no punishment: either 0 or larger than the limit
rel(*this, 
   (shop_costs[shop] == 0 ||  shop_costs[shop] > delivery_costs_limits[shop]) 
   >>
  (punish[shop] == 0));
Again, a better solution was found (or crafted):
rel(*this, 
punish[shop]==
  expr(*this,
       (shop_costs[shop] > 0 && 
        shop_costs[shop] <= delivery_costs_limits[shop])
      )*delivery_costs[shop]);
Probably this is slightly too influenced by MiniZinc's high-level syntax:
punish[j] = 
    delivery_costs[j]*bool2int(this_cost > 0 /\ 
            this_cost < delivery_costs_limits[j])
Using the first two attempts cited above, the model first had 30031 failures (and about 11 seconds). With the two improvements the failures decreased to 5012 failures (and about 2 seconds).

The solution is - of course - the same as for the MiniZinc model.

There are two solutions with a minimal total of 42013. To show these two, use the following command line options
  shopping_basket6 -search dfs 42013
The result is below. Note that the the first shops is named 0 in the x array, whereas in the MiniZinc model it is named 1.
total: 42013
x: {12, 19, 16, 17, 17, 12, 16, 16, 19, 12, 16, 16, 12, 12, 17, 16, 12, 12, 16, 7, 12, 35, 9, 16, 12, 12, 16, 19, 12}
item_costs: {345, 2921, 3074, 1233, 2554, 878, 1715, 3202, 2335, 639, 1644, 1606, 730, 1096, 1132, 1176, 1654, 1908, 2277, 298, 503, 927, 2182, 449, 397, 1246, 2145, 799, 558}
shop_costs: {0, 0, 0, 0, 0, 0, 0, 298, 0, 2182, 0, 0, 9954, 0, 0, 0, 17288, 4919, 0, 6055, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 927, 0}
punish: {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 390, 0}

total: 42013
x: {12, 19, 16, 17, 17, 12, 16, 16, 19, 12, 16, 12, 12, 12, 17, 16, 12, 12, 16, 7, 12, 35, 9, 16, 12, 12, 16, 19, 12}
item_costs: {345, 2921, 3074, 1233, 2554, 878, 1715, 3202, 2335, 639, 1644, 1606, 730, 1096, 1132, 1176, 1654, 1908, 2277, 298, 503, 927, 2182, 449, 397, 1246, 2145, 799, 558}
shop_costs: {0, 0, 0, 0, 0, 0, 0, 298, 0, 2182, 0, 0, 11560, 0, 0, 0, 15682, 4919, 0, 6055, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 927, 0}
punish: {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 390, 0}
When developing this Gecode model, I wanted to see all the intermediate results, i.e. the item costs, which shop had delivery costs (punish), etc, and then I kept them in the final model.

Costas Array

Model: costas_array.cpp

Generating Costas Arrays (MathWorld, see also Wikipedia's entry), has been a quite popular example in constraint programming. So here is my take.

This model is a port of my MiniZinc model costas_array.mzn (which in its turn is based on Barry O'Sullivan's MiniZinc model CostasArray.mzn). Two comments:
  • Note that the matrix, here defined by
    Matrix<IntVarArray> diffs(differences, n, n);
    
    is accessed diffs(column, row). I tend to forget this.
  • Upper triangular matrix
    The Costas array in this model is calculated by using a helper matrix where the n'th row represents the nth difference of the Costas array. This means that we just have to use the upper triangular matrix (the lower triangular is set to a constant value). For this we use the handy slice method which extracts just the part of the row we need:
    for(int i = 0; i < n-1; i++) {
       distinct(*this, diffs.slice(i+1,n,i,i+1), opt.icl());
    }
    
Symmetry breaking
This model use one symmetry breaking option (-symmetry min) which requires that the first element in the costas array must be the minimum.
if (opt.symmetry() == SYMMETRY_MIN) {
  min(*this, costas, costas[0], opt.icl());
}
For an array of 6, with this symmetry breaking there are 19 solutions (without: 119 solutions).

Divisible by 9 through 1 problem

divisible_by_9_through_1.cpp

The problem is from Solving Combinatory Problems with LINQ:
Find a number consisting of 9 digits in which each of the digits from 1 to 9 appears only once. This number must also satisfy these divisibility requirements:

1. The number should be divisible by 9.
2. If the rightmost digit is removed, the remaining number should be divisible by 8.
3. If the rightmost digit of the new number is removed, the remaining number should be divisible by 7.
4. And so on, until there's only one digit (which will necessarily be divisible by 1).
The Gecode code for this is (basically):
distinct(*this, digits, opt.icl());
for(int i = 0; i < n; i++) {
  int m = base-i-1;
  IntVarArray digits_slice(*this, digits.slice(0, 1, m));
  to_num(*this, digits_slice, t[i], base, opt.icl());
  rel(*this, t[i] % m == 0);
}
branch(*this, digits, INT_VAR_SIZE_MIN, INT_VAL_SPLIT_MAX); 
slice for arrays
As I understand it, slice for arrays is new in Gecode 3.4.0 (there have been a slice for matrices in earlier versions) and is very handy.

The above code could have been written somewhat more compact, by doing the slice part in the call to to_num, but it don't looks as nice.
to_num(*this, 
       IntVarArray(*this, digits.slice(0, 1, m)), 
       t[i], 
       base, 
       opt.icl());
A related note: The constraint to_num which ensures that the array x contains the digitss (and in correct order) of the number z in base base can now be written as follows. sum(coeffs, x) == z is a multiplication of the array x with the array coeffs. In this case, however, it takes slightly more propagations than using the linear variant.
void to_num(Space & space, IntVarArray x, IntVar z, int base = 10, IntConLevel icl = ICL_BND) {
  int len = x.size();
  IntArgs coeffs(len);
  for(int r = 0; r < len; r++) {
    coeffs[r] = pow(base, len-r-1);
  }
  // linear(space, coeffs, x, IRT_EQ, z, icl); // older version
  rel(space, sum(coeffs, x) == z, icl); // alternative version
}
Solutions
Due to integer overflow, this model cannot handle base > 10.

Answer: 381654729 t: {381654729, 38165472, 3816547, 381654, 38165, 3816, 381, 38, 3} digits: {3, 8, 1, 6, 5, 4, 7, 2, 9} (in base 10) The array t is the progression of the truncated number (in base 10), and the array digits contains the digits used.

For base=8 there are 3 solutions. (Note that the answer is in base 10.)
Answer: 1538257
t: {1538257, 192282, 24035, 3004, 375, 46, 5}
digits: {5, 6, 7, 4, 3, 2, 1} (in base 8)

Answer: 1391089
t: {1391089, 173886, 21735, 2716, 339, 42, 5}
digits: {5, 2, 3, 4, 7, 6, 1} (in base 8)

Answer: 874615
t: {874615, 109326, 13665, 1708, 213, 26, 3}
digits: {3, 2, 5, 4, 1, 6, 7} (in base 8)
Here is all solutions from base 2..10 where the number is in base 10. 381654729 ([3,8,1,6,5,4,7,2,9])
BaseSolutions, base 10 ([digits in base])
21 ([1])
3-
457 ([3,2,1]), 27 ([1,2,3])
5-
67465 ([5,4,3,2,1]), 2285 ([1,4,3,2,5])
7-
81538257 ([5,6,7,4,3,2,1]), 1391089 ([5,2,3,4,7,6,1]), 874615 ([3,2,5,4,1,6,7])
9
10


Compare with other implementations in other systems:

Broken weights

Gecode model: broken_weights.cpp

This problem is from Math Less Traveled:
Here's a fantastic problem I recently heard. Apparently it was first posed by Claude Gaspard Bachet de Méziriac in a book of arithmetic problems published in 1612, and can also be found in Heinrich Dorrie's 100 Great Problems of Elementary Mathematics.
"""
A merchant had a forty pound measuring weight that broke into four pieces as the result of a fall. When the pieces were subsequently weighed, it was found that the weight of each piece was a whole number of pounds and that the four pieces could be used to weigh every integral weight between 1 and 40 pounds. What were the weights of the pieces?
"""
Note that since this was a 17th-century merchant, he of course used a balance scale to weigh things. So, for example, he could use a 1-pound weight and a 4-pound weight to weigh a 3-pound object, by placing the 3-pound object and 1-pound weight on one side of the scale, and the 4-pound weight on the other side.
It is a rather simple problem, much like the coins problem (coins3.cpp) where the object is to find a combination of coins (of a certain nomination) to be able to give back changes for all values 1 to 99.

There is one significant difference between these problems, however: in this weights problem, the weights can be on either side of the balance scale, so instead of a simple 0..n representation (how many, is any, of each coin/weights to use), we use {-1,0,+1} where -1 means that a weight is on the left side of the scale and +1 means it is on the right size, and 0 means it is unused.

The main Gecode model is shown below. weights is the weights to use (of size 4), and orig_weight is the total weight (the original weight that was broken, here 40). x (and it's matrix alias x_m) is the representation of which weights to use ({-1,0,1 }) for each combination (from 1..40). Also, not shown here, we minimize the element last in the sorted weights array.
// a matrix version of the different combinations of weights to use
Matrix<IntVarArgs> x_m(x, num_weights, orig_weight);

// symmetry breaking: order the weights
rel(*this, weights, IRT_LQ, opt.icl());

// sum of the weights is w
linear(*this, weights, IRT_EQ, orig_weight, opt.icl());

// all different weights from 1..40 must be combined
for(int w = 1; w <= orig_weight; w++) {
  IntVarArray tmp(*this, num_weights, -orig_weight, orig_weight);
  for(int c = 0; c < num_weights; c++) {
    rel(*this, weights[c]*x_m(c,w-1) == tmp[c], opt.icl());
  }
  rel(*this, sum(tmp) == w, opt.icl());
}

branch(*this, weights, INT_VAR_DEGREE_MAX, INT_VAL_RANGE_MIN);
As noted above, this is a simple problem (and model). Some comments:
  • It took me some time to realize that the tmp array must have a range of -orig_weight..orig_weight.
  • branching: We just need to branch on the weights. One of the first tries, I branched just on x which was a bad idea (or I happened to pick the wrong variable/value selection).
  • Note the use of the "alias" sum for summing the tmp array.
Options
Well, the above solved the specific problem with original weight (orig_weight) of 40 and 4 pieces (num_weights). However, the model then developed in a more general and can now handle any (or should I write "any") combination of original weight and number of pieces.

Gecode don't have direct support for this kind of off-standard command lines options (there are a couple standard options, though). There is the much used size option, which is often used for selecting a specific problem, or a size of some kind. But here we want to have two different options, so a dedicated option class was written:
class BrokenWeightsOptions : public Options {
private:
  Driver::UnsignedIntOption _num_weights; // number of weights
  Driver::UnsignedIntOption _orig_weight; // original weight
public:
  // Initialize options
  BrokenWeightsOptions(const char* n)
    : Options(n),
      _num_weights("-num_weights",
                   "number of weights",
                   4),
      _orig_weight("-orig_weight",
                   "original weight",
                   40) {
    add(_num_weights);
    add(_orig_weight);
  }
  // Parse options from arguments argv (number is argc)
  void parse(int& argc, char* argv[]) {
    Options::parse(argc,argv);
  }

  unsigned int num_weights(void) const { return _num_weights.value(); }
  unsigned int orig_weight(void) const { return _orig_weight.value(); }
};
Now we could test different problem, e.g. if there could be 3 number of pieces instead of 4: broken_weights -num_weights 3 -orig_weight 40 Answer: no, it could not.

For the problem of original weight of 50 and 5 broken weights, the model came up with this solution (remember, we minimize the largest elements of the weights array):
weights: {1, 3, 9, 18, 19}
Another feature is the -search option, which by default is bab (branch & bound). Using -dfs (depth first search), all solutions is given instead of the minimal solution (or rather the progression of minimal solutions).

Compare with the following implementations:

Strimko / Chain Sudoku

In Strimko - Latin squares puzzle with "streams" (August 2009), I blogged about Strimko (a.k.a. Chain Sudoku) - then a rather new grid puzzle - based on Latin squares. For more about Strimko, see the references in that blog post.

The Gecode model for solving Strimko problems (including 6 instances) is strimko2.cpp. Some of the constraints are shown below:
  • the Latin Square constraint is modeled rather simple with the use of Matrix alias.
  • the new feature of array concatenation of IntVarArgs. stream << x[i*n+j]; means: for each position in stream that belongs to "this" stream st, push the corresponding element in the decision variable array (matrix) x to the temporary array stream. All the values in stream must then be distinct.
Matrix<IntVarArray> x_m(x, n, n);
// Latin Square
for (int i = 0; i < n; i++) {
  distinct(*this, x_m.row(i), opt.icl());
  distinct(*this, x_m.col(i), opt.icl());
}

// streams
for(int st = 1; st <= n; st++) {
  IntVarArgs stream;
  for(int i = 0; i < n; i++) {
    for(int j = 0; j < n; j++) {
      if (streams[i*n+j] == st) {
        stream << x[i*n+j];
      }
    }
  }
  distinct(*this, stream, opt.icl());
}
The problem also state some fixed hints in the grid, but the code for handling this is not shown. Compare with the following models:

Futoshiki puzzle

Futoshiki puzzles are another Latin square based puzzles, with the additional twist that there are less-than (and greater-than) hints. Here is an example from Wikipedia:



The Gecode model futoshiki.cpp is very much like Strimko model (modulo the different type of hints), so I don't show any code here. The program includes includes 3 examples, one is the Wikipedia example shown above.

Compare with the following models:

Latin square card puzzle

This problem is from Mario Livio's book about group theory "The Equation that couldn't be solved", page 22:
... Incidentally, you may get a kick out of solving this eighteenth century card puzzle: Arrange all the jacks, queens, kings, and aces from a deck of cards in a square so that no suit or value would appear twice in any row, column, or the two main diagonals.
The Gecode model is latin_square_card_puzzle.cpp contains just a bunch of distinct (all different) constraints for distinct rows, columns, and the two diagonals.

The representation (for n=4) is rather simple: the value is the digit (here 0..3), and the 10's is the suits (0's, 10's, 20's, and 30's) and is access by mod respectively division with 10. So the complete list of valid values are
// values: i mod 10 
0, 1, 2, 3,  // suite 0: 0 div 10
10,11,12,13, // suite 1: 1 div 10
20,21,22,23, // suite 2: 2 div 10
30,31,32,33  // suite 3: 3 div 10
The initial domain of the grid (x) is 0..n*10, but it is reduced by these valid values via an IntSet, and the function dom.
x(*this, n*n, 0, n*10)
// ...
// create all the valid values
IntArgs cards_a;
for(int i = 0; i < n; i++) {
   for(int j = 0; j < n; j++) {
     cards_a << i+m*j;
   }
}
IntSet cards(cards_a); // convert to a IntSet
dom(*this, x, cards);  // restrict the domain
Compare with the MiniZinc model latin_square_card_puzzle.mzn (where the construction of all the alldifferent's is much easier).

Note: There is a solution of n = 4 and n = 5, but not for n = 6. The n = 6 problem is the same as Euler's 36 officer's problem, which thus is not solvable. Also see MathWorld

Latin squares

Gecode model: latin_squares.cpp.

Just a simple decomposition of Latin squares constraint using Gecode's Matrix wrapper of a n*n IntVarArray
void latin_square(Space& space, Matrix m, 
                  IntConLevel icl = ICL_DOM) {
  int n = m.width();
  for(int i = 0; i < n; i++) {
    distinct(space, m.row(i), icl);
    distinct(space, m.col(i), icl);
  }
}
It is called by the following:
Matrix m(x, n, n);
latin_square(*this, m, opt.icl());

Tourist site competition

Gecode model: tourist_site_competition.cpp

This problem is from from Pierre Flener's presentation Constraint Technology - A Programming Paradigm on the Rise (PDF), pages 5f: problem statement, pages 12f: model, pages 21ff: walktrough of a solution. Here with 7 tourist sites and 7 judges:
Every tourist site is visited by r = 3 judges. Every judge visits c = 3 tourist sites. Every pair of sites is visited by lambda = 1 common judge.
There are 151200 solutions to this problem. With the additional symmetry breaking constraint that Ali (judge 0 in the model) should visit Birka, Falun and Lund (sites 0,1,2) there are 4320 solutions.

This problem was also presented as "The Airline-of-the-Year Problem" in Flener's presentation Constraint Programming - Programming Paradigm on the Rise, page 4f. The problem is stated as follows for 7 airlines and 7 judges:
Constant jury: Every airline is tested by 3 judges. Constant load: Every judge tests 3 airlines. Equity: Every airline pair is tested by 1 common judge.
The model is a quite straight forward translation of the MiniZinc model tourist_site_competition.mzn.

One note: The code for showing which sites a judge visits, and which judges visits a site is done with the following two channel constraints:
// declarations
BoolVarArray x; 
SetVarArray judges_where; // where are the judges
SetVarArray sites_who;    // who visits this sites
Matrix x_m(x, num_judges, num_sites);
// ..
// Where are the judges?
for(int j = 0; j < num_judges; j++) {
   channel(*this, x_m.col(j), judges_where[j]);
}
// Which judges visits the sites?
for(int s = 0; s < num_sites; s++) {
  channel(*this, x_m.row(s), sites_who[s]);
}

Scheduling speakers

scheduling_speakers.cpp. This problem is from from Rina Dechter's book "Constraint Processing", page 72 where the object is to schedule 6 speakers to 6 slots.

It is a very simple model, and the only problem was how to convert the list of available slots to set variable. Here is my solution, and maybe there is a better way,
3,4,5,6,
3,4,
2,3,4,5,
2,3,4,
3,4,
1,2,3,4,5,6
After first trying a more general but complicated variant (which is shown in the model), I settled with this simpler one, which works fine for this small example:
IntSetArgs available(n);
available[0] = IntSet( IntArgs() << 3 << 4 << 5 << 6 );
available[1] = IntSet( IntArgs() << 3 << 4 );
available[2] = IntSet( IntArgs() << 2 << 3 << 4 << 5 );
available[3] = IntSet( IntArgs() << 2 << 3 << 4 );
available[4] = IntSet( IntArgs() << 3 << 4 );
available[5] = IntSet( IntArgs() << 1 << 2 << 3 << 4 << 5 << 6 );
Then the constraints can be written
const static int n = 6;
// ...
x(*this, n, 1, n);
// ...
distinct(*this, x, opt.icl());
for(int i = 0; i < n; i++) {
   // rel(*this, SetVar(*this, available[i],available[i]), SRT_SUP, x[i]);
   rel(*this, singleton(x[i]) <= available[i]);
}
Where the for loop corresponds to this high-level code forall(i in 1..n) (x[i] in available[i]).

PERT

Gecode model: pert.cpp.

This is a simple PERT model in Gecode, from Pascal van Hentenryck Scheduling and Packing In the Constraint Language cc(FD), page 7f.

Compare with the following models:

Lichtenstein coloring problem

Gecode model: lichtenstein_coloring.cpp.

This problem is from bit-player: The chromatic number of Liechtenstein
It seems that Liechtenstein is divided into 11 communes, which emphatically do not satisfy the connectivity requirement of the four color map theorem. Just four of the communes consist of a single connected area (Ruggell, Schellenberg and Mauren in the north, and Triesen in the south). ... In the map above, each commune is assigned its own color, and so we have an 11-coloring. It’s easy to see we could make do with fewer colors, but how many fewer? I have found a five-clique within the map; that is, there are five communes that all share a segment of border with one another. It follows that a four-coloring is impossible. Is there a five-coloring? What is the chromatic number of Liechtenstein?
Also, see Mikael Johansson's post at Michi's blog On the chromatic number of Lichtenstein where I got the connections between communes/enclaves of Lichtenstein.

Apart from quite a lot of preparations, in terms of initializing arrays etc, this model was surprisingly easy to convert from the MiniZinc model (lichtenstein_coloring.mzn).

An example: the symmetry breaking constraint to ensure that we must take the colors in order - i.e. start from color 0, then 1, and so on - we can now (in Gecode 3.4) use the following; it was slightly less writable/readable in earlier versions.
for(int c = 1; c < num_colors; c++) {
  rel(*this, (color_used[c]==1) >> (color_used[c-1]== 1));
}
With this and the other symmetry breaking constraint that we assign a particular commune (Balzers) with the color 0, there are 1872 solutions. They are shown when using -search dfs.

Also, compare with the following models:

Averbach seating problem

Gecode model: averbach_1.4.cpp.

This is Problem 1.4 from Averbach & Chein "Problem Solving Through Recreational Mathematics":
Messr Baker, Dyer, Farmer, Glover, and Hosier are seated around a circular table, playing poker. Each gentleman is the namesake of the profession of one of the others. The dyer is seated two places to the left of Mr Hosier. The baker sits two places to Mr Baker's right. The farmer is seated to the left of Mr Farmer. Mr Dyer is on the glover's right. What is the name of the dyer?
The only tricky part here was the modulo constraints (and these constraints are about all there is). The constraint The dyer is seated two places to the left of Mr Hosier can be translated into the high-level expression dyer = (Hosier - 2) mod 5 which is nearly exactly how to express it in Gecode:
rel(*this, baker == (Baker + 2) % 5);
This could also be written by explicit use the mod constraint:
IntVar c5(*this, 5, 5);
mod(*this, expr(*this, Baker + 2),  c5, baker);
As you probably have understood now, I prefer the first, shorter version.

Fill-a-pix

Gecode model: fill_a_pix.cpp

From Conceptis Puzzles: Fill-a-Pix rules:
Fill-a-Pix is a Minesweeper-like puzzle based on a grid with a pixilated picture hidden inside. Using logic alone, the solver determines which squares are painted and which should remain empty until the hidden picture is completely exposed.
Given a grid of hints with the number of painted squares around a specific square, the object is to figure out (pun intended) the painted squares. X refers to an unknown (and is represented with -1 in the model).
X,X,X,X,X,X,X,X,0,X,
X,8,8,X,2,X,0,X,X,X,
5,X,8,X,X,X,X,X,X,X,
X,X,X,X,X,2,X,X,X,2,
1,X,X,X,4,5,6,X,X,X,
X,0,X,X,X,7,9,X,X,6,
X,X,X,6,X,X,9,X,X,6,
X,X,6,6,8,7,8,7,X,5,
X,4,X,6,6,6,X,6,X,4,
X,X,X,X,X,X,3,X,X,X
This problem has just one combined constraint, and again the new syntactic sugar makes it rather easy to state: for each square which is not X, we require that there are this number of neighbours which are painted (has value 1). n is the dimension of the grid and prob is the hints of the problem instance (an array of integers).
for(int i = 0; i < n; i++) {      
  for(int j = 0; j < n; j++) {
    // get the data from the instance
    int p = prob[c++]; 
    if (p > X) {
      IntVarArgs tmp;
      // must be inside the grid
      for(int a = -1; a <= 1; a++) {
        for(int b = -1; b <= 1; b++) {
          if (i+a >= 0 && i+a < n &&
              j+b >= 0 && j+b < n) {
            tmp << x[(i+a)*n+(j+b)];
          }
        }
      }
      rel(*this, p == sum(tmp));
    }
  }
}
This model was inspired/derived from my MiniZinc model (fill_a_pix.mzn which in turn was derived from my Minesweeper in MiniZinc (minesweeper.mzn).

Compare with the following models: Also, see with the Minesweeper example from the Gecode distribution: minesweeper.cpp.

Minesweeper 2

Gecode model: minesweeper2.cpp

Since the Fill-a-pix problem (see above) is quite like the Minesweeper problem, I also implemented a Minesweeper model based on the Fill-a-pix model.

The first 10 problem instances was borrowed from the Minesweeper example from the Gecode distribution (minesweeper.cpp). I added the 7 instances that was included with the MiniZinc model (see my MiniZinc page for a list of these instances); some of these are not square matrices.

Here is the main constraint for the Minesweeper model. It is very much the same as for Fill-a-pix, with some exceptions:
  • It supports non-square problems
  • Two extra constraints are added:
    • If a cell has a hint, then it cannot be a bomb
    • If a cell contains a bomb then it cannot be numbered.
for(int i = 0; i < rows; i++) {      
  cout << "  ";
  for(int j = 0; j < cols; j++) {
    int p = prob[c++]; // get data
    IntVar pp(*this, p, p);
    rel(*this, (pp > XX) >> (x[i*cols+j] == 0));
    rel(*this, (x[i*cols+j] == 1) >> (pp == XX));
    if (p > X) {
      IntVarArgs tmp;
      for(int a = -1; a <= 1; a++) {
        for(int b = -1; b <= 1; b++) {
          if (i+a >= 0 && i+a < rows &&
              j+b >= 0 && j+b < cols) {
            tmp << x[(i+a)*cols+(j+b)];
          }
        }
      }
      rel(*this, p == sum(tmp));
    }
  }
}

Clock triplets

Gecode model: clock_triplets.cpp. Problem formulation from the F1 Compiler example:
Dean Clark's Problem (Clock Triplets Problem)
The problem was originally posed by Dean Clark and then presented to a larger audience by Martin Gardner. The problem was discussed in Dr. Dobbs's Journal, May 2004 in an article by Timothy Rolfe. According to the article, in his August 1986 column for Isaac Asimov's Science Fiction Magazine, Martin Gardner presented this problem:
"""
Now for a curious little combinatorial puzzle involving the twelve numbers on the face of a clock. Can you rearrange the numbers (keeping them in a circle) so no triplet of adjacent numbers has a sum higher than 21? This is the smallest value that the highest sum of a triplet can have.
"""

Mr Greenguest fancy dress problem

Gecode model: fancy.cpp.

Problem formulation and LPL code:
Mr. Greenfan wants to give a dress party where the male guests must wear green dresses. The following rules are given: 1 If someone wears a green tie he has to wear a green shirt. 2 A guest may only wear green socks and a green shirt if he wears a green tie or a green hat. 3 A guest wearing a green shirt or a green hat or who does not wear green socks must wear a green tie. 4 A guest who is not dressed according to rules 1-3 must pay a $11 entrance fee. Mr Greenguest wants to participate but owns only a green shirt (otherwise he would have to pay one for $9). He could buy a green tie for $10, a green hat (used) for $2 and green socks for $12. What is the cheapest solution for Mr Greenguest to participate?

Jobs puzzle

Gecode model: jobs_puzzle.cpp.

This is a standard problem in Automatic Reasoning. Problem formulation from Larry Wos' problem page:
Jobs Puzzle There are four people: Roberta, Thelma, Steve, and Pete. Among them, they hold eight different jobs. Each holds exactly two jobs. The jobs are chef, guard, nurse, clerk, police officer (gender not implied), teacher, actor, and boxer. The job of nurse is held by a male. The husband of the chef is the clerk. Roberta is not a boxer. Pete has no education past the ninth grade. Roberta, the chef, and the police officer went golfing together. Question: Who holds which jobs?

Heterosquare

Gecode model: heterosquare.cpp. Problem from Comet's old Wiki:
A heterosquare of order n is a n*n square whose elements are distinct integers from 1 to n^2 such that the sums of the rows, columns and diagonals are all different. Here is an example of heterosquare of order 3 19 1 2 3 6 8 9 4 21 7 6 5 18 16 17 12 15 (Sums)
The constraints are shown below. One can note the expression sum(m.row(i)) == row_sums[i] (i.e. sum over a row in the matrix); I wasn't sure if it would work. I have also noted that I've been lazy and use the append array operator (<<) quite much.
Matrix m(x, n, n);
// all the entries in the matrix should be different
distinct(*this, x);

// and all sums should be different
IntVarArgs all_sums;
all_sums << row_sums; // append to the array all_sums
all_sums << col_sums;
all_sums << diag1;
all_sums << diag2;    
distinct(*this, all_sums);

// rows/column sums and diagonals
IntVarArgs v_diag1;
IntVarArgs v_diag2;
for(int i = 0; i < n; i++) {
  rel(*this, sum(m.row(i)) == row_sums[i]);
  rel(*this, sum(m.col(i)) == col_sums[i]);
  v_diag1 << m(i,i);
  v_diag2 << m(n-i-1,i);
}

rel(*this, sum(v_diag1) == diag1);
rel(*this, sum(v_diag2) == diag2);
Some trivia: Without any symmetry breaking, there are 24960 solution of order 3, i.e. a 3 by 3 matrix. The first one is:
            | 14 (diag2)
  1   2   3  =   6
  4   5   8  =  17
  6   9   7  =  22
---------------
 11  16  18 | 13 (diag1)
Compare with the following implementations:

Some explorations of ISBN13

Gecode model: isbn.cpp.

This model is just for exploring ISBN13.

Put the ISBN (with 12 or 13 digits) to check at the command line: ./isbn 978052112549, and the program will print the check digit (9). If we put a shorter ISBN13, e.g with 11 digits (97805211254), then the program will print all the ISBN13 that matches, i.e. will calculate the two last digits in the number:
ISBN13: {9, 7, 8, 0, 5, 2, 1, 1, 2, 5, 4, 7, 1} checkdigit: 1
ISBN13: {9, 7, 8, 0, 5, 2, 1, 1, 2, 5, 4, 6, 2} checkdigit: 2
ISBN13: {9, 7, 8, 0, 5, 2, 1, 1, 2, 5, 4, 5, 3} checkdigit: 3
ISBN13: {9, 7, 8, 0, 5, 2, 1, 1, 2, 5, 4, 4, 4} checkdigit: 4
ISBN13: {9, 7, 8, 0, 5, 2, 1, 1, 2, 5, 4, 3, 5} checkdigit: 5
ISBN13: {9, 7, 8, 0, 5, 2, 1, 1, 2, 5, 4, 2, 6} checkdigit: 6
ISBN13: {9, 7, 8, 0, 5, 2, 1, 1, 2, 5, 4, 1, 7} checkdigit: 7
ISBN13: {9, 7, 8, 0, 5, 2, 1, 1, 2, 5, 4, 0, 8} checkdigit: 8
ISBN13: {9, 7, 8, 0, 5, 2, 1, 1, 2, 5, 4, 9, 9} checkdigit: 9
Note: The digit check for ISBN uses two constants 3 and 1 (which I call mult0 and mult1 in the model). If we let these two "free" (see below how to do that) then there are 90 different numbers that satisfies the first 12 digits. Some of them are (chosen to have not the same check digit):
ISBN13: {9, 7, 8, 0, 5, 2, 1, 1, 2, 5, 4, 9, 1} checkdigit: 1
mult0: 1
mult1: 0
ISBN13: {9, 7, 8, 0, 5, 2, 1, 1, 2, 5, 4, 9, 1} checkdigit: 1
mult0: 1
mult1: 5
ISBN13: {9, 7, 8, 0, 5, 2, 1, 1, 2, 5, 4, 9, 1} checkdigit: 1
mult0: 3
mult1: 3
ISBN13: {9, 7, 8, 0, 5, 2, 1, 1, 2, 5, 4, 9, 5} checkdigit: 5
mult0: 9
mult1: 6
ISBN13: {9, 7, 8, 0, 5, 2, 1, 1, 2, 5, 4, 9, 6} checkdigit: 6
mult0: 0
mult1: 1
ISBN13: {9, 7, 8, 0, 5, 2, 1, 1, 2, 5, 4, 9, 8} checkdigit: 8
mult0: 8
mult1: 5
ISBN13: {9, 7, 8, 0, 5, 2, 1, 1, 2, 5, 4, 9, 9} checkdigit: 9
mult0: 1
mult1: 3
The mult's can be stated on command line:
  ./isbn ISBN mult0 mult1
Example: isbn 978052112549 3 1 for the default multiplicators. The value 11 (or any value larger than 9) is to let one mult free, e.g. isbn 978052112549 11 11 for letting both free, where we get the 90 solutions mentioned above.

Compare with the following models:

Covering (OPL)

Gecode model: covering_opl.cpp.

This problem is from the OPL example cover.mod, where the object is to find a minimum cover of workers for a set of tasks. See the Comet model covering_opl.co for a fairly OPL-like model.

I thought it would be easy to model this problem in Gecode. This Comet constraint is about the only constraint in the model (except for the cost calculation), and it ensures that there is at least one qualified worker for each task:
forall(j in Tasks)
  m.post(sum( c in Qualified[j] ) 
       Hire[c] >= 1);
However, it took quite a time to get this correct. Here is the different stages of the model.

My first approach was to translate this to the following Gecode snippet. Qualified is an IntSetArgs array of the workers qualified for each task, Hire is an IntVarArray of values 0..1, indicating that a workers is hired. The code loop through all the tasks and each workers (w) to sum the qualified and hired workers. We have two boolean arrays here: is_qualified[w] is true (i.e. 1) if this worker is qualified for the task, and is_qualified_and_hired[w] is set if this worker is qualified and hired.
for(int t = 0; t < num_tasks; t++) {
  SetVar s(*this, Qualified[t], Qualified[t]);
  BoolVarArray is_qualified(*this, nbWorkers, 0, 1);
  BoolVarArray is_qualified_and_hired(*this, nbWorkers, 0, 1);
  for(int w = 0; w < nbWorkers; w++) {
    // is this worker qualified for this task?
    rel(*this, s, SRT_SUP, IntVar(*this, w, w), is_qualified[w]);
    // if this worker is qualified _and_ hired 
    // then it's a count
    rel(*this, (is_qualified[w] && Hire[w] == 1) == 
       (is_qualified_and_hired[w] == 1));
  }
  // there must be some worker hired for this task
  rel(*this, sum(is_qualified_and_hired) >= 1);
}
We have seen the set operator SRT_SUP before, in the Scheduling speakers, and like that problem, some trickery was needed to convert the set of qualified workers to this model (and that is more a C++ issue than a Gecode thing).

After some thought, I changed to this more compact version using the set constraint in an expr:
    for(int t = 0; t < num_tasks; t++) {
      SetVar s(*this, Qualified[t], Qualified[t]);
      BoolVarArray is_qualified_and_hired(*this, nbWorkers, 0, 1);
      for(int w = 0; w < nbWorkers; w++) {
        // if this worker qualified _and_ hired then it's a count 
        rel(*this, (expr(*this, s >= IntSet(w,w)) && Hire[w] == 1) == 
            (is_qualified_and_hired[w] == 1));
      }
      // must be some worker hired for this task
      rel(*this, sum(is_qualified_and_hired) >= 1);
    }
I would rather like to loop over just the qualified workers in the Qualified[t] set, but I didn't know how to do that yet.

Ah. Some minutes later I figured it out (by reading the documentation about IntSet in more detail). This is definitely more like it:
for(int t = 0; t < num_tasks; t++) {
  IntSet tt(_QualifiedSet[t]);
  BoolVarArgs bb;
  for(int w = 0; w < nbWorkers; w++) {
    // is this worker qualified?
    if (tt.in(w)) {
      // then: is he/she hired?
      bb << expr(*this, Hire[w] == 1);
    }
  }
  // at least one worker must hired for this task
  rel(*this, sum(bb) >= 1);
}

Pi Day Sudoku 2009

Gecode model: sudoku_pi.cpp.

Problem from Brain Freeze Puzzles/a>, and 360 Pi Day Sudoku 2009:
Each row, column, and region contains the digits 1-9 exactly once plus three Pi symbols.
I wrote about this problem in Pi Day Sudoku 2009 - the models (MiniZinc and Comet), and the two models in MiniZinc (sudoku_pi.mzn) and Comet (sudoku_pi.co).

The Gecode model was surprisingly easy to do, maybe except for changing from 1-based to 0-based indexing for the hints and using an array based representation instead of a matrix proper. The constraints is just a lot of global cardinality constraint (count in Gecode); we cannot use the alldifferent constraints (distinct in Gecode) since there are (exactly) three occurrences of Π (this symbol should be a fancy Pi) in each row/column/region.

Set partition

Gecode model: set_partition.cpp.

Set partition problem. From Koalog:
Given the set S = {1, 2, ..., n}, it consists in finding two sets A and B such that: * A union B = S * |A| = |B| * sum(A) = sum(B) * sum_squares(A) = sum_squares(B)
This model uses set variables and set operations for union, cardinality etc. The sums and squared sums are calculated by using the set variable method weights.

In the Gecode distribution there is also a model of this problem. However it use integer variables and a lot of redundant/efficiency constraints.

Mr Smith problem

Gecode model: mr_smith.cpp.

Logic problem from an IF Prolog example:
The Smith family and their three children want to pay a visit but they do not all have the time to do so. Following are few hints who will go and who will not: o If Mr Smith comes, his wife will come too. o At least one of their two sons Matt and John will come. o Either Mrs Smith or Tim will come, but not both. o Either Tim and John will come, or neither will come. o If Matt comes, then John and his father will also come.
Here the new boolean operators in Gecode shines. In earlier version this would be much harder to write (and read):
BoolVar 
   Mr_Smith(L[0]), 
   Mrs_Smith(L[1]), 
   Matt(L[2]), 
   John(L[3]), 
   Tim(L[4]);

// If Mr Smith comes, his wife will come too.
rel(*this, Mr_Smith >> Mrs_Smith);

// At least one of their two sons Matt and John will come.
rel(*this, Matt || John);

// Either Mrs Smith or Tim will come, but not both.
rel(*this, Mrs_Smith + Tim == 1);

// Either Tim and John will come, or neither will come.
rel(*this, Tim == John);

// If Matt comes, then John and his father will also come.
rel(*this, Matt >> (John && Mr_Smith));

Talisman square

Gecode model: talisman_square.cpp.

From MathWorld Talisman Square:
An n x n array of the integers from 1 to n^2 such that the difference between any one integer and its neighbor (horizontally, vertically, or diagonally, without wrapping around) is greater than or equal to some value k is called a (n,k)-talisman square.
Since this problem used two parameters (n and k) I wrote a special option class (using the standard SizeOption just gives the possibility of one optional parameter).
class TalismanSquareOptions : public Options {
private:
  Driver::UnsignedIntOption _n_option;
  Driver::UnsignedIntOption _k_option;
public:
  TalismanSquareOptions(const char* n) 
    : Options(n),
      _n_option("-n",
                "n value",
                5),
      _k_option("-k",
                "k value",
                4)
  {
    add(_n_option);
    add(_k_option);
  }
  void parse(int& argc, char* argv[]) {
    Options::parse(argc,argv);    
  }
  unsigned int k(void) const  { return _k_option.value();  }
  unsigned int n(void) const  { return _n_option.value();  }
};
And then, the program can be called with the following parameters
./talisman_square -k 3 -n 10

Curious numbers

Gecode model: cur_num.cpp.

Problem: Curious Numbers (#114) from Dudeney "Amusements in Mathematics".

Curious set of integers

Gecode model: curious_set_of_integers.cpp.

From Martin Gardner:
The integers 1,3,8, and 120 form a set with a remarkable property: the product of any two integers is one less than a perfect square. Find a fifth number that can be added to the set without destroying this property.

Bin packing problems

Gecode model: bin_packing.cpp.

This is a rather simple, or at least straightforward, model of bin packing. It includes the same 9 problem as the SICStus Prolog model (bin_packing.pl).

There is an extra option (-symmetry sort) which sorts the entries (the stuff array) before trying to pack, and it is - to no surprise - quite faster.

Exodus (Dell Logic Puzzles)

Gecode model: exodus.cpp.

One new addition of rel-expressions is element which is used in this model. Instead of the following with temporary IntVar:s and element constraints:
IntVar Age_Yemen(*this, dom_ages);
element(*this, Age, Yemen, Age_Yemen);
IntVar Age_Ethiopia(*this, dom_ages);
element(*this, Age, Ethiopia, Age_Ethiopia);
We can now simply write:
rel(*this, element(Age,Yemen) < element(Age,Ethiopia));
However, the following is still not possible to write since Yemen and Ethiopia is IntVar:
rel(*this, Age[Yemen] < Age[Ethiopia]);
Maybe this will come is later version?

Note: As I noted this use of element expression rather late, not all models use it. Also, sometimes it is better to use the temporary version and element constraint, for example where the domain of the (temporary) variable should be tweaked in some way.

Scheduling problem I

Gecode model: schedule1.cpp.

Scheduling problem from SICStus Prolog.

This model use the new scheduling constraint cumultive which is a simpler version of the existing constraint cumulatives. (See Scheduling with the cumulatives constraint in Gecode for more about cumulatives.)

Set covering problem II

Gecode model: set_covering2.cpp.

Set covering problem from Taha: "Operations Research - An Introduction". Example 9.1-2: Minimize the number of security telephones in street corners on a campus.

Set covering problem III

Gecode model: set_covering3.cpp.

Set covering problem from Murty: "Optimization Models for Decision Making", page 302f.

Set partition/set covering problem

Gecode model: set_covering4.cpp.

Problem from Lundgren, Rönnqvist, Värbrand "Optimeringslära", page 408: minimize the cost of the alternatives which covers all the objects. This model has also an option for solving it either as set partition problem (parameter 0) or as a set covering problem (parameter 1).

Set covering (Skiena)

Gecode model: set_covering_skiena.cpp.

From Steven Skiena: The Stony Brook Algorithm Repository. Set cover.

Combinatorial auction

Gecode model: combinatorial_auction.cpp.

This example is from the Wikipedia article Combinatorial_auction. It is here implemented as a set partition problem.

Just forgotten problem (Enigma 1517)

Gecode model: just_forgotten.cpp.

Huey, Dewey and Louie problem

Gecode model: huey_dewey_louie.cpp.

K4P2 Graceful Graph

Gecode model: K4P2GracefulGraph2.cpp.

3 jugs problem

Gecode model: 3_jugs2.cpp.

The famous 3 jugs problem with a constraint programming bent (or MIP bent if you like it).

Knapsack investments problem

Gecode model: knapsack_investments.cpp.

Decide which projects should be investment in, with some extra requirements.

Decomposition of global constraint all_differ_from_at_least_k_pos

Gecode model: all_differ_from_at_least_k_pos.cpp.

Decomposition of global constraint all_equal

Gecode model: all_equal.cpp.

Decomposition of global constraint all_min_dist

Gecode model: all_min_dist.cpp.

Decomposition of global constraint alldifferent_cst

Gecode model: alldifferent_cst.cpp.

Decomposition of global constraint alldifferent_modulo

Gecode model: alldifferent_modulo.cpp.

Decomposition of global constraint alldifferent_on_intersection

Gecode model: alldifferent_on_intersection_modulo.cpp.

Decomposition of global constraint among

Gecode model: among.cpp.

Decomposition of global constraint among_seq

Gecode model: among_seq.cpp.

Decomposition of global constraint bin_packing

Gecode model: bin_packing2.cpp.

Decomposition of global constraint distribute

Gecode model: distribute.cpp.

Decomposition of global constraint inverse

Gecode model: inverse.cpp.

Averbach seating problem 1.2

Gecode model: averbach_1.2.cpp.

This is another seating problem from Averbach & Chein, problem 1.2.

Averbach seating problem 1.3

Gecode model: averbach_1.3.cpp.

A logical puzzle (men, wifes and ponies) from Averbach & Chein, problem 1.3.

Hamming distance

Gecode model: hamming_distance.cpp.

Hanging weights puzzle

Gecode model: hanging_weights.cpp.

Problem from Using LINQ to solve puzzles.

Magic hexagon puzzle (CSPLib problem 23)

Gecode model: magic_hexagon.cpp.

This is problem 23 from CSPLib. Also, see the model (and comments) in ESSENCE.

Bales of Hay problem

Gecode model: bales_of_hay.cpp.

Problem from The Math Less Traveled: The haybaler.

Warehouse location problem

Gecode model: warehouse.cpp.

From OPL model warehouse.mod. Note: There is a warehouses model in the Gecode distribution that solves the same problem, and that is also used as case study example in the excellent "Modeling and Programming with Gecode" (which I read some month ago).

However, my Gecode model was translated rather directly from the Comet model warehouse.co and without looking at the distributed model.

Marathon puzzle

Gecode model: marathon2.cpp.

The problem is from an example of Xpress, but I cannot find the page.
Dominique, Ignace, Naren, Olivier, Philippe, and Pascal have arrived as the first six at the Paris marathon. Reconstruct their arrival order from the following information: a) Olivier has not arrived last b) Dominique, Pascal and Ignace have arrived before Naren and Olivier c) Dominique who was third last year has improved this year. d) Philippe is among the first four. e) Ignace has arrived neither in second nor third position. f) Pascal has beaten Naren by three positions. g) Neither Ignace nor Dominique are on the fourth position.

Olympic puzzle

Gecode model: olympic.cpp.

This is a Prolog benchmark problem.

Seesaw problem

Gecode model: stuckey_seesaw.cpp.

From Marriott & Stuckey "Programming with Constraints", page 257: Balancing on a seesaw.

Fractions problem

Gecode model: olympic.cpp.

This is another Prolog benchmark problem.

Killer Sudoku

Gecode model: killer_sudoku.cpp.

Killer Sudoku is yet another grid puzzle. This model solve the example shown at the Wikipedia page.

Kakuro

Gecode model: kakuro.cpp.

Kakuro is yet another grid puzzle. This model solve the example shown at the Wikipedia page.

KenKen

Gecode model: kenken2.cpp.

KenKen is yet another grid puzzle. This model solve the example shown at the Wikipedia page.

A Round of Golf (Dell Logic Puzzles)

Gecode model: a_round_of_golf.cpp.

Abbot's Puzzle

Gecode model: abbots_puzzle.cpp.

Abbot's puzzle ("Amusements in Mathematics, Dudeney", number 110)

Added Corner

Gecode model: added_corner.cpp.

Arch Friends (Dell Logic Puzzles)

Gecode model: arch_friends.cpp.

Babysitting (Dell Logic Puzzles)

Gecode model: babysittning.cpp.

Breaking News (Dell Logic Puzzles)

Gecode model: breaking_news.cpp.

Four Islands (Dell Logic Puzzles)

Gecode model: four_islands.cpp.

Lecture Series (Dell Logic Puzzles)

Gecode model: lecture_series.cpp.

Tunapalooza (Dell Logic Puzzles)

Gecode model: tunapalooza.cpp.

Five brigands problem

Gecode model: five_brigands.cpp.

The Five Brigands problem from Dudeney: "Amusements in Mathematics" (number 133)

Circling Squares

Gecode model: circling_squares.cpp.

From Dudeney: "Amusements in Mathematics".

Mama's Age

Gecode model: circling_squares.cpp.

From Dudeney: "Amusements in Mathematics", number 40.

Mrs Timpkin's Age

Gecode model: timpkin.cpp.

From Dudeney: "Amusements in Mathematics", number 43.

Torn Number

Gecode model: torn_number.cpp.

From Dudeney: "Amusements in Mathematics", number 113.

Twin letters problem

Gecode model: twin_letters.cpp.

Monks and doors problem

Gecode model: monks_and_doors.cpp.

Contracting Costs

Gecode model: contracting_costs.cpp.

From "Mathematical Puzzles of Sam Loyd, Volume 2", number 20.

General Store

Gecode model: general_store.cpp.

From "Mathematical Puzzles of Sam Loyd, Volume 2", number 30.

Crypta (cryptarithmetic puzzle)

Gecode model: curious_set_of_integers.cpp.

Prolog benchmark problem.

All interval

Gecode model: all_interval.cpp.

Note: This is a fairly true port from the MiniZinc model all_interval.mzn, and it's not very fast. A faster implementation is in the Gecode distribution: all-interval.

Devil's Word

Gecode model: devils_word.cpp.

Translate each character in a word to ASCII value and then try to sum its values (either positive or negative) to a total. Compare with my CGI program Devil's Word

Dinner problem

Gecode model: dinner.cpp.

A very simple dinner problem.

Music Men

Gecode model: music_men.cpp.

Yet another logical puzzle.

Number of days problem

Gecode model: number_of_days.cpp.

Problem from Nathan Brixius' blog post Solving a Knapsack problem with Solver Foundation and LINQ.

Pigeon hole problem

Gecode model: pigeon_hole.cpp.

Pandigital numbers

Gecode model: pandigital_numbers.cpp.

Seven-11 problem

Gecode model: seven11.cpp.

Ski assignment problem

Gecode model: ski_assignment.cpp.

Smuggler's knapsack problem

Gecode model: smuggler_knapsack.cpp.

Smuggler's knapsack problem from Marriott & Stuckey "Programming with constraints", page 101f, 115f

Square root of WONDERFUL

Gecode model: square_root_of_wonderful.cpp.

From Martin Gardner.

Place number puzzle

Gecode model: place_number_puzzle.cpp.

Post office problem

Gecode model: post_office_problem2.cpp.

Problem from Wayne Winston's "Operations Research: Applications and Algorithms": minimizing the number of workers at a post office.

Pythagoras

Gecode model: pythagoras.cpp.

Generating solutions of A^2 + B^2 = C^2 for A,B,C 0..Int::Limits::max.

Rabbits problem

Gecode model: rabbits.cpp.

A very simple linear problem from Pascal Van Hentenryck: "The OPL Optimization Programming Language", page 9.

Remainders problem

Gecode model: remainders.cpp.

From Kordemsky.

Safe cracking problem

Gecode model: safe_cracking.cpp.

Miss Manners' seating problem

Gecode model: miss_manners.cpp.

Problem formulation from CP-standards:
The "Miss Manners" problem is a notorious benchmark for rule engines. The problem is to find an acceptable seating arrangement for guests at a dinner party. It should match people with the same hobbies (at least one), and to seat everyone next to a member of the opposite sex.
I'm somewhat reluctant to publish this model: Although the solution is shown after just a second, it takes "forever" to prove that it is the best. Well, maybe it just a question of correct labeling, but I have found none efficient.

Changes of my Gecode models for version 3.4.0

See Gecode version 3.4 released for a description of things that was changed in Gecode version 3.4.0.

One major change was replacement of tt, imp, eqv, ~ for logical implication, equivalence, and reification to a neater syntax, as well as post has been replaced with rel and expr.

For example, in alldifferent_except_0.cpp (decomposition of the global constraint all different except 0), this code segment
post(space,
     tt(imp(x[i] != c && x[j] != c, // =>
     x[i] != x[j])),
     icl
     );
is now written
rel(space,
   (x[i] != c && x[j] != c) >> (x[i] != x[j]),
   icl
);
This is a more direct syntax which I like very much. Since I'm a lazy person, I love this kind of syntactic sugar.

Another example is from who_killed_agatha2.cpp. This older code segment:
 post(*this, 
      tt(
        imp(hates_m2(i, agatha) == 1, 
            hates_m2(i, butler) == 1)), 
        opt.icl()
     ); 
is now written
 rel(*this, 
        (hates_m2(i, agatha) == 1) >> 
        (hates_m2(i, butler) == 1), 
        opt.icl()); 
Here are all the changed models, which are available at my Gecode page. Many of them required just a simple change of post() to rel(); then there is no special comment below.

A comment: I will very soon also blog about a couple (about 100) new Gecode models that also - as much as possible - use the new syntax.

Gecode version 3.4 released

Version 3.4 of Gecode has been released. It can be downloaded here.

This version has quite a few changes. My favorites are
  • the extended "syntactic sugar" for rel and expr
  • "Modeling and Programming with Gecode" which has been extended with many sections and chapters, for example quite a few case studies.
Very soon after this blog post, I will also publish two posts related to this new Gecode version: Here are the Changes:
Changes in Version 3.4.0 (2010-07-26)

This release includes: considerably improved support for posting expressions and relations (also including set and full arithmetic expressions); other improvements for modeling (array initialization and element addition to arrays); state-of-the-art unary and cumulative scheduling propagators (including optional and flexible tasks); major cleanups of the variable and view infrastructure (now also documented in MPG); cleanups of the examples; several other fixes and performance improvements.

This release is the first to be accompanied by a complete version of "Modeling and Programming in Gecode" which has been extended by many new case studies and parts on programming search engines and variables.

  • Kernel
    • Additions
      • Added LocalObject and LocalHandle classes that can be used for space-allocated objects that are shared within a space, for example among several propagators or propagators and branchers. (minor)
    • Other changes
      • The support for dynamically resizing variable arrays has been removed (it was buggy and inefficient). Instead, all argument arrays now support adding elements using operator<<. In addition, all arrays now support concatenation using operator+ and slicing, and variable arrays, view arrays, and variable argument arrays include a test whether all variables are assigned. (major)
    • Bug fixes
      • Posting a propagator in a failed space could make the space non-failed again. (minor)
  • Finite domain integers
    • Additions
      • You can now construct IntArgs from an STL vector, an IntSharedArray, or using simple comprehensions. (minor)
    • Other changes
      • The argument arrays now have constructors that create new variables. (minor)
      • IntArgs with simple sequences of values can now be created using the IntArgs::create static member function. (minor)
    • Performance improvements
      • Optimized element propagator, expect a speed up of around 35-50% in most cases. (minor)
  • Finite integer sets
    • Other changes
      • The argument arrays now have constructors that create new variables. (minor)
    • Bug fixes
      • Fixed the include and exclude tell operations of set variables so that they work with empty ranges. (minor)
  • Scheduling constraints
    • Additions
      • Added scheduling constraints for tasks with flexible duration (for both unary and cumulative resources), and made all scheduling propagators deal correctly with zero length tasks. (major)
      • Added propagators for cumulative scheduling. (major)
  • Minimal modeling support
    • Additions
      • The minimodel library now provides convenient post functions for set constraints. (major)
    • Other changes
      • The Matrix class now supports const operations and has an output operator. (minor)
      • Linear expressions can now contain non-linear parts, such as multiplications or divisions, or set expressions such as cardinality. (major)
      • The minimodel post functions have been split into two functions, rel and expr. While rel posts a constraint, expr returns a new variable constrained to the given expression. This change makes it possible to get rid of the reification operator (~) as well as the tt and ff functions, which were previously needed to distinguish between relations and expressions. Boolean equivalence and implication can now be expressed using operators (==,<<,>>). (major)
      • Array slices can now be empty. (minor)
    • Bug fixes
      • Fixed a bug in minimodel, which could crash when using zero coefficients. (minor, thanks to Håkan Kjellerstrand)
    • Performance improvements
      • Posting linear expressions performs more aggressive optimizations for assigned variables. (minor)
      • Arithmetic modeling functions now try to avoid creating new variables and posting propagators for common cases. (minor)
  • Script commandline driver
    • Other changes
      • The driver now catches SIGINT (i.e., pressing Ctrl-C) and stops the search properly, printing statistics up to the point where it stopped. (minor)
      • Running a script in time mode stops all iterations and samples immediately if a single run reaches a limit (eases benchmarks with timeouts). (minor)
  • Example scripts
    • Additions
      • New custom branching for the BACP example using a custom value selection. (minor)
    • Removals
      • Removed stress tests, the real examples are much more stressful, actually! (minor)
    • Other changes
      • The Nonogram example now uses AFC as the default branching and includes some more instances. (minor)
      • Take advantage of the better modeling support for the Balanced incomplete block design (BIBD), Golomb ruler, Kakuro, Black Hole, and Warehouse examples (nothing but dusting off examples that have been around for ages). (minor)
  • Gist
    • Other changes
      • If an inspector throws an exception, an error message is printed indicating which inspector caused the problem. Previously, Gist would crash with a Qt error that was difficult to trace. (minor)
    • Bug fixes
      • Fixed bug in Gist where signals were sent across threads, which makes Qt crash in certain situations on some platforms. (minor)
      • Fixed bug in interactive search where every move in the tree required recomputation. (minor, thanks to David Zaremby)
  • Gecode/FlatZinc
    • Bug fixes
      • Boolean relations were incorrect on assigned arguments. (minor)
      • Fixed garbage collection of variables that are not printed. The bug lead to variables being mixed up in the output. (minor)
  • General
    • Removals
      • Variables do not have init functions any longer as they are not needed, see MPG for discussion. (minor)
    • Other changes
      • Completely cleaned up variables and views, drastically saving code. (major)
      • The configure script now checks for qmake-qt4 and moc-qt4, which are used on some Linux systems to distinguish between Qt3 and Qt4. (minor)
      • The build system now supports Visual C++ 2010. (minor)

July 06, 2010

Lightweight Dynamic Symmetry Breaking (LDSB) for Gecode

Christopher Mears just announced on the Gecode mailing list that he has released (a somewhat experimental version of) Lightweight Dynamic Symmetry Breaking (LDSB) library for Gecode (version 3.3.1).

In the announce mail, Mears writes:
The main strength of LDSB is that it allows symmetry breaking to 
be added very simply to a model.  Here is an example:
// ...
IntVarArray xs;
ExampleModel() : xs(*this, 4, 0, 3)
{
    distinct(*this, xs);

    // Symmetry breaking starts here...
    Symmetries s(1);
    s[0] = values_interchange(*this, xs);
    // ... and ends here.

    branch(*this, xs, INT_VAR_NONE, INT_VAL_MIN, s);
}
// ...
There are also a lot of examples how to use LDSB in the package.

The reference for LDSB is C. Mears, M. Garcia de la Banda, B. Demoen, M. Wallace. Lightweight Dynamic Symmetry Breaking. Faculty of Information Technology, Monash University, No. 2010/254, 2010. However, I have not found the paper online.

LDSB was first written for ECLiPSe (and is included in recent releases); see the ECLiPSe documentation for more details about the ECLiPSe LDSB package.

April 08, 2010

Gecode version 3.3.1 released

Gecode version 3.3.1 has been released. From the Changelog:
This release adds many new features to Gist, fixes two major bugs in extensional constraints, and has some more cleanups to comply with the first release of the "Modeling and Programming with Gecode" document. And, as always some small fixes and cleanups.
  • Kernel
    • Other changes
      • The (unused and unusable) CopiedHandle have been removed. (minor)
    • Bug fixes
      • Fixed bug in VarArray::resize function that occurred when shrinking variable arrays. (minor)
  • Finite domain integers
    • Bug fixes
      • Fixed extensional constraint with finite automata for very unlikely (but apparantely possible) border case. (major)
      • The extensional constraints with tuple sets could cause crashes when used with parallel search. (major)
  • Finite integer sets
    • Removals
      • Removed Set::IntSetPropagator and Set::IntSetRePropagator because they are subsumed by the MixBinaryPropagator patterns. (minor)
    • Bug fixes
      • Fixed channeling between set and integer variables which did not propagate enough. (minor)
  • Scheduling constraints
    • Other changes
      • Tasks in unary scheduling constraints may now have processing times of 0. (minor)
    • Bug fixes
      • The unary scheduling propagator with optional tasks missed some propagation sometimes. (minor)
  • Script commandline driver
    • Additions
      • You can now register Gist inspectors in the driver options. (minor)
  • Example scripts
    • Additions
      • Added Gist inspectors for the Knights and Queens examples. (minor)
  • Gist
    • Additions
      • In addition to inspectors, you can now also register comparators, which can be used to compare two nodes in the tree. In combination with the option to compare before computing a fixpoint of the second node, this lets you see what exactly was modified by a branching. (major)
      • Gist can now stop exploration after all alternatives of a certain branching are exhausted. This feature can be turned on by posting a special branching using the Gist::stopBranch post function. Gist will then stop whenever that special branching becomes active. (major)
      • Added inspection of nodes before fixpoint computation. (minor)
      • Nodes can now be bookmarked. (minor)
      • Added inspectors that react whenever a node is selected. (minor)
    • Bug fixes
      • Missing export declarations prevented embedding Gist as a widget. There is now example code for embedding Gist in the directory gecode/gist/standalone-example. (minor)
      • Fixed a bug where sometimes clicking on a node would select a different node. (minor)
    • Performance improvements
      • Scrolling and zooming have been reimplemented. The new implementation is more efficient and works around problems that occurred with large trees on some platforms. Zooming is now more intuitive, keeping the current center centered. You can now also zoom by pressing shift while using the mouse wheel. (major)
  • Gecode/FlatZinc
    • Other changes
      • Comply with MiniZinc 1.1. String literals are not allowed any longer except in annotations, the solver outputs UNKNOWN and UNSATISFIABLE instead of just ==========, and the global constraints all_equal, decreasing_int, and decreasing_bool are supported. (minor)
    • Performance improvements
      • Variables that do not have output annotations are now garbage collected during copying. (minor)
      • When using sums of Boolean variables using bool2int in MiniZinc, the FlatZinc interpreter now posts the more efficient propagators that work directly on the Boolean variables. (minor)
  • General
    • Documentation fixes
      • Removed many small documentation quirks. (minor)
Don't miss the new documentation Modeling and Programming with Gecode. Compared to the earlier version ("Modeling with Gecode"), many chapters has been added on programming propagators and branchers. I recommend reading these chapters even if you are not going to write such a thing, since it is a very good description how Gecode works at lower levels. Also there are now some very good case studies.

March 16, 2010

Gecode version 3.3.0 released

Gecode version 3.3.0 has been released. From the Changelog:

Changes in Version 3.3.0 (2010-03-15)

This release provides some fixes, some performance improvements for domain propagators, and quite some clean ups how propagators and advisors report their status to the kernel. Many of these clean ups are essential to make it easier to program propagators and branchers with Gecode.

  • Kernel
    • Additions
      • Advisors now can force its propagator to be rescheduled, including recomputation of its cost used for scheduling (normally, a propagator is only rescheduled if its modification event delta changes). An advisor can signal forceful rescheduling by returning ES_NOFIX_FORCE or returning the return value of ES_NOFIX_FORCE_DISPOSE. Read the forthcoming "Modeling and Programming with Gecode" for details. (minor)
    • Other changes
      • The failure macros for posting GECODE_ES_FAIL and GECODE_ME_FAIL now only accept a single argument and assume that "home" actually refers to the home space. Read the forthcoming "Modeling and Programming with Gecode" for details. (minor)
      • The functions ES_FIX_PARTIAL, ES_NOFIX_PARTIAL, ES_FIX_DISPOSE, and ES_NOFIX_DISPOSE are now member of Space. Read the forthcoming "Modeling and Programming with Gecode" for details. (minor)
      • The function ES_SUBSUMED now is a member of Space and accepts a propagator as its single argument. The variant with a size as the second argument is available as ES_SUBSUMED_DISPOSED but use is highly discouraged. Read the forthcoming "Modeling and Programming with Gecode" for details. (minor)
      • The functions ES_SUBSUMED_FIX and ES_SUBSUMED_NOFIX for advisors have been renamed to ES_FIX_DISPOSE and ES_NOFIX_DISPOSE. Read the forthcoming "Modeling and Programming with Gecode" for details. (minor)
    • Bug fixes
      • ViewValBrancher with random value selection did not produce a random sequence of values. (minor)
  • Finite domain integers
    • Additions
      • Integer sets (IntSet) now have a in member function for testing whether an integer is included. (minor)
    • Other changes
      • Patterns for reified propagators have been moved to the Gecode::Int namespace. (minor)
    • Performance improvements
      • Considerably improved performance and memory consumption of the DFA-based extensional constraint (regular). (major)
  • Finite integer sets
    • Other changes
      • Patterns for set propagators have been moved to the Gecode::Set namespace. (minor)
  • Minimal modeling support
    • Additions
      • Linear expressions can freely mix integer and Boolean variables and support construction from variable arrays via a sum function. (minor)
    • Removals
      • Removed special cases for posting linear and Boolean expressions consisting of a single variable only (was highly ambigious). (minor)
  • Support algorithms and datastructures
    • Performance improvements
      • Changed to single, really efficient bitset implementation used allover the system. (major)
  • Gist
    • Bug fixes
      • Avoid inter-thread call to QWidget::update, which apparently causes a slight memory leak (and warning messages on stderr) on Mac OS. (minor)
  • Gecode/FlatZinc
    • Other changes
      • The FlatZinc interpreter can now be extended by plugins that implement custom search strategies. The plugins are implemented as dynamically loaded libraries using the Qt plugin mechanism. An example can be found in the directory gecode/flatzinc/exampleplugin. (minor)
      • The index of the variable used for optimization is now available in the FlatZincSpace class. (minor)
      • Added command line option -print, which controls whether all solutions are printed or only the last one that is found, and -search, to choose between branch-and-bound and restart optimization. (minor)
      • The FlatZinc library can now parse a FlatZinc model into any subclass of FlatZincSpace, so that custom extensions can be built. Annotations on the solve item can now be accessed from the returned FlatZincSpace, so that additional search strategies can be implemented. (minor)
    • Bug fixes
      • The FlatZinc interpreter ignored the -c-d and -a-d command line switches when used with Gist. (minor)
I'm really looking forward to the forthcoming document Modeling and Programming with Gecode mentioned above. Also, I'm doing some experiments with the new search annotations using AFC (Accumulated Failure Count, a.k.a. weighted degree of a variable) for Gecode/fz (the FlatZinc solver):
  • afc_min
  • size_afc_min
  • afc_max
  • size_afc_max

November 05, 2009

Gecode version 3.2.1 released

Version 3.2.1 of Gecode is released.

From the Change log:
Changes in Version 3.2.1 (2009-11-04) This release fixes one serious bug in the element constraint for matrices; adds branchings using accumulated failure counts (also known as weighted degree); provides some optimizations (mostly for element constraints and for regular expressions with millions of nodes); adds two cute models (word-square and crossword); and a little this and that as always.

  • Kernel
    • Additions
      • Propagators and variables now maintain an accumulated failure count (AFC). (major). Details: The AFC of a propagator counts how often has the propagator failed during the entire search, and the AFC of a variable is its degree plus the sum of the AFCs of all propagators depending on the variable. While it looks straightforward, this required a major extension of the Gecode toplevel namespace. kernel to deal with global information accessed concurrently from several threads during search. The AFC is also known as "weighted degree".
  • Finite domain integers
    • Additions
      • Added INT_VAL_RANGES_MIN (and INT_VAL_RANGES_MAX) as value selection for branching: it tries the values from the smallest (largest) range first, if the variable's domain has several ranges, otherwise it splits the domain. (minor)
      • Added AFC-based branching strategies for integer and Boolean variables: INT_VAR_AFC_MIN, INT_VAR_AFC_MAX, INT_VAR_SIZE_AFC_MIN, INT_VAR_SIZE_AFC_MAX. For details, see "Modeling with Gecode". (major)
    • Other changes
      • The semantics of division and modulo has changed to match the C standard. This means that the operations round towards zero, and that the result of the modulo has the same sign as its first argument. (minor)
    • Bug fixes
      • Fixed segfault in matrix element constraint. (major)
      • Added missing dispose function for linear disequality of Boolean variables (the only problem was that with a proper dispose function more memory can be reused when the propagator becomes subsumed, so really a tiny quirk). (minor)
    • Performance improvements
      • Element constraints for integer arrays now accept shared integer arrays (IntSharedArray). By this, the very same array can be shared among several element constraints. Models require no change, as IntArgs are automatically coerced to IntSharedArrays. See "Modelling with Gecode" for more explanation. (minor)
      • Optimized element for matrices (in special cases, the propagator is up to six times as efficient as before). (minor)
  • Finite integer sets
    • Additions
      • Added AFC-based branching strategies for set variables: SET_VAR_AFC_MIN, SET_VAR_AFC_MAX, SET_VAR_SIZE_AFC_MIN, SET_VAR_SIZE_AFC_MAX. For details, see "Modeling with Gecode". (major)
    • Other changes
      • Split rel-op.cpp and rel-op-const.cpp into several compilation units, to avoid excessive memory and time usage of the gcc compiler. (minor)
  • Minimal modeling support
    • Performance improvements
      • Conversion of a regular expression to a DFA would crash on regular expressions with several million nodes (due to running out of call stack space). (minor, thanks to Håkan Kjellerstrand)
  • Example scripts
    • Additions
      • Added word square puzzle. (minor , contributed by Håkan Kjellerstrand)
      • Added crossword puzzle (thanks to Peter Van Beek for providing access to some crossword grids). (minor, thanks to Peter Van Beek)
    • Other changes
      • Sudoku and GraphColor now uses smallest size over accumulated failure count (AFC) as the default heuristic. (minor)
    • Bug fixes
      • The Nonogram example no longer crashes on empty lines. (minor, thanks to Jan Wolter)
  • Gecode/FlatZinc
    • Bug fixes
      • Fixed statistics output (number of solutions was sometimes wrong). (minor)
  • General
    • Other changes
      • Now posting of propagators and branchers take an object of class Home (rather than just a space) that can carry additional information relevant for posting (for example, groups and accumulated failure information). Models do not need to be changed in any way! (major) :Details: While models require no change, propagator rewriting does (it will work but some information might be lost). Instead of writing something like GECODE_REWRITE(*this,SomeProp::post(home,...)); where *this is the propagator to be rewritten, you should now write GECODE_REWRITE(*this,SomeProp::post(home(*this),...)); By this, the new propagator will inherit information from *this (in particular the accumulated failure count).

Some notes about the new models

I can now proudly announce that my model word_square.cpp has been included (as the model word-square.cpp), or rather a much improved version of it. Christian Schulte and Mikael Lagerkvist has done a great job of making it very effective with two different approaches. For example, it presents all 17 word squares of order 7 (i.e. a 7x7 square) given the included word list in about a minute on my machine. Some of the fixed bugs above was found when we tested this model.

The other new model, crossword.cpp, solves crossword puzzles given a problem template and a word list. Included are 72 problems taken from different sources, basically a standard benchmark repository. This is also an impressive demonstration of an effective solution of a standard (and fun) problem using a general system.

October 31, 2009

Some new models, etc

The readers that follows me on Twitter (hakankj) have probably already has seen this, but here follows a list of models, etc, not yet blogged about.

MiniZinc: Different models

The following four models are translation of my Comet models: And here are some new models:

Nonogram related

A large 40x50 Nonogram problem instance in MiniZinc: nonogram_stack_overflow.dzn to be used with nonogram_create_automaton2.mzn model. The problem was mentioned in Stack Overflow thread Solving Nonograms (Picross). It is solved in under 1 second and 0 failures.

Today my Nonogram model nonogram_create_automaton2.mzn was included in the great Survey of Paint-by-Number Puzzle Solvers (created by Jan Wolter).

My MiniZinc model is described and analyzed here. I'm not at all surprised that it's slower compared to the other solvers; it was quite expected.

Some comments:
Assessment: Slow on more complex puzzles.

...

Results: Run times are not really all that impressive, especially since it is only looking for one solution, not for two like most of the other programs reviewed here. I don't know what the differences are between this and Lagerkvist's system, but this seems much slower in all cases, even though both are ultimately being run in the Gecode library.
Update 2009-11-01
I later realized two things:

1) That the mzn2fzn translator did not use the -G gecode flag, which means that Gecode/FlatZinc uses a decomposition of the regular constraint instead of Gecode's built in, which is really the heart in this model. The model is basically two things: build an automata for a pattern and then run regular on it.
2) When Jan compiled Gecode, he set off all optmization for comparison reason. This is quite unfortunate since Gecode is crafted with knowledge of the specific optimizations.

I there have run all problems by myself and see how well it would done (at least the ballpart figure) when using an "normal" optimized Gecode and with the -G gecode flag for mzn2fzn.

Explanation of the values:
Problem: Name of the problem instance
Runtime: The value of runtime from Gecode/FlatZinc solver
Solvetime: The value of solvetime from Gecode/FlatZinc solver
Failures: Number of failures:
Total time: The Unix time for running the complete problem, including the time of mzn2fzn (which was not included in the benchmark).
A "--" means that a solution was not reached in 10 minutes.
Problem Runtime Solvetime Failures Total time
Dancer 0.002 0.000 0 1.327s
Cat 0.009 0.002 0 0.965s
Skid 0.012 0.006 13 0.660s
Bucks 0.015 0.004 3 0.866s
Edge 0.008 0.005 25 0.447s
Smoke 0.011 0.004 5 0.963s
Knot 0.026 0.006 0 1.450s
Swing 0.059 0.012 0 1.028s
Mum 0.120 0.093 20 1.811s
Tragic 6:24.273 6:23.607 394841 6:28,10
Merka -- -- -- --
Petro 2.571 2.545 1738 4.071s
M&M 0.591 0.510 89 1.961s
Signed 1.074 1.004 929 2.461s
Hot -- -- -- --
Flag -- [lazy solver: 2 solutions in 10 seconds] -- -- --
Lion -- -- -- --

It's interesting to note that the Lazy solver finds some solutions quite fast for the "Flag" problem. However, there where no other big differences compared to Gecode/FlatZinc. I also tested the problems with JaCoP's FlatZinc solver which solved the problems in about the same time as Gecode/FlatZinc with no dramatic differences.

As mentioned above, the exact values is not really comparable to the benchmark values. But it should give an indication of the result when using -G gecode and a "normal" optimized Gecode.

Unfortunately, I cannot link to the specific models due to copyright issues, but they can all be downloaded from the page Web Paint-by-Number Puzzle Export.

(End update)

Update 2009-11-02
Jan Wolter now have rerun the tests of my solver with the -G gecode option, and the time is much more like mine in the table above. The analysis is quite different with an assessment of Pretty decent, and the following under Result:
...

When comparing this to other solvers, it's important to note that nonogram_create_automaton2.mzn contains only about 100 lines of code. From Kjellerstrand's blog, it is obvious that these 100 lines are the result of a lot of thought and experimentation, and a lot of experience with MiniZinc, but the Olšák solver is something like 2,500 lines and pbnsolve approaches 7,000. If you tried turning this into a fully useful tool rather than a technology demo, with input file parsers and such, it would get a lot bigger, but clearly the CSP approach has big advantages.

(End update 2)

Also, the Gecode nonogram solver is included in the survey: called Lagerkvist. I'm not sure when it was added to the survey. It use the latest version of Gecode, so it must have been quite recent.

Some comments:
Assessment: Pretty decent.

...

Puzzles with blank lines seem to cause the program to crash with a segmentation fault.

Otherwise it performs quite well. There seems to be about 0.02 seconds of overhead in all runs, so even simple puzzles take at least that long to run. Aside from that, it generally outperforms the Wilk solver. It's not quite in the first rank, especially considering that it was only finding one solution, not checking for uniqueness, but it's still pretty darn good.

...
I mailed Jan today about the -solutions n options in the Gecode related solvers (he tested my MiniZinc model with Gecode/FlatZinc), as well as some other comments about my model. Also, I will download the problems tested and play with them.

Tailor/Essence': Zebra puzzle

Suddenly I realized that there where no Zebra problem, neither at Andrea's or here. So here it is: zebra.eprime: Zebra puzzle.

Gecode: Words square problem

See Gecode: Modeling with Element for matrices -- revisited for an earlier discussion of this problem.

Thanks to Christian Schulte my Word square model is now much faster: word_square2.cpp.

From the comment in the model:
Christian Schulte suggested 
using branch strategy INT_VAL_SPLIT_MIN 
instead of INT_VAL_MAX .
This made an improvement for size 7 from
322 failures to 42 and from 1:16 minutes
to 10 seconds (for 1 solution).
 
Now it manage to solve a size 8 in a reasonable 
time (1:33 minutes and 1018 failures):
  abastral
  bichrome
  achiotes
  shippers
  troponin
  rotenone
  amerinds
  lessness
 
But wait, there's more!

In the SVN trunk version of Gecode, there is now a version of this model: examples/word-square.cpp where Christian Schulte and Mikael Zayenz Lagerkvist has done a great job of improving it (using a slighly smaller word list, though). It solves a size 8 problem in about 14 seconds. There is also two different approaches with different strengths, etc. I have great hopes that it will improved even further...

October 14, 2009

Gecode News Archive and RSS feed

Short notice: Gecode now has a News Archive. There is also a RSS feed.

The latest news snippet is the following:


2009-10-14 Layout

The Gecode web site layout has been improved. We now provide a news section on the front page, a dedicated news page, and an RSS feed for all news updates.


October 13, 2009

Gecode: Modeling with Element for matrices -- revisited

In Learning constraint programming - part II: Modeling with the Element constraint I complained about the lack of support for Element for matrices in Gecode. It was also my main complaint of Gecode in SweConsNet 2009 talk.

I'm happy to say that Gecode version 3.2 now contains support for this. Thanks to the Gecode team, and especially Christian! See the API for details.

The models mentioned in the former blog post has now been updated to use the new version of Element. They have the same name as the original model, just with a "2" added. The old code is kept (but commented) in the new models for comparison.

Example: Word square

As an example, here is the old variant for handling element in a matrix of words:
void element_offset(Space& space, 
                    IntArgs words,
                    IntVar e, 
                    IntVar word_len, 
                    IntVar c,
                    IntVar res,
                    IntConLevel icl = ICL_BND) {

  element(space, words, 
                plus(space, 
                     mult(space, 
                          e, 
                          word_len, icl), 
                     c, icl), 
                res, icl);

  
} // element_offset

      // ... and was used as such: 
       element_offset(*this, words, E[i], word_len_v, C[j], tmp, opt.ic());

The new version of Element makes this much easier:
        // ...
        element(*this, words_m, C[j], E[i], tmp, opt.icl());

Performance

For the Word square model the improvement in performance is significant. Now it is very easy to generate a word square of larger orders: The new model gives a solution of order 5 in less than 1 second, whereas the old model took much longer, over a couple of minutes. Order 6 with the new model takes about 2 seconds (I haven't waited for the old model to give a solution).

Here is an example of an order 7 word square; took about a minute with the new model.
laissez
almonry
impling
solideo
snidest
ernesse
zygotes
It is probably possible to tweak this further.

Warning

Last, please note the following warning from Modeling with Gecode about using this feature:
Tip 6.2 (Element for matrix can compromise propagation). Whenever it is possible one should use an array rather than a matrix for posting element constraints, as an element constraint for a matrix will provide rather weak propagation for the row and column variables.

...

October 07, 2009

Gecode version 3.2 released

Gecode version 3.2 is released. It can be downloaded here.

From the Changelog:
Changes in Version 3.2.0 (2009-10-05)

This release has some important bug fixes (in particular for global cardinality aka count), the documentation has been improved (worked around some issues with generation by doxygen), integrates the FlatZinc interpreter into the Gecode source tree, provides propagators for disjunctive scheduling (experimental), and lots of small changes and fixes. For more consistent names, branchings are branchers now and branching descriptions are choices (this you might have to adapt to).

  • Kernel
    • Other changes
      • A branching is now a brancher and a branching description is now a choice. (major).
        Details: Classes and member functions have been renamed accordingly. The change is necessary due to proper explanation in the forthcoming "Programming with Gecode".
  • Search engines
    • Other changes
      • Optimized thread creation by thread pools, now the creation and deletion of arbitrarily many parallel search engines also works for platforms using pthreads (Linux and MacOS). (minor)
      • Path for search provides top and empty methods. (minor, thanks to Vincent Barichard)
    • Bug fixes
      • The memory reported could be sometimes too low (that could only happen when an advisor allocates memory which they do only now). (minor)
      • Compiles again if no threads available. (minor)
  • Finite domain integers
    • Additions
      • Added regret_min and regret_max for IntVar and BoolVar (they were only available for IntView). (minor, thanks to Kish Shen)
      • Added branch and assign for single integer and Boolean variable. (minor)
      • Added element constraints for Matrix arrays. (minor, thanks to Håkan Kjellerstrand)
    • Other changes
      • The element constraint now computes more accurate variable bounds when being posted (to avoid arithmetic overflow in naive models). (minor)
      • The element constraint now throws an exception if used with an empty array. (minor)
      • Moved cumulatives to the new scheduling library. (minor)
      • Moved circuit to the new graph library. (minor)
    • Bug fixes
      • Slightly improved strength of the division propagator. (minor, thanks to Jan Kelbel)
      • Fixed serious bug in the bounds propagator for global cardinality. (major)
    • Performance improvements
      • Extensional propagators using DFAs or REGs (aka regular) use a more compact state representation but create their state more eagerly. That can improve performance considerably (twice as fast) at a slight increase in memory. (minor, thanks to George Katsirelos, Nina Narodytska)
      • Optimized n-ary disjunction and conjunction and the clause constraint. (minor)
      • Linear constraints over Boolean variables with unit coefficients (aka Boolean cardinality constraints) have been reimplemented. Less memory (minus 30%) and more speed. For example, BIBD runs 10% faster now. (minor)
  • Finite integer sets
    • Additions
      • Added branch and assign for single set variable. (minor)
      • Added element constraints for Matrix arrays. (minor, thanks to Håkan Kjellerstrand)
    • Other changes
      • The element constraint with an integer index variable now throws an exception if used with an empty array. (minor)
  • Scheduling constraints
    • Additions
      • Added propagators for disjunctive scheduling (unary resource scheduling). This is still experimental as the propagators are not yet optimized and branching and modelling support is not yet available. (major)
      • Added a new module for scheduling. To use scheduling constraints, you have to include <gecode/scheduling.hh> and link against the scheduling library. (minor)
  • Graph constraints
    • Additions
      • Cost-based variants for circuit added. (minor)
      • Added a new module for graph constraints. To use graph constraints, you have to include <gecode/graph.hh> and link against the graph library. (minor)
  • Minimal modeling support
    • Additions
      • Added element constraints for Matrix interface to arrays. (minor, thanks to Håkan Kjellerstrand)
  • Script commandline driver
    • Additions
      • Added new standard option for options called symmetry. (minor)
    • Other changes
      • The driver takes copies of all string values passed top it. (minor, thanks to Luca Di Gaspero)
  • Support algorithms and datastructures
    • Other changes
      • No longer depend on availability of timersub. (minor, thanks to Alexandre Fayolle)
  • Example scripts
    • Additions
      • Added branching following Warnsdorff's heuristic for Knights. (minor)
    • Other changes
    • Bug fixes
      • Fixed wrong symmetry breaking for TSP. (minor, thanks to Geoffrey Chu)
      • Fixed zero cost edges in TSP examples. (minor)
  • Gist
    • Other changes
      • The Gist console now has a toolbar that provides buttons to clear the text as well as to configure the console window to stay on top of Gist. Furthermore, after adding output, the console now automatically scrolls to the bottom. (minor)
    • Bug fixes
      • Gist now places clones also on the leftmost branch during search. (minor)
  • Gecode/FlatZinc
    • Additions
      • The Gecode interpreter for the FlatZinc language is now part of the main Gecode source tree. (major)
  • General
    • Other changes
      • Compiles with MSVC 2005 again. (minor, thanks to David Rijsman)
    • Bug fixes
      • The configure script checked for Qt 4.2, although Gist requires at least Qt 4.3. (minor)
    • Documentation fixes
      • Cleaned up the generated reference documentation, and introduced a number of workarounds for issues in doxygen. In particular, the documentation for linear constraints over Boolean variables and for the thread abstractions is now generated properly. (major)
      • Mention that also grep is needed for building Gecode. (minor, thanks to Vivian De Smedt)
Also, don't forget to read the new version of Modeling with Gecode.

Note: I have yet to change my models at My Gecode page to version 3.2.

Update some hours later: The following three models was the only that has to be changed, all with the simple addition of #include <gecode/scheduling.hh>:

September 23, 2009

MiniZinc Challenge 2009 Results

The result of MiniZinc Challenge 2009 is presented in MiniZinc Challenge 2009 Results:
There were two entrants this year:

* Gecode
* SICStus

In addition, the challenge organisers entered the following three FlatZinc implementations:

* ECLiPSe
* G12/FD
* G12/LazyFD

As per the challenge rules, these entries are not eligible for prizes, but do modify the scoring results.

Summary of Results

fd_search

sicstus 1651.8
eclipse_ic 322.1
gecode 4008.8
g12_fd 2040.6
g12_lazyfd 1376.6

free_search

sicstus 1841.0
gecode 4535.5
g12_fd 1112.4
g12_lazyfd 2511.1


Congratulations to the Gecode team!

June 03, 2009

Scheduling with the cumulatives constraint in Gecode

One of the few problems of my learning problems that has not been modeled in Gecode before was the Furniture moving problem (from Marriott & Stuckey "Programming with Constraints", page 112f). The reason has been that Gecode don't have the common cumulative constraint, but instead the generalized and extended cumulatives constraint (note the last "s"). This cumulatives constraint was defined and explained in the paper A new multi-resource cumulatives constraint with negative heights (ps.Z) by Nicolas Beldiceanu and Mats Carlsson.

Furniture moving

The last couple of days I have read that paper and experimented with the cumulatives constraint in modeling the Furniture moving problem. The Gecode program is here: furniture_moving.cpp.

Some notes
The cumulatives paper shows 8 examples/variants of usuage (A..H, which are explained below) and the Furniture moving problem is modeled after the G variant, a covering model of producers/consumers. The main approach is that we have the four tasks (consumers, the furnitures to move) and three movers (producers). The height of the consumers is coded as negative, and the movers as height 1 (positive), i.e.
int _durations[] = {   30,    10,  15,   15                };
int _height[] =    {   -2,    -1,  -2,   -2,      1,  1,  1};
int _limit[] =     {0}; // just one machine
Just for fun, I also let the durations of the movers "free" to see how long they have to work according to this model. The result was that the movers 2 and 3 have to work full time (60 minutes), but the first mover has to work just 10 minutes.

8 Cumulatives examples

In the paper cited above, Beldiceanu and Carlsson showed 8 different variants (A to H) of how to model with the cumulatives constraint. I have tried to implement all these examples in Gecode as faithful as possible.

The Gecode code is cumulatives_examples.cpp and is fairly general with a short explanations (cited from the paper) for each variants. For selecting a specific problem instance it takes a command line parameter where 0 is example A to parameter 7 for example H.

Below are some comments for each example, and as said before there are more comments in the Gecode code. Please note that Gecode is 0-based so we start numbering tasks and machines from 0 and onward.

Example A

This is the "plain old" cumulative with one machine.

Example B

cumulative with two machines.

Example C

Note that the atmost parameter of cumulatives is set to false since we have an at least constraint. Also a dummy task (0) are used here.

Example D

Same as the above, except for the "dummy" task. Also, I have added an another cumulatives constraint with atmost=true so all tasks are not "stacked" above each other.

Example E

Producer/consumer model. This one was quite tricky to understand first, but after reading the description some times I realized it's beauty: all producers start at time 0 and and produce whatever they produce, and all consumers ends at the "last date" (here time 6) and consumes whatever they consumes.

Example F

Generalization of example E with two machines.

Example G

A "covering problem", also elegant stated with cumulatives. Here there are two tasks (consumers, task 0 and 1) which have negative height, and 4 persons (producers) with the positive height 1. The atmost parameter must be false.

It is this variant that the above mentioned Furniture moving problem was modeled after.

Example H

A generalization of Example G with two machines. Here I have hard coded which machine a task/person should belong since otherwise only task 6 would be at machine 1. Also, I added an atmost limit for each machine for esthetic reasons.

Another scheduling example: Building a house

This example is a standard example from OPL: how to build a house with some dependencies of different tasks, e.g. that masonry must be finished before carpentry, painting must be finished before moving in etc.

This problem was done in Gecode with cumulatives (as variant A, see above) and was very straightforward to implements, both the cumulatives and the dependencies graph. There code is here: building_a_house.cpp.

See also: Some scheduling models in Comet

Comet has a great support for scheduling with a special Schedule framework, much like OPL's. Here are my Comet models that use this Schedule framework. The two first models are standard OPL examples. Also, the Comet model furniture_moving_cumulative.co is a variant of the Furniture moving problem, using a cumulative constraint implemented as a decomposition.

May 21, 2009

Gecode version 3.1 released

Gecode version 3.1 is released. The changes are:
Changes in Version 3.1.0 (2009-05-20)
This release introduces parallel search, features improved memory management (can double efficiency on MacOS X), and provides a reusable command line driver upon popular request. And, of course, some this and that.
  • Kernel
    • Performance improvements
      • Cache memory blocks from deleted spaces. This hardly increases peak memory consumption. It improves performance on Windows and Linux only by up to 5%, but on MacOS by 50% in some cases (this improvement is absolutely essential for parallel execution). (major)
  • Search engines
    • Additions
      • Added parallel search engines for DFS, BAB, and Restart (but not LDS). Please make sure to read the section "Parallel search" in "Modeling with Gecode". (major) Details
    • Other changes
      • The stop function of stop objects now also takes a second argument of type Search::Options. This is in particular useful for decisions that involve the number of threads used for search. (minor)
  • Finite domain integers
    • Additions
      • Added a wait propagator: executes a function when a variable (or variables) become(s) assigned. (minor)
    • Other changes
      • The INT_VAL_MED value selection now consistently selects the greatest element in the domain not greater than the median. (minor)
  • Finite integer sets
    • Additions
      • Added a wait propagator: executes a function when a variable (or variables) become(s) assigned. (minor)
    • Other changes
      • The set constraint sequentialUnion has been renamed to sequence. (minor)
  • Script commandline driver
    • Additions
      • Added a new module "driver" as a commandline driver for scripts. This is due to popular request: most people have been using the support functionality for examples anyway. This function is now wrapped into a proper module (and Example is now called Script to be more general). See "Modeling with Gecode" for documentation. (major)
    • Other changes
      • If Gist is not available, -mode gist is the same as -mode solution. Invocation with -help also prints information about how Gecode has been configured. (minor)
  • Support algorithms and datastructures
    • Additions
      • Added a tiny portable thread package specifically tailored to the needs for parallel search. Unfortunately, other portable thread packages have just too many issues. (minor)
      • Support::quicksort and Support::insertion support using the less than operator for the sort order by leaving out the comparator object. (minor)
  • Example scripts
    • Additions
      • Added new example for equidistant frequency permutation arrays (EFPA). (minor)
    • Other changes
      • The parameters of the Hamming Codes example are now configurable through command line options (instead of hard-coded). (minor)
  • Gist
    • Other changes
      • Small user interface changes: disable search from hidden nodes, add depth information to status bar, and add statistics for subtrees (available from the node context menu and the Node menu). (minor)
      • Easily add multiple inspectors to Gist. Inspectors are not exclusive any longer, you can select any combination of them to respond to clicks or solutions simultaneously. (minor)
    • Bug fixes
      • Fixed a dead lock that could occur when closing the Gist main window while search is still running. (minor)
      • The inspectors are finalized before Gist exits. This fixes a bug where (at least on Mac OS) some memory was not freed in the correct order. (minor)
      • Gist now correctly centers the current node after search has finished. (minor)
Also, version 1.6 of Gecode/FlatZinc is released to support for MiniZinc version 1.0. See MiniZinc version 1.0 released! for more about MiniZinc version 1.0.

Update: Some comments and findings
I have now converted all my Gecode models at My Gecode page to version 3.1.The only thing I has to care about was that the Example class is incorporated properly in the distribution and is now called Script. Also see the updated introduction document Modeling with Gecode (PDF) for more information.

Here are the changes I did, examplified with the model debruijn.cpp.

Remove
   #include "examples/support.hh"
and add the following (#include <minimodel.hh> is shown here for completion).
   #include <gecode/driver.hh>
   #include <gecode/int.hh>
   #include <gecode/minimodel.hh>

   using namespace Gecode;
The base class Example should also be replaced with Script. This means that the following old code:
   class DeBruijn : public Example {
was replaced with:
   class DeBruijn : public Script {
as where all the other occurrences of Example

Accordingly the optimization variants should also be replaced:
   MinimizeExample -> MinimizeScript
   MaximizeExample -> MaximizeScript
The Options has been incorporated in the Driver class so I had to change
   UnsignedIntOption _base_option;
   StringOption _int_var_option; 
to
   Driver::UnsignedIntOption _base_option;
   Driver::StringOption _int_var_option; 
And now I will check out the other new features...

May 09, 2009

Learning Constraint Programming IV: Logical constraints: Who killed Agatha? revisited

Here is a comparison of how different constraint programming systems implements the logical constraints in the problem Who Killed Agatha?, also known as The Dreadsbury Mansion Murder Mystery.. In Learning constraint programming - part II: Modeling with the Element constraint the problem was presented and showed how the systems implements the Element constraints.

Problem formulation from The h1 Tool Suite
Someone in Dreadsbury Mansion killed Aunt Agatha. Agatha, the butler, and Charles live in Dreadsbury Mansion, and are the only ones to live there. A killer always hates, and is no richer than his victim. Charles hates noone that Agatha hates. Agatha hates everybody except the butler. The butler hates everyone not richer than Aunt Agatha. The butler hates everyone whom Agatha hates. Noone hates everyone. Who killed Agatha?
Originally from F. J. Pelletier: Seventy-five problems for testing automatic theorem provers., Journal of Automated Reasoning, 2: 191-216, 1986.

Here we see compare how different constraint programming systems implement the three emphasized conditions in the problem formulation above:
  • the concept of richer
  • Charles hates noone that Agatha hates
  • No one hates everyone
All models defines the concepts of hates and richer as two matrices. The matrix declarations are omitted in the code snippets below.

Models

Here are the different models for the Who killed Agatha? problem. JaCoP and Choco has two version for how to implement the Element constraint, see the link above. Also, there is no Gecode/R version of this problem.

Defining the concept of richer

First we define the concept of richer, which consists of two parts:
  • No one is richer than him-/herself
  • if i is richer than j then j is not richer than i
This is an antisymmetric relation which is explored somewhat more (with an alternative predicate of the concept) in the MiniZinc model antisymmetric.mzn.

The logical concept used here is equivalence (if and only of).

MiniZinc

% define the concept of richer: no one is richer than him-/herself
forall(i in r) (
   richer[i,i] = 0
)

/\ % if i is richer than j then j is not richer than 
forall(i, j in r where i != j) (
   richer[i,j] = 1 <-> richer[j,i] = 0
)

Comet

Note that Comet don't have support for equivalence directly, instead we have to use two implications. Update: equivalence in Comet is written as == (I tested <=> which didn't work). Thanks to Pascal Van Hentenryck for pointing this out.
// no one is richer than him-/herself
forall(i in r)
  m.post(richer[i,i] == 0);

// if i is richer than j then j is not richer than i
forall(i in r, j in r : i != j) {
  /* earlier version: two implications
  m.post(richer[i,j] == 1 => richer[j,i] == 0);
  m.post(richer[j,i] == 0 => richer[i,j] == 1);
  */
  // equivalence
  m.post((richer[i,j] == 1) == (richer[j,i] == 0));
}

JaCoP

//  no one is richer than him-/herself
for(int i = 0; i < n; i++) {
    store.impose(new XeqC(richer[i][i], 0));
}

//  if i is richer than j then j is not richer than i
for(int i = 0; i < n; i++) {
    for(int j = 0; j < n; j++) {
        if (i != j) {
             store.impose(
                          new Eq(
                                 new XeqC(richer[i][j], 1),
                                 new XeqC(richer[j][i], 0)
                                 )
                          );

        }
    }
}

Choco

//   a) no one is richer than him-/herself
for(int i = 0; i < n; i++) {
    m.addConstraint(eq(richer[i][i], 0));
}

//   b) if i is richer than j then j is not richer than i
for(int i = 0; i < n; i++) {
    for(int j = 0; j < n; j++) {
        if (i != j) {
             m.addConstraint(
                             ifOnlyIf(
                                    eq(richer[i][j], 1),
                                    eq(richer[j][i], 0)
                                      )
                             );
        }
    }
}

Gecode

// no one is richer than him-/herself
for(int i = 0; i < n; i++) {
  rel(*this, richer_m(i,i), IRT_EQ, 0, opt.icl());
}

// if i is richer than j then j is not richer than i
for(int i = 0; i < n; i++) {
  for(int j = 0; j < n; j++) {
    if (i != j) {
      post(*this, tt(
              eqv(
                 richer_m(j,i) == 1, // <=>
                 richer_m(i,j) == 0)), 
                 opt.icl());
    }
  }
}

Charles hates noone that Agatha hates

Here is the definitions of the condition Charles hates noone that Agatha hates, which simply mean the implication:
  if Agatha hates X then Charles don't hate X

MiniZinc

When starting to modeling these kind of problems, I tend to follow the order of the conditions, which here means that the Charles part is before the Agatha part. When remodeling in another system the order tends to be fixed (cf the Comet version).
forall(i in r) (
   hates[charles, i] = 0 <- hates[agatha, i] = 1
)

Comet

forall(i in r)
  m.post(hates[agatha, i] == 1 => hates[charles, i] == 0);

JaCoP

for(int i = 0; i < n; i++) {
    store.impose(
                 new IfThen(
                            new XeqC(hates[agatha][i], 1),
                            new XeqC(hates[charles][i], 0)
                            )
                 );
}

Choco

I tend to copy/paste the models for Choco and JaCoP and just change the functions that are different. A consequence of this is that some special feature in one of these two systems is not used.
for(int i = 0; i < n; i++) {
    m.addConstraint(
                    implies(
                            eq(hates[agatha][i], 1),
                            eq(hates[charles][i], 0)
                            )
                    );
}

Gecode

for(int i = 0; i < n; i++) {
   post(*this, tt(
                  imp(
                        hates_m(i,agatha) == 1, 
                        // =>
                        hates_m(i,charles) == 0)), 
                        opt.icl());
}

No one hates everyone

This is the last condition to compare: No one hates everyone. It is implemented by a sum of the number of persons that each person hates, and this sum must be 2 or less. Note that it is possible to hate oneself.

MiniZinc

forall(i in r) (
  sum(j in r) (hates[i,j]) <= 2  
)

Comet

  forall(i in r) 
    m.post(sum(j in r) (hates[i,j]) <= 2);

JaCoP

Note: We could save the XlteqC constraint by restrict the domain of a_sum to 0..2 instead of 0..n (=3) but this explicit use of XlteqC is more clear.
for(int i = 0; i < n; i++) {
    FDV a[] = new FDV[n];
    for (int j = 0; j < n; j++) {
        a[j] = new FDV(store, "a"+j, 0, 1);
        a[j] = hates[i][j];
    }
    FDV a_sum = new FDV(store, "a_sum"+i, 0, n);
    store.impose(new Sum(a, a_sum));
    store.impose(new XlteqC(a_sum, 2));
}

Choco

Note: sum is an operator, which makes the condition somewhat easier to state than in JaCoP.
for(int i = 0; i < n; i++) {
    IntegerVariable a[] = makeIntVarArray("a", n, 0, 1);
    for (int j = 0; j < n; j++) {
        a[j] = hates[i][j];
    }
    m.addConstraint(leq(sum(a), 2));
}

Gecode

In Gecode this condition is quite easy to state by using linear. In order to use this there is a Matrix "view" of the hates matrix hates_m.
Matrix hates_m(hates, n, n);
// ...
for(int i = 0; i < n; i++) {
  linear(*this, hates_m.row(i), IRT_LQ, 2, opt.icl());
}

End comment

The mandatory end comment: There are probably better ways of modeling the problem than shown above, either by changing some details or by model the problem completely different. Maybe this will be done sometime...

May 08, 2009

Learning Constraint Programming III: decomposition of a global constraint: alldifferent_except_0

The series Learning Constraint Programming is a preparation for My talk at SweConsNet Workshop 2009: "Learning Constraint Programming (MiniZinc, JaCoP, Choco, Gecode/R, Comet, Gecode): Some Lessons Learned". Confusingly, the entries is not numbered in any logical order. Sorry about that.

Here are the previous entries:

Introduction

The global constraint alldifferent_except_0 (or the more general variant alldifferent_except_c) is one of my favorite global constraints. It is very handy to use when 0 (or any constant c) is coded as an unknown or irrelevant value. Then we can constraint the rest to be all distinct.

The great Global Constraint Catalog entry alldifferent_except_0) explains this constraint as:
Purpose
Enforce all variables of the collection VARIABLES to take distinct values, except those variables that are assigned to 0.

Example
​(<5,​0,​1,​9,​0,​3>)​
The alldifferent_except_0 constraint holds since all the values (that are different from 0) 5, 1, 9 and 3 are distinct.

Models

I have modeled a decomposition of alldifferent_except_0 in the following models, where the constraint is just tested perhaps combined with some other constraint, e.g. sorted or that there must be at least some zeros:

- MiniZinc alldifferent_except_0.mzn
- Comet: alldifferent_except_0.co
- Gecode/R: all_different_except_0.rb
- Choco: AllDifferentExcept0_test.java
- JaCoP: AllDifferentExcept0_test.java
- Gecode: alldifferent_except_0.cpp

Some models using alldifferent_except_0

And here is some real use of the constraint:

- Nonogram (Comet): nonogram.co. (A faster model using the regular constraint, nonogram_regular.co, is described here and here).
- I wrote about alldifferent_except_0 in Pi Day Sudoku 2009. However, as faster way of solving the problem was found and is described in Solving Pi Day Sudoku 2009 with the global cardinality constraint). Note: the competition is still on, so there is no link to any model.
- Sudoku generate (Comet): sudoku_generate.co
- all paths graph (MiniZinc): all_paths_graph.mzn
- Cube sum (MiniZinc): cube_sum.mzn
- Message sending (MiniZinc): message_sending.mzn

As the first two entries indicates, there may be faster solutions than using (a decomposition) of alldifferent_except_0, but even as a decomposition is a great conceptual tool when modeling a problem.

Implementations

In the implementations below we also see how to define a function (predicate) in the constraint programming systems.

For the Gecode/R model there are different approaches:
- "standard" ("direct") approach where we loop over all different pairs of elements and ensures that if both values are different from 0 then they should be different
- using count
- using global cardinality ("simulated" in Gecode/R, see below)

Also, in some models we use the slighly more general version alldifferent_except_c where c is any constant (e.g. "Pi" in the Pi Day Sudoku puzzle mentioned above.

MiniZinc:

Model: alldifferent_except_0.mzn.
predicate all_different_except_0(array[int] of var int: x) =
   let {
      int: n = length(x)
   }
   in
   forall(i,j in 1..n where i != j) (
        (x[i] > 0 /\ x[j] > 0) 
        -> 
        x[i] != x[j]
   )
;

// usage:
constraint all_different_except_0(x);

Comet:

Model: alldifferent_except_0.co.
function void alldifferent_except_0(Solver m, var{int}[] x) {
  int n = x.getSize();
  forall(i in 1..n, j in i+1..n) {
    m.post(
           x[i] > 0 && x[j] > 0 
           => 
           x[i] != x[j]
           );
  }
}

// usage
exploreall {
  // ...
  alldifferent_except_0(m, x);
}

Gecode/R

Model:all_different_except_0.rb.

When modeling the constraint in Gecode/R, I experimented with different approaches. The reification variant all_different_except_0_reif is actually quite fast.
# The simplest and the fastest implementation 
# using count for 1..max (poor man's global cardinality)
def all_different_except_0
  (1..self[0].domain.max).each{|i|
    self.count(i).must <= 1
  }
end

# global cardinality version using an extra array with the counts
def global_cardinality(xgcc)
  (self[0].domain.max+1).times{|i|
    xgcc[i].must == self.count(i)
  }
end

#
# The standard approach using reification.
#
def all_different_except_0_reif(x)
  n = x.length
  b1_is_an bool_var_matrix(n,n)
  b2_is_an bool_var_matrix(n,n)
  b3_is_an bool_var_matrix(n,n)
  n.times{|i|
    n.times{|j|
      if i != j then 
        x[i].must_not.equal(0, :reify => b1[i,j]) 
        x[i].must_not.equal(0, :reify => b2[i,j]) 
        x[i].must_not.equal(x[j], :reify => b3[i,j])
        (b1[i,j] & b2[i,j]).must.imply(b3[i,j])
      else 
        b1[i,j].must.true
        b2[i,j].must.true
        b3[i,j].must.true
      end
    }
  }
end

    # ...
    # usage:
    x.all_different_except_0
    # all_different_except_0_gcc(x)
    # all_different_except_0_reif(x)

Choco

Model: AllDifferentExcept0_test.java.

Note that here alldifferent_except_0 is derived from the more general version alldifferent_except_c.
//
// decomposition of alldifferent except 0
//
public void allDifferentExcept0(CPModel m, IntegerVariable[] v) {
    allDifferentExceptC(m, v, 0);
}

//
// slightly more general: alldifferent except c
//
public void allDifferentExceptC(CPModel m, IntegerVariable[] v, int c) {
    int len = v.length;
    for(int i = 0; i < v.length; i++) {
        for(int j = i+1; j < v.length; j++) {
            m.addConstraint(ifThenElse(
                                       and(
                                           gt(v[i], c), 
                                           gt(v[j], c)
                                           ), 
                                       neq(v[i], v[j]),
                                       TRUE
                                       )
                            );
        }
    }    
}

// ...
// usage: 

    m.addConstraint(allDifferent(queens));

JaCoP

Model: AllDifferentExcept0_test.java

This is exactly the same approach as the Choco version.
//
// decomposition of alldifferent except 0
//
public void allDifferentExcept0(FDstore m, FDV[] v) {
    allDifferentExceptC(m, v, 0);
} // end allDifferentExcept0


//
// slightly more general: alldifferent except c
//
public void allDifferentExceptC(FDstore m, FDV[] v, int c) {
    int len = v.length;
    for(int i = 0; i < v.length; i++) {
        for(int j = i+1; j < v.length; j++) {
            m.impose(new IfThen(
                                       new And(
                                           new XneqC(v[i], c), 
                                           new XneqC(v[j], c)
                                           ), 
                                       new XneqY(v[i], v[j])
                                       )
                            );
        }
    }        
} // end allDifferentExceptC


        // ...
        // usage:
        allDifferentExcept0(store, x);

Gecode

alldifferent_except_0.cpp

The Gecode version is very succint since it use overloaded boolean operators. Very nice.
void alldifferent_except_0(Space& space, IntVarArray x, IntConLevel icl = ICL_BND) {
  for(int i = 0; i < x.size(); i++) {
    for(int j = i+1; j < x.size(); j++) {
      post(space,
           tt(
           imp(x[i] != 0 && x[j] != 0, 
           // =>
           x[i] != x[j])),
           icl
           );
    }
  }
} // alldifferent_except_0


// ...
// usage:
    alldifferent_except_0(*this, x, opt.icl());

May 05, 2009

Learning constraint programming - part II: Modeling with the Element constraint

As indicated last in Learning constraint programming (languages) - part I here are some findings when implementing Crossword, Word square, and Who killed Agatha?. See links below for the implementations.

Spoiled
The first constraint programming system i learned after constraint logic programming in Prolog was MiniZinc. When implemented the problems below I realized that I have been quite spoiled by using MiniZinc. The way MiniZinc (and also Comet) supports the Element constraint, i.e. access of variable arrays/matrices, is very straightforward in these systems and it doesn't matter whether the array to access is an array of integers or of non-variable variable integers. In the Java (Choco, JaCoP) and C++ systems (Gecode), however, this is another matter. Due to different circumstances I have not implemented these models in Gecode/R.

Element in MiniZinc and Comet
Accessing arrays and matrices in MiniZinc and Comet is simply done by using the [] construct, no matter what the type of the array or the index are (I assume integers and variable integers here). For the other systems we must explicitly use the Element constraint (called nth in Choco).

Crossword

This is a standard constraint programming problem, used as a running example in Apt's great book Principles of Constraint Programming. Here is a formulation of the problem (stated differently than in the book):
Place the words listed below in the following crossword. The '#' means a blocked cell, and the numbers indicate the overlappings of the words.

      1   2   3   4   5
    +---+---+---+---+---+
  1 | 1 |   | 2 |   | 3 |
    +---+---+---+---+---+
  2 | # | # |   | # |   |
    +---+---+---+---+---+
  3 | # | 4 |   | 5 |   |
    +---+---+---+---+---+
  4 | 6 | # | 7 |   |   |
    +---+---+---+---+---+
  5 | 8 |   |   |   |   |
    +---+---+---+---+---+
  6 |   | # | # |   | # |
    +---+---+---+---+---+
                         
We can use the following words
* AFT * ALE * EEL * HEEL * HIKE * HOSES * KEEL * KNOT * LASER * LEE * LINE * SAILS * SHEET * STEER * TIE

Models


MiniZinc: crossword.mzn
Choco: CrossWord.java
JaCoP: CrossWord.java
Gecode/R: (Not implemented)
Comet: crossword.co
Gecode: crossword.cpp

Note: I have seen more general models for solving crossword problems in Choco, JaCoP, Gecode/R, and Gecode with constructions other that the simple Elements used here. Since I wanted to compare the same way of solving the problem using the same Element-construct this may be an unfair comparison between the systems. Well, this is at least a finding how to implement this problem by Element...

Explanation of variables
The matrix A is the individual chars of the words (Comet variant):
int A[1..num_words,1..word_len] = 
  [
   [h, o, s, e, s], //  HOSES
   [l, a, s, e, r], //  LASER
   [s, a, i, l, s], //  SAILS
   [s, h, e, e, t], //  SHEET
   [s, t, e, e, r], //  STEER
   [h, e, e, l, 0], //  HEEL
   [h, i, k, e, 0], //  HIKE
   [k, e, e, l, 0], //  KEEL
   [k, n, o, t, 0], //  KNOT
   [l, i, n, e, 0], //  LINE
   [a, f, t, 0, 0], //  AFT
   [a, l, e, 0, 0], //  ALE
   [e, e, l, 0, 0], //  EEL
   [l, e, e, 0, 0], //  LEE
   [t, i, e, 0, 0]  //  TIE
];
overlapping is the matrix of the overlapping cells)
This is the Comet version:
[
 [1, 3, 2, 1],   //  s
 [1, 5, 3, 1],   //  s 
  
 [4, 2, 2, 3],   //  i
 [4, 3, 5, 1],   //  k
 [4, 4, 3, 3],   //  e
  
 [7, 1, 2, 4],   //  l
 [7, 2, 5, 2],   //  e
 [7, 3, 3, 4],   //  e
  
 [8, 1, 6, 2],   //  l
 [8, 3, 2, 5],   //  s
 [8, 4, 5, 3],   //  e
 [8, 5, 3, 5]    //  r
 ];
E is the variable array of which the word to use for the different overlappings. This is in fact the only variable (array) that is needed in the problem, apart from the utility/convenience variables.

The main constraint for the crossword example in each system is stated thus:

MiniZinc:
forall(i in 1..num_overlapping) (
   A[E[overlapping[i,1]], overlapping[i,2]] =  A[E[overlapping[i,3]], overlapping[i,4]]
)
Comet:
  forall(i in 1..num_overlapping) {
    cp.post(A[E[overlapping[i,1]], overlapping[i,2]] ==  A[E[overlapping[i,3]], overlapping[i,4]], onDomains);
  }
Choco:
Note that Choco has a special Element which support two dimensional arrays (matrix), which we use.
for(int I = 0; I < num_overlapping; I++) {
  IntegerVariable tmp = makeIntVar("tmp" + I, 1, 26);
  M.addConstraint(nth(E[overlapping[I][0]], W[overlapping[I][1]], AX, tmp));
  M.addConstraint(nth(E[overlapping[I][2]], W[overlapping[I][3]], AX, tmp));
}
JaCoP:
Here we had used some trickery by using a transposed version of the word matrix since JaCoP has no special Element constraint for two dimensional arrays.
for (int I = 0; I < num_overlapping; I++) {
   FDV tmp = new FDV(store, "TMP" + I, 0, num_words*word_len);
   store.impose(new Element(E[overlapping[I][0]], words_t[overlapping[I][1]], tmp));
   store.impose(new Element(E[overlapping[I][2]], words_t[overlapping[I][3]], tmp));
}
Gecode:
This is more complicated compared to the two Java systems since in Gecode we use an array (of length rows*cols) to simulate the matrix. (There is a Matrix "view" in Gecode but the indices must be of type integer, not IntVar, so it can not be used.) Also, the constraints plus and mult takes IntVar as argument.

The first overlapped crossing is "expanded" like this (Gecode is 0-based):
   A[E[overlapping[i,0]], overlapping[i,1]] // MiniZinc/Comet
   =>
   a1 = A[ E[I*4+0] * word_len + overlapping[I*4+1]] // Gecode
Here is the complete code. The comments hopefully explains what is going on.

First we define an utility function for accessing the element according to the above principle.
/**
 *
 * Special version of element for an array version of a "matrix" words,
 * E is an integer variable array, C is an array of IntVars for
 * the offset j in the words "matrix".
 *
 * The call 
 *    element_offset(*this, words, E[i], word_len_v, C[j], res, opt.icl());
 *
 * corresponds to:
 *    res = words[E[i], j] --> words[E[i]*word_len+J]
 *
 */
void element_offset(Space& space,
                   IntArgs words,
                   IntVar e,
                   IntVar word_len,
                   IntVar c,
                   IntVar res,
                   IntConLevel icl = ICL_DOM) {

      element(space, words, 
              plus(space, 
                   mult(space, 
                        e, 
                        word_len, icl), 
                   c, icl), 
              res, icl);
}

The function is then used as follows:
for(int I = 0; I < num_overlapping; I++) {
   IntVar e1(*this, 0, num_overlapping*4);
   IntVar e2(*this, 0, num_overlapping*4);

   IntVarArray o(*this, 4, 0, num_overlapping*4);
   for(int J = 0; J < 4; J++) {
     post(*this, o[J] == overlapping[I*4+J], opt.icl());
   }

   element(*this, E, o[0], e1, opt.icl());      // e1 = E[I*4+0]
   element(*this, E, o[2], e2, opt.icl());      // e2 = E[I*4+2]

   IntVar a1(*this, 0, num_words*word_len);
      
   element_offset(*this, A, e1, word_len_v, o[1], a1, opt.icl());
   element_offset(*this, A, e2, word_len_v, o[3], a1, opt.icl());
}
(The same element_offset function is also used in the word_square problem below.) It took quite a time to get the function and temporary variables (and their domains) right. With training (and the element_offset as a skeleton) similiar problems should be easier to implement.

Note: this is not a bashing of Gecode. Gecode is a great system and it happens that for this specific problem, Gecode does not has the appropriate support. I should also mention that it was a long time since I programmed in C++ and am little rusty.

As mentioned earlier, I have been very spoiled by the MiniZinc (and Comet) constructs. Also: I'm a very 'lazy' person (in the Perl sense of the word lazy) and likes the agile programming languages - Perl, Ruby, Python etc - much for their high level constructs.

Word square problem

The word problem is a cousin to the crossword problem, and is described in Wikipedia's Word_square:
A word square is a special case of acrostic. It consists of a set of words, all having the same number of letters as the total number of words (the "order" of the square); when the words are written out in a square grid horizontally, the same set of words can be read vertically.
An example of order 7 found by the Comet model where we see that the first row word (aalborg) is also the first column word.

aalborg
abaiser
lantaca
bittman
osamine
recanes
granese

Models

Here are the models for solving the Word square problem:
MiniZinc: word_square.mzn
Choco: WordSquare.java
JaCoP: WordSquare.java
Gecode/R: (Not implemented it Gecode/R)
Comet: word_square.co
Gecode: word_square.cpp

It is somewhat easier than the crossword problem. As before, E is the array of the index of the words to use, and words is an matrix of the words. Also, these models is an experiment of how to read a file, the word list /usr/dict/words (standard on Unix/Linux systems).

MiniZinc
forall(I, J in 1..word_len) (
  words[E[I], J] = words[E[J],I]
)
Comet
  forall(i in 1..word_len) {
    forall(j in 1..word_len) {    
      cp.post(words[E[i], j] == words[E[j],i], onDomains);
    }
  }
JaCoP
// 
// The overlappings (crossings).
// Note that we use a transposed word matrix for the Element.
//
for(int i = 0; i < word_length ; i++) {
    for(int j = 0; j < word_length ; j++) {
        // Comet: words[E[i], j] ==  words[E[j],i]
        FDV tmp = new FDV(store, "tmp" + i + " " + j, 0, dict_size);
        store.impose(new Element(E[i], words[j], tmp));
        store.impose(new Element(E[j], words[i], tmp));
    }
}
Choco
// Constants for the nth constraint below
IntegerVariable [] C = new IntegerVariable[dict_size];
for (int I = 0; I < word_length; I++) {
    C[I] = makeIntVar("C"+I, I,I);
}

// The overlappings (crossings)
for(int I = 0; I < word_length ; I++) {
    for(int J = 0; J < word_length ; J++) {
        // Comet: words[E[i], j] ==  words[E[j],i]
        IntegerVariable tmp = makeIntVar("tmp" + I + " " + J, 0, dict_size);
        M.addConstraint(nth(E[I], C[J], words, tmp));
        M.addConstraint(nth(E[J], C[I], words, tmp));
    }
}
Gecode
Note that this model use the same function element_offset that was used in the Crossword problem. It took some time to realize that it also could be used here.
// convenience variables for the element constraints below
// since element, plus, and mult wants IntVars.
IntVar word_len_v(*this, word_len, word_len);
IntVarArray C(*this, word_len, 0, word_len-1);
for(int i = 0; i < word_len; i++) {
  rel(*this, C[i], IRT_EQ, i, opt.icl());
}

for(int i = 0; i < word_len; i++) {
  for(int j = 0; j < word_len; j++) {
    // words[E[i], j] ==  words[E[j],i]

    IntVar tmp(*this, 0, num_words);

    // tmp == words[E[i], j] --> words[E[i]*word_len+j]
    element_offset(*this, words, E[i], word_len_v, C[j], tmp, opt.icl());

    // tmp == words[E[j], i]  --> words[E[j]*word_len+i]
    element_offset(*this, words, E[j], word_len_v, C[i], tmp, opt.icl());
  }
}

Who killed Agatha?

This is a standard benchmark for theorem proving, also known as The Dreadsbury Mansion Murder Mystery.

Problem formulation from The h1 Tool Suite
Someone in Dreadsbury Mansion killed Aunt Agatha. Agatha, the butler, and Charles live in Dreadsbury Mansion, and are the only ones to live there. A killer always hates, and is no richer than his victim. Charles hates noone that Agatha hates. Agatha hates everybody except the butler. The butler hates everyone not richer than Aunt Agatha. The butler hates everyone whom Agatha hates. Noone hates everyone. Who killed Agatha?
Originally from F. J. Pelletier: Seventy-five problems for testing automatic theorem provers., Journal of Automated Reasoning, 2: 191-216, 1986.

Models


MiniZinc: who_killed_agatha.mzn
JaCoP : WhoKilledAgatha.java
JaCoP : WhoKilledAgatha_element.java
Choco: WhoKilledAgatha.java
Choco: WhoKilledAgatha_element.java
Comet: who_killed_agatha.co
Gecode: who_killed_agatha.cpp

In Some new Gecode models I wrote about the findings in implemented this problem in Gecode and compared to Comet/MiniZinc.

The models use two 3x3 matrices for representing the two relations hates and richer with 0..1 as domain (i.e. boolean). The Element constraints is used for implementing the condition A killer always hates, and is no richer than his victim. where the_killer is an integer variable; the_victim is in some models replaced by agatha (the integer 0). The interesting thing here is that at least one of the indices are integer variables, which caused the difficulties in the two problems above.

These models also use a lot of boolean constructs. A comparison of how these are implemented in the different CP systems may be described in a future blog post.

MiniZinc
hates[the_killer, the_victim] = 1 /\
richer[the_killer, the_victim] = 0
Comet:
m.post(hates[the_killer, the_victim] == 1);
m.post(richer[the_killer, the_victim] == 0);
Note: In the models below I have simplified the problem by using agatha (defined as the integer 0) instead of the integer variable the_victim. This is not a problem since we know that Agatha is the victim, and is the reason why the Element is easier to use than for Crossword and Word square.

JaCoP variant 1 (no Element):
JaCoP don't have a direct support for the case when the index i (in matrix[i][j]) is an integer variable so the first variant of modeling the condition A killer always hates, and is no richer than his victim. does not use Element at all. In WhoKilledAgatha.java we simply loop over all integers (0..2), check if "this" i equals the_killer and then we can use two integers for accessing the matrices. Also, note the IfThen construct.
for(int i = 0; i < n; i++) {
    store.impose(
                 new IfThen(
                            new XeqC(the_killer, i),
                            new XeqC(hates[i][agatha], 1)
                            )
                 );
    store.impose(
                 new IfThen(
                            new XeqC(the_killer, i),
                            new XeqC(richer[i][agatha], 0)
                            )
                 );
}
This was the first variant I implemented, but then I recall the "trickery" used in Crossword and Word square where the matrices where transposed and Element could be used. The problem with this approach is that all constraints must be rewritten in a way that may be confusing. Come to think of it, maybe the names of the matrices should have been changed to is_hated_by and poorer.

JaCoP variant 2 (transposed matrices, Element)
This method of transposition and using Element is implemented in WhoKilledAgatha_element.java. The constraint is now much simpler:
int shift = -1;
for(int i = 0; i < n; i++) {
    store.impose(new Element(the_killer, hates[agatha], one, shift));
    store.impose(new Element(the_killer, richer[agatha], zero, shift));
}
Note: the Element in JaCoP defaults to start index as 1, but has support for shifting it to 0, by using -1 as the shift parameter. Choco variant 1 (no Element)
I implemented exact the same principle that was used in the two JaCoP model in the two Choco models. The first - no Element - is WhoKilledAgatha.java:

for(int i = 0; i < n; i++) {
    m.addConstraint(implies(
                            eq(the_killer,i),
                            eq(hates[i][agatha], 1)
                            )
                    );
    m.addConstraint(implies(
                            eq(the_killer, i),
                            eq(richer[i][agatha], 0)
                            )
                    );
}
Choco variant 2 (transposed matrices, nth)
Note: one and zero are integer variables since nth cannot handle plain integers as the last argument.
for(int i = 0; i < n; i++) {
   m.addConstraint(nth(the_killer, hates[agatha], one));
   m.addConstraint(nth(the_killer, richer[agatha], zero)
}

Comments

Here we have seen - not surprisingly - that using the Element constraint is quite different depending of which CP system we use and it can be easy or not so easy. It was my explicit intension to see how to solve the same problem as similiar as possible. We should also note that most (if not all) problems can be modeled in many ways, some not using Element at all.

One last comment: The two Java models of Who killed Agatha? took quite a long time to implement. The main reason for that was not the the handling of Element but was due to a bug of confusing the two matrices in one of the conditions. Sigh.

April 13, 2009

Regular expressions in Gecode

Using regular expressions for constraint programming is not new. The following examples from the Gecode distribution uses regular expressions in some way (the links are to the Gecode site): I have now started to learn more about Gecode's support for regular expression. The following operations are supported (from Gecode::REG Class Reference:
  • =: Assign to regular expression r.
  • +: Return expression for: this expression followed by r.
  • +=: This expression is followed by r.
  • |: Return expression for: this expression or r.
  • |=: This expression or r.
  • *: Return expression for: this expression arbitrarily often (Kleene star).
  • +: Return expression for: this expression at least once.
  • (unsigned int n, unsigned int m): Return expression for: this expression at least n and at most m times.
  • (unsigned int n): Return expression for: this expression at least n times.
See below for some examples.

Global contiguity

For a simple use of regular expressions to implement a global constraint is global contiguity which is explained is Global constraint catalog as:
Enforce all variables of the VARIABLES collection to be assigned to 0 or 1. In addition, all variables assigned to value 1 appear contiguously.
The model global_contiguity.cpp implements this by the simple regular expression: 0*1*0*, i.e.
REG r0(0), r1(1);
extensional(*this, x, *r0 + *r1 + *r0);
Note that the operators are before the REG element.

(This model also implements global contiguity by a variant of the slide
constraint.)

Generating names via regular expressions

In April 2004 Geoffrey K. Pullum (at Language Log) wondered about the many spellings of the swedish author Henning Mankell in The mysteries of... what's his name?.

I then wrote an Icon program for generating all the different versions of the names, and presented the program and solutions in the (swedish) blog posts: It was a fun problem and I have now implemented a version in Gecode: all_regexp.cpp.

There are two names (regular expressions) which are generated.

Henning Mankell
This uses the regular expression [hm][ea](nk|n|nn)(ing|ell|all), which in Gecode is stated as
extensional(*this, X, 
            (h|m) +                            // [hm]
            (e|a) +                            // [ea]
            ((n+k) | n | (n+n) ) +             // (nk|n|nn)
            ( (i+n+g) | (e+l+l) | (a+l+l))     // (ing|ell|all)
            ); 
In the original both the first and last name was generated; here we just generate the first name (or the last name).

Here are all 36 solutions:
Total: 36 solutions:
hanall
hanell
haning
henall
henell
hening
manall
manell
maning
menall
menell
mening
hankall
hankell
hanking
hannall
hannell
hanning
henkall
henkell
henking
hennall
hennell
henning
mankall
mankell
manking
mannall
mannell
manning
menkall
menkell
menking
mennall
mennell
menning
kjellerstrand
This is my last name and I have seen a lot of misspellings of this name. The regular expression for some of the variants is k(je|ä)ll(er|ar)?(st|b)r?an?d which in Gecode is implemented as:
extensional(*this, X, 
            k +                  // k
            (j+e|auml) +         // (je|ä)
            l+l +                // ll
            ((e+r)|(a+r))(0,1) + // (er|ar)?
            ((s + t)|b) +        // (st|b)
            r(0,1) +             // r?
            a+                   // a
            n(0,1) +             // n?
            d                    // d
            );
Different sizes, collecting all solutions
The model actually loops through many models with different sizes (length of the array) and collects all the solutions in a vector that is printed at the end of the program. Here is a the main function:
int
main(int argc, char* argv[]) {
  SizeOptions opt("AllRegexp");
  opt.solutions(0);
  opt.model(AllRegexp::MODEL_MANKELL);
  opt.model(AllRegexp::MODEL_MANKELL, "mankell");
  opt.model(AllRegexp::MODEL_KJELLERSTRAND, "kjellerstrand");
  opt.parse(argc,argv);
  // default for the mankell model
  int min = 6;
  int max = 7;
  if (opt.model() == AllRegexp::MODEL_KJELLERSTRAND) {
    min = 7;
    max = 13;
  }

  // loop through all the different sizes
  for(int s = min; s <= max ; s++) {
    opt.size(s);
    Example::run(opt);    
  }

  std::cout << "Total: " << all_solutions.size() << " solutions: " << std::endl;
  for(unsigned int i = 0; i < all_solutions.size(); i++) {
    std::cout << all_solutions[i] << std::endl;
  }

  return 0;

April 12, 2009

12 more Gecode models. e.g. circuit, clique, and Hidato puzzle

Here are 12 more Gecode models and some comments. Note that more information and references to the problem is always presented is the model. These and other Geode models are collected at My Gecode page.
  • alldifferent_except_0.cpp: Global constraint of alldifferent_except_0 (decomposition), also the slightly more general alldifferent_except_c

    This is one of my first test models how to create (a decomposition of) a global constraint as well as how to call a function. Note the use of imp (implication):
    void alldifferent_except_c(Space& space, IntVarArray x, int c, IntConLevel icl = ICL_BND) {
      for(int i = 0; i < x.size(); i++) {
        for(int j = i+1; j < x.size(); j++) {
          post(space,
               tt(imp(x[i] != c && x[j] != c, // =>
                      x[i] != x[j])),
               icl
               );
        }
      } 
    }
    
    Just for fun, the model also has the constraint that there should be at least one 0.
  • discrete_tomography.cpp: Discrete tomography

    For more about discrete tomography, see for example Christoph Dürr's Discrete Tomography and the comments in the ECLiPSe code tomo.ecl.txt.

    The coding of the model was quite straightforward. The main lesson learned here is to be able to state (from the command line) which problem to solve. I borrowed the principle used in Gecode's nonogram example.

    First in the model is the global variables for accessing the specific problem:
    namespace {
      extern const int* problems[];         // List of problems
      extern const unsigned int n_examples; // Number of specifications
    }
    
    And last in the file is the specifications of the problems:
    namespace {
      // ...
      // Specification for problem 1
      const int p1[] =
        { 11, 12,
          // Rows sums
          0,0,8,2,6,4,5,3,7,0,0,
          // Column sums
          0,0,7,1,6,3,4,5,2,7,0,0
        };
      
      // ...
    
    To access the problem the following is used (slightly compressed)
    class DiscreteTomography : public Example {
    protected:
      const int* prob;  // Problem to be used
      IntVarArray x;     // Fields of matrix
      int rows(void) const { return prob[0]; } // Return rows of matrix
      int cols(void) const { return prob[1]; } // Return columns of matrix
    public:
      DiscreteTomography(const SizeOptions& opt) 
        : prob(problems[opt.size()]),
          x(*this, rows()*cols(), 0, 1)
      {
        // ...
    
  • labeled_dice.cpp: Labeled dice problem

    Jim Orlin wrote about this puzzle in Colored letters, labeled dice: a logic puzzle and I presented a Comet solution (labeled_dice.co) in Comet: regular constraint, a much faster Nonogram with the regular constraint, some OPL models, and more.

    The Gecode model use one element constraint and a couple of count constraints. It was simpler than I first thought.
  • building_blocks.cpp: Building Blocks (Dell Logic Puzzle)

    This problem almost the same as labeled_dice (see above) and should perhaps be incorporated into that model.
  • young_tableaux.cpp: Young tableaux and partitions

    Young tableaux are described in the Wikipedia article Young tableau and MathWorld article YoungTableau, and is one of my standard comparison models.

    The tricky part here was to calculate the partition (from the Young tableau), especially the following (MiniZinc/Comet) where p is the partition element, and x is the Young tableau matrix. p[i] == sum(j in 1..n) (x[i,j] <= n) The corresponding Gecode was:
    for(int i = 0; i < n; i++) {
      // p[i] == sum(j in 1..n) (x[i,j] <= n)
      IntVar t(*this, 0, n+1);
      count(*this, m.row(i), n+1, IRT_EQ, t, opt.icl());
      post(*this, p[i] == n-t, opt.icl());
    }
    
    Not very advanced, but it took some time.
  • set_covering_deployment.cpp: Set covering deployment

    Set covering deployment is presented in the MathWorld Set Covering Deployment and uses a small example of placing Roman armies.

    Well, the hardest part was the second requirement: "There should always be an backup army near every city". See the model how I implemented that.
  • subset_sum.cpp: Subset sum problem

    This problem is from Katta G. Murty Optimization Models for Decision Making (PDF) page 340:
    Example 7.8.1

    A bank van had several bags of coins, each containing either 16, 17, 23, 24, 39, or 40 coins. While the van was parked on the street, thieves stole some bags. A total of 100 coins were lost. It is required to find how many bags were stolen.
    Comments The subset sum constraint is simply this constraint where x is the values to choose from and tot is the sum:
     linear(space, values, x, IRT_EQ, tot,  icl);
    
    The model is slightly more general, as the total (to sum) can be set from the command line.
  • sonet.cpp: SONET problem

    This is a standard problem in constraint programming. My solution is based on the ESSENCE' (Andrea Rendl's Tailor) model sonet_problem.eprime.

    The biggest problem was to translate the following requirement if there is a demand between 2 nodes, then there has to exist a ring, on which they are both installed which use an exists construct in the ESSENCE' model (as well as my MiniZinc model sonet_problem.mzn. The correspondent Gecode code is:
    for(int client1 = 0; client1 < n; client1++) {
      for(int client2 = client1 + 1; client2 < n; client2++) {
        if(demand[client1*n+client2] == 1) {
          // ESSENCE' code:
          //   exists ring : int(1..r) .
          //      rings[ring,client1] + rings[ring, client2] >= 2)
    
          // here we use reification: 
          BoolVarArray bs(*this, r, 0, 1);
          for(int ring = 0; ring < r; ring++) {
            // does this ring satisfy the condition?
            // note: the "~" is needed
            bs[ring] = post(*this, ~(rings_m(client1, ring) + rings_m(client2,ring) >= 2), opt.icl());
          }
          // at least one success is needed
          linear(*this, bs, IRT_GQ, 1, opt.icl());
        }
      }
    }
    
    I really like that Gecode supports this higher level of stating logical operations and reifications.
  • hidato.cpp: Hidato puzzle

    Hidato is yet another grid puzzle. For more about Hidato see the following sources (and games to play): The first source explains the problem as:
    Puzzles start semi-filled with numbered tiles. The first and last numbers are circled. Connect the numbers together to win. Consecutive number must touch horizontally, vertically, or diagonally.


    Conceptually the problem can be modeled as follows (which was done in the MiniZinc model hidato.mzn), e.g. one forall loop and four exists loops. It works but is inefficient:
      % ...
      forall(k in 1..r*c-1) (
         exists(i in 1..r, j in 1..c) (
            k = x[i, j] % fix this k /\
            exists(a, b in  {-1, 0, 1}
              where 
              i+a >= 1 /\ j+b >=  1 /\
              i+a <= r /\ j+b <= c
              /\ not(a = 0 /\ b = 0) 
            ) 
            (
              % find the next k
              k + 1 = x[i+a, j+b]  
            )
         )
      )
    
    I started to follow this logic in Gecode with a lot of boolean arrays and reifications, but after some thoughts here is what was used instead.
        for(int k = 1; k < n*n; k++) {
          IntVar i(*this, 0, n-1);
          IntVar j(*this, 0, n-1);
          IntVar a(*this, -1, 1);
          IntVar b(*this, -1, 1);
    
          // 1) First: fix this k, i.e.
          //        k == board[i*n+j]
          IntVar inj(*this, 0,n*n-1);
          post(*this, inj == i*n+j, opt.icl());
          element(*this, board, inj, k, opt.icl());
    
          // 2) Then, find the position of the next value, i.e.
          //     k + 1 == board[(i+a)*n+(j+b)]
          IntVar ai(*this, 0, n-1); // this takes care of the borders of the matrix
          IntVar jb(*this, 0, n-1); // ibid.
          post(*this, a+i == ai,opt.icl());
          post(*this, j+b == jb,opt.icl());
          IntVar ai_plus_jb(*this, 0,n*n-1);
          post(*this, ai_plus_jb == ai*n+jb, opt.icl());
          element(*this, board, ai_plus_jb, k+1, opt.icl());
        }
    
    Note that the borders of the matrix is handled by the domains of the IntVar's ai and jb.
  • circuit_orbit.cpp: Global constraint circuit (decomposition) using permutation orbits

    circuit is a global constraint I have been fascinated by for a time, and I first implemented a decomposition in MiniZinc (circuit_test.mzn). Gecode has a builtin version of circuit, but I thought it could be fun to implement this decomposition as well.

    It uses some observation I noted about permutation orbits when playing around with another problem.

    Also, it sets the parameter c_d (recomputation commit distance) to size*2 in order to save memory. For example, with the default value of requires 600Mb memory for the problem size=500, but with c-d 1000 requires only 12Mb (for more about this, see the discussion and comments in de Bruijn sequences in Gecode (and other systems)).
  • nadel.cpp: Nadel's construction problem ("soft" version)

    This a construction problem attributed by Rina Dechter "Constraint Processing" (page 5) to B.A. Nadel's article "Constraint satisfaction algorithms" (1989). I not sure about the solution, since I have not been able to find Nadel's article.

    I haven't found any solution that satisfies all the constraints presented in the problem (which may indicate that my model is wrong).

    However this can be used as a study of reifications: this "soft" version minimized the broken constraints, and generates 28 different solutions with 1 broken constraint. The broken constraints are either
    • steep_slopes constraints, or
    • near_dump constraints.
  • clique.cpp: Global constraint clique (decomposition), maximize the cardinality of cliques in a graph

    This decomposition of the global constraint clique for a graph g uses the (obvious) observation that if two different nodes is in the same clique (s) then there must be a connection between the two nodes in the boolean (channeled) representation of s. This actually seems to be enough for use as a constraint (which is an example of the declarative nature of constraint programming). In MiniZinc parlance this is formulated in the model clique.mzn as:
    link_set_to_booleans(x, x_bool) /\
    forall(i,j in 1..n where i != j) (
       (x_bool[i] /\ x_bool[j]) -> ( g[i,j] > 0)
    ) /\
    c = card(s)
    
    The Gecode variant is slightly more complicated, mainly because the imp function don't accept IntArgs as a second argument.
    // x_bool is a boolean representation of the clique set x
    BoolVarArray x_bool(*this, n, 0, 1);
    channel(*this, x_bool, x); // channel the clique to a boolean array
    for(int i = 0; i < n; i++) {
      for(int j = 0; j < n; j++) {
        if (i != j) {
          BoolVar b(*this, 0, 1);
          if (g[i*n+j] > 0) {
            post(*this, b == 1, opt.icl());
          } else {
            post(*this, b == 0, opt.icl());
          }
          post(*this, tt(imp(
                            x_bool[i] == 1 && x_bool[j] == 1, // ->
                            b)),
              opt.icl());
         }
      }
    }
    
    The model contains the small example from Global constraint catalog: clique, a graph of 10 nodes, a bigger random graph of 100 nodes, and a graph randomizer. These examples is choosen by re-commenting the source code.

    The included random graph of size 100 finds a maximum clique (size 6) in about 2 seconds. and generates the four size 6 cliques in about the same time.

April 05, 2009

Some new Gecode models

Here are some new Gecode models created since the last week, consisting mostly of two kind of standard problems:
  • simple operations reasearch problems
  • logical problems
When modeling these problems some findings where discovered which also are presented.

  • assignment.cpp
    Simple assignment problem from Winston's "Operations Research, page 393f.

  • organize_day.cpp
    Organizing a day, simple operations research problem (ECLiPSe)

    Findings:
    The function no_overlap was quite easy to write, using reification and booleans.
    void no_overlap(Space& space, IntVar s1, int d1, IntVar s2, int d2, IntConLevel icl = ICL_BND) {
      post(space, tt((s1 + d1 <= s2) || (s2 + d2 <= s1)), icl);
    }
    
    Note: tt is needed so the logic expressions are interpreted as reifications.
  • coins3.cpp
    Minimum number of coins to pay exactly any amount between 1 and 99 cents. From the ECLiPSe book.

    Findings: One thing I almost always want when modeling optimization problems, is to investigate both the optimimum (maximum/minimum) and also see all solutions for that optimum.

    In this model different search methods can be stated via the command line.
    • BAB (Branch and Bound) and Restart for minimizing the number of coins
    • DFS (Depth First Search) for searching the complete search tree given a value from command line (opt.size()).

    When using the DFS (e.g. coins3.exe -search dfs 8) the value from the command line constrains the model with this condition:
    if (num_coins_val) {
       post(*this, num_coins == num_coins_val, opt.icl());
    }
    
  • set_covering.cpp
    Set covering problem: placing of firestations (Winston "Operations Research", page 486)
  • zebra.cpp
    The famous zebra problem.

    Findings:
    For the constraint x is next to y the usual approach is to use: abs(x,y) == 1 .

    However, in Gecode we cannot state the abs condition directly, so here is what I did instead (inspired by the Gecode examples all-interval.cpp):
     void next_to(Space & space, IntVar x, IntVar y, IntConLevel icl = ICL_DOM) {
        post(space, abs(space, minus(space, x, y, icl), icl) == 1, icl); 
    }
    
    Usage: next_to(*this, smoke[Chesterfields], animal[Fox], opt.icl());
  • who_killed_agatha.cpp
    Who killed agatha? (The Dreadsbury Mansion Murder Mystery) is a standard benchmark in theorem proving, and also a good test of how to express logical relations in a constraint programming language.

    Findings:
    In Gecode it is not as easy to state this problem as in MiniZinc who_killed_agatha.mzn or in Comet who_killed_agatha.co; these models can be studied for comparison.

    Gecode has, however, support for implications (imp) and logical equivalence (eqv), which it makes it easier to state these kind of logical relations (than without them).

    One example: The constraint Charles hates noone that Agatha hates. is translated into the following Gecode code (the comment shows the MiniZinc version). Note: hates_m is a Matrix wapper for the IntVarArray hates; the access for rows and columns is hates_m(col, row).
    for(int i = 0; i < n; i++) {
      // MiniZinc: hates[charles, i] = 0 <- hates[agatha, i] = 1
      post(*this, tt(imp(hates_m(i,agatha) == 1, hates_m(i,charles) == 0)), opt.icl());
    }

    The biggest problem was to formulate the constraint A killer always hates, and is no richer than his victim.. In MiniZinc and Comet the first part can be stated directly simply as
    hates[the_killer, the_victim] = 1
    
    Since hates is an IntVarArray this is to be reinpreted as hates[the_killer * n + the_victim] = 1 (n is the column size).

    However, since both the_killer and the_victim are IntVars we cannot use the element constraint directly. Well, after some experimentation the following (somewhat hairy) utilily function was created:
    void element_m(Space & space, IntVarArray m, IntVar x, IntVar y, int rows, int cols, int val, IntConLevel icl = ICL_DOM) {
      IntVar x_n(space, 0, (rows*cols)-1);
      IntVar ix(space, 0, (rows*cols)-1); 
      IntVar e1(space, 0, rows-1);
      post(space, x_n == x*rows, icl); // x_n = x*rows
      post(space, ix==x_n+y, icl);     // ix = x*row + y
      element(space, m, ix, e1, icl);  // e1 = m[x*row +y]
      post(space, e1 == val, icl);     // val = m[x*row +y]
    } // end element_m
    
    The function is then called as: element_m(*this, hates, the_killer, the_victim, n, n, 1, opt.icl());.
    I'm not very proud of this, but it works. The constraint could probably be stated in a more succint way.
  • stable_marriage.cpp
    Stable marriage is also a standard problem in constraint programming, from Pascal Van Henteryck's great OPL book ("The OPL Optimization Programming Language").

    Findings:
    Formulating this problem in OPL, Comet (which is very influenced by OPL), and MiniZinc is quite easy. See stable_marriage.mzn (MiniZinc), stable_marriage.co (Comet).

    In Gecode the real problem was to translate the following OPL constraint to Gecode:
    forall(m in Men, w in Women)
      rankMen[m,w] < rankMen[m, wife[m]] => rankWomen[w,husband[w]] < rankWomen[w,m];
    
    Here rankMen and rankWomen are IntArgs and wife and husband are IntVarArrays (m is an integer loop variable).

    As for the Agatha model, a utility function (element_m2) was created, and the corresponding Gecode code became:
    for(int m = 0; m < n; m++) {
       IntVar rankMen_res = element_m2(*this, rankMen, m, wife, n, opt.icl());
       for(int w = 0; w < n; w++) {
         IntVar rankWomen_res = element_m2(*this, rankWomen, w, husband, n, opt.icl());
         BoolVar b1 = post(*this, ~(rankMen[m*n+w] < rankMen_res));
         BoolVar b2 = post(*this, ~(rankWomen_res < rankWomen[w*n+m]));
         post(*this, tt(imp(b1, b2)), opt.icl());
      }
    }
    
    Note here the usage of the two BoolVars b1 and b2 in the declarations, and also the use of reifications (~) in the post expressions.
Some thought: Maybe one of the cause of the problems I have in formulating Gecode models is that I'm translating the logic from OPL/Comet/MiniZinc, instead of stepping back and think about the problem in a new way.

March 28, 2009

de Bruijn sequences in Gecode (and other systems)

First, the Gecode model of de Bruijn sequences that is refered below: debruijn.cpp.

Update 2009-03-30: Thanks to Mikael Zayenz Lagerkvist, I have fixed/updated/added some things. These comments are last in this post.

Introduction

I have been fascinated by de Bruijn sequences (Wikipedia) for years, and made some web based programs: Explanation of the terms "classic" and "arbitrary" (which is not standard vocabulary):
  • "classic" de Bruijn sequence: the sequence length is base^n (n is the number of bits),
  • "arbitrary" sequence: where the sequence length is arbitrary.
Modeling de Bruijn sequences is also one of the "standard" problems I always model when learning a new constraint programming system. See below for the implementation in other systems.

Principle used

The basic principle in generating de Bruijn sequences used in this model is the following. Note: The names for the parameters base, n, m, and bin_code are perhaps unfortunate and confusing, but are kept since they are used in all my other implementations (see below).

Given:
  • a base (parameter base)
  • number of bits (n)
  • length of sequence (m)
Then:
  • make a list of distinct integers in the range 0..(base^n)-1. This array is called x in the model. These are the nodes in a de Bruijn graph. The goal of this model is to find nodes that really are "de Bruijn nodes".
  • calculate the "bit sequence" (in base base) for each integer. This is a matrix with m rows and n columns, here called binary.
  • apply the de Bruijn condition for each consecutive integers, i.e. the first elements in binary[r] is the same as the last elements in binary[r-1], and also "around the corner".
  • the de Bruijn sequence is then the first element in each row, here called bin_code.
For "classic" de Bruin sequences this is really overkill, but it makes it possible to calculate sequences of arbitrary lengths.

Here is a simple run of the program with the following command line (see below for a discussion of the options):

debruijn.exe -solutions 1 -base 13 -n 4 -m 52 -print-matrix 1 -int-var smallest -int-val indomain-min

Result:
DeBruijn
base: 13
number of bits (n): 4
length of sequence (m): 52

x:{0, 1, 13, 169, 2197, 2, 26, 338, 4394, 3, 39, 507, 6591, 4, 52, 676, 8788, 5, 65, 845, 10985, 6, 78, 1014, 13182, 7, 91, 1183, 15379, 8, 104, 1352, 17576, 9, 117, 1521, 19773, 10, 130, 1690, 21970, 11, 143, 1859, 24167, 12, 156, 2029, 26389, 325, 4225, 26364}
de Bruijn sequence{0, 0, 0, 0, 1, 0, 0, 0, 2, 0, 0, 0, 3, 0, 0, 0, 4, 0, 0, 0, 5, 0, 0, 0, 6, 0, 0, 0, 7, 0, 0, 0, 8, 0, 0, 0, 9, 0, 0, 0, 10, 0, 0, 0, 11, 0, 0, 0, 12, 0, 1, 12}
binary matrix:
0 0 0 0
0 0 0 1
0 0 1 0
...
0 1 12 0
1 12 0 0
12 0 0 0


Initial
        propagators:  312
        branchings:   3

Summary
        runtime:      0.020 (20.000000 ms)
        solutions:    1
        propagations: 1119
        nodes:        16
        failures:     0
        peak depth:   15
        peak memory:  100 KB
For calculating the standard "keypad problem" (i.e. base = 10, n = 4, and m = 10000), this model takes too much memory for my computer (2 Gb RAM, a Linux 3.4MHz dual core). There are probably some things to make this model more efficient. Update: See below how to handle this.

However, calculating the same base and n with a length of 1000 is fast:
Initial
        propagators:  6000
        branchings:   3

Summary
        runtime:      1.660 (1660.000000 ms)
        solutions:    1
        propagations: 138390
        nodes:        253
        failures:     0
        peak depth:   252
        peak memory:  42651 KB

Findings

Here are some findings when modeling this Gecode model.

* Matrix<IntVarArray>

One of things that took the longest time to do what the "channeling" from the integer array x and the matrix binary. The reason was simply that I haven't read the documentation how to use the Matrix wrapper. The order in the call is columns, rows, not the other way around which I assumed. Here is how to use it (a simple example):

...
IntVarArray some_array(*this, number_of_columns*number_of_rows, 0, 100);
Matrix matrix_version(some_array, number_of_columns, number_of_rows);

...


For accessing the matrix, we - of course - use the same order:
matrix_version(column, row);.

* Options

This model has a lot of different parameters to play with:
  • base
  • n
  • m
  • print_matrix (printing the binary matrix)
  • int-var IntVarBranch
  • int-val IntValBranch
Testing different variants via re-compilation is simply not feasible. Hence I wrote a subclass of Option for this, with initial help from Mikael Zayenz Lagerkvist. The most interesting part is handling different branching options (see the model debruijn.cpp how it is implemented). The default option branching is not enough since both IntVarBranch and IntValBranch should be possible to change via the command line. (A simpler version using the branching option is in donald_gerald_robert.cpp.)

Here is the result of running the program with the -help option showing the different options (the standard options is removed):
...
       -base (unsigned int) default: 2
                base to use
        -n (unsigned int) default: 3
                number of bits to use.
        -m (unsigned int) default: 0
                length of the sequence.
        -print-matrix (unsigned int) default: 0
                1 prints the binary matrix.
        -int-var (input-order, first-fail, anti-first-fail, smallest, largest, occurrence, max-regret) default: smallest
                options for IntVarBranch
                  input-order: use VAR_NONE
                  first-fail: use VAR_SIZE_MIN
                  anti-first-fail: use VAR_SIZE_MAX
                  smallest: use VAR_MIN_MIN
                  largest: use VAR_MAX_MAX
                  occurrence: use VAR_DEGREE_MAX
                  max-regret: use VAR_REGRET_MIN_MAX
        -int-val (indomain-min, indomain-max, indomain-median, indomain-split) default: indomain-min
                options for IntValBranch
                  indomain-min: use VAL_MIN
                  indomain-max: use VAL_MAX
                  indomain-median: use VAL_MED
                  indomain-split: use VAL_SPLIT_MIN

Some notes and other findings about this:
  • If no -m (the sequence length) is stated, it is defaulted to base^n. This is done in the class DeBruijnOptions.
  • The names used in -int-var and -int-val is taken from the mappings that Gecode/FlatZinc use for the MiniZinc branching options.
  • It seems that these extra options must be stated after all of the builtin options from the (standard) Options class.
  • This is another finding: When adding an option and you want to use it (say) in the print function, don't forget to add it to the clone constructor. Here is the constructor:
      //
      // Constructor for cloning s
      //
      DeBruijn(bool share, DeBruijn& s) : Example(share,s),
                                          n(s.n), 
                                          m(s.m),
                                          base(s.base),
                                          print_matrix(s.print_matrix),
                                          int_var(s.int_var),
                                          int_val(s.int_val)
      {
        x.update(*this, share, s.x);
        binary.update(*this, share, s.binary);
        bin_code.update(*this, share, s.bin_code);
        // gcc.update(*this, share, s.gcc);
      }
  • In the model, there are a (commented) extra constraint which states that the occurrences of the different values in the de Bruijn sequence should be the same, using the global cardinality constraint count. I will investigate how to make this constraint optional in the way I want. Update:: See below for more about this.

Other implementations

Here are my other constraint programming implementations of de Bruijn sequences:

See also

Update 2009-03-30
Thanks to Mikael Zayenz Lagerkvist there is a new version of debruijn.cpp. The improvements are:
  • It now compiles clean with gcc -Wall. For some reason I have forgotten to include that option in my Makefile.
  • There is a a new option -print_x which print x, i.e. the array of integers in the de Bruijn graph. The default it is now off (0).
  • This version always calculates the occurrences of elements in the de Bruijn sequence (the bin_code array), which is handled by a new IntVarArray variable gcc (for "global cardinality constraint").

    There is also a new option, -same-occurrences, which - when set to 1 - requires that all elements in the de Bruijn sequence should be the same.

    An example: Here is a problem where base=13, n=4 and m=52 and -same-occurrence 1
    de Bruijn sequence: {0, 0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6, 7, 7, 7, 8, 8, 8, 9, 9, 9, 10, 10, 10, 11, 11, 11, 12, 12, 12, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1}
    gcc: {4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4}
    
    
    Initial
            propagators:  314
            branchings:   4
    
    Summary
            runtime:      0.040 (40.000000 ms)
            solutions:    1
            propagations: 2516
            nodes:        46
            failures:     0
            peak depth:   45
            peak memory:  254 KB
    
  • Exhaused memory

    I wrote above that this problem:
    base=10
    n=4 (number of bits) and
    m = 10000 (length)
    required too much memory.

    This was fixed by incrementing the options -c-d ("recomputation commit distance") and -a-d ("recomputation adaption distance"). Mikael's suggestion was to use the following values:
    -c-d 256 -a-d 8

    With these values set, the problem took about 5:14 minutes on my computer (3.4MHz dual core, Mandriva Linux and 2Gb RAM) and required only about 400 Mb (before I had to abort since the memory was exhausted). Here is a result of running debruijn.exe -base 10 -n 4 -c-d 256 -a-d 8
    Initial
            propagators:  60002
            branchings:   4
    
    Summary
            runtime:      5:12.300 (312300.000000 ms)
            solutions:    1
            propagations: 12793015
            nodes:        2529
            failures:     0
            peak depth:   2528
            peak memory:  346288 KB
    debruijn.exe -base 10 -n 4 -c-d 256 -a-d 8  310,60s user 1,92s system 99% cpu 5:13,53 total
    

    I also tested with debruijn.exe -base 10 -n 4 -c-d 1256 -a-d 18, i.e. by just adding a "1" before each of the -c-d and -a-d values. The result was:
    Initial
            propagators:  60002
            branchings:   4
    
    Summary
            runtime:      5:05.150 (305150.000000 ms)
            solutions:    1
            propagations: 12793015
            nodes:        2529
            failures:     0
            peak depth:   2528
            peak memory:  61039 KB
    debruijn.exe -base 10 -n 4 -c-d 1256 -a-d 18  303,65s user 1,66s system 99% cpu 5:05,62 total
    
    The interesting part is that it required just about 20% of the memory. I have to study these things more.
    The two options -c-d, and -a-d 8are documented here.

    Later note: See more about these options in Mikael's comment below (which I didn't see when writing the update).


As usual, thanks Mikael!

March 27, 2009

Gecode version 3.0.2 released

Gecode version 3.0.2 is released. Download here.

From the changelog:

Changes in Version 3.0.2 (2009-03-26)
This is a bug fix release fixing two more embarrassing bugs. However, this time we redesigned our tests carefully such that they cover all changes and optimizations done for the transition from 2.2.0 to 3.0.*. Please update asap.


  • Finite domain integers
    • Bug fixes
      • Fixed bug in optimization of extensional constraints with DFAs (hard to reproduce, almost impossible). (major)

    • Performance improvements
      • Reoptimized element with integer values and created bizarre testcases. (minor)

  • Minimal modelling support
    • Bug fixes
      • Fixed bug in posting of Boolean expressions including reified linear expressions. Again, that escaped our testsuite (also fixed). (major, thanks to Gustavo Guiterrez)

  • Example scripts
    • Bug fixes
      • The radiotherapy example was missing in the Makefile. (minor, thanks to Håkan Kjellerstrand)

  • Gist
    • Other changes
      • A separator is printed between solutions in the TextInspector. (minor)


March 26, 2009

My first Gecode models

Gecode has long been an inspiration for many of my models: I have borrowed sample problems from its vast collection of examples, e.g. from Gecode's Minesweeper in the MiniZinc model minesweeper.mzn, from Gecode's Nonogram in the Comet model nonogram_regular.co and so on.

Since the Gecode models are very fast (one reason is that it use C++), they have also been used as a benchmark for my other models. For example, see the discussion about performance in Comet: Nonogram improved: solving problem P200 from 1:30 minutes to about 1 second. (The Gecode team later incorporated one of these findings in their Nonogram model.)

Gecode (and also the MiniZinc solver Gecode/FlatZinc) has a feature I like very much: it prints out much interesting statistics (note: almost all other systems I have used shows some sort of statistics). Here is an example of the output from the 1000-queens model (from the Gecode distribution):


Initial
propagators: 3
branchings: 1

Summary
runtime: 0.870 (870.000000 ms)
solutions: 1
propagations: 3006
nodes: 996
failures: 2
peak depth: 993
peak memory: 77721 KB

Using the Example framework (see below) makes it easy to interactively testing different strategies etc via the command line since there is a lot of default options to use. Here are the options for the queens model:

Options for Queens:
-help, --help, -?
print this help message
-propagation (binary, mixed, distinct) default: distinct
propagation variants
binary: only binary disequality constraints
mixed: single distinct and binary disequality constraints
distinct: three distinct constraints
-icl (def, val, bnd, dom) default: def
integer consistency level
-solutions (unsigned int) default: 1
number of solutions (0 = all)
-c-d (unsigned int) default: 8
recomputation commit distance
-a-d (unsigned int) default: 2
recomputation adaption distance
-node (unsigned int) default: 0
node cutoff (0 = none, solution mode)
-fail (unsigned int) default: 0
failure cutoff (0 = none, solution mode)
-time (unsigned int) default: 0
time (in ms) cutoff (0 = none, solution mode)
-mode (solution, time, stat, gist) default: solution
how to execute example
-iterations (unsigned int) default: 500
iterations per sample (time mode)
-samples (unsigned int) default: 1
how many samples (time mode)
(unsigned int) default: 100
which version/size for example

In Gecode version 3.0.1 and Gecode/FlatZinc 1.5 released I mentioned the Gist GUI for graphically exploring (and debugging) the search tree.


My Gecode models


Well, this week I started to program some Gecode models. I have created a special page for them: My Gecode page which - right now - contains the following models:

  • coins_grid.cpp: Placing coins on a grid (Tony Hurlimann)
    Note: Constraint programming systems are not very good on this problem (at least not my models). Using integer programming is much faster, see Comet's MIP solver goins_grid.co).

  • diet.cpp: simple diet problem (operations research)

  • donald_gerald_robert.cpp: DONALD+GERALD=ROBERT, an experiment in "tweaking".
    Note: Here I experiment in different search strategies, different branchings etc.

  • least_diff.cpp: Least diff problem, an alphametic puzzle: minimize the difference ABCDE-FGHIJ where A..J are integers 0..9 and all different.

  • map.cpp: Simple coloring problem (Van Hentenryck)

  • quasigroup_completion.cpp: Quasigroup Completion

  • send_more_money_any_base.cpp: SEND+MORE=MONEY in any base

  • seseman.cpp: simple recreational mathematics puzzle. See the CGI-program Seseman convent problem

  • seseman2.cpp: A more general model of the Seseman problem (using a matrix for representing the problem)

  • survo_puzzle.cpp: Survo puzzle (with just one hardcoded example)

  • xkcd.cpp: XKCD's knapsack problem. See the xkcd strip for the problem


Some notes about the models


These are all problems I tend to start with when learning constraint programming systems: they are all simple and different in some way or another. (Compare with other implementations in My MiniZinc page, My JaCoP page, My Choco page, My Gecode/R page, and My Comet page.)

My other implementations of Quasigroup Completion and Survo puzzle usually have some some data/problems files. Well, this time I was kind of lazy and just hard coded one example or two. Sorry about that.

As you may note my C++ is quite rusty, so some things could probably have been coded better. And to quote myself: I tend to program more for readabilty than elegance, which makes my programs sometimes look quite boring....


Example framework
The models use the Example framework mentioned above, which is the framework that all Geocde examples use. Unfortunately it is not completely incorporated in the Minimodel framework and is not mentioned in the (very good) introduction Modeling with Gecode (PDF). But since all Gecode's examples use the framework, almost all the questions I have had so far could be solve by just browsing the examples, with just a glance or two at the online documentation.


See also


* This blog's Gecode category

Gecode site
* Gecode

Download
* Download

Dokumentation
* Documentation
* Publications
* Modeling with Gecode (PDF)
* The great collection of examples

Mailing list etc
* Community

Interfaces to other systems
* Interfaces

March 24, 2009

Gecode version 3.0.1 and Gecode/FlatZinc 1.5 released

Version 3.0.1 of Gecode is relased. It contains mostly bug fixes:

This is a bug fix release fixing an embarassing bug in reified Boolean linear constraints.
  • Finite domain integers

    • Bug fixes
      • IntSetArgs no longer inherit from PrimArgArray, which was wrong as IntSet is no primitive type and hence does not support vararg initializers. (minor)
      • Fixed bug in reified Boolean linear constraints (an optimization is currently disabled and will be active in the next release: the optimization was incorrect and was never tested). (major, thanks to Alberto Delgado)


  • Example scripts
    • Additions
    • Bug fixes
      • The examples now pass the c-d and a-d command line options correctly to Gist. (minor)
      • The Steel Mill Slab Design example had two bugs in the search heuristic and a missing redundant constraint. (minor, bugzilla entry, thanks to Chris Mears)




Gecode/FlatZinc has been updated to version 1.5. The bug fix here is very interesting (and exiting) since Gist now also works with Gecode/FlatZinc.

Gist in Gecode and Gecode/FlatZinc
Gist for Gecode has been around for time, and was officially in the distribution in version 3.0.0. In Modeling with Gecode (PDF) there is a section how to program and use Gist.

Stable Marriage Problem.
As a first example of Gist with Gecode/FlatZinc (i.e. MiniZinc models): stable_marriage.pdf is a PDF file which shows the full search tree for the MiniZinc model stable_marriage.mzn. The green nodes are solutions, the blue nodes are choice points, the red squares are failures, and the big red triangles are a sub tree of failures.

Running the model with fz -mode stats stable_marriage.fzn gives the following result.

wife : [7, 5, 9, 8, 3, 6, 1, 4, 2]
husband: [7, 9, 5, 8, 2, 6, 1, 4, 3]
----------
wife : [6, 5, 9, 8, 3, 7, 1, 4, 2]
husband: [7, 9, 5, 8, 2, 1, 6, 4, 3]
----------
wife : [6, 4, 9, 8, 3, 7, 1, 5, 2]
husband: [7, 9, 5, 2, 8, 1, 6, 4, 3]
----------
wife : [6, 1, 4, 8, 5, 9, 3, 2, 7]
husband: [2, 8, 7, 3, 5, 1, 9, 4, 6]
----------
wife : [6, 4, 1, 8, 5, 7, 3, 2, 9]
husband: [3, 8, 7, 2, 5, 1, 6, 4, 9]
----------
wife : [6, 1, 4, 8, 5, 7, 3, 2, 9]
husband: [2, 8, 7, 3, 5, 1, 6, 4, 9]
----------
==========

runtime: 10
solutions: 6
propagators: 346
propagations: 12426
nodes: 67
failures: 28
peak depth: 8
peak memory: 132 KB


Also see My MiniZinc Page for MiniZinc models which may be used with Gecode/FlatZinc.

March 22, 2009

Gecode: Guido Tack 'Constraint Propagation - Models, Techniques, Implementation'

Guido Tack is one of the Gecode team. His doctoral dissertation Constraint Propagation - Models, Techniques, Implementation (2009, as PDF) discuss constraint propagation in general, and give many examples of how propagation works in Gecode.

Abstract:


This dissertation presents the design of a propagation-based constraint solver. The design is based on models that span several levels of abstraction, ranging from a mathematical foundation, to a high-level implementation architecture, to concrete data structures and algorithms. This principled design approach results in a well-understood, correct, modular, and efficient implementation.

The core of the developed architecture is the propagation kernel. It provides the propagation infrastructure and is thus crucial for correctness and efficiency of the solver. Based on a mathematical model as well as a careful design of the employed algorithms and data structures, the presented architecture results in an efficient and domain-independent kernel. Constraints are realized by propagators, and implementing a propagator is a challenging, error-prone, and time-consuming task. A practically useful solver must however provide a comprehensive propagator library. This dissertation introduces two techniques for automatically deriving correct and efficient propagators. Views generalize variables and are used to derive propagators from existing propagators. For constraints over set variables, propagators are derived from formal constraint specifications.

The presented techniques are the basis of Gecode, a production-quality, highly efficient, and widely deployed constraint solver. Gecode is the empirical evidence for success and relevance of the principled design approach of this dissertation.


Some comments
My interest right now is to step up from "just modeling" with decompositions to understand more about (and hopefully also write) "real" propagators, such as global constraints. Programming propagators is not possible in the MiniZinc language, the system I've used most the last year. Both Comet and Gecode has support for this, as well as the Java based systems Choco and JaCoP. Until now, I've only used Gecode's examples as a inspiration of and comparison to my own models, but I plan to start modeling in Gecode as well.

In view of this, Constraint Propagation - Models, Techniques, Implementation is a perfect book, explaning what propagations are and how they are implemented in Gecode. It is therefore great for understanding how Gecode works. It is also a modern introduction of the state-of-art techniques and implementation of propagators, which is one of the most important part in constraint programming.

Although quite theoretical in part (it is a dissertation in computer science after all), the book has a nice pedagogic pace and explains all core concepts very well, often with examples. Also, the proofs - there are lot of them - are often explained in "normal prose" before or after the proof, which makes it easy to follow the reasoning. (I didn't follow all the proofs, meaning that for some proofs I either didn't understand them or just glanced them through.)

It was very interesting to read the reasons, supported with benchmarks, behind some of the design decisions that was made in Gecode.

I enjoyed the "Related work" part at the end of a chapter or section, which related the discussed concepts to former research. Also, I liked that the Bibliography shows to what page a specific work is refered in the book.

I happened to read Modeling with Gecode just before Constraint Propagation - Models, Techniques, Implementation, and that was a good thing: Modeling with Gecode shows how to model in Gecode so the core concepts are laid out for the dissertation. (The Documentation page shows that a document "Programming with Gecode" is under preparation, which I look forward to read.)

Last, thanks to Mikael Zayenz Lagerkvist for recommending the book.

March 14, 2009

Solving Pi Day Sudoku 2009 with the global cardinality constraint

Here is an update of the Pi Day Sudoku 2009 problem.

After a suggestion from Mikael Zayenz Lagerkvist (of the Gecode team) the model now uses just the global cardinality constraint and is much faster than the model which used my own decomposition of all_different_except_0 (or rather all_different_except_Pi) and the count constraint.

Both of the new MiniZinc and Comet models solves the Pi Day Sudoku 2009 problem in less than a second (compared to couple of seconds before).

The Comet model has the following statistics for finding the first (and unique) solution.

time: 9
#choices = 6
#fail = 1
#propag = 1306

For Gecode/FlatZinc:

runtime: 20
solutions: 1
propagators: 35
propagations: 255
nodes: 3
failures: 0
peak depth: 2
peak memory: 158 KB

Compared to the former model, there are now much less failures.


Some more about the global cardinality constraint
Global cardinality (in Sudoku) is a way of stating how many occurrences there should be of each number per row, column or region. For the Pi Day Sudoku problem the following occurrences are needed:

Pi: 3 occurrences (coded as -1 in the model)
0: 0 occurrences (0 is a dummy value for handling the indata)
1..9: exactly one occurrence each

I MiniZinc the this is coded as

array[1..n, 1..n] of var -1..9: x;
array[-1..9] of 0..3: occ = array1d(-1..9, [3, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1]);
contraint
forall(i in 1..n) ( % rows
check([x[i,j] | j in 1..n], occ)
)
...

In Comet the function cardinality is used, i,e, m.post(cardinality(occ, all(j in 1..n) x[i,j]));.


For more information about the global cardinality constraint, see the entry in Global Constraint Catalog.


A comment
Introductions to constraint programming nowadays tend to start with the Sudoko problem (or SEND + MORE = MONEY). These models often use the alldifferent constraint, and seldom, if ever, have I seen any presentation using global cardinality. One reason may be that alldifferent is easier to understand in a introduction, another reason may be that it is general a better constraint to use for these problems.

"Vanilla" Sudoko solver using global cardinality
I have compared some few "vanilla" Sudoku problems with either alldifferent or global cardinality but haven't done any systematic comparison. Sometimes the global cardinality approach is better, sometimes it is somewhat worse.

sudoku_gcc.mzn is a "vanilla" Sudoku solver in MiniZinc using the global cardinality constraint. And the same approach in Comet: sudoku_gcc.co. Note: These models includes just a few problems.


Also see: Gecode's Sudoku solver is very fast, and includes 88 different problems.

March 13, 2009

Gecode version 3.0.0 released

Gecode version 3.0.0 is released. This is great!

From the release notes:

This release is a major consolidation release: interfaces have been cleaned up (consistent parameter passing, consistent naming, simpler Gist interface, namespaces for operator overloading); some functionality has been extended (propagators can be non-monotonic; branchings support tie-breaking and random variable and value selection); some functionality that did not meet our quality goals has been removed (complete set variables, reflection); usage has been simplified (auto-linking on Windows, more commonly used filename extensions); important aspects have been optimized (memory management, memory usage and efficiency on 64bit machines). These cleanups were in particular necessary to make Gecode easier to document (this release is the first to be accompanied by tutorial documentation explaining how to model with Gecode).

Apart from that, many small fixes and additions. Please see below for the details.

As the interfaces have changed considerably, please consult How to Change to Gecode 3.0.0..

The new interface to MiniZinc is also released: Gecode/FlatZInc 1.4. This release has the following changes:

Version 1.4
* Updated to compile with Gecode 3.0.0
* Fixed conditional output items
* Print variables as required by the upcoming FlatZinc specification

I also recommend reading the nice introduction to modeling in Gecode Modeling with Gecode (PDF).

March 12, 2009

MiniZinc Challenge 2008 Results

The MiniZinc Challenge 2008 was held in the summer 2008. I don't know exactly when the result was published (probably in the last week), but here it is; MiniZinc Challenge 2008 Results.

The summary states (higher result better):


The results of the challenge are available here; congratulations go to GeCode on a very convincing win! The summary of results is given in this table:

Contestant Total Score
eclipse_fd 787.3
eclipse_ic 938.8
g12_fd 1655.1
gecode 3418.8

Congratulations to the Gecode team!

This result confirms my impression about the MiniZinc solvers: Gecode/flatzinc is my favorite solver, since it is often very fast, and also that it shows important statistics. See Pi Day Sudoku 2009 for some examples of the latter.

As of writing (2009-03-12) the links to the models and data files in the result page don't work but all the files is included in the last ROTD (release of the day).

By the way, two of the models are from my MiniZinc collection:
* debruijn_binary.mzn: de Bruijn sequences
* quasiGroup7.mzn: quasigroup problem 7

March 11, 2009

Pi Day Sudoku 2009

Brainfreeze Puzzles has a Pi day Sudoku competition:


Pi Day is a celebration of the number π that occurs every March 14 (3.14...). Math geeks all over the world celebrate their math-geekiness on this day with pie-eating contests, recitations of the digits of pi, and occasionally fundraisers where math faculty get hit in the face with pies. At Brainfreeze Puzzles we celebrate Pi Day - how else? - with a Sudoku puzzle.

...

Rules: Fill in the grid so that each row, column, and jigsaw region contains 1-9 exactly once and π [Pi] three times.

The puzzle is also here (as a PDF file)

Models

I programmed this puzzle in both Comet (using the constraint programming module) and MiniZinc using different solvers. The models use the same principle of solving the problem: a (homebrewn) all different except 0 constraint and counting exactly 3 occurrences of Pi for each row/column and region.

All the solvers give the same unique answer, which of course may mean that there is some systematic error on my part.


Comet


Comet solves the problem in about 9 seconds with the following statistics, using exploreall to make sure that it is a unique solution.

time: 8821
#choices = 9999
#fail = 16231
#propag = 23168886

Somewhat surprisingly selecting j before i in the forall loop, gives an improvment from 15 seconds to 9 seconds. This was the same labeling strategy that was used for improving Nonogram (see Comet: Nonogram improved: solving problem P200 from 1:30 minutes to about 1 second). Here is the labeling:

// reversing i and j gives faster solution
forall(j in 1..n, i in 1..n: !x[i,j].bound()) {
tryall(v in V : x[i,j].memberOf(v)) by(v)
m.label(x[i,j], v);
onFailure
m.diff(x[i,j], v);
}

MiniZinc and Gecode/flatzinc

Interestingly, the same improvement was observed with MiniZinc and the Gecode/flatzinc solver (the interface to Gecode). Here is the labeling function, with the same swapping of i and j (I didn't know that this mattered in MiniZinc).


solve :: int_search([x[i,j] | j, i in 1..n], "input_order", "indomain", "complete") satisfy;

With the swap, the time went from 16 seconds to 10 seconds with the following statistics:

propagators: 9460
propagations: 13461717
failures: 11221
clones: 11221
commits: 31478
peak memory: 5638 KB
mzx.pl sudoku_pi.mzn --num 2 9,41s user 0,72s system 99% cpu 10,139 total

Fzntini, FlatZinc, ECLiPSe and searching for the first solution

MiniZinc's builtin solver flatzinc and fzntini (a SAT solver) only shows one solution and no statistics.

* fzntini solve problem in about 13 seconds
* flatzinc takes 3 seconds.

The ECLiPSe's MiniZinc solver ic takes 6.5 seconds using the same labeling as Gecode/flatzinc. No statistics is shown for this solver either.

For comparison both Comet and Gecode/flatzinc shows the first solutions in 2.5 seconds.

Solution

Since it is a competition I won't show my solution or the models on/via the blog. Sorry about that.

Reference

I learned about Pi Days Sudoku from the 360 blog: Pi Day Sudoku 2009.

March 05, 2009

Comet: Nonogram improved: solving problem P200 from 1:30 minutes to about 1 second

In Comet: regular constraint, a much faster Nonogram with the regular constraint, some OPL models, and more I reported the running time of the Nonogram P200 problem (one of the hardest standard problems) as 1:30 minutes with the Comet model nonogram_regular.co (using the regular constraint).

In the last paragraph of the Nonogram section, I noted: Maybe some creative labeling or a "proper" regular constraint can fasten things up...

Well, it was "creative labeling" part that made it. After some help from Pascal Van Hentenryck (one of the creators of Comet), the time for generating all the solutions of P200 - which is unique - went to about 1 second.

The statistics for the full search of P200 is shown below. Note: the unit for "time" is milliseconds and reports the solving time excluding the startup of Comet itself, i.e. 0.627 seconds.

num_solutions: 1
time: 621
#choices = 520
#fail = 1040
#propag = 1213237
comet -j2 nonogram_regular.co 1,48s user 0,10s system 99% cpu 1,582 total

For comparison the Gecode program Nonogram displays one solution in about the same time, but searching the complete search tree takes about 45 seconds, using the following command line option:
nonogram -pk speed -solutions 0 8.

Maybe there are some more options that will make the Gecode program works faster.


What was changed?


Pascal showed me two improvements (annotated by "Pascal's improvement" in nonogram_regular.co):

The first change didn't do much difference. It was a change in the regular function. The following

forall(i in x_range) {
cp.inside(x[i], 1..S); // Do this in case it's a var.
cp.post(a[i+1] == d2[a[i], x[i]], onDomains); // Determine a[i+1].
}

was changed into

forall(i in x_range) {
cp.post(x[i] >= 1); // hakank: Pascal's improvement
cp.post(x[i] <= S); // hakank: Pascal's improvement
cp.post(a[i+1] == d2[a[i], x[i]], onDomains); // Determine a[i+1].
}

The second change, however, made the whole thing. In the using section, the labeling (search strategy) was originally stated as

forall(i in 1..rows, j in 1..cols : !board[i,j].bound()) {
tryall(v in 1..2) m.post(board[i,j] == v, onDomains);
}

Pascal's suggested that j in 1..cols and i in 1..rows should switch place, which made the whole improvement.

The labeling is now as follows, i.e. it starts with the smallest "dimension" (rows or cols).

if (rows * row_rule_len < cols * col_rule_len ) {
    forall(i in 1..rows, j in 1..cols: !board[i,j].bound()) {
      tryall(v in 1..2) by (-v)
        m.label(board[i,j],v);
    }
  } else {
    forall(j in 1..cols, i in 1..rows: !board[i,j].bound()) {
      tryall(v in 1..2) by (-v)
        m.label(board[i,j],v);
    }
  }


Thanks again Pascal for your help and inspiration!

February 28, 2009

Comet: regular constraint, a much faster Nonogram with the regular constraint, some OPL models, and more

Since the last time some more Comet model has been written.

Much faster Nonogram model using the regular constraint

In More Comet models, e.g. Nonogram, Steiner triplets, and different set covering problems I presented nonogram.co, a solver for Nonogram puzzles, and also noted that is was quite slow: Well, it is nice to have some more optimization to do (or more probably, a complete remodelling)....

Inspired by the announcement this week of the ECLiPSe example of Nonogram solver using the regular constraint (see nono_regular.ecl.txt and regular.ecl.txt) - and also my earlier Gecode/R model nonogram.rb which used "regular expressions" - I created a Nonogram model in Comet using the regular constraint: nonogram_regular.co.

Let us first look at the regular constraint.


Regular constraint
The regular constraint (see the Comet model regular.co for my implementation) is a global constraint using a DFA (deterministic finite automata) which accepts (or not accepts) an input sequence given a "transition matrix". The constraint was presented by Gilles Pesant in "A Regular Language Membership Constraint for Finite Sequences of Variables" (2004).

My implementation of the regular constraint is heavily borrowed from MiniZinc's builtin regular predicate (from lib/zinc/globals.mzn); the translation to Comet was very straightforward (omitting just some asserts).

An exemple of the usage of the constraint: We want to match the regular expression "123*21" (i.e. first "1", then "2", then zero or more "3", then "2", and last a "1"). Note: The length of the sequence is 10, so there must be 6 "3":s since "123" and "21" are "anchored" at the beginning and the end.

int len = 10;
int n_states = 5; // number of states
int input_max = 3; // the states are 1,2, and 3
int initial_state = 1; // we start with the 1 state
set{int} accepting_states = {n_states}; // This is the last state
// The transition matrix
int transition_fn[1..n_states, 1..input_max] =
[[2, 0, 0], // transitions from state 1: "1" -> state 2
[0, 3, 0], // transitions from state 2: "2" -> state 3
[0, 4, 3], // transitions from state 3: "2" -> state 4 | "3" -> state 3 (loop)
[5, 0, 0], // transitions from state 4: "1" -> state 5
[0, 0, 0]]; // transitions from state 5: END state
...
exploreall {
regular(reg_input, n_states, input_max, transition_fn, initial_state, accepting_states);
} using {
// ....
}

The unique sequence resulting from this automata is thus 1 2 3 3 3 3 3 3 2 1.

For using regular in the Nonogram problem, the automata for each row/column clue, must be built, preferably by a function. In regular.co this is done with the function make_transition_matrix, which by far was the hardest part of this problem (and surely could be written in a smoother way).

For the Nonogram clue [3,2,1] - which represents the regular expression "0*1110*110*10*" - the following automata (transition matrix) is generated:
1 2
0 3
0 4
5 0
5 6
0 7
8 0
8 9
9 0


Note that the regular function uses 0 (zero) as the failing state, so the states must start with 1. This is taken care in the model as the resulting value is subtracted by 1.

As usual in my models, the regular constraint is just a "convenience function", i.e. not using special written propagation methods etc.

The regular constraint has - of course - more applications than for Nonogram solving. I plan to look more into this.


The Nonogram solver: nonogram_regular.co
After the regular constraint and the automata generator was written, it was quite easy to change the old Nonogram solver to use these new tools. The result is in nonogram_regular.co. I was quite curious how fast this version should be compared to the former slow model. In short: it is much faster.

As comparison the Lambda instace took about 12 seconds with the old model; with the new model it takes 0.5 seconds, which is a nice improvement. I never finished a run for Nonunique problem with the older model; the new model takes 0.5 seconds (including the startup of the Comet program). Etc.

The P200 problem nonogram_p200.co now takes 2:05 minutes, which can be compared with 57 seconds for the Gecode/R model. Thus, the Comet model is still slow compared to Gecode version and the ECLiPSe version, which solves the P200 problem in just a second or two. Maybe some creative labeling or a "proper" regular constraint can fasten things up...

Update some hour later: One way to gain about 30 seconds to 1:30 minutes on the P200 problem was to explicitly state the consistency level to onDomain, e.g. cp.post(a[m] == q0, onDomains), and to use another labeling strategy:
forall(i in 1..rows, j in 1..cols : !board[i,j].bound()) {
// label(board[i,j]); // the former strategy
tryall(v in 1..2) m.post(board[i,j] == v, onDomains);
}


Some more Nonogram problems have been coded:

An aside about regular expressions
I have been very interested in regular expressions (especially the more expressive Perl type) for a long time. In 1997 I wrote a simple Perl program MakeRegex which returns a regular expression given a list of words. It was later Appletized in MakeRegexApplet. Now there are better programs/modules for this.

OPL Models

One of my projects is to translate the OPL models from Pascal Van Hentenryck "The OPL Optimization Programming Language" into Comet. Even if OPL and Comet are different languages the reading of the book has been very awarding. Thanks again, Pascal!

Some OPL models already have been published, but now I've been more systematic and started from the beginning. More models to come.

Finding: arrays in a tuple
In Comet: New models, e.g. Einstein puzzle, KenKen, Kakuro, Killer Sudoku, Stigler's Diet problem, translations of OPL models I wrote

One big difference here: In Comet it is not allowed to have an array in a tuple, so the use data must be a separate matrix.

I've found a way of using arrays in a tuple, by first initialize the values in a matrix and then - by using the all function - "slice" the values into the tuple array. This has been done in the models fixed_charge2.co and production3.co.

Example from fixed_charge2.co:

tuple productType {
int profit;
set{Machines} machines;
int[] use;
}
int use[Products, Resources] = [[3,4],[2,3], [6,4]];
productType Product[Products] =
[productType(6, {shirtM}, all(i in Resources) use[shirt,i]),
productType(4, {shortM}, all(i in Resources) use[shorts,i]),
productType(7, {pantM}, all(i in Resources) use[pants,i])];

Combining different models

The alphametic problem SEND + MOST = MONEY has the additional requirement to maximize the value of MONEY. The older model send_most_money.co does that and nothing more.

One natural extension of the problem is the following:
* first find the maximum value of MONEY
* then find all the solutions with this value.

The model send_most_money2.co has two functions:
* smm which is just the constraint for the alphametic problem
* send_most_money which has a parameter money.

send_most_money is first called with 0, indicating that is should maximize the value of MONEY, and then returns that value. The next call to send_most_money is with the calculated MONEY value, which indicates that all solutions should be generated.

The answer is

check all solutions for MONEY = 10876
x[9,7,8,2,1,0,4,6] MONEY: 10876
x[9,7,8,4,1,0,2,6] MONEY: 10876


Jim Orlin's Logic Puzzle

In Colored letters, labeled dice: a logic puzzle Jim Orlin stated a logic puzzle:
My daughter Jenn bough a puzzle book, and showed me a cute puzzle. There are 13 words as follows: BUOY, CAVE, CELT, FLUB, FORK, HEMP, JUDY, JUNK, LIMN, QUIP, SWAG, VISA, WISH.

There are 24 different letters that appear in the 13 words. The question is: can one assign the 24 letters to 4 different cubes so that the four letters of each word appears on different cubes. (There is one letter from each word on each cube.) It might be fun for you to try it. I’ll give a small hint at the end of this post. The puzzle was created by Humphrey Dudley.

...

If anyone wants to write an Excel spreadsheet and solve it via integer programming, please let me know. I’d be happy to post the Excel spreadsheet if you send it to me, or link to it if you post it and send me the URL.

This was a fun puzzle, so I modeled the problem in labeled_dice.co, and mailed Jim the link.

Some days later he wrote Update on Logic Puzzle where the contributed models were presented. There where another Comet model WordPuzzleSchaus.co by Pierre Schaus, one of Comet's developers. Pierres model use an different and more elegant approach than mine.

Jim also linked to some other logical puzzles from the great collection brownbuffalo.sourceforge.net (which have solutions in ECLiPSe/Prolog). One of these puzzles was Building Blocks, in the same vein as his labeled dice puzzle. Hence I had to make a Comet model of this problem: building_blocks.co.

He concludes the post with the following:

Incidentally, the ease for solving it using Constraint Programming makes me think that Constraint Programming should be considered a fundamental tool in the OR toolkit.


Other Comet models this week

Here are the other Comet models created/published this week. Some of them where very easy to do since they are translations of my MiniZinc models.

January 05, 2009

Tom Schrijvers: "Monadic Constraint Programming" (in Haskell)

Tom Schrijvers presents in the blog post Monadic Constraint Programming a draft version of the paper Monadic Constraint Programming written by him, Peter Stuckey, and Philip Wadler:


A constraint programming system combines two essential components: a constraint solver and a search engine. The constraint solver reasons about satisfiability of conjunctions of constraints, and the search engine controls the search for solutions by iteratively exploring a disjunctive search tree defined by the constraint program. In this paper we give a monadic definition of constraint programming where the solver is defined as a monad threaded through the monadic search tree. We are then able to define search and search strategies as first class objects that can themselves be built or extended by composable search transformers. Search transformers give a powerful and unifying approach to viewing search in constraint programming, and the resulting constraint programming system is first class and extremely flexible.


Prototype code in Haskell can be downloaded here.