My Constraint Programming Blog: March 2011 Archives

« February 2011 | Main | April 2011 »

March 31, 2011

Google or-tools has released support for LP/IP

Today Google or-tools released support for linear programming (LP) and integer programming (IP). Compiled versions of GLPK and CBC can be downloaded here.

Some small examples in

C++, integer_solver_example.cpp, linear_solver_example.cpp
Python, integer_solver_example.py, linear_solver_example.py
Java, IntegerSolverExample.cpp, LinearSolverExample.cpp

Coins grid LP version

Just to test this I converted one of my CP models coins_grid.py to a IP model: coins_grid_mip.py. This was quite straightforward, except that some constrains are not exactly like the CP interface. The model solves the problem in no time.

Here is the full model (edited)

from linear_solver import pywraplp
def main(unused_argv):
    # using GLPK
  solver = pywraplp.Solver('CoinsGridGLPK',
                          pywraplp.Solver.GLPK_LINEAR_PROGRAMMING)

  # Using CLP
  # solver = pywraplp.Solver('CoinsGridCLP',
  #                          pywraplp.Solver.CLP_LINEAR_PROGRAMMING)

  n =  31  # the grid size
  c =  14  # number of coins per row/column

  x = {}
  for i in range(n):
    for j in range(n):
      x[(i,j)] = solver.IntVar(0, 1, 'x[%i,%i]' % (i, j))

  # sum rows/columns == c
  for i in range(n):
    solver.Add(solver.Sum(
        [x[(i, j)] for j in range(n)]) == c)      # sum rows
    solver.Add(solver.Sum(
        [x[(j, i)] for j in range(n)]) == c) # sum cols

  # quadratic horizonal distance var
  objective_var = solver.Sum(
        [x[(i, j)] * (i - j) * (i - j)
          for i in range(n) for j in range(n)])

  objective = solver.Minimize(objective_var)

  solver.Solve()

  for i in range(n):
    for j in range(n):
      # int representation
      print int(x[(i, j)].solution_value()),
    print
  print

if __name__ == '__main__':
    main("coin grids")

Note: the solver outputs floats (0.0 and 1.0) but I strip this to 0 and 1 in the output.

Installation

On my Linux Ubuntu box I have to recompile the source code (from SVN version) to get this to work. Before that some things had to be changed.

Add the following two directories to LD_LIBRARY_PATH so the solver can find the .so files

(some path)/glpk-4.45/lib
(some path)/cbc-2.6.2/lib/coin

some_path is the directory where the GLPK and CBC packages are installed. Also, one have to set the proper path to GLPK and CBC in the Makefile.

After that everything worked fine.

For testing the Java models, type the following in the or-tools directory (where the Makefile is):

make run_LinearSolverExample
make run_IntegerSolverExample

Posted by hakank at 07:05 PM | Permalink

March 29, 2011

A first look at Google or-tools' Java interface

The Java interface for Google or-tools was released some week ago, and I have now ported some of my Python models to Java (about 35 models). See my Google CP Solver page / Java models for a list of them. Some of them has also been committed to the SVN repository of the or-tools project.

In A first look at Google CP Solver/Python (Google or-tools) I described the Python interface for Google or-tools. Since much of what is written there is applicable with the Java interface as well, I here concentrate on the differences between these two versions.

The porting was surprisingly easy to do for most models, since all the work of looking up the proper API call to use was already done for the Python models. See below for some of the differences.

I found very few unexpected things in the Java version. The Python interface (being slightly older) had some bugs when I first tested it, but they where fixed very fast by Laurent Perron and his colleagues.

Example: SEND + MORE = MONEY

One difference between the Java interface and the Python interface is that the Python interface has some more syntactic sugar, e.g. it recognizes the arithmetic operators (+, -, /, *), and (dis)equality: ==, !=. The Java interface has none of this which caused me some pain to translate.

For example: To get the result of two numbers (a + b == c) in Python:

  solver.Add(a + b == c);

In the Java interface one has to do use two methods (addSum and makeEquality):

  solver.addConstraint(solver.makeEquality(solver.makeSum(a, b), c));

Here is a table of corresponding operations:

Python       Java
----------------------------
a + b        makeSum(a, b)
a * b        makeProd(a, b)
a - b        makeDifference(a, b)
- a          makeOpposite(a)
a / b        makeDiv(a, b)
a == b       makeEquality(a, b)
a != b       makeDifference(a, b)

Example: the usual core of the SEND + MORE == MONEY problem can be written in a Python model (and similar in other constraint solver systems):

# define variables etc
# ...
#
solver.Add(      s*1000 + e*100 + n*10 + d +
                 m*1000 + o*100 + r*10 + e ==
           m*10000 + o*1000 + n*100 + e*10 + y

However, using this approach is not really doable with the Java interface, since it would require a lot of makeSum and makeProd, etc. A better way (and this is probably a general advice for the Python interface as well) is to instead use a scalar product: makeScalProd or makeScalProdEquality. Here is one way, where we also see how the decision variables is declared.

Solver solver = new Solver("SendMoreMoney");

int base = 10;
IntVar s = solver.makeIntVar(0, base - 1, "s");
IntVar e = solver.makeIntVar(0, base - 1, "e");
IntVar n = solver.makeIntVar(0, base - 1, "n");
IntVar d = solver.makeIntVar(0, base - 1, "d");
IntVar m = solver.makeIntVar(0, base - 1, "m");
IntVar o = solver.makeIntVar(0, base - 1, "o");
IntVar r = solver.makeIntVar(0, base - 1, "r");
IntVar y = solver.makeIntVar(0, base - 1, "y");

IntVar[] x = {s,e,n,d,m,o,r,y};

IntVar[] eq = {s,e,n,d,  m,o,r,e, m,o,n,e,y};
int[] coeffs = {1000, 100, 10, 1,            //    S E N D +
                1000, 100, 10, 1,            //    M O R E
                -10000, -1000, -100, -10, -1 // == M O N E Y
};
solver.addConstraint(
      solver.makeScalProdEquality(eq, coeffs, 0));
// s > 0
solver.addConstraint(
      solver.makeGreater(s, 0));
// m > 0
solver.addConstraint(
      solver.makeGreater(m, 0));
// all different
solver.addConstraint(
      solver.makeAllDifferent(x, true));

This approach, using scalar products, is implemented in SendMoreMoney.java. In SendMoreMoney2.java I experimented with some different approaches of coding this problem, some quite hairy.

Search, labeling, output

There is not really much difference how to do search and labeling in the Java version compared to the Python version. Here is the search section of SendMoreMoney.java:

DecisionBuilder db = solver.makePhase(x,
                                       solver.INT_VAR_DEFAULT,
                                       solver.INT_VALUE_DEFAULT);
solver.newSearch(db);
while (solver.nextSolution()) {
   for(int i = 0; i < 8; i++) {
     System.out.print(x[i].toString() + " ");
   }
   System.out.println();
 }
solver.endSearch();

// Statistics
System.out.println();
System.out.println("Solutions: " + solver.solutions());
System.out.println("Failures: " + solver.failures());
System.out.println("Branches: " + solver.branches());
System.out.println("Wall time: " + solver.wall_time() + "ms");

One thing that is different, is how to define the variables for labeling. In Python it is quite easy to concatenate two arrays of decision variables like this:

solver.makePhase(x + y, ...);

But this don't work as easy in Java (being Java). I tend to call such a labeling array for all and fill in all the IntVar from x and y; there are other way of doing this of course. On the other hand, the Python interface cannot handle labeling on matrices (multidimensional tuples), and in these models I also had to rely on a flattened "view" of this matrix (often called x_flat or something like that).

Python also has niftier shortcuts for creating temporary arrays using array comprehensions. For example in the Python version of set_covering.py one can write the following:

# ensure that all cities are covered
for i in range(num_cities):
   b = [x[j] for j in range(num_cities) if distance[i][j] <= min_distance]
   solver.Add(solver.SumGreaterOrEqual(b, 1))

This I have translated to Java code using an ArrayList instead (SetCovering.java):

// ensure that all cities are covered
for(int i = 0; i < num_cities; i++) {
  ArrayList b = new ArrayList();
  for(int j = 0; j < num_cities; j++) {
    if (distance[i][j] <= min_distance) {
      b.add(x[j]);
    }
  }
  solver.addConstraint(
      solver.makeSumGreaterOrEqual(b.toArray(new IntVar[1]), 1));
}

Decompositions, and somewhat under the hood

As far as I know, there is not (yet) support for modulo operations with decision variables (IntVar's). Since modulo is a favorite operator of mine, I implemented a version in DivisibleBy9Through1.java which corresponds quite closely to the Python version divisible_by_9_through_1.py. Here is the Python version:

def my_mod(solver, x, y, r):

    if not isinstance(y, int):
        solver.Add(y != 0)

    lbx = x.Min()
    ubx = x.Max()
    ubx_neg = -ubx
    lbx_neg = -lbx
    min_x = min(lbx, ubx_neg)
    max_x = max(ubx, lbx_neg)

    d = solver.IntVar(min_x, max_x, 'd')    

    if not isinstance(r, int):
        solver.Add(r >= 0)
        solver.Add(x*r >= 0)

    if not isinstance(r, int) and not isinstance(r, int):
        solver.Add(-abs(y) < r)
        solver.Add(r < abs(y))
        
    solver.Add(min_x <= d)
    solver.Add(d <= max_x)
    solver.Add(x == y*d+r)

The Java version is not very different, however I simplified it somewhat and requires that all arguments are IntVar's:

public static void my_mod(Solver solver, IntVar x, IntVar y, IntVar r) {
        
  long lbx = x.min();
  long ubx = x.max();
  long ubx_neg = -ubx;
  long lbx_neg = -lbx;
  int min_x = (int)Math.min(lbx, ubx_neg);
  int max_x = (int)Math.max(ubx, lbx_neg);

  IntVar d = solver.makeIntVar(min_x, max_x, "d");
    
  // r >= 0
  solver.addConstraint(
        solver.makeGreaterOrEqual(r,0));

  // x*r >= 0
  solver.addConstraint(
      solver.makeGreaterOrEqual(
          solver.makeProd(x,r).Var(), 0));

  // -abs(y) < r
  solver.addConstraint(
      solver.makeLess(
          solver.makeOpposite(
              solver.makeAbs(y).Var()).Var(), r));

  // r < abs(y)
  solver.addConstraint(
      solver.makeLess(r,
          solver.makeAbs(y).Var().Var()));

  // min_x <= d, i.e. d > min_x
  solver.addConstraint(
        solver.makeGreater(d, min_x));


  // d <= max_x
  solver.addConstraint(
        solver.makeLessOrEqual(d, max_x));

  // x == y*d+r
  solver.addConstraint(
        solver.makeEquality(x,
            solver.makeSum(
               solver.makeProd(y,d).Var(),r).Var()));
}

Note: This implementation is based on the ECLiPSe version mentioned in A Modulo propagator for ECLiPSE. Here is the ECLiPSe Prolog source code: modulo_propagator.ecl.

Another example is the decomposition of the global constraint circuit (from Circuit model) based on some observations about the "orbit" of x.

public static void circuit(Solver solver, IntVar[] x) {

  int n = x.length;
  IntVar[] z = solver.makeIntVarArray(n, 0, n - 1, "z");

  solver.addConstraint(
      solver.makeAllDifferent(x, true));
  solver.addConstraint(
      solver.makeAllDifferent(z, true));

  // put the orbit of x[0] in z[0..n-1]
  solver.addConstraint(
        solver.makeEquality(z[0], x[0]));
  for(int i = 1; i < n-1; i++) {
    solver.addConstraint(
        solver.makeEquality(z[i], 
            solver.makeElement(x, z[i-1]).Var()));
  }

  // z may not be 0 for i < n-1
  for(int i = 1; i < n - 1; i++) {
    solver.addConstraint(
          solver.makeNonEquality(z[i], 0));
  }
        
  // when i = n-1 it must be 0
  solver.addConstraint(
        solver.makeEquality(z[n - 1], 0));

}

Some benchmark: n-queens

Below is some statistics for the the n-queens problem, implemented in NQueens2 model. The constraint section is:

IntVar[] q = solver.makeIntVarArray(n, 0, n-1, "q");

solver.addConstraint(solver.makeAllDifferent(q, true));

IntVar[] q1 = new IntVar[n];
IntVar[] q2 = new IntVar[n];
for(int i = 0; i < n; i++) {
  q1[i] = solver.makeSum(q[i], i).Var();
  q2[i] = solver.makeSum(q[i], -i).Var();
}
solver.addConstraint(
    solver.makeAllDifferent(q1, true));
solver.addConstraint(
    solver.makeAllDifferent(q2, true));

Here is the timings (for one solution):

     N    Failures       Branches      Wall time Required
                                                 memory
  -------------------------------------------------------
     8      3             9              2ms
    10      5            16              3ms
    50     15            70              4ms
   100     12           120             10ms
   200      1           198             22ms
   500      2           496            126ms
   501      6           503            128ms
  1000      0           996            504ms 
  2000      0          1995           2164ms
  3000      2          3001           5119ms
  4000      0          3999           9493ms
  5000      1          4994          15803ms
  6000      1          5996          24511ms
  7000      0          6997          34354ms   1.1Gb
  8000      0          7996          46985ms   1.5Gb
  9000      0          8996          62905ms   1.8Gb
 10000      9          9999          81312ms   2.3Gb

I ran these tests on a Linux Ubuntu 10.4 Lynx, with 12Gb RAM, 8 cores, (Intel Core i7 930 @ 2.80GHz). Note: the "Wall time" is the time reported by the solver, and the "Required memory" is manually via the "top" command for the running process.

I wondered if the Python version nqueens3.py that use the same constraints and labeling as the Java version should have any different result. Well, it has exactly the same failures and branches as the Java version (no surprise since it use the same solver engine). Also the wall times and memory usage are almost identical.

See also these blog posts, all about the Python interface:

Some kind of conclusion

As mentioned above, it was quite easy to port the Python models to Java since they share a lot of common API (with some difference in details). If there is no requirement about the language to use, I personally would probably prefer the Python interface since it's more high level and is less verbose than the Java interface (I like brevity).

Maybe (repeat maybe) I will also have a take on Google or-tools' C++ interface.

My Google or-tools Java models

Here are my Java models. All of them where ported from existing Python models. They are also available at Google CP Solver page / Java models (the Python models is available from the same page). For implementation of these problems in other constraint programming systems, see Common constraint programming problems.

AllDifferentExcept0.java: A decomposition of the global constraint alldifferent_except_0
AllInterval.java: All interval problem (CSPLib #7)
Circuit.java: A decomposition of the global constraint circuit
CoinsGrid.java: Tony Hurlimann's coin grid problem
CoveringOpl.java: Set covering (OPL)
Crossword.java: Simple crossword problem
DeBruijn.java: "Arbitrary" de Bruijn sequences
Diet.java: Simple diet optimization problem
DivisibleBy9Through1.java: Divisible by 9 through 1 for a "truncated" number. Includes a modulo decomposition.
LeastDiff.java: Least difference problem (minimize the difference of ABCDE-FGHIJ, where all digits A..J are different)
MagicSquare.java: Magic Square
Map.java: Simple map coloring problem
Map2.java: Simple map coloring problem, alternative version
Minesweeper.java: Minesweeper solver (uses the same data file as for the Python model minesweeper.py, see above)
NQueens.java: n-queens problem
NQueens2.java: n-queens problem, faster version
QuasigroupCompletion.java: Quasigroup completion (uses the same data file as for the Python model quasigroup_completion.py, see above)
SendMoreMoney.java: Solve SEND+MORE=MONEY
SendMoreMoney2.java: Solve SEND+MORE=MONEY (5 different encodings).
SendMostMoney.java: Solve SEND+MOST=MONEY, and show all solutions with maximal value of MONEY.
Seseman.java: Seseman convent problem.
SetCovering.java: Set covering, placing of firestations (Winston)
SetCovering2.java: Set covering, number of security telephons on a campus (Taha)
SetCovering3.java: Set covering, senators making a committe (Murty)
SetCovering4.java: Set covering (and set partiton), work scheduling (Lundgren, Roennqvist, Vaerbrand "Optimeringslaera")
SetCoveringDeployment.java: Set covering deployment
StableMarriage.java: Stable Marriage problem (from Van Hentenryck's OPL book and more problem instances)
Strimko2.java: Strimko puzzle (Latin squares with "streams").
Sudoku.java: Sudoku
SurvoPuzzle.java: Survo puzzle (uses the same data file as for the Python model survo_puzzle.py, see above)
ToNum.java: Converts inter to/from an array of digits
WhoKilledAgatha.java: Who killed agatha? (The Dreadsbury Mansion Murder Mystery, an automated reasoning problem)
Xkcd.java: Xkcd knapsack problem.
YoungTableaux.java: Young tableaux and partitions

Posted by hakank at 10:18 PM | Permalink

March 21, 2011

Version 1.3.2 of G12 MiniZinc released

Version 1.3.2 of MiniZinc has been released.

From the NEWS:

G12 MiniZinc Distribution 1.3.2
-------------------------------

Bugs fixed in this release:

* We have fixed a series of problems in mzn2fzn and solns2out with flattening and printing of expressions that contain anonymous variables. (This also resolves bug #67.)

* A bug that caused mzn2fzn to abort if an annotation contained a string literal argument has been fixed. [Bug #212]

* A problem with mzn2fzn that caused it to sometimes not print error messages on Windows XP, despite errors being present, has been fixed. [Bugs #97 and #215]

Other changes in this release:

* The distribution now includes a language syntax definition for the Zinc family of languages for use with GtkSourceView. The definition is in the directory tools/gtksourceview.

And here is the release notes for version 1.3.1 (released some weeks ago):

G12 MiniZinc Distribution 1.3.1
-------------------------------
Bugs fixed in this release:

* The CP-Viz support now correctly renders solutions for optimisation problems.

Posted by hakank at 07:08 AM | Permalink

March 06, 2011

A matching problem, a related seating problem, and some other seating problems

The INFORMS blog challenge the last month (February) was O.R. and Love. The entry by Capgemini O.R. Blog Team was Cupids with Computers where a bunch of OR'ers (including me) was matched according to some - not exactly stated - compatibility scores. It is both interesting and fun to read. Also, see their companion blog post Two Deadly Sins that discuss some of the problems with matching.

I was curious how well a Constraint Programming system like MiniZinc would do on this matching problem so I wrote the model or_matching.mzn.

And it did quite well: it (Gecode/fz) solves the problem in 0.1 seconds (with 191 failures). Some comments about the model:

Since few CP system have good support for floats I converted the scores to integer (x 10), i.e. 3.7 is represented as 37 etc. However, this can lead to some errors since I assume that the scores are rounded.
it also identify the left out person which is just for fun.

The result is the following matchings (slightly edited):

Total score: 519

P1 P2 Score
--------------
 1  9: 92
 5 14: 69 (tie)
10 18: 69 (tie)
 8 16: 59
 2 19: 55
 3 12: 52
 6 11: 44
13 15: 42
 4  7: 37

Left out: 17

One reason that I modeled the problem was to see if it was possible to identify the OR'ers, or at least myself, and with some assumptions I have a some guesses. The assumptions is that the matchings in the blog post are presented from best match (score) and then ordered in score order.

My model includes a symmetry breaking which place the person with the lower index in the left position of the pair which means that I can just identify a pair. Also the second and third pair has a tie so it is even harder to guess these.


Ranking                                     My guess (pair)

--------------------------------------------------------------------

- Anna Nagurney and Laura McLay:              1,9

- Ian Frommer and Paul Rubin:                 5,14 or 10,18 (the score is a tie)

- Michael Trick and Tallys Yunes:             5,14 or 10,18 (ibid.)

- Larry D'Agostino and Robert Randall:        8,16

- David Smith and Thiago Serra:               2,19

- Francisco Marco-Serrano and Niclola Petty:  3,12

- John Angelis and Michale Watson:            6,11

- John Poppelaars and Patricia Randall:       13,15

- Hakan Kjellerstrand and Phillip Mah:        4,7



- Shiva Subramanian (the leftout):            17

So, based on this, I guess that I am either #4 or #7.

Let us further investigate this. In Two Deadly Sins there is some smaller score matrices where I happen to be named:

 Hakan John Patricia Phillip Shiva
  -     3.7   2.2     2.2     1.3   Hakan
  -      -    4.4     2.2     1.8   John
  -      -    -       4.2     2.7   Patricia
  -      -    -       -       2.2   Philip
  -      -    -       -       -     Shiva

Now assume that this is the same score and names as in the main matrix. My (Hakan's) scores for John, Patricia, Phillip, and Shiva, are 3.7, 2.2, 2.2, and 1.3 respectively, and these value can be found for #4 but not for #7. There is just one score 1.3 and it is for #4 and #17 so we might draw the conclusion that #17 is Shiva.

However, since Hakan is #4 and is connected to #7, this would mean that Philip should be #7 if my matchings are correct. But it can't be since there is no 4.4 score for Hakan and Philip. Hmmm.

OK. If we skip my ranking and instead look at the score matrices we can see that there is one 1.8 score, between John and Shiva (in the small score matrix), and #17 and #7. So this means that #7 is John. And if #7 is John it makes more sense since there is a score of 3.7 between Hakan (#4) and John (#7) in the big score matrix. John and Patricia has a score of 4.4, and the only 4.4 for #7 is #13 so Patricia is #13. And the last one to find is Philip which is #11 given by the score 4.2 with Patricia.

So now we have found the following identifies from the small scores:


Hakan: #4

John: #7

Patricia: #13

Philip: #11

Shiva: #17

And let us compare these findings with my matchings.

Hakan: #4. This seems to be correct. But the matching to #7 is not Philip (he is #11).
John: #7. Now, there are two Johns but neither of them has been attributed to #7 in my matching, so this is plain #fail.
Patricia: #13. My matching matched Patricia with John Poppelaars and is could be correct. This means that the "other John" is #15 (according to my matching)
Philip: #11. Well, this is the same #fail as for Hakan.
Shiva: #17. This seems to be correct.

Well, my method of identifying the persons with the matching seems to be quite wrong, but it was fun trying.

Back to the model and some more details. The kernel of the model is really just the following code, including some symmetry breaking constraints.


% decision variables:

% the matchings

array[1..m, 1..2] of var 1..n: x; 

% the scores for the found matching

array[1..m] of var 0..max_score: s; 



constraint

  alldifferent([x[i,j] | i in 1..m, j in 1..2]) :: domain

;



constraint

  forall(i in 1..m) (

     s[i] = scores[x[i,1], x[i,2]] 



     % symmetry breaking

     /\ x[i,1] <= x[i,2] 

  )

;



% another symmetry breaking

constraint decreasing(s);

I optimized this model for Gecode/fz using these two things found by some experimentation:

branching is on the score array s instead of the seating array x
alldifference has the :: domain annotation. Without this, it takes much longer.

These two makes Gecode solve the problem quite fast: Gecode/fz: 0.16s, 191 failures

However, for other solvers the branching on the score array s is not so good so here are the statistics for branching on the array x instead, and using instead the variable/value heuristics: first_fail/indomain_median which seems to give the best overall result for most solvers.

Gecode/fz: 4.5s, 98469 failures
G12/LazyFD: 2.16s
fzn2smt: 6.0s
fzntini: 16:02 minutes (I'm not sure why this is so hard for fzntini)
G12/fd: 10.127s, 189346 choice points
G12/minizinc: 10.465s, 189346 choice points
G12/fdmip: 10.226s, 189346 choice points
JaCoP/fz: 10.358s, 192957 wrong decisions
SICStus/fz: 18.419s, 362122 backtracks
ECLiPSe/ic: 48.313s
ECLiPSe/fd: 19.557s

Update 2011-03-08
Julien Fischer and Sebastian Brand (of G12 team) was kind to point out that there is a better way of getting the identification of the leftout:

constraint
   if n mod 2 != 0 then
     % This version was suggested by 
     % Sebastian Brand and Julien Fischer.
     leftout > 0 /\
     forall(j in 1..m, k in 1..2) (
        leftout != x[j,k]
     )
   else 
     leftout = 0
   endif;

Below is shown the statistics using this version instead. Julien also pointed out that G12/fd and G12/minizinc is really the same solver and - for this problem - G12/fdmip doesn't make use of the MIP component so it's the same as G12/fd. So I skipped these two latter in this test. The branching is (still) first_fail/indomain_median

    * Gecode/fz: 3.3s, 98469 failures
    * G12/LazyFD: 1.5s
    * fzn2smt: (I stopped after 10 minutes.)
    * fzntini: 5:52 minutes
    * G12/fd: 6.4s, 107178 choice points
    * JaCoP/fz: 5.4s, 192975 wrong decisions
    * SICStus/fz: 7.5s, 362122 backtracks
    * ECLiPSe/ic: 35.5s
    * ECLiPSe/fd: 11.3s

This change boosted the times for most solvers, quite much for some. It's strange that fzn2smt was much slower, and I don't understand why. For fzn2smt I also tested different things, e.g. to removed the :: domain annotation on alldifferent but it made no difference.

Testing indomain_random
I was also curious how some solvers would do with the heuristic indomain_random, so I ran each solver three times and took the best result. Some solvers was "stable" and got exactly the same statistics (Gecode, G12/fd, SICStus, ECLiPSe/ic), but some of them got quite varying result and all three results are shown (JaCoP and ECLiPSe/fd).

    * Gecode/fz: 1.7, 27422 failures (stable)
    * G12/fd: 19.0s, 1401656 choice points (stable)
    * JaCoP/fz: 2.3s, 22178 wrong decisions
        8.5s 164599 wrong decisions
        4.1s 50575 wrong decisions
        2.3s 22178 wrong decisions
    * SICStus/fz: 3.3s, 116386 backtracks (stable)
    * ECLiPSe/ic: 34.5s (stable)
    * ECLiPSe/fd: 5.6s
       7.41s
       5.6s
      26.7s

Almost all of the solver got at least one result better than before, except G12/fd which is surprising.
End of Update 2011-03-08

2011-03-12 Yet another update
After some interesting discussions with Sebastian Brand (thank you, Sebastian!) I have included some improvements.
1) Since the calculation of the leftout don't contribute much to the constraint store, it has been removed as a decision variable and just calculated in the output section:


output [

  let { 

     int: leftout = max(0..n diff { fix(x[j,k]) | j in 1..m, k in 1..2 }) 

    }

  in

  % ...

2) The symmetry breaking constraint to sort the matchings in score order

  decreasing(s)

has been replaced with another symmetry breaking constraint which seems to be faster. Unfortunately, there is no sorting anymore but since it is faster for all solvers I prefer it.

  forall(i in 2..m)( x[i-1,1] < x[i,1] )

3) The branching has been changed to first_fail/indomain_split

With these three changes there are now improvements of most solvers; for ECLiPSe's solvers a radical improvement (compare with the listing above):

- G12/lazy: 0.89s
- Gecode: 0.10s and 649 failures
Note: without the ::domain annotation on alldifferent it's slighly worse: 0.16s and 1898
That difference has been much larger in earlier versions.
- JaCoP: 1.4s: 3253 wrong search decisions
- SICStus: 0.29s 3291 backtracks
- Choco: 1.8s 5928 backtracks
- ECLiPSe/ic: 1.6s
- ECLiPSe/fd: 0.94s
- smt2fzn is still very slow (I contacted the fzn2smt developers about this)
- tini: 4:44min (slightly better, but still very slow)

4) Sebastian also suggested that I should test a table constraint. That is, instead of


  s[i] = scores[x[i,1], x[i,2]]

we use this:


   table([x[i,1], x[i,2], s[i]], scores_table)

This also requires another representation of the score matrix:


scores_table = array2d(1..n*n, 1..3,

  [

    1, 1, 00,

    1, 2, 30,

    1, 3, 10,

    1, 4, 33,

    1, 5, 49,

    % ....

Here is the result of using this table approach (together with the previous changes):

- G12/fd: 2.0s, 708 choice points (slower)
- G12/lazy: 2.442s (slower)
- Gecode: 0.11s 686 failures (about the same)
- JaCoP: 1.2s 785 wrong search decisions (slightly better)
- SICStus: 0.43 746 backtracks (better in backtracks, but slightly slower)
- Choco: 1.9s: 2299 backtracks (better in backtracks, but slightly slower)
- ECLiPSe/ic: 2.8s (slower)
- ECLiPSe/fd: 1.4s (slower)
- fzn2smt: 11.1s (and SMT is back in the game again!)
Note: Using the decreasing/1 constraint instead, it takes 7.0s.
- tini: stopped after 10 minutes, so it's slower than the previous version.

To sum: Most solvers where a tiny bit slower with this table approach, and some have slightly less choice points/failures/etc. Hmmm, maybe this (slight) increment of time for some solvers is caused by the slightly larger file to process by mzn2fzn?

The big winner here is SMT, but for this problem it's still slower than most other solvers.

I've put the improvements in the same model as before or_matching.mzn (the older model is now in or_matching_orig.mzn).

Also, thanks to Krzysztof Kuchcinski for an interesting discussion about this.

End of update 2011-03-12

A related seating problem

This matching problem inspired me also to model a related problem using the score matrix from the matching model:

When all these persons meet, what is the best way to place them around a (topological circular) table if the objective is to get the highest possible compatibility score for the adjacent seated persons.

Another MiniZinc model (or_seating.mzn) gives an optimal value of 996 score points, in about 1.5 seconds (with 17629 failures) using Gecode.

There are some possible way of approaching this problem, and I took the TSP way using a global constraint circuit. This model use my decomposition of a circuit constraint which, given the score matrix, gives an optimal circuit. However, what we really want is the path (the seating), so we have do extract the path. After a while I realized that the temporary array z used in the circuit_me decomposition is exactly this path, so it's just to use it in the parameter of the predicate. A consequence of this is that all seatings ends with person #1, which shouldn't really be a problem since the path is a permutation.





predicate circuit_me(array[int] of var int: x, array[int] of var int: z) =

  let { 

        int: lbx = min(index_set(x)),

        int: ubx = max(index_set(x))

  }

  in

   all_different(x) :: domain /\

   all_different(z) :: domain  /\



   % put the orbit of x[1] in in z[1..n]

   z[lbx] = x[lbx] /\ 

   forall(i in lbx+1..ubx) (

      z[i] = x[z[i-1]]

   )

   /\ % may not be 1 for i < n

   forall(i in lbx..ubx-1) (

      z[i] != lbx

   )

   /\  % when i = n it must be 1

   z[n] = lbx

;

Here are the two optimal solutions found by this model (with the implicit symmetry breaking that #1 is always the last position). However, the second solution is just the first reversed.


9 10 18 8 16 13 15 12 3 17 19 2 11 6 14 5 7 4 1

4 7 5 14 6 11 2 19 17 3 12 15 13 16 8 18 10 9 1

For the systems that support the circuit/1 constraint (right now only Gecode and JaCoP support this), there is a variant: one can use circuit to get the circuit and then extract the path (seating) with this "extractor" predicate where the solution always start with #1.


predicate circuit_path(array[int] of var int: x, 

                       array[int] of var int: p) =

   % we always starts the path at 1

   p[1] = 1

   /\

   forall(i in 2..n) (

     p[i] = x[p[i-1]] 

   );

Then we got these two optimal solutions, which is just permutations of the two above.


1 9 10 18 8 16 13 15 12 3 17 19 2 11 6 14 5 7 4

1 4 7 5 14 6 11 2 19 17 3 12 15 13 16 8 18 10 9

Connection to the matching problem:
Comparing these seating placements with the matching above, it seems to be correct, e.g. #9 and #1 sit together, as do #18 and #6, #2 and #19, etc.

Some other seating problems

For some reason I am fascinated by seating problems and I'm not really alone. John Toczek's PuzzlOr problem Best Host this February was a seating problem (the object is to minimize seating conflicts given that some persons would like to sit together) that I solved using a rather simple MiniZinc model. However, I will not link to the model since the contest (to win a “O.R. The Science of Better” T-shirt) is still open. And let us not forget Karen Petrie's Wedding planning problem: How a computer scientist organised her wedding.

Here are two (classical) combinatorial seating problems, the formulation taken from Fei Dai, Mohit Dhawan, Changfu Wu and Jie Wu: On Solutions to a Seating Problem, page 1:

Seating, row
MiniZinc model seating_row.mzn

n couples are seated in one row. Each person, except two seated at the two ends, has two neighbors. It is required that each neighbor of person k should either have the same gender or be the spouse of k.

Seating, table:
MiniZinc model seating_table.mzn

A classical [...] seating problem is the following: n couples are seated around a table in such a way that two neighbors of each person k are the different gender of k and they are not the spouse of k.

Both these problems is about constraints on couples and gender. There are different way of representing who is in a couple with "anonymous persons", i.e. using just integers to represent the couples and individuals. Here are two variants (I included both of these in the seating_row model):

1) Using math:


predicate is_couple1(int: n, var int: a, var int: b) =

  % exists(i in 1..n) (

  exists(i in max(lb(a), lb(b))..min(ub(a), ub(b))) (

     (2*(i-1)+1 = a /\ 2*(i-1)+2 = b) 

     \/

     (2*(i-1)+1 = b /\ 2*(i-1)+2 = a) 

  );

2) Using set representation, i.e. an array of the pairs:


% define the couple

array[1..n] of set of int: couples = [{2*(i-1)+1, 2*(i-1)+2} | i in 1..n];



predicate is_couple(int: n, var int: a, var int: b, array[int] of set of int: c) =

  exists(j in 1..n) (

     a in c[j] /\ b in c[j]

  );

The second approach of using sets, is in my opinion the nicest, and it's also easier to use if one has a concrete list of couples.

The gender is simply represented by modulo 2.

The paper "On Solutions to a Seating Problem" (cited above) studies "reasonable patterns":

A reasonable pattern is an assignment of n couples without indices that meet the above requirement. For example, when n = 3 all reasonable patterns are as follows:

wwwhhh whhwwh whhhww hhwwwh
hhhwww hwwhhw hwwwhh wwhhhw

However, I am more interested in the concrete seating placements so it's not quite the same problems. For n=3 there is a total of 60 patterns. Here are all 8 different patterns generated for 3 pairs (n=3). The first number indicates the occurrences of each pattern.

12 gender2:hhhwww
 6 gender2:hhwwwh
 6 gender2:hwwhhw
 6 gender2:hwwwhh
 6 gender2:whhhww
 6 gender2:whhwwh
 6 gender2:wwhhhw
12 gender2:wwwhhh

I tried to generate exactly one of each pattern but had no luck. A model with the following simple symmetry breaking generates 20 solutions, less than 60 but more than exactly 8 patterns.


constraint (x[1] = 1 \/ x[1] = 2);

Statistics for Seating, row problem
Here are the number of solutions (using no symmetry breaking) for some different number of couples (n). The solver statistics below is for Gecode/fz where just the last solutions is printed (it took about 2.5 minutes to print the solutions for n=6, without printing it took just 15.7 seconds).


n  #solutions

-------------

1          2  (0.001s, 0 failures)

2          8  (0.001s, 5 failures)

3         60  (0.01s, 73 failures)

4        816  (0.02s, 1181 failures)

5      17520  (0.45s, 29872 failures)

6     550080  (15.7s, 1051417 failures)

7   23839200  (12:36min, 49685576 failures)

The sequence of these numbers seems to correspond to the OEIS Integer Sequence A096121 which is described as A rook must land on each square exactly once, but may start and end anywhere and may intersect its own path. Inspired by Leroy Quet in a Jul 05 2004 posting to the Seqfan mailing list.

Well, I'm not sure if there is actually a deep connection between these two problems or just a freaky coincidence...

Statistics for Seating, table problem
For the seating table problem with 3 couples (n=3) there are 12 solutions (two different patterns). Here are the number of solutions for different n.


n  #solutions

-------------

1         0  (0s, 0 failures)

2         0  (0s, 4 failures)

3        12 (0s, 0 failures)

4        96 (0.01s, 36 failures)

5      3120 (0.04s, 340 failures)

6    115200 (0.9s,  4436 failures)

7   5836320 (43.4s, 146938 failures)

8 382072320 (47:28minutes, 6593012)

The related OEIS sequence is A059375 (Total menage numbers 2*n!*A000179(n)) (the sequence starts with 1 which is the value for n=0, but this is weird in our current setting). This problem is also known as the Menage problem (Wikipedia) so the correspondence between these numbers is not surprising at all. Also see Wolfram MathWorld: Married Couples Problem.

Posted by hakank at 11:23 AM | Permalink

My Constraint Programming Blog