Main

March 06, 2011

A matching problem, a related seating problem, and some other seating problems

The INFORMS blog challenge the last month (February) was O.R. and Love. The entry by Capgemini O.R. Blog Team was Cupids with Computers where a bunch of OR'ers (including me) was matched according to some - not exactly stated - compatibility scores. It is both interesting and fun to read. Also, see their companion blog post Two Deadly Sins that discuss some of the problems with matching.

I was curious how well a Constraint Programming system like MiniZinc would do on this matching problem so I wrote the model or_matching.mzn.

And it did quite well: it (Gecode/fz) solves the problem in 0.1 seconds (with 191 failures). Some comments about the model:
  • Since few CP system have good support for floats I converted the scores to integer (x 10), i.e. 3.7 is represented as 37 etc. However, this can lead to some errors since I assume that the scores are rounded.
  • it also identify the left out person which is just for fun.
The result is the following matchings (slightly edited):
Total score: 519

P1 P2 Score
--------------
 1  9: 92
 5 14: 69 (tie)
10 18: 69 (tie)
 8 16: 59
 2 19: 55
 3 12: 52
 6 11: 44
13 15: 42
 4  7: 37

Left out: 17
One reason that I modeled the problem was to see if it was possible to identify the OR'ers, or at least myself, and with some assumptions I have a some guesses. The assumptions is that the matchings in the blog post are presented from best match (score) and then ordered in score order.

My model includes a symmetry breaking which place the person with the lower index in the left position of the pair which means that I can just identify a pair. Also the second and third pair has a tie so it is even harder to guess these.

Ranking My guess (pair)
--------------------------------------------------------------------
- Anna Nagurney and Laura McLay: 1,9
- Ian Frommer and Paul Rubin: 5,14 or 10,18 (the score is a tie)
- Michael Trick and Tallys Yunes: 5,14 or 10,18 (ibid.)
- Larry D'Agostino and Robert Randall: 8,16
- David Smith and Thiago Serra: 2,19
- Francisco Marco-Serrano and Niclola Petty: 3,12
- John Angelis and Michale Watson: 6,11
- John Poppelaars and Patricia Randall: 13,15
- Hakan Kjellerstrand and Phillip Mah: 4,7

- Shiva Subramanian (the leftout): 17

So, based on this, I guess that I am either #4 or #7.

Let us further investigate this. In Two Deadly Sins there is some smaller score matrices where I happen to be named:
 Hakan John Patricia Phillip Shiva
  -     3.7   2.2     2.2     1.3   Hakan
  -      -    4.4     2.2     1.8   John
  -      -    -       4.2     2.7   Patricia
  -      -    -       -       2.2   Philip
  -      -    -       -       -     Shiva
Now assume that this is the same score and names as in the main matrix. My (Hakan's) scores for John, Patricia, Phillip, and Shiva, are 3.7, 2.2, 2.2, and 1.3 respectively, and these value can be found for #4 but not for #7. There is just one score 1.3 and it is for #4 and #17 so we might draw the conclusion that #17 is Shiva.

However, since Hakan is #4 and is connected to #7, this would mean that Philip should be #7 if my matchings are correct. But it can't be since there is no 4.4 score for Hakan and Philip. Hmmm.

OK. If we skip my ranking and instead look at the score matrices we can see that there is one 1.8 score, between John and Shiva (in the small score matrix), and #17 and #7. So this means that #7 is John. And if #7 is John it makes more sense since there is a score of 3.7 between Hakan (#4) and John (#7) in the big score matrix. John and Patricia has a score of 4.4, and the only 4.4 for #7 is #13 so Patricia is #13. And the last one to find is Philip which is #11 given by the score 4.2 with Patricia.

So now we have found the following identifies from the small scores:
Hakan: #4
John: #7
Patricia: #13
Philip: #11
Shiva: #17
And let us compare these findings with my matchings.
  • Hakan: #4. This seems to be correct. But the matching to #7 is not Philip (he is #11).
  • John: #7. Now, there are two Johns but neither of them has been attributed to #7 in my matching, so this is plain #fail.
  • Patricia: #13. My matching matched Patricia with John Poppelaars and is could be correct. This means that the "other John" is #15 (according to my matching)
  • Philip: #11. Well, this is the same #fail as for Hakan.
  • Shiva: #17. This seems to be correct.
Well, my method of identifying the persons with the matching seems to be quite wrong, but it was fun trying.

Back to the model and some more details. The kernel of the model is really just the following code, including some symmetry breaking constraints.

% decision variables:
% the matchings
array[1..m, 1..2] of var 1..n: x;
% the scores for the found matching
array[1..m] of var 0..max_score: s;

constraint
alldifferent([x[i,j] | i in 1..m, j in 1..2]) :: domain
;

constraint
forall(i in 1..m) (
s[i] = scores[x[i,1], x[i,2]]

% symmetry breaking
/\ x[i,1] <= x[i,2]
)
;

% another symmetry breaking
constraint decreasing(s);

I optimized this model for Gecode/fz using these two things found by some experimentation:
  • branching is on the score array s instead of the seating array x
  • alldifference has the :: domain annotation. Without this, it takes much longer.
These two makes Gecode solve the problem quite fast: Gecode/fz: 0.16s, 191 failures

However, for other solvers the branching on the score array s is not so good so here are the statistics for branching on the array x instead, and using instead the variable/value heuristics: first_fail/indomain_median which seems to give the best overall result for most solvers.
  • Gecode/fz: 4.5s, 98469 failures
  • G12/LazyFD: 2.16s
  • fzn2smt: 6.0s
  • fzntini: 16:02 minutes (I'm not sure why this is so hard for fzntini)
  • G12/fd: 10.127s, 189346 choice points
  • G12/minizinc: 10.465s, 189346 choice points
  • G12/fdmip: 10.226s, 189346 choice points
  • JaCoP/fz: 10.358s, 192957 wrong decisions
  • SICStus/fz: 18.419s, 362122 backtracks
  • ECLiPSe/ic: 48.313s
  • ECLiPSe/fd: 19.557s
Update 2011-03-08
Julien Fischer and Sebastian Brand (of G12 team) was kind to point out that there is a better way of getting the identification of the leftout:
constraint
   if n mod 2 != 0 then
     % This version was suggested by 
     % Sebastian Brand and Julien Fischer.
     leftout > 0 /\
     forall(j in 1..m, k in 1..2) (
        leftout != x[j,k]
     )
   else 
     leftout = 0
   endif;
Below is shown the statistics using this version instead. Julien also pointed out that G12/fd and G12/minizinc is really the same solver and - for this problem - G12/fdmip doesn't make use of the MIP component so it's the same as G12/fd. So I skipped these two latter in this test. The branching is (still) first_fail/indomain_median
    * Gecode/fz: 3.3s, 98469 failures
    * G12/LazyFD: 1.5s
    * fzn2smt: (I stopped after 10 minutes.)
    * fzntini: 5:52 minutes
    * G12/fd: 6.4s, 107178 choice points
    * JaCoP/fz: 5.4s, 192975 wrong decisions
    * SICStus/fz: 7.5s, 362122 backtracks
    * ECLiPSe/ic: 35.5s
    * ECLiPSe/fd: 11.3s    
This change boosted the times for most solvers, quite much for some. It's strange that fzn2smt was much slower, and I don't understand why. For fzn2smt I also tested different things, e.g. to removed the :: domain annotation on alldifferent but it made no difference.

Testing indomain_random
I was also curious how some solvers would do with the heuristic indomain_random, so I ran each solver three times and took the best result. Some solvers was "stable" and got exactly the same statistics (Gecode, G12/fd, SICStus, ECLiPSe/ic), but some of them got quite varying result and all three results are shown (JaCoP and ECLiPSe/fd).
    * Gecode/fz: 1.7, 27422 failures (stable)
    * G12/fd: 19.0s, 1401656 choice points (stable)
    * JaCoP/fz: 2.3s, 22178 wrong decisions
        8.5s 164599 wrong decisions
        4.1s 50575 wrong decisions
        2.3s 22178 wrong decisions
    * SICStus/fz: 3.3s, 116386 backtracks (stable)
    * ECLiPSe/ic: 34.5s (stable)
    * ECLiPSe/fd: 5.6s
       7.41s
       5.6s
      26.7s
Almost all of the solver got at least one result better than before, except G12/fd which is surprising.
End of Update 2011-03-08


2011-03-12 Yet another update
After some interesting discussions with Sebastian Brand (thank you, Sebastian!) I have included some improvements.
1) Since the calculation of the leftout don't contribute much to the constraint store, it has been removed as a decision variable and just calculated in the output section:

output [
let {
int: leftout = max(0..n diff { fix(x[j,k]) | j in 1..m, k in 1..2 })
}
in
% ...

2) The symmetry breaking constraint to sort the matchings in score order
  decreasing(s)
has been replaced with another symmetry breaking constraint which seems to be faster. Unfortunately, there is no sorting anymore but since it is faster for all solvers I prefer it.
  forall(i in 2..m)( x[i-1,1] < x[i,1] )
3) The branching has been changed to first_fail/indomain_split

With these three changes there are now improvements of most solvers; for ECLiPSe's solvers a radical improvement (compare with the listing above):

- G12/lazy: 0.89s
- Gecode: 0.10s and 649 failures
Note: without the ::domain annotation on alldifferent it's slighly worse: 0.16s and 1898
That difference has been much larger in earlier versions.
- JaCoP: 1.4s: 3253 wrong search decisions
- SICStus: 0.29s 3291 backtracks
- Choco: 1.8s 5928 backtracks
- ECLiPSe/ic: 1.6s
- ECLiPSe/fd: 0.94s
- smt2fzn is still very slow (I contacted the fzn2smt developers about this)
- tini: 4:44min (slightly better, but still very slow)


4) Sebastian also suggested that I should test a table constraint. That is, instead of
s[i] = scores[x[i,1], x[i,2]]

we use this:
table([x[i,1], x[i,2], s[i]], scores_table)

This also requires another representation of the score matrix:

scores_table = array2d(1..n*n, 1..3,
[
1, 1, 00,
1, 2, 30,
1, 3, 10,
1, 4, 33,
1, 5, 49,
% ....

Here is the result of using this table approach (together with the previous changes):

- G12/fd: 2.0s, 708 choice points (slower)
- G12/lazy: 2.442s (slower)
- Gecode: 0.11s 686 failures (about the same)
- JaCoP: 1.2s 785 wrong search decisions (slightly better)
- SICStus: 0.43 746 backtracks (better in backtracks, but slightly slower)
- Choco: 1.9s: 2299 backtracks (better in backtracks, but slightly slower)
- ECLiPSe/ic: 2.8s (slower)
- ECLiPSe/fd: 1.4s (slower)
- fzn2smt: 11.1s (and SMT is back in the game again!)
Note: Using the decreasing/1 constraint instead, it takes 7.0s.
- tini: stopped after 10 minutes, so it's slower than the previous version.

To sum: Most solvers where a tiny bit slower with this table approach, and some have slightly less choice points/failures/etc. Hmmm, maybe this (slight) increment of time for some solvers is caused by the slightly larger file to process by mzn2fzn?

The big winner here is SMT, but for this problem it's still slower than most other solvers.

I've put the improvements in the same model as before or_matching.mzn (the older model is now in or_matching_orig.mzn).

Also, thanks to Krzysztof Kuchcinski for an interesting discussion about this.

End of update 2011-03-12

A related seating problem

This matching problem inspired me also to model a related problem using the score matrix from the matching model:
When all these persons meet, what is the best way to place them around a (topological circular) table if the objective is to get the highest possible compatibility score for the adjacent seated persons.
Another MiniZinc model (or_seating.mzn) gives an optimal value of 996 score points, in about 1.5 seconds (with 17629 failures) using Gecode.

There are some possible way of approaching this problem, and I took the TSP way using a global constraint circuit. This model use my decomposition of a circuit constraint which, given the score matrix, gives an optimal circuit. However, what we really want is the path (the seating), so we have do extract the path. After a while I realized that the temporary array z used in the circuit_me decomposition is exactly this path, so it's just to use it in the parameter of the predicate. A consequence of this is that all seatings ends with person #1, which shouldn't really be a problem since the path is a permutation.

predicate circuit_me(array[int] of var int: x, array[int] of var int: z) =
let {
int: lbx = min(index_set(x)),
int: ubx = max(index_set(x))
}
in
all_different(x) :: domain /\
all_different(z) :: domain /\

% put the orbit of x[1] in in z[1..n]
z[lbx] = x[lbx] /\
forall(i in lbx+1..ubx) (
z[i] = x[z[i-1]]
)
/\ % may not be 1 for i < n
forall(i in lbx..ubx-1) (
z[i] != lbx
)
/\ % when i = n it must be 1
z[n] = lbx
;

Here are the two optimal solutions found by this model (with the implicit symmetry breaking that #1 is always the last position). However, the second solution is just the first reversed.

9 10 18 8 16 13 15 12 3 17 19 2 11 6 14 5 7 4 1
4 7 5 14 6 11 2 19 17 3 12 15 13 16 8 18 10 9 1

For the systems that support the circuit/1 constraint (right now only Gecode and JaCoP support this), there is a variant: one can use circuit to get the circuit and then extract the path (seating) with this "extractor" predicate where the solution always start with #1.

predicate circuit_path(array[int] of var int: x,
array[int] of var int: p) =
% we always starts the path at 1
p[1] = 1
/\
forall(i in 2..n) (
p[i] = x[p[i-1]]
);

Then we got these two optimal solutions, which is just permutations of the two above. 1 9 10 18 8 16 13 15 12 3 17 19 2 11 6 14 5 7 4
1 4 7 5 14 6 11 2 19 17 3 12 15 13 16 8 18 10 9

Connection to the matching problem:
Comparing these seating placements with the matching above, it seems to be correct, e.g. #9 and #1 sit together, as do #18 and #6, #2 and #19, etc.

Some other seating problems

For some reason I am fascinated by seating problems and I'm not really alone. John Toczek's PuzzlOr problem Best Host this February was a seating problem (the object is to minimize seating conflicts given that some persons would like to sit together) that I solved using a rather simple MiniZinc model. However, I will not link to the model since the contest (to win a “O.R. The Science of Better” T-shirt) is still open. And let us not forget Karen Petrie's Wedding planning problem: How a computer scientist organised her wedding.

Here are two (classical) combinatorial seating problems, the formulation taken from Fei Dai, Mohit Dhawan, Changfu Wu and Jie Wu: On Solutions to a Seating Problem, page 1:

Seating, row
MiniZinc model seating_row.mzn
n couples are seated in one row. Each person, except two seated at the two ends, has two neighbors. It is required that each neighbor of person k should either have the same gender or be the spouse of k.
Seating, table:
MiniZinc model seating_table.mzn
A classical [...] seating problem is the following: n couples are seated around a table in such a way that two neighbors of each person k are the different gender of k and they are not the spouse of k.
Both these problems is about constraints on couples and gender. There are different way of representing who is in a couple with "anonymous persons", i.e. using just integers to represent the couples and individuals. Here are two variants (I included both of these in the seating_row model):

1) Using math:
predicate is_couple1(int: n, var int: a, var int: b) =
% exists(i in 1..n) (
exists(i in max(lb(a), lb(b))..min(ub(a), ub(b))) (
(2*(i-1)+1 = a /\ 2*(i-1)+2 = b)
\/
(2*(i-1)+1 = b /\ 2*(i-1)+2 = a)
);

2) Using set representation, i.e. an array of the pairs:
% define the couple
array[1..n] of set of int: couples = [{2*(i-1)+1, 2*(i-1)+2} | i in 1..n];

predicate is_couple(int: n, var int: a, var int: b, array[int] of set of int: c) =
exists(j in 1..n) (
a in c[j] /\ b in c[j]
);

The second approach of using sets, is in my opinion the nicest, and it's also easier to use if one has a concrete list of couples.

The gender is simply represented by modulo 2.

The paper "On Solutions to a Seating Problem" (cited above) studies "reasonable patterns":
A reasonable pattern is an assignment of n couples without indices that meet the above requirement. For example, when n = 3 all reasonable patterns are as follows:

wwwhhh whhwwh whhhww hhwwwh
hhhwww hwwhhw hwwwhh wwhhhw
However, I am more interested in the concrete seating placements so it's not quite the same problems. For n=3 there is a total of 60 patterns. Here are all 8 different patterns generated for 3 pairs (n=3). The first number indicates the occurrences of each pattern.
12 gender2:hhhwww
 6 gender2:hhwwwh
 6 gender2:hwwhhw
 6 gender2:hwwwhh
 6 gender2:whhhww
 6 gender2:whhwwh
 6 gender2:wwhhhw
12 gender2:wwwhhh
I tried to generate exactly one of each pattern but had no luck. A model with the following simple symmetry breaking generates 20 solutions, less than 60 but more than exactly 8 patterns.

constraint (x[1] = 1 \/ x[1] = 2);

Statistics for Seating, row problem
Here are the number of solutions (using no symmetry breaking) for some different number of couples (n). The solver statistics below is for Gecode/fz where just the last solutions is printed (it took about 2.5 minutes to print the solutions for n=6, without printing it took just 15.7 seconds).

n #solutions
-------------
1 2 (0.001s, 0 failures)
2 8 (0.001s, 5 failures)
3 60 (0.01s, 73 failures)
4 816 (0.02s, 1181 failures)
5 17520 (0.45s, 29872 failures)
6 550080 (15.7s, 1051417 failures)
7 23839200 (12:36min, 49685576 failures)

The sequence of these numbers seems to correspond to the OEIS Integer Sequence A096121 which is described as A rook must land on each square exactly once, but may start and end anywhere and may intersect its own path. Inspired by Leroy Quet in a Jul 05 2004 posting to the Seqfan mailing list.

Well, I'm not sure if there is actually a deep connection between these two problems or just a freaky coincidence...

Statistics for Seating, table problem
For the seating table problem with 3 couples (n=3) there are 12 solutions (two different patterns). Here are the number of solutions for different n.

n #solutions
-------------
1 0 (0s, 0 failures)
2 0 (0s, 4 failures)
3 12 (0s, 0 failures)
4 96 (0.01s, 36 failures)
5 3120 (0.04s, 340 failures)
6 115200 (0.9s, 4436 failures)
7 5836320 (43.4s, 146938 failures)
8 382072320 (47:28minutes, 6593012)

The related OEIS sequence is A059375 (Total menage numbers 2*n!*A000179(n)) (the sequence starts with 1 which is the value for n=0, but this is weird in our current setting). This problem is also known as the Menage problem (Wikipedia) so the correspondence between these numbers is not surprising at all. Also see Wolfram MathWorld: Married Couples Problem.

January 05, 2011

INFORMS’ first Blog Challenge Results: OR and the Holidays

All 14 entries in INFORMS' December Blog Challenge: O.R. and the Holidays are now available at INFORMS’ first Blog Challenge Results.
INFORMS’ first Blog Challenge was a hit! We received 14 entries on the topic of O.R. and the Holidays. As you might suspect, the travelling salesman problem was noted, with Santa needing to make his marathon delivery to all of the good little girls and boys of O.R. However, the O.R. suggestions for holiday success were not limited to Santa, with the elves receiving tips on inventory optimization, making the most of the family vacation with revenue management, maximizing baking production, a mixing model for picking the teams for the office bowling party, and finding the “perfect” parking spot at the mall. Rounding out the bunch was an interview with “Dr. O.R. Field” about her New Year’s Resolutions. Thanks to everyone who contributed, and a Happy New Year to all.
My own entry was Christmas Company Competition Problem: Mixing teams.

The topic for January 2011 INFORMS Blog Challenge: O.R. and Politics.

June 24, 2009

My old swedish blog posts about constraint programming translated (via Google)

Before starting this blog (i.e. My Constraint Programming blog) late December 2008, I blogged about constraint programming in my Swedish blog hakank.blogg). Translations of those Swedish blog post has not been collected before, and now it is time.

So, here are links to most of the blog posts from the category Constraint Programming, translated via Google's Language Tools. Most of the translation is intelligible, but if you have some questions about some particular, please feel to mail me: hakank@bonetmail.com for more information.


November 4, 2005
Swedish: Sesemans matematiska klosterproblem samt lite Constraint Logic Programming
Translated: Sesemans kloster mathematical problems and little Constraint Logic Programming ("kloster" means "convent").


April 18, 2006
Swedish: Choco: Constraint Programming i Java
Translated: Choco: Constraint Programming in Java


The two post above was written when I had just a cursory interest in constraint programming. From about February 2008 and onward, it became my main interest.


April 5, 2008
Swedish: Constraint Programming: Minizinc, Gecode/flatzinc och ECLiPSe/minizinc,
Translated: Constraint Programming: Minizinc, Gecode / flatzinc and ECLIPSE / minizinc.


April 14, 2008
Swedish: MiniZinc-sida samt fler MiniZinc-exempel
Translated: MiniZinc-page and multi-MiniZinc example.


April 21, 2008
Swedish: Applikationer med constraint programming, lite om operations research samt nya MiniZinc-modeller
Translated: Applications of constraint programming, a little of operations research and new MiniZinc models


April 26, 2008
Swedish: Ett litet april-pyssel
Translated: A small-craft in April (a better translation would be "A small April puzzle")


April 27, 2008
Swedish: Mitt OpenOffice Calc-/Excel-skov
Translated: My Open Office Calc-/Excel-skov (better translation: My Open Office Calc-/Excel period).


May 26, 2008
Swedish: Några fler MiniZinc-modeller, t.ex. Smullyans Knights and Knaves-problem
Translated: Some more MiniZinc models, eg Smullyans Knights and Knaves-problem Smullyans Knights and Knaves problem


June 2, 2008
Swedish: MiniZinc/FlatZinc version 0.8 släppt
Translated: MiniZinc / FlatZinc version 0.8 released


June 2, 2008
Swedish: Nya MiniZinc-modeller, flera global constraints, däribland clique
Translated: New MiniZinc models, several global constraints, including Clique


June 5, 2008
Swedish: Mats Anderssons tävling kring fotbolls-EM 2008 - ett MiniZinc-genererat tips
Translated: Mats Andersson racing around championship in 2008 - a MiniZinc-generated tips (it is about a competition how to predict Soccer World Cup 2008 using MiniZinc)


June 24, 2008
Swedish: Tre matematiska / logiska pyssel med constraint programming-lösningar: n-puzzle, SETS, M12 (i MiniZinc)
Translated: Three mathematical / logical craft with constraint programming solutions: n-puzzle, SETS, M12 (in MiniZinc) ("craft" should be translated to "puzzles")


June 24, 2008
Swedish: MiniZinc-nyheter
Translated: MiniZinc news


June 29, 2008
Swedish: Gruppteoretisk lösning av M12 puzzle i GAP
Translated: Group Theoretical solution of the M12 puzzle in GAP (well, this is not really a constraint programming solution, but it is another way of solving the M12 puzzle blogged about in June 24)


June 30, 2008
Swedish: Gruppteoretisk lösning av M12 puzzle i GAP - take 2
Translated: Group Theoretical solution of the M12 puzzle in GAP - take 2


July 4, 2008
Swedish: Martin Chlond's Integer Programming Puzzles i MiniZinc
Translated: Martin's Chlond Integer Programming Puzzles in MiniZinc


July 7, 2008
Swedish: Fler MiniZinc modeller kring recreational mathematics
Translated: More MiniZinc models on recreational mathematics


July 20, 2008
Swedish: Fler constraint programming-modeller i MiniZinc, t.ex. Minesweeper och Game of Life
Translated: More constraint programming models in MiniZinc, eg Minesweeper och Game of Life Minesweeper and the Game of Life


August 17, 2008
Swedish: JaCoP - Java Constraint Programming solver
Translated: JaCoP - Java Constraint Programming solver


September 14, 2008
Swedish: Constraint programming i Java: Choco version 2 släppt - samt jämförelse med JaCoP
Translated: Constraint programming in Java: Choco version 2 released - and comparison with JaCoP


September 28, 2008
Swedish: Constraint programming: Fler MiniZinc-modeller, t.ex. Martin Gardner och nonogram
Translated: Constraint programming: More MiniZinc models, eg Martin Gardner och nonogram Martin Gardner and nonogram


December 27, 2008
Swedish: Constraint programming-nyheter samt nya MiniZinc-modeller
Translated: Constraint programming, news and new MiniZinc models


December 29, 2008
Swedish: My Constraint Programming Blog
Translated: My Constraint Programming Blog


After that, entries about constraint programming at my swedish blog posts where just summaries of the stuff written here at My Constraint Programming blog.

March 08, 2009

Comet: About 15 new Comet models

The latter part of the last week was dominated by the optimization of the Nonogram solver, which I wrote about in Comet: Nonogram improved: solving problem P200 from 1:30 minutes to about 1 second. I would like to thank Pascal Van Hentenryck, Mikael Zayenz Lagerkvist, and Christian Schulte for help, comments, and inspiration.


Here are the other Comet models written the last week.

Puzzles, recreation mathematics, etc

Pascal Van Hentenryck's OPL book

These models are translations from Pascal Van Hentenryck "The OPL Optimization Programming Language"

Operations research

  • furniture_moving_cumulative.co: Furniture moving using global constraint cumulative (simple scheduling problem from Marriott & Stuckey: 'Programming with constraints', page 112f). Compare with furniture_moving.co which uses the Comet's Schedule constraint.

Global constraints

Many of these models has been inspired by the great collections in Global Constraint Catalog.

February 28, 2009

Comet: regular constraint, a much faster Nonogram with the regular constraint, some OPL models, and more

Since the last time some more Comet model has been written.

Much faster Nonogram model using the regular constraint

In More Comet models, e.g. Nonogram, Steiner triplets, and different set covering problems I presented nonogram.co, a solver for Nonogram puzzles, and also noted that is was quite slow: Well, it is nice to have some more optimization to do (or more probably, a complete remodelling)....

Inspired by the announcement this week of the ECLiPSe example of Nonogram solver using the regular constraint (see nono_regular.ecl.txt and regular.ecl.txt) - and also my earlier Gecode/R model nonogram.rb which used "regular expressions" - I created a Nonogram model in Comet using the regular constraint: nonogram_regular.co.

Let us first look at the regular constraint.


Regular constraint
The regular constraint (see the Comet model regular.co for my implementation) is a global constraint using a DFA (deterministic finite automata) which accepts (or not accepts) an input sequence given a "transition matrix". The constraint was presented by Gilles Pesant in "A Regular Language Membership Constraint for Finite Sequences of Variables" (2004).

My implementation of the regular constraint is heavily borrowed from MiniZinc's builtin regular predicate (from lib/zinc/globals.mzn); the translation to Comet was very straightforward (omitting just some asserts).

An exemple of the usage of the constraint: We want to match the regular expression "123*21" (i.e. first "1", then "2", then zero or more "3", then "2", and last a "1"). Note: The length of the sequence is 10, so there must be 6 "3":s since "123" and "21" are "anchored" at the beginning and the end.

int len = 10;
int n_states = 5; // number of states
int input_max = 3; // the states are 1,2, and 3
int initial_state = 1; // we start with the 1 state
set{int} accepting_states = {n_states}; // This is the last state
// The transition matrix
int transition_fn[1..n_states, 1..input_max] =
[[2, 0, 0], // transitions from state 1: "1" -> state 2
[0, 3, 0], // transitions from state 2: "2" -> state 3
[0, 4, 3], // transitions from state 3: "2" -> state 4 | "3" -> state 3 (loop)
[5, 0, 0], // transitions from state 4: "1" -> state 5
[0, 0, 0]]; // transitions from state 5: END state
...
exploreall {
regular(reg_input, n_states, input_max, transition_fn, initial_state, accepting_states);
} using {
// ....
}

The unique sequence resulting from this automata is thus 1 2 3 3 3 3 3 3 2 1.

For using regular in the Nonogram problem, the automata for each row/column clue, must be built, preferably by a function. In regular.co this is done with the function make_transition_matrix, which by far was the hardest part of this problem (and surely could be written in a smoother way).

For the Nonogram clue [3,2,1] - which represents the regular expression "0*1110*110*10*" - the following automata (transition matrix) is generated:
1 2
0 3
0 4
5 0
5 6
0 7
8 0
8 9
9 0


Note that the regular function uses 0 (zero) as the failing state, so the states must start with 1. This is taken care in the model as the resulting value is subtracted by 1.

As usual in my models, the regular constraint is just a "convenience function", i.e. not using special written propagation methods etc.

The regular constraint has - of course - more applications than for Nonogram solving. I plan to look more into this.


The Nonogram solver: nonogram_regular.co
After the regular constraint and the automata generator was written, it was quite easy to change the old Nonogram solver to use these new tools. The result is in nonogram_regular.co. I was quite curious how fast this version should be compared to the former slow model. In short: it is much faster.

As comparison the Lambda instace took about 12 seconds with the old model; with the new model it takes 0.5 seconds, which is a nice improvement. I never finished a run for Nonunique problem with the older model; the new model takes 0.5 seconds (including the startup of the Comet program). Etc.

The P200 problem nonogram_p200.co now takes 2:05 minutes, which can be compared with 57 seconds for the Gecode/R model. Thus, the Comet model is still slow compared to Gecode version and the ECLiPSe version, which solves the P200 problem in just a second or two. Maybe some creative labeling or a "proper" regular constraint can fasten things up...

Update some hour later: One way to gain about 30 seconds to 1:30 minutes on the P200 problem was to explicitly state the consistency level to onDomain, e.g. cp.post(a[m] == q0, onDomains), and to use another labeling strategy:
forall(i in 1..rows, j in 1..cols : !board[i,j].bound()) {
// label(board[i,j]); // the former strategy
tryall(v in 1..2) m.post(board[i,j] == v, onDomains);
}


Some more Nonogram problems have been coded:

An aside about regular expressions
I have been very interested in regular expressions (especially the more expressive Perl type) for a long time. In 1997 I wrote a simple Perl program MakeRegex which returns a regular expression given a list of words. It was later Appletized in MakeRegexApplet. Now there are better programs/modules for this.

OPL Models

One of my projects is to translate the OPL models from Pascal Van Hentenryck "The OPL Optimization Programming Language" into Comet. Even if OPL and Comet are different languages the reading of the book has been very awarding. Thanks again, Pascal!

Some OPL models already have been published, but now I've been more systematic and started from the beginning. More models to come.

Finding: arrays in a tuple
In Comet: New models, e.g. Einstein puzzle, KenKen, Kakuro, Killer Sudoku, Stigler's Diet problem, translations of OPL models I wrote

One big difference here: In Comet it is not allowed to have an array in a tuple, so the use data must be a separate matrix.

I've found a way of using arrays in a tuple, by first initialize the values in a matrix and then - by using the all function - "slice" the values into the tuple array. This has been done in the models fixed_charge2.co and production3.co.

Example from fixed_charge2.co:

tuple productType {
int profit;
set{Machines} machines;
int[] use;
}
int use[Products, Resources] = [[3,4],[2,3], [6,4]];
productType Product[Products] =
[productType(6, {shirtM}, all(i in Resources) use[shirt,i]),
productType(4, {shortM}, all(i in Resources) use[shorts,i]),
productType(7, {pantM}, all(i in Resources) use[pants,i])];

Combining different models

The alphametic problem SEND + MOST = MONEY has the additional requirement to maximize the value of MONEY. The older model send_most_money.co does that and nothing more.

One natural extension of the problem is the following:
* first find the maximum value of MONEY
* then find all the solutions with this value.

The model send_most_money2.co has two functions:
* smm which is just the constraint for the alphametic problem
* send_most_money which has a parameter money.

send_most_money is first called with 0, indicating that is should maximize the value of MONEY, and then returns that value. The next call to send_most_money is with the calculated MONEY value, which indicates that all solutions should be generated.

The answer is

check all solutions for MONEY = 10876
x[9,7,8,2,1,0,4,6] MONEY: 10876
x[9,7,8,4,1,0,2,6] MONEY: 10876


Jim Orlin's Logic Puzzle

In Colored letters, labeled dice: a logic puzzle Jim Orlin stated a logic puzzle:
My daughter Jenn bough a puzzle book, and showed me a cute puzzle. There are 13 words as follows: BUOY, CAVE, CELT, FLUB, FORK, HEMP, JUDY, JUNK, LIMN, QUIP, SWAG, VISA, WISH.

There are 24 different letters that appear in the 13 words. The question is: can one assign the 24 letters to 4 different cubes so that the four letters of each word appears on different cubes. (There is one letter from each word on each cube.) It might be fun for you to try it. I’ll give a small hint at the end of this post. The puzzle was created by Humphrey Dudley.

...

If anyone wants to write an Excel spreadsheet and solve it via integer programming, please let me know. I’d be happy to post the Excel spreadsheet if you send it to me, or link to it if you post it and send me the URL.

This was a fun puzzle, so I modeled the problem in labeled_dice.co, and mailed Jim the link.

Some days later he wrote Update on Logic Puzzle where the contributed models were presented. There where another Comet model WordPuzzleSchaus.co by Pierre Schaus, one of Comet's developers. Pierres model use an different and more elegant approach than mine.

Jim also linked to some other logical puzzles from the great collection brownbuffalo.sourceforge.net (which have solutions in ECLiPSe/Prolog). One of these puzzles was Building Blocks, in the same vein as his labeled dice puzzle. Hence I had to make a Comet model of this problem: building_blocks.co.

He concludes the post with the following:

Incidentally, the ease for solving it using Constraint Programming makes me think that Constraint Programming should be considered a fundamental tool in the OR toolkit.


Other Comet models this week

Here are the other Comet models created/published this week. Some of them where very easy to do since they are translations of my MiniZinc models.

February 21, 2009

Comet: New models, e.g. Einstein puzzle, KenKen, Kakuro, Killer Sudoku, Stigler's Diet problem, translations of OPL models

Here are some of the new Comet models written the last week.

Findings this week

Representing KenKen, Kakuro, and Killer Sudoku

One important thing in modelling a problem is how to representing the data in the problem. For grid puzzles such as KenKen, Kakuro, and Killer Sudoku, where one of the objects is to operate (add, multiply, etc) on an array of unknown length, I found one representation which work quite well (at least on the simple problem I tested).

Let's take the KenKen problem from Wikipedia.

The first version kenken.co was quite large, and stated explicitly all the operations (addition, subtraction, multiplication, or division). This was slightly generalized in the same model, though.

After that I relaxed the requirement to state exactly which operation to do things actually got simpler to model. In the Comet model kenken2.co we represent an cell with a tuple (struct) as two integers

tuple cell {
int r;
int c;
}

The cage of numbers to add (or multiply etc) is represented by another tuple, a set of cells, and the result (integer):

tuple P {
set{cell} c;
int res;
}

Now, it is only to state the problem using this structure:

int num_p = 15; // number of cages/segments
P problem[1..num_p] = 
  [
   P({cell(1,1), cell(2,1)}, 11),
   P({cell(1,2), cell(1,3)}, 2),
   P({cell(1,4), cell(2,4)}, 20),
   P({cell(1,5), cell(1,6), cell(2,6), cell(3,6)}, 6),
   P({cell(2,2), cell(2,3)}, 3),
   P({cell(2,5), cell(3,5)}, 3),
   P({cell(3,1), cell(3,2), cell(4,1), cell(4,2)}, 240),
   P({cell(3,3), cell(3,4)}, 6),  
   P({cell(4,3), cell(5,3)}, 6),
   P({cell(4,4), cell(5,4), cell(5,5)}, 7),
   P({cell(4,5), cell(4,6)}, 30),  
   P({cell(5,1), cell(5,2)}, 6),
   P({cell(5,6), cell(6,6)}, 9),
   P({cell(6,1), cell(6,2), cell(6,3)}, 8),
   P({cell(6,4), cell(6,5)}, 2)
   ];

This representation makes it possible to reading a problem from a file etc (which has not been done, though).

The single function that calculates the values in the segment given the result is as follows. Note: Here I assume that division and subtraction will only be done for two sized sets.

function void calc(Solver m, set{cell} cc, var{int}[,] x, int res) {
if (cc.getSize() == 2) {
var{int} a(m, x.getRange(0));
var{int} b(m, x.getRange(0));
m.post(a == x[cc.atRank(0).r,cc.atRank(0).c]);
m.post(b == x[cc.atRank(1).r,cc.atRank(1).c]);
m.post((a + b == res) || (a * b == res) ||
(a * res == b) || (b * res == a) ||
(a - b == res) || (b - a == res));
} else {
m.post(sum(i in cc) x[i.r, i.c] == res || prod(i in cc) x[i.r, i.c] == res);
}
}


In the exploreall we state the following constraints:

// all rows and columns must be unique
forall(i in R)
m.post(alldifferent(all(j in R) x[i,j]));
forall(j in R)
m.post(alldifferent(all(i in R) x[i,j]));
// the specific problem
forall(i in 1..num_p) {
calc(m, problem[i].c, x, problem[i].res);
}

Now, the other two grid puzzles in Comet, Killer Sudoku, and Kakuro, is represented is the same way, using just slight modifications in the calc function and added or altered constraints.

For more about these three grid puzzles see Wikipedia's entries:
* KenKen
* Killer Sudoku
* Kakuro


Different solvers in one model: Sudoku generator


Speaking of grid puzzles, here is a (proof-of-concept) generator of Sudoko puzzles: sudoku_generate.co. It uses a simple and brute force approach:

* Generate a problem matrix with the proper number of unknowns
* Try to solve the problem. If there are exactly one solution, then all is fine, else generate another problem.

It also use two limits to ensure that one generation/solution is not
to costy (which may miss some good problems):
* timeout
* number of failures

The thing to note here is the way two different functions works together: first one constraint solver (model) to generate a Sudoku matrix, then another constraint solver (model) to really solve the Sudoko problem.


User defined constraints


Earlier I wrote how to define function in Comet, e.g. in Some Comet constraint programming (CP) models .

This week I wanted to state my own functions in the same way as the built in functions, e.g. as m.post(myConstraint(...));.

The following function implements the (global) constraint increasing, which then can be stated as m.post(increasing(x));. Please note that this is just a "modelling" constraint, and it will use the underlying solver propagation etc.

class increasing extends UserConstraint {
var{int}[] _x;
int _n;
increasing(var{int}[] x) : UserConstraint() {
_x = x;
_n = _x.getSize();
}
Outcome post(Consistency cl) {
Solver cp = _x[1].getSolver();
forall(i in 2.._n) {
cp.post(_x[i-1] <= _x[i], cl);
}
return Suspend;
}
Outcome propagate() {
return Suspend;
}
}

The Comet model constraint_test.co implements the following constraints:

  • alldifferent_except_0
  • increasing
  • decreasing
  • toNum
  • correspondence

Translating OPL models

I've started to read Pascal Van Hentenryck's great book about OPL, "The OPL Optimization Programming Language", (ISBN: 9780262720304, unfortunately out of print), and it is quite clear the Comet "is not very unlike OPL". Which is not a big surprise since Van Hentenryck is the designer of both languages.

There are differences between Comet and OPL, sometimes quite big, but often they are easy to circumvent. Maybe later findings will discuss more about this.

The following Comet models are translations of OPL models, some of these are from the examples from ILOG's OPL Models.

  • building_a_house.co: Scheduling building a house (OPL). Comet use a different way of stating scheduling constraints than OPL.
  • covering_opl.co: Set covering problem. Also note the labeling (in the uses section), which solves the problem faster than with the default label, or labelFF.
  • einstein_opl.co: Einstein puzzle, translation of the OPL model from Daniel Selman's Einstein's Puzzle - Or, 'The Right Tool for the Job'. The presentation in the Comet model is different than the OPL model (which is shown commented) but it is quite close.
  • fixed_charge.co: Fixed-charge problem. One big difference here: In Comet it is not allowed to have an array in a tuple, so the use data must be a separate matrix.
  • number_square.co: From page 32 of Van Hentenryck's OPL book: Finding a eight digit square which remains a square when 1 is concatenated in from of it. This simple problem translates very well to Comet.
  • p_median_opl.co: P-median problem. This also translates very well to Comet.
  • production2.co: Production planning problem. Same difference as in the fixed charge model, where arrays in a tuple is not allowed.
  • warehouse.co: Warehouse location problem.


Models


Here are the new Comet models. As usual more information about the problem/model is in the header of the model.

February 15, 2009

More Comet models, e.g. Nonogram, Steiner triplets, and different set covering problems

During the week some more Comet models has been published. Many of them are translations of my MiniZinc models.

Findings this week

Here are some findings for the previous week.

tryall again: Nonogram

One of the models I really wanted to make was Nonogram, since it would be interesting to compare with the MiniZinc model nonogram.mzn and the Gecode/R model nonogram.rb. The Comet model is nonogram.co.

The first version was very influenced by the MiniZinc model, and used tryall where the MiniZinc model used exists (see the previous post About 20 more constraint programming models in Comet for some more comments ablit this). But this was way too slow.

After some thoughts, I rewrote the tryall:s and it was getting faster. For a simple Nonogram problem like The Hen (see picture below), it originally took about 30 seconds. And after the changes, it now takes a fraction of a second (with alot of fewer propagations).

Nonogram The hen
. X X X . . . .
X X . X . . . .
. X X X . . X X
. . X X . . X X
. . X X X X X X
X . X X X X X .
X X X X X X . .
. . . . X . . .
. . . X X . . .

Still, compared to the Gecode/R (and - of course - Gecode) models, which uses "regular expression" representation, the Comet model is slow, but is faster the MiniZinc version. For example the Lambda picture (see below) takes about 12 seconds with the Comet model, compared to 0.5 seconds with the Gecode/R model. Well, it is nice to have some more optimization to do (or more probably, a complete remodelling)...

Nonogram: Lambda problem
. X X . . . . . . .
X . X X . . . . . .
X . . X . . . . . .
. . . X X . . . . .
. . . . X . . . . .
. . . X X X . . . .
. . . X X X . . . .
. . X X . X X . . .
. . X X . . X . . .
. X X . . . X X . X
. X X . . . . X X X
X X . . . . . X X .

To summarize, the main finding here was to replace two tryall statements with two decision variables, and change the rest accordingly. The model also include two symmetry breaking constraints ("alldifferent except 0" and ordering of a temporary array), but these make no big difference.

binary representation of set variables

As far as I know, Comet don't (yet) have a variable set variables (i.e. as decision variables). Instead, another representation must be used, and probably the most natural representation is an array of binary decision values representing each value of the set.

The following models use this representation:
* set_partition.co, partitions a range of integers 1..n in two equal sized sets where the sums and sums of squares of the two sets are equal.
* steiner_triplets.co: the well known problem of Steiner triples.
* partition_function.co, (partitions values according to a function of some kind)

Models

These models is available by My Comet page. As usual, the problems are presented in the header of the file.

February 09, 2009

About 20 more constraint programming models in Comet

Since Some Comet constraint programming (CP) models, I've continued to translate some more of my (approx. 600) MiniZinc models to Comet. There are a few new models: game_theory_taha.co and fill_in_the_squares.co.

Findings this week

tryall

One finding this week is that tryall almost behaves like MiniZinc's exists, a very useful modelling tool. See for example the following models for the usage:
* Hidato puzzle
* Knights path
* SONET problem
* Word golf.

tryall behaves slightly different depending where it is placed: In the main explore/exploreall section, it may not generate all the solutions; but when placed in the using section, it generates all the solutions (and makes the model a bit slower).

Search strategies (labelling)

I've started to experiment with different search strategies (labelling), but for most of the models, the two builtin label and labelFF (first fail) are reasonable fast. This is an area I will look into more.

Other things

* strings/dictionaries: It is very nice to be able to use "real" strings in modelling. See for example Word golf, and Set covering deployment. The latter model also - just for fun - has a hash table for counting the number of occurrences of the selected countries.

* more than one models in the same program. MiniZinc only allows one model per file. As an experiment, game_theory_taha.co (two player zero sum game) solves first for one player, and then another model for the second player (which is not necessary since the solutions are a dual problem). Also, in Word golf, there is a loop of models solving for different ladder lengths.

The models

As usual, the description of the problem is included in the model, and a reference to my other model(s) of the same problem.

bus_schedule.co: Bus scheduling (Taha)
calculs_d_enfer.co: Calculs d'enfer puzzle (from the NCL manual)
crew.co: Crew scheduling (Gecode)
discrete_tomography.co: Discrete Tomography (one color model)
fill_in_the_squares.co: Fill in the squares, recreational mathematics. (from ZDC system). Origin unknown.
furniture_moving.co: Optimizing furniture moving (Marriott and Stuckey)
game_theory_taha.co: 2 player zero sum game (Taha). This uses the MIP (mixed integer programming) module.
heterosquare_cp.co: Heterosquare problem
hidato.co: Hidato puzzle
knights_path.co: Knight path
least_square.co: Least square (From Lundgren, Rönnqvist, Värbrand: "Optimeringslära"), using the MIP module.
max_flow_winston1.co: Maximum flow problem (Winston)
pandigital_numbers.co: Pandigital numbers (any base << about 12)
photo_problem.co: Photo problem
place_number_puzzle.co: Eight number placement puzzle
quasigroup_completion.co: Quasigroup completion problem.
Problem files:


set_covering.co: Set covering, placing of firestations (Winston)
set_covering_deployment.co: Set covering deployment
sonet_problem.co: SONET problem
survo_puzzle.co: Survo puzzle
traffic_lights.co:Traffic lights problem (CSPLib problem 16)
who_killed_agatha.co: Who killed agatha? (The Dreadsbury Mansion Murder Mystery, a automated reasoning problem)
word_golf.co: Word golf (word chain, word ladder), word_golf_len3.co: Data file with 2044 3-letter words.
young_tableaux.co: Young tableaux and partitions