Main

March 06, 2011

A matching problem, a related seating problem, and some other seating problems

The INFORMS blog challenge the last month (February) was O.R. and Love. The entry by Capgemini O.R. Blog Team was Cupids with Computers where a bunch of OR'ers (including me) was matched according to some - not exactly stated - compatibility scores. It is both interesting and fun to read. Also, see their companion blog post Two Deadly Sins that discuss some of the problems with matching.

I was curious how well a Constraint Programming system like MiniZinc would do on this matching problem so I wrote the model or_matching.mzn.

And it did quite well: it (Gecode/fz) solves the problem in 0.1 seconds (with 191 failures). Some comments about the model:
  • Since few CP system have good support for floats I converted the scores to integer (x 10), i.e. 3.7 is represented as 37 etc. However, this can lead to some errors since I assume that the scores are rounded.
  • it also identify the left out person which is just for fun.
The result is the following matchings (slightly edited):
Total score: 519

P1 P2 Score
--------------
 1  9: 92
 5 14: 69 (tie)
10 18: 69 (tie)
 8 16: 59
 2 19: 55
 3 12: 52
 6 11: 44
13 15: 42
 4  7: 37

Left out: 17
One reason that I modeled the problem was to see if it was possible to identify the OR'ers, or at least myself, and with some assumptions I have a some guesses. The assumptions is that the matchings in the blog post are presented from best match (score) and then ordered in score order.

My model includes a symmetry breaking which place the person with the lower index in the left position of the pair which means that I can just identify a pair. Also the second and third pair has a tie so it is even harder to guess these.

Ranking My guess (pair)
--------------------------------------------------------------------
- Anna Nagurney and Laura McLay: 1,9
- Ian Frommer and Paul Rubin: 5,14 or 10,18 (the score is a tie)
- Michael Trick and Tallys Yunes: 5,14 or 10,18 (ibid.)
- Larry D'Agostino and Robert Randall: 8,16
- David Smith and Thiago Serra: 2,19
- Francisco Marco-Serrano and Niclola Petty: 3,12
- John Angelis and Michale Watson: 6,11
- John Poppelaars and Patricia Randall: 13,15
- Hakan Kjellerstrand and Phillip Mah: 4,7

- Shiva Subramanian (the leftout): 17

So, based on this, I guess that I am either #4 or #7.

Let us further investigate this. In Two Deadly Sins there is some smaller score matrices where I happen to be named:
 Hakan John Patricia Phillip Shiva
  -     3.7   2.2     2.2     1.3   Hakan
  -      -    4.4     2.2     1.8   John
  -      -    -       4.2     2.7   Patricia
  -      -    -       -       2.2   Philip
  -      -    -       -       -     Shiva
Now assume that this is the same score and names as in the main matrix. My (Hakan's) scores for John, Patricia, Phillip, and Shiva, are 3.7, 2.2, 2.2, and 1.3 respectively, and these value can be found for #4 but not for #7. There is just one score 1.3 and it is for #4 and #17 so we might draw the conclusion that #17 is Shiva.

However, since Hakan is #4 and is connected to #7, this would mean that Philip should be #7 if my matchings are correct. But it can't be since there is no 4.4 score for Hakan and Philip. Hmmm.

OK. If we skip my ranking and instead look at the score matrices we can see that there is one 1.8 score, between John and Shiva (in the small score matrix), and #17 and #7. So this means that #7 is John. And if #7 is John it makes more sense since there is a score of 3.7 between Hakan (#4) and John (#7) in the big score matrix. John and Patricia has a score of 4.4, and the only 4.4 for #7 is #13 so Patricia is #13. And the last one to find is Philip which is #11 given by the score 4.2 with Patricia.

So now we have found the following identifies from the small scores:
Hakan: #4
John: #7
Patricia: #13
Philip: #11
Shiva: #17
And let us compare these findings with my matchings.
  • Hakan: #4. This seems to be correct. But the matching to #7 is not Philip (he is #11).
  • John: #7. Now, there are two Johns but neither of them has been attributed to #7 in my matching, so this is plain #fail.
  • Patricia: #13. My matching matched Patricia with John Poppelaars and is could be correct. This means that the "other John" is #15 (according to my matching)
  • Philip: #11. Well, this is the same #fail as for Hakan.
  • Shiva: #17. This seems to be correct.
Well, my method of identifying the persons with the matching seems to be quite wrong, but it was fun trying.

Back to the model and some more details. The kernel of the model is really just the following code, including some symmetry breaking constraints.

% decision variables:
% the matchings
array[1..m, 1..2] of var 1..n: x;
% the scores for the found matching
array[1..m] of var 0..max_score: s;

constraint
alldifferent([x[i,j] | i in 1..m, j in 1..2]) :: domain
;

constraint
forall(i in 1..m) (
s[i] = scores[x[i,1], x[i,2]]

% symmetry breaking
/\ x[i,1] <= x[i,2]
)
;

% another symmetry breaking
constraint decreasing(s);

I optimized this model for Gecode/fz using these two things found by some experimentation:
  • branching is on the score array s instead of the seating array x
  • alldifference has the :: domain annotation. Without this, it takes much longer.
These two makes Gecode solve the problem quite fast: Gecode/fz: 0.16s, 191 failures

However, for other solvers the branching on the score array s is not so good so here are the statistics for branching on the array x instead, and using instead the variable/value heuristics: first_fail/indomain_median which seems to give the best overall result for most solvers.
  • Gecode/fz: 4.5s, 98469 failures
  • G12/LazyFD: 2.16s
  • fzn2smt: 6.0s
  • fzntini: 16:02 minutes (I'm not sure why this is so hard for fzntini)
  • G12/fd: 10.127s, 189346 choice points
  • G12/minizinc: 10.465s, 189346 choice points
  • G12/fdmip: 10.226s, 189346 choice points
  • JaCoP/fz: 10.358s, 192957 wrong decisions
  • SICStus/fz: 18.419s, 362122 backtracks
  • ECLiPSe/ic: 48.313s
  • ECLiPSe/fd: 19.557s
Update 2011-03-08
Julien Fischer and Sebastian Brand (of G12 team) was kind to point out that there is a better way of getting the identification of the leftout:
constraint
   if n mod 2 != 0 then
     % This version was suggested by 
     % Sebastian Brand and Julien Fischer.
     leftout > 0 /\
     forall(j in 1..m, k in 1..2) (
        leftout != x[j,k]
     )
   else 
     leftout = 0
   endif;
Below is shown the statistics using this version instead. Julien also pointed out that G12/fd and G12/minizinc is really the same solver and - for this problem - G12/fdmip doesn't make use of the MIP component so it's the same as G12/fd. So I skipped these two latter in this test. The branching is (still) first_fail/indomain_median
    * Gecode/fz: 3.3s, 98469 failures
    * G12/LazyFD: 1.5s
    * fzn2smt: (I stopped after 10 minutes.)
    * fzntini: 5:52 minutes
    * G12/fd: 6.4s, 107178 choice points
    * JaCoP/fz: 5.4s, 192975 wrong decisions
    * SICStus/fz: 7.5s, 362122 backtracks
    * ECLiPSe/ic: 35.5s
    * ECLiPSe/fd: 11.3s    
This change boosted the times for most solvers, quite much for some. It's strange that fzn2smt was much slower, and I don't understand why. For fzn2smt I also tested different things, e.g. to removed the :: domain annotation on alldifferent but it made no difference.

Testing indomain_random
I was also curious how some solvers would do with the heuristic indomain_random, so I ran each solver three times and took the best result. Some solvers was "stable" and got exactly the same statistics (Gecode, G12/fd, SICStus, ECLiPSe/ic), but some of them got quite varying result and all three results are shown (JaCoP and ECLiPSe/fd).
    * Gecode/fz: 1.7, 27422 failures (stable)
    * G12/fd: 19.0s, 1401656 choice points (stable)
    * JaCoP/fz: 2.3s, 22178 wrong decisions
        8.5s 164599 wrong decisions
        4.1s 50575 wrong decisions
        2.3s 22178 wrong decisions
    * SICStus/fz: 3.3s, 116386 backtracks (stable)
    * ECLiPSe/ic: 34.5s (stable)
    * ECLiPSe/fd: 5.6s
       7.41s
       5.6s
      26.7s
Almost all of the solver got at least one result better than before, except G12/fd which is surprising.
End of Update 2011-03-08


2011-03-12 Yet another update
After some interesting discussions with Sebastian Brand (thank you, Sebastian!) I have included some improvements.
1) Since the calculation of the leftout don't contribute much to the constraint store, it has been removed as a decision variable and just calculated in the output section:

output [
let {
int: leftout = max(0..n diff { fix(x[j,k]) | j in 1..m, k in 1..2 })
}
in
% ...

2) The symmetry breaking constraint to sort the matchings in score order
  decreasing(s)
has been replaced with another symmetry breaking constraint which seems to be faster. Unfortunately, there is no sorting anymore but since it is faster for all solvers I prefer it.
  forall(i in 2..m)( x[i-1,1] < x[i,1] )
3) The branching has been changed to first_fail/indomain_split

With these three changes there are now improvements of most solvers; for ECLiPSe's solvers a radical improvement (compare with the listing above):

- G12/lazy: 0.89s
- Gecode: 0.10s and 649 failures
Note: without the ::domain annotation on alldifferent it's slighly worse: 0.16s and 1898
That difference has been much larger in earlier versions.
- JaCoP: 1.4s: 3253 wrong search decisions
- SICStus: 0.29s 3291 backtracks
- Choco: 1.8s 5928 backtracks
- ECLiPSe/ic: 1.6s
- ECLiPSe/fd: 0.94s
- smt2fzn is still very slow (I contacted the fzn2smt developers about this)
- tini: 4:44min (slightly better, but still very slow)


4) Sebastian also suggested that I should test a table constraint. That is, instead of
s[i] = scores[x[i,1], x[i,2]]

we use this:
table([x[i,1], x[i,2], s[i]], scores_table)

This also requires another representation of the score matrix:

scores_table = array2d(1..n*n, 1..3,
[
1, 1, 00,
1, 2, 30,
1, 3, 10,
1, 4, 33,
1, 5, 49,
% ....

Here is the result of using this table approach (together with the previous changes):

- G12/fd: 2.0s, 708 choice points (slower)
- G12/lazy: 2.442s (slower)
- Gecode: 0.11s 686 failures (about the same)
- JaCoP: 1.2s 785 wrong search decisions (slightly better)
- SICStus: 0.43 746 backtracks (better in backtracks, but slightly slower)
- Choco: 1.9s: 2299 backtracks (better in backtracks, but slightly slower)
- ECLiPSe/ic: 2.8s (slower)
- ECLiPSe/fd: 1.4s (slower)
- fzn2smt: 11.1s (and SMT is back in the game again!)
Note: Using the decreasing/1 constraint instead, it takes 7.0s.
- tini: stopped after 10 minutes, so it's slower than the previous version.

To sum: Most solvers where a tiny bit slower with this table approach, and some have slightly less choice points/failures/etc. Hmmm, maybe this (slight) increment of time for some solvers is caused by the slightly larger file to process by mzn2fzn?

The big winner here is SMT, but for this problem it's still slower than most other solvers.

I've put the improvements in the same model as before or_matching.mzn (the older model is now in or_matching_orig.mzn).

Also, thanks to Krzysztof Kuchcinski for an interesting discussion about this.

End of update 2011-03-12

A related seating problem

This matching problem inspired me also to model a related problem using the score matrix from the matching model:
When all these persons meet, what is the best way to place them around a (topological circular) table if the objective is to get the highest possible compatibility score for the adjacent seated persons.
Another MiniZinc model (or_seating.mzn) gives an optimal value of 996 score points, in about 1.5 seconds (with 17629 failures) using Gecode.

There are some possible way of approaching this problem, and I took the TSP way using a global constraint circuit. This model use my decomposition of a circuit constraint which, given the score matrix, gives an optimal circuit. However, what we really want is the path (the seating), so we have do extract the path. After a while I realized that the temporary array z used in the circuit_me decomposition is exactly this path, so it's just to use it in the parameter of the predicate. A consequence of this is that all seatings ends with person #1, which shouldn't really be a problem since the path is a permutation.

predicate circuit_me(array[int] of var int: x, array[int] of var int: z) =
let {
int: lbx = min(index_set(x)),
int: ubx = max(index_set(x))
}
in
all_different(x) :: domain /\
all_different(z) :: domain /\

% put the orbit of x[1] in in z[1..n]
z[lbx] = x[lbx] /\
forall(i in lbx+1..ubx) (
z[i] = x[z[i-1]]
)
/\ % may not be 1 for i < n
forall(i in lbx..ubx-1) (
z[i] != lbx
)
/\ % when i = n it must be 1
z[n] = lbx
;

Here are the two optimal solutions found by this model (with the implicit symmetry breaking that #1 is always the last position). However, the second solution is just the first reversed.

9 10 18 8 16 13 15 12 3 17 19 2 11 6 14 5 7 4 1
4 7 5 14 6 11 2 19 17 3 12 15 13 16 8 18 10 9 1

For the systems that support the circuit/1 constraint (right now only Gecode and JaCoP support this), there is a variant: one can use circuit to get the circuit and then extract the path (seating) with this "extractor" predicate where the solution always start with #1.

predicate circuit_path(array[int] of var int: x,
array[int] of var int: p) =
% we always starts the path at 1
p[1] = 1
/\
forall(i in 2..n) (
p[i] = x[p[i-1]]
);

Then we got these two optimal solutions, which is just permutations of the two above. 1 9 10 18 8 16 13 15 12 3 17 19 2 11 6 14 5 7 4
1 4 7 5 14 6 11 2 19 17 3 12 15 13 16 8 18 10 9

Connection to the matching problem:
Comparing these seating placements with the matching above, it seems to be correct, e.g. #9 and #1 sit together, as do #18 and #6, #2 and #19, etc.

Some other seating problems

For some reason I am fascinated by seating problems and I'm not really alone. John Toczek's PuzzlOr problem Best Host this February was a seating problem (the object is to minimize seating conflicts given that some persons would like to sit together) that I solved using a rather simple MiniZinc model. However, I will not link to the model since the contest (to win a “O.R. The Science of Better” T-shirt) is still open. And let us not forget Karen Petrie's Wedding planning problem: How a computer scientist organised her wedding.

Here are two (classical) combinatorial seating problems, the formulation taken from Fei Dai, Mohit Dhawan, Changfu Wu and Jie Wu: On Solutions to a Seating Problem, page 1:

Seating, row
MiniZinc model seating_row.mzn
n couples are seated in one row. Each person, except two seated at the two ends, has two neighbors. It is required that each neighbor of person k should either have the same gender or be the spouse of k.
Seating, table:
MiniZinc model seating_table.mzn
A classical [...] seating problem is the following: n couples are seated around a table in such a way that two neighbors of each person k are the different gender of k and they are not the spouse of k.
Both these problems is about constraints on couples and gender. There are different way of representing who is in a couple with "anonymous persons", i.e. using just integers to represent the couples and individuals. Here are two variants (I included both of these in the seating_row model):

1) Using math:
predicate is_couple1(int: n, var int: a, var int: b) =
% exists(i in 1..n) (
exists(i in max(lb(a), lb(b))..min(ub(a), ub(b))) (
(2*(i-1)+1 = a /\ 2*(i-1)+2 = b)
\/
(2*(i-1)+1 = b /\ 2*(i-1)+2 = a)
);

2) Using set representation, i.e. an array of the pairs:
% define the couple
array[1..n] of set of int: couples = [{2*(i-1)+1, 2*(i-1)+2} | i in 1..n];

predicate is_couple(int: n, var int: a, var int: b, array[int] of set of int: c) =
exists(j in 1..n) (
a in c[j] /\ b in c[j]
);

The second approach of using sets, is in my opinion the nicest, and it's also easier to use if one has a concrete list of couples.

The gender is simply represented by modulo 2.

The paper "On Solutions to a Seating Problem" (cited above) studies "reasonable patterns":
A reasonable pattern is an assignment of n couples without indices that meet the above requirement. For example, when n = 3 all reasonable patterns are as follows:

wwwhhh whhwwh whhhww hhwwwh
hhhwww hwwhhw hwwwhh wwhhhw
However, I am more interested in the concrete seating placements so it's not quite the same problems. For n=3 there is a total of 60 patterns. Here are all 8 different patterns generated for 3 pairs (n=3). The first number indicates the occurrences of each pattern.
12 gender2:hhhwww
 6 gender2:hhwwwh
 6 gender2:hwwhhw
 6 gender2:hwwwhh
 6 gender2:whhhww
 6 gender2:whhwwh
 6 gender2:wwhhhw
12 gender2:wwwhhh
I tried to generate exactly one of each pattern but had no luck. A model with the following simple symmetry breaking generates 20 solutions, less than 60 but more than exactly 8 patterns.

constraint (x[1] = 1 \/ x[1] = 2);

Statistics for Seating, row problem
Here are the number of solutions (using no symmetry breaking) for some different number of couples (n). The solver statistics below is for Gecode/fz where just the last solutions is printed (it took about 2.5 minutes to print the solutions for n=6, without printing it took just 15.7 seconds).

n #solutions
-------------
1 2 (0.001s, 0 failures)
2 8 (0.001s, 5 failures)
3 60 (0.01s, 73 failures)
4 816 (0.02s, 1181 failures)
5 17520 (0.45s, 29872 failures)
6 550080 (15.7s, 1051417 failures)
7 23839200 (12:36min, 49685576 failures)

The sequence of these numbers seems to correspond to the OEIS Integer Sequence A096121 which is described as A rook must land on each square exactly once, but may start and end anywhere and may intersect its own path. Inspired by Leroy Quet in a Jul 05 2004 posting to the Seqfan mailing list.

Well, I'm not sure if there is actually a deep connection between these two problems or just a freaky coincidence...

Statistics for Seating, table problem
For the seating table problem with 3 couples (n=3) there are 12 solutions (two different patterns). Here are the number of solutions for different n.

n #solutions
-------------
1 0 (0s, 0 failures)
2 0 (0s, 4 failures)
3 12 (0s, 0 failures)
4 96 (0.01s, 36 failures)
5 3120 (0.04s, 340 failures)
6 115200 (0.9s, 4436 failures)
7 5836320 (43.4s, 146938 failures)
8 382072320 (47:28minutes, 6593012)

The related OEIS sequence is A059375 (Total menage numbers 2*n!*A000179(n)) (the sequence starts with 1 which is the value for n=0, but this is weird in our current setting). This problem is also known as the Menage problem (Wikipedia) so the correspondence between these numbers is not surprising at all. Also see Wolfram MathWorld: Married Couples Problem.

February 28, 2009

Comet: regular constraint, a much faster Nonogram with the regular constraint, some OPL models, and more

Since the last time some more Comet model has been written.

Much faster Nonogram model using the regular constraint

In More Comet models, e.g. Nonogram, Steiner triplets, and different set covering problems I presented nonogram.co, a solver for Nonogram puzzles, and also noted that is was quite slow: Well, it is nice to have some more optimization to do (or more probably, a complete remodelling)....

Inspired by the announcement this week of the ECLiPSe example of Nonogram solver using the regular constraint (see nono_regular.ecl.txt and regular.ecl.txt) - and also my earlier Gecode/R model nonogram.rb which used "regular expressions" - I created a Nonogram model in Comet using the regular constraint: nonogram_regular.co.

Let us first look at the regular constraint.


Regular constraint
The regular constraint (see the Comet model regular.co for my implementation) is a global constraint using a DFA (deterministic finite automata) which accepts (or not accepts) an input sequence given a "transition matrix". The constraint was presented by Gilles Pesant in "A Regular Language Membership Constraint for Finite Sequences of Variables" (2004).

My implementation of the regular constraint is heavily borrowed from MiniZinc's builtin regular predicate (from lib/zinc/globals.mzn); the translation to Comet was very straightforward (omitting just some asserts).

An exemple of the usage of the constraint: We want to match the regular expression "123*21" (i.e. first "1", then "2", then zero or more "3", then "2", and last a "1"). Note: The length of the sequence is 10, so there must be 6 "3":s since "123" and "21" are "anchored" at the beginning and the end.

int len = 10;
int n_states = 5; // number of states
int input_max = 3; // the states are 1,2, and 3
int initial_state = 1; // we start with the 1 state
set{int} accepting_states = {n_states}; // This is the last state
// The transition matrix
int transition_fn[1..n_states, 1..input_max] =
[[2, 0, 0], // transitions from state 1: "1" -> state 2
[0, 3, 0], // transitions from state 2: "2" -> state 3
[0, 4, 3], // transitions from state 3: "2" -> state 4 | "3" -> state 3 (loop)
[5, 0, 0], // transitions from state 4: "1" -> state 5
[0, 0, 0]]; // transitions from state 5: END state
...
exploreall {
regular(reg_input, n_states, input_max, transition_fn, initial_state, accepting_states);
} using {
// ....
}

The unique sequence resulting from this automata is thus 1 2 3 3 3 3 3 3 2 1.

For using regular in the Nonogram problem, the automata for each row/column clue, must be built, preferably by a function. In regular.co this is done with the function make_transition_matrix, which by far was the hardest part of this problem (and surely could be written in a smoother way).

For the Nonogram clue [3,2,1] - which represents the regular expression "0*1110*110*10*" - the following automata (transition matrix) is generated:
1 2
0 3
0 4
5 0
5 6
0 7
8 0
8 9
9 0


Note that the regular function uses 0 (zero) as the failing state, so the states must start with 1. This is taken care in the model as the resulting value is subtracted by 1.

As usual in my models, the regular constraint is just a "convenience function", i.e. not using special written propagation methods etc.

The regular constraint has - of course - more applications than for Nonogram solving. I plan to look more into this.


The Nonogram solver: nonogram_regular.co
After the regular constraint and the automata generator was written, it was quite easy to change the old Nonogram solver to use these new tools. The result is in nonogram_regular.co. I was quite curious how fast this version should be compared to the former slow model. In short: it is much faster.

As comparison the Lambda instace took about 12 seconds with the old model; with the new model it takes 0.5 seconds, which is a nice improvement. I never finished a run for Nonunique problem with the older model; the new model takes 0.5 seconds (including the startup of the Comet program). Etc.

The P200 problem nonogram_p200.co now takes 2:05 minutes, which can be compared with 57 seconds for the Gecode/R model. Thus, the Comet model is still slow compared to Gecode version and the ECLiPSe version, which solves the P200 problem in just a second or two. Maybe some creative labeling or a "proper" regular constraint can fasten things up...

Update some hour later: One way to gain about 30 seconds to 1:30 minutes on the P200 problem was to explicitly state the consistency level to onDomain, e.g. cp.post(a[m] == q0, onDomains), and to use another labeling strategy:
forall(i in 1..rows, j in 1..cols : !board[i,j].bound()) {
// label(board[i,j]); // the former strategy
tryall(v in 1..2) m.post(board[i,j] == v, onDomains);
}


Some more Nonogram problems have been coded:

An aside about regular expressions
I have been very interested in regular expressions (especially the more expressive Perl type) for a long time. In 1997 I wrote a simple Perl program MakeRegex which returns a regular expression given a list of words. It was later Appletized in MakeRegexApplet. Now there are better programs/modules for this.

OPL Models

One of my projects is to translate the OPL models from Pascal Van Hentenryck "The OPL Optimization Programming Language" into Comet. Even if OPL and Comet are different languages the reading of the book has been very awarding. Thanks again, Pascal!

Some OPL models already have been published, but now I've been more systematic and started from the beginning. More models to come.

Finding: arrays in a tuple
In Comet: New models, e.g. Einstein puzzle, KenKen, Kakuro, Killer Sudoku, Stigler's Diet problem, translations of OPL models I wrote

One big difference here: In Comet it is not allowed to have an array in a tuple, so the use data must be a separate matrix.

I've found a way of using arrays in a tuple, by first initialize the values in a matrix and then - by using the all function - "slice" the values into the tuple array. This has been done in the models fixed_charge2.co and production3.co.

Example from fixed_charge2.co:

tuple productType {
int profit;
set{Machines} machines;
int[] use;
}
int use[Products, Resources] = [[3,4],[2,3], [6,4]];
productType Product[Products] =
[productType(6, {shirtM}, all(i in Resources) use[shirt,i]),
productType(4, {shortM}, all(i in Resources) use[shorts,i]),
productType(7, {pantM}, all(i in Resources) use[pants,i])];

Combining different models

The alphametic problem SEND + MOST = MONEY has the additional requirement to maximize the value of MONEY. The older model send_most_money.co does that and nothing more.

One natural extension of the problem is the following:
* first find the maximum value of MONEY
* then find all the solutions with this value.

The model send_most_money2.co has two functions:
* smm which is just the constraint for the alphametic problem
* send_most_money which has a parameter money.

send_most_money is first called with 0, indicating that is should maximize the value of MONEY, and then returns that value. The next call to send_most_money is with the calculated MONEY value, which indicates that all solutions should be generated.

The answer is

check all solutions for MONEY = 10876
x[9,7,8,2,1,0,4,6] MONEY: 10876
x[9,7,8,4,1,0,2,6] MONEY: 10876


Jim Orlin's Logic Puzzle

In Colored letters, labeled dice: a logic puzzle Jim Orlin stated a logic puzzle:
My daughter Jenn bough a puzzle book, and showed me a cute puzzle. There are 13 words as follows: BUOY, CAVE, CELT, FLUB, FORK, HEMP, JUDY, JUNK, LIMN, QUIP, SWAG, VISA, WISH.

There are 24 different letters that appear in the 13 words. The question is: can one assign the 24 letters to 4 different cubes so that the four letters of each word appears on different cubes. (There is one letter from each word on each cube.) It might be fun for you to try it. I’ll give a small hint at the end of this post. The puzzle was created by Humphrey Dudley.

...

If anyone wants to write an Excel spreadsheet and solve it via integer programming, please let me know. I’d be happy to post the Excel spreadsheet if you send it to me, or link to it if you post it and send me the URL.

This was a fun puzzle, so I modeled the problem in labeled_dice.co, and mailed Jim the link.

Some days later he wrote Update on Logic Puzzle where the contributed models were presented. There where another Comet model WordPuzzleSchaus.co by Pierre Schaus, one of Comet's developers. Pierres model use an different and more elegant approach than mine.

Jim also linked to some other logical puzzles from the great collection brownbuffalo.sourceforge.net (which have solutions in ECLiPSe/Prolog). One of these puzzles was Building Blocks, in the same vein as his labeled dice puzzle. Hence I had to make a Comet model of this problem: building_blocks.co.

He concludes the post with the following:

Incidentally, the ease for solving it using Constraint Programming makes me think that Constraint Programming should be considered a fundamental tool in the OR toolkit.


Other Comet models this week

Here are the other Comet models created/published this week. Some of them where very easy to do since they are translations of my MiniZinc models.

February 15, 2009

More Comet models, e.g. Nonogram, Steiner triplets, and different set covering problems

During the week some more Comet models has been published. Many of them are translations of my MiniZinc models.

Findings this week

Here are some findings for the previous week.

tryall again: Nonogram

One of the models I really wanted to make was Nonogram, since it would be interesting to compare with the MiniZinc model nonogram.mzn and the Gecode/R model nonogram.rb. The Comet model is nonogram.co.

The first version was very influenced by the MiniZinc model, and used tryall where the MiniZinc model used exists (see the previous post About 20 more constraint programming models in Comet for some more comments ablit this). But this was way too slow.

After some thoughts, I rewrote the tryall:s and it was getting faster. For a simple Nonogram problem like The Hen (see picture below), it originally took about 30 seconds. And after the changes, it now takes a fraction of a second (with alot of fewer propagations).

Nonogram The hen
. X X X . . . .
X X . X . . . .
. X X X . . X X
. . X X . . X X
. . X X X X X X
X . X X X X X .
X X X X X X . .
. . . . X . . .
. . . X X . . .

Still, compared to the Gecode/R (and - of course - Gecode) models, which uses "regular expression" representation, the Comet model is slow, but is faster the MiniZinc version. For example the Lambda picture (see below) takes about 12 seconds with the Comet model, compared to 0.5 seconds with the Gecode/R model. Well, it is nice to have some more optimization to do (or more probably, a complete remodelling)...

Nonogram: Lambda problem
. X X . . . . . . .
X . X X . . . . . .
X . . X . . . . . .
. . . X X . . . . .
. . . . X . . . . .
. . . X X X . . . .
. . . X X X . . . .
. . X X . X X . . .
. . X X . . X . . .
. X X . . . X X . X
. X X . . . . X X X
X X . . . . . X X .

To summarize, the main finding here was to replace two tryall statements with two decision variables, and change the rest accordingly. The model also include two symmetry breaking constraints ("alldifferent except 0" and ordering of a temporary array), but these make no big difference.

binary representation of set variables

As far as I know, Comet don't (yet) have a variable set variables (i.e. as decision variables). Instead, another representation must be used, and probably the most natural representation is an array of binary decision values representing each value of the set.

The following models use this representation:
* set_partition.co, partitions a range of integers 1..n in two equal sized sets where the sums and sums of squares of the two sets are equal.
* steiner_triplets.co: the well known problem of Steiner triples.
* partition_function.co, (partitions values according to a function of some kind)

Models

These models is available by My Comet page. As usual, the problems are presented in the header of the file.