« Some new Google CP Solver/Python models | Main | Comparison of some Nonogram solvers: Google CP Solver vs Gecode and two MiniZinc solvers »

Google CP Solver: Regular constraint, Nonogram solver, nurse rostering, etc

Models mentioned: After some sweat and confusions, I have now implemented a Nonogram solver in Google CP Solver using a (decomposition of) regular constraint. The model is here: nonogram_regular.py. Also, a test model for the regular constraint is in regular.py

The hardest part by far was to translate the regular constraint from Comet to Google CP solver. The Comet version (nonogram_regular.co) was described in a sequence of blog posts: As we will see below, other problems where implemented by the regular constraint.

Update: Laurent Perron suggested an improvement of the regular constraint, which makes some of the models much faster. I have commented this below.

Translation of regular

As mentioned above, the hardest part was the translation of the regular constraint from Comet (from nonogram_regular.co) to Google CP solver (the Comet version was a translated from the MiniZinc version). I started with the simpler example regular.py) which basically consists of two parts also included in the nonogram model:

make_transition_matrix

The function translates an array of numbers to an finite state automaton which representing and regular expression of 0's and 1's. E.g. the array [3,2,1] represents the regular expression 0*1{3}0+1{2}0+1{1}0* and is translated into the automaton (the valid states) which must start in state 1 and end in state 9 (0 is a failing state):
1 2
0 3
0 4
5 0
5 6
0 7
8 0
8 9
9 0
The hardest part here was that the Comet constraint is 1-based, but Python is 0-based. There are many things that can - and will - go wrong when translating from a 1-based system to a 0-base system. And it don't help that state 0 is used as a fail state. regular constraint

Then, given the automaton (the valid states), the regular constraint transforms the states into an array of decision variables. The domains are 1..2 (representing the values of 0 and 1).

Update Laurent Perron suggested an improvement using the table constraint (called solver.AllowedAssignments in Google CP Solver). I have changed to this version here.

This was his first improvement (last in the method):
#
# Global constraint regular
#
# This is a translation of MiniZinc's regular constraint (defined in
# lib/zinc/globals.mzn), via the Comet code refered above.
# All comments are from the MiniZinc code.
# '''
# The sequence of values in array 'x' (which must all be in the range 1..S)
# is accepted by the DFA of 'Q' states with input 1..S and transition
# function 'd' (which maps (1..Q, 1..S) -> 0..Q)) and initial state 'q0'
# (which must be in 1..Q) and accepting states 'F' (which all must be in
# 1..Q).  We reserve state 0 to be an always failing state.
# '''
#
# x : IntVar array
# Q : number of states
# S : input_max
# d : transition matrix
# q0: initial state
# F : accepting states
def regular(x, Q, S, d, q0, F):

  solver = x[0].solver()

  assert Q > 0, 'regular: "Q" must be greater than zero'
  assert S > 0, 'regular: "S" must be greater than zero'

  # d2 is the same as d, except we add one extra transition for
  # each possible input;  each extra transition is from state zero
  # to state zero.  This allows us to continue even if we hit a
  # non-accepted input.

  d2 = []
  for i in range(Q + 1):
    for j in range(S):
      if i == 0:
        d2.append((0, j, 0))
      else:
        d2.append((i, j, d[i - 1][j]))

  # If x has index set m..n, then a[m-1] holds the initial state
  # (q0), and a[i+1] holds the state we're in after processing
  # x[i].  If a[n] is in F, then we succeed (ie. accept the
  # string).
  x_range = range(0, len(x))
  m = 0
  n = len(x)

  a = [solver.IntVar(0, Q + 1, 'a[%i]' % i) for i in range(m, n + 1)]

    # Check that the final state is in F
  solver.Add(solver.MemberCt(a[-1], F))
    # First state is q0
  solver.Add(a[m] == q0)
  for i in x_range:
    solver.Add(x[i] >= 1)
    solver.Add(x[i] <= S)
    # Determine a[i+1]: a[i+1] == d2[a[i], x[i]]
    # Laurent Perron suggested using this: 
    solver.Add(solver.AllowedAssignments((a[i], x[i] - 1, a[i + 1]), d2))
Update 2: Some hour later, Laurent Perron added C++ support for simplifying this by adding solver.TransitionConstraint, so the definition of regular is now much neater:
def regular(x, Q, S, d, q0, F):

  solver = x[0].solver()

  assert Q > 0, 'regular: "Q" must be greater than zero'
  assert S > 0, 'regular: "S" must be greater than zero'

  # d2 is the same as d, except we add one extra transition for
  # each possible input;  each extra transition is from state zero
  # to state zero.  This allows us to continue even if we hit a
  # non-accepted input.

  d2 = []
  for i in range(Q + 1):
    for j in range(1, S + 1):
      if i == 0:
        d2.append((0, j, 0))
      else:
        d2.append((i, j, d[i - 1][j - 1]))

  solver.Add(solver.TransitionConstraint(x, d2, q0, F))
This improvement is in the model: regular_table2.py and the Nonogram model: nonogram_table2.py

Nonogram

After some more problems was fixed it was really a simple matter of translating the rest of the Nonogram model (nonogram_regular.py).

Labeling: I has the same labeling, or as approximate as possible, as in the Comet model:
# This labeling was inspired by a suggestion from
# Pascal Van Hentenryck about my Comet nonogram model.
board_label = []
if rows * row_rule_len < cols * col_rule_len:
    for i in range(rows):
        for j in range(cols):
            board_label.append(board[i,j])
else:
    for j in range(cols):        
        for i in range(rows):
            board_label.append(board[i,j])
Update: Here are Laurent Perron's versions with the improved regular: nonogram_table.py and nonogram_table2.py

Testing and problem instances for Nonogram

I have compared two problem instances, p200 (nonogram_p200.py) and p199 (nonogram_p199.py), with some of my other Nonogram solver: the Comet model nonogram_regular.co (with my decomposition of regular), the MiniZinc model nonogram_create_automaton2.mzn with Gecode/fz, using Gecode's built in regular.

Update: I have added Laurent Perron's version using table (solver.AllowedAssignments) as an separate entry for comparison:

Problem Google CP Solver Google CP Solver with table Comet Gecode/fz
P200 1.8s/11615 failures 0.2s/520 failures 0.4s/794 failures 0.2s/520 failures
P199 17.607s/104239 failures 0.3s/772 failures 0.6s/1175 failures 0.1s/772 failures

The result for P200 is not too bad for Google CP Solver, but I am a little surprised by the large times/failures for the P199, almost 18 seconds. However, one could note that using a plain ordering of the variables for this problem instance, i.e.
board_label = [board[i,j] for i in range(rows) for j in range(cols)]
then P199 is solved in 0.4 seconds with 2005 failures. However, with this ordering, P200 takes very long time.

Update: It's interesting to see that the improved table version has exactly the same number of failures as the version using Gecode/fz.
Update 2:The second improvement of Laurent Perron didn't change the number of failures, but the time was slightly improved with a couple of ms's: 169ms vs 209ms for P200, and 264ms vs 296ms for P199.

Thanks for the improvements, Laurent!

Here are some other problem instances:

Nurse rostering

Model: nurse_rostering.py

A standard application of regular constraint is in different types of scheduling, e.g. nurse rostering. The model nurse_rostering.py is a small example based on the model in the MiniZinc Tutorial (PDF), page 35f (the graph of the DFA is shown at page 37).

The example is for 7 nurses and 14 days using three states: day shift, night shift, and day off. There are two kind of constraints:

* Rules using the regular constraint:
  • - one day off every 4 days
  • - no 3 nights in a row.
These are translated as the following transition function:
transition_fn =  [
   # d,n,o
    [2,3,1], # state 1
    [4,4,1], # state 2
    [4,5,1], # state 3 
    [6,6,1], # state 4
    [6,0,1], # state 5
    [0,0,1]  # state 6
    ]
Note: This transition function can be used for any number of nurses and days. * Other rules
Here I went wild and required some extra rules for this specific number of nurses and days:
  • Each nurse must work between 7 and 10 days during the period (14 days)
  • On weekends there must be 2 nurses on day shift, 1 on night shift, and 4 has day off
  • On working days: 3 nurses on day shift, 2 on night shift, and 2 has day off
Below is one solution, including statistics for each nurse and day, respectively:
Nurse0:  d d d o d d d o d d d o d o  day_stat: [('d', 10), ('o', 4)] total: 10 workdays
Nurse1:  d d d o d d d o d d d o d o  day_stat: [('d', 10), ('o', 4)] total: 10 workdays
Nurse2:  d d o d d n o d d d o d n o  day_stat: [('d', 8), ('o', 4), ('n', 2)] total: 10 workdays
Nurse3:  n n o d n o n d n o d d o d  day_stat: [('d', 5), ('o', 4), ('n', 5)] total: 10 workdays
Nurse4:  n o d n n o o d n n o n o o  day_stat: [('d', 2), ('o', 6), ('n', 6)] total: 8 workdays
Nurse5:  o n n d o o o n o o n n o d  day_stat: [('d', 2), ('o', 7), ('n', 5)] total: 7 workdays
Nurse6:  o o n n o o o n o n n d o n  day_stat: [('d', 1), ('o', 7), ('n', 6)] total: 7 workdays

Statistics per day:
        d n o
Day 0:  3 2 2
Day 1:  3 2 2
Day 2:  3 2 2
Day 3:  3 2 2
Day 4:  3 2 2
Day 5:  2 1 4
Day 6:  2 1 4
Day 7:  3 2 2
Day 8:  3 2 2
Day 9:  3 2 2
Day10:  3 2 2
Day11:  3 2 2
Day12:  2 1 4
Day13:  2 1 4

Solving 3 jugs problem using regular

Model: 3_jugs_regular.py
Model: 3_jugs_regular2.py

This is a model solving the classic 3 jugs problem (a.k.a. water jugs problem), based on Taha's network flow model (Introduction to Operations Research, page 245f) but using regular instead. Compare, for example, with the Comet model 3_jugs.co using a network flow approach.

The different possible states in the problem is represented as nodes, and we start at state 1 (node 1). The name of the nodes represent the amount of water in each water jug; e.g. 8,0,0 means that 8 liter/gallon is in the first jug, and 0 in both jug 2 and 3. The goal is then to go from state 8,0,0 to 4,4,0. Here are all the nodes:
nodes = [
    '8,0,0', # 1 start
    '5,0,3', # 2
    '5,3,0', # 3 
    '2,3,3', # 4 
    '2,5,1', # 5
    '7,0,1', # 6
    '7,1,0', # 7
    '4,1,3', # 8
    '3,5,0', # 9
    '3,2,3', # 10
    '6,2,0', # 11
    '6,0,2', # 12
    '1,5,2', # 13
    '1,4,3', # 14
    '4,4,0'  # 15 goal
    ]
The valid transitions between these states are as follows.
states = [
    [2,9],  # state 1
    [3],    # state 2
    [4, 9], # state 3
    [5],    # state 4
    [6,9],  # state 5
    [7],    # state 6
    [8,9],  # state 7
    [15],   # state 8
    [10],   # state 9
    [11],   # state 10
    [12],   # state 11
    [13],   # state 12
    [14],   # state 13
    [15]    # state 14
    ]
And from this adjacency lists we can build the DFA:
transition_fn =  [
   # 1  2  3  4  5  6  7  8  9  0  1  2  3  4  5 
    [0, 2, 0, 0, 0, 0, 0, 0, 9, 0, 0, 0, 0, 0, 0],  # 1
    [0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],  # 2
    [0, 0, 0, 4, 0, 0, 0, 0, 9, 0, 0, 0, 0, 0, 0],  # 3
    [0, 0, 0, 0, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],  # 4
    [0, 0, 0, 0, 0, 6, 0, 0, 9, 0, 0, 0, 0, 0, 0],  # 5
    [0, 0, 0, 0, 0, 0, 7, 0, 0, 0, 0, 0, 0, 0, 0],  # 6
    [0, 0, 0, 0, 0, 0, 0, 8, 9, 0, 0, 0, 0, 0, 0],  # 7
    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 15], # 8
    [0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 0, 0, 0, 0, 0], # 9
    [0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 11, 0, 0, 0, 0], # 10
    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 12, 0, 0, 0], # 11
    [0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 13, 0, 0], # 12
    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 14, 0], # 13
    [0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 15], # 14
                                                    # 15
    ]
As you may see this is very much like the representation in the network flow model, but the connections is a state number instead of just 0/1 (connected/not connected).

Running this program gives the following solution where x contains the chosen states, and then the nodes of the states are shown.
x: [1, 9, 10, 11, 12, 13, 14, 15]
8,0,0 -> 3,5,0
3,5,0 -> 3,2,3
3,2,3 -> 6,2,0
6,2,0 -> 6,0,2
6,0,2 -> 1,5,2
1,5,2 -> 1,4,3
1,4,3 -> 4,4,0

num_solutions: 1
failures: 1
branches: 2
wall_time: 0 ms

Found a solution of length 8: [1, 9, 10, 11, 12, 13, 14, 15]
Note: This model don't have any explicit Minimization objective to minimize the number of states (which is implied in the problem). Instead it calls the main function with the length of the array x, ranging from length 1 to 15, and thus find the first, minimal, solution of length 8.

3 jugs problem, alternative solution

Model: 3_jugs_regular2.py

However, I was not satisfied with this calling of main repeatedly. Instead an alternative solution was created which use an extra (last) state which may be looped into (15->15->15...):
# ....
[15],   # state 14
[15]    # state 15: loop ("fix-point")
With this state we can fix the length of the array of decision variables (x) to 15, the number of states, and add an objective variable cost which is then minimized. This variable counts the number of elements in x that is not in the final state (state 15):
cost = solver.IntVar(0, input_max, 'cost')
# ...
sum_b = [solver.IsLessCstVar(x[i], input_max-1) for i in range(input_max)]
solver.Add(cost == solver.Sum(sum_b))

objective = solver.Minimize(cost, 1)

Global contiguity

Model: contiguity_regular.py

The definition of the global constraint global_contiguity from Global Constraint Catalog is:
Enforce all variables of the VARIABLES collection to be assigned to 0 or 1. In addition, all variables assigned to value 1 appear contiguously.

Example:
(<0, 1, 1, 0>)

The global_contiguity constraint holds since the sequence 0 1 1 0 contains no more than one group of contiguous 1.
The regular expression for this is ^0*1*0*$ (or maybe ^0*1+0*$ if there is an assumption that there must be at least one 1). The transition function is thus:
transition_fn =  [
    [1,2], # state 1: 0*
    [3,2], # state 2: 1* 
    [3,0], # state 3: 0* 
    ]
For an array of length 4 there are 11 solutions:
[0, 0, 0, 0]
[0, 0, 0, 1]
[0, 0, 1, 0]
[0, 0, 1, 1]
[0, 1, 0, 0]
[0, 1, 1, 0]
[0, 1, 1, 1]
[1, 0, 0, 0]
[1, 1, 0, 0]
[1, 1, 1, 0]
[1, 1, 1, 1]

num_solutions: 11
failures: 0
branches: 20
wall_time: 0 ms
For an array of length 100 there are 5051 solutions, still with 0 failures (and 10100 branches). It takes about 4 seconds to generate and print all solutions; without printing it takes 0.2 seconds.

Regular expressions

Model: regexp.py

The history behind this is that my last name (Kjellerstrand) is quite often misspelled, in ways according to this regular expression:
k(je|ä)ll(er|ar)?(st|b)r?an?d
Note: In Swedish the two constructs "kje(ll)" and "kä(ll)" sounds alike.

This model generates all the words that can be construed by this regular expression. As with the other models, we create a DFA. Note that the states are groups of characters (maybe it should be just single characters):
transition_fn =  [
   # 1 2 3 4 5 6 7 8 9 0 1 2     # 
    [0,2,3,0,0,0,0,0,0,0,0,0],   #  1 k 
    [0,0,0,4,0,0,0,0,0,0,0,0],   #  2 je
    [0,0,0,4,0,0,0,0,0,0,0,0],   #  3 ä
    [0,0,0,0,5,6,7,8,0,0,0,0],   #  4 ll
    [0,0,0,0,0,0,7,8,0,0,0,0],   #  5 er
    [0,0,0,0,0,0,7,8,0,0,0,0],   #  6 ar
    [0,0,0,0,0,0,0,0,9,10,0,0],  #  7 st 
    [0,0,0,0,0,0,0,0,9,10,0,0],  #  8 b
    [0,0,0,0,0,0,0,0,0,10,0,0],  #  9 r
    [0,0,0,0,0,0,0,0,0,0,11,12], # 10 a
    [0,0,0,0,0,0,0,0,0,0,0,12],  # 11 n
                                 # 12 d 
    ]

s = ['k','je','ä','ll','er','ar','st','b','r','a','n','d']
Running the program generates all the 48 variants of my last name. Though, I have to confess that I have never been called "Kjellbad", "Källbad" or "Kjellbrad". Yet.
kjellstad
kjellbad
källstad
källbad
kjellerstad
kjellerbad
kjellarstad
kjellarbad
kjellstrad
kjellstand
kjellbrad
kjellband
källerstad
källerbad
källarstad
källarbad
källstrad
källstand
källbrad
källband
kjellerstrad
kjellerstand
kjellerbrad
kjellerband
kjellarstrad
kjellarstand
kjellarbrad
kjellarband
kjellstrand
kjellbrand
källerstrad
källerstand
källerbrad
källerband
källarstrad
källarstand
källarbrad
källarband
källstrand
källbrand
kjellerstrand
kjellerbrand
kjellarstrand
kjellarbrand
källerstrand
källerbrand
källarstrand
källarbrand
Compare with the Gecode model all_regexp.cpp, and was described in Regular expressions in Gecode. Here I also describe my fascination with regular expressions.