Get the cartesian product of a series of lists?

Each Answer to this Q is separated by one/two green lines.

How can I get the Cartesian product (every possible combination of values) from a group of lists?

Input:

somelists = [
   [1, 2, 3],
   ['a', 'b'],
   [4, 5]
]

Desired output:

[(1, 'a', 4), (1, 'a', 5), (1, 'b', 4), (1, 'b', 5), (2, 'a', 4), (2, 'a', 5) ...]

itertools.product

Available from Python 2.6.

import itertools

somelists = [
   [1, 2, 3],
   ['a', 'b'],
   [4, 5]
]
for element in itertools.product(*somelists):
    print(element)

Which is the same as,

for element in itertools.product([1, 2, 3], ['a', 'b'], [4, 5]):
    print(element)

import itertools
>>> for i in itertools.product([1,2,3],['a','b'],[4,5]):
...         print i
...
(1, 'a', 4)
(1, 'a', 5)
(1, 'b', 4)
(1, 'b', 5)
(2, 'a', 4)
(2, 'a', 5)
(2, 'b', 4)
(2, 'b', 5)
(3, 'a', 4)
(3, 'a', 5)
(3, 'b', 4)
(3, 'b', 5)
>>>

For Python 2.5 and older:

>>> [(a, b, c) for a in [1,2,3] for b in ['a','b'] for c in [4,5]]
[(1, 'a', 4), (1, 'a', 5), (1, 'b', 4), (1, 'b', 5), (2, 'a', 4), 
 (2, 'a', 5), (2, 'b', 4), (2, 'b', 5), (3, 'a', 4), (3, 'a', 5), 
 (3, 'b', 4), (3, 'b', 5)]

Here’s a recursive version of product() (just an illustration):

def product(*args):
    if not args:
        return iter(((),)) # yield tuple()
    return (items + (item,) 
            for items in product(*args[:-1]) for item in args[-1])

Example:

>>> list(product([1,2,3], ['a','b'], [4,5])) 
[(1, 'a', 4), (1, 'a', 5), (1, 'b', 4), (1, 'b', 5), (2, 'a', 4), 
 (2, 'a', 5), (2, 'b', 4), (2, 'b', 5), (3, 'a', 4), (3, 'a', 5), 
 (3, 'b', 4), (3, 'b', 5)]
>>> list(product([1,2,3]))
[(1,), (2,), (3,)]
>>> list(product([]))
[]
>>> list(product())
[()]

I would use list comprehension :

somelists = [
   [1, 2, 3],
   ['a', 'b'],
   [4, 5]
]

cart_prod = [(a,b,c) for a in somelists[0] for b in somelists[1] for c in somelists[2]]

with itertools.product:

import itertools
result = list(itertools.product(*somelists))

Here is a recursive generator, which doesn’t store any temporary lists

def product(ar_list):
    if not ar_list:
        yield ()
    else:
        for a in ar_list[0]:
            for prod in product(ar_list[1:]):
                yield (a,)+prod

print list(product([[1,2],[3,4],[5,6]]))

Output:

[(1, 3, 5), (1, 3, 6), (1, 4, 5), (1, 4, 6), (2, 3, 5), (2, 3, 6), (2, 4, 5), (2, 4, 6)]

In Python 2.6 and above you can use ‘itertools.product`. In older versions of Python you can use the following (almost — see documentation) equivalent code from the documentation, at least as a starting point:

def product(*args, **kwds):
    # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
    # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
    pools = map(tuple, args) * kwds.get('repeat', 1)
    result = [[]]
    for pool in pools:
        result = [x+[y] for x in result for y in pool]
    for prod in result:
        yield tuple(prod)

The result of both is an iterator, so if you really need a list for furthert processing, use list(result).

Although there are many answers already, I would like to share some of my thoughts:

Iterative approach

def cartesian_iterative(pools):
  result = [[]]
  for pool in pools:
    result = [x+[y] for x in result for y in pool]
  return result

Recursive Approach

def cartesian_recursive(pools):
  if len(pools) > 2:
    pools[0] = product(pools[0], pools[1])
    del pools[1]
    return cartesian_recursive(pools)
  else:
    pools[0] = product(pools[0], pools[1])
    del pools[1]
    return pools
def product(x, y):
  return [xx + [yy] if isinstance(xx, list) else [xx] + [yy] for xx in x for yy in y]

Lambda Approach

def cartesian_reduct(pools):
  return reduce(lambda x,y: product(x,y) , pools)

Recursive Approach:

def rec_cart(start, array, partial, results):
  if len(partial) == len(array):
    results.append(partial)
    return 

  for element in array[start]:
    rec_cart(start+1, array, partial+[element], results)

rec_res = []
some_lists = [[1, 2, 3], ['a', 'b'], [4, 5]]  
rec_cart(0, some_lists, [], rec_res)
print(rec_res)

Iterative Approach:

def itr_cart(array):
  results = [[]]
  for i in range(len(array)):
    temp = []
    for res in results:
      for element in array[i]:
        temp.append(res+[element])
    results = temp

  return results

some_lists = [[1, 2, 3], ['a', 'b'], [4, 5]]  
itr_res = itr_cart(some_lists)
print(itr_res)

A minor modification to the above recursive generator solution in variadic flavor:

def product_args(*args):
    if args:
        for a in args[0]:
            for prod in product_args(*args[1:]) if args[1:] else ((),):
                yield (a,) + prod

And of course a wrapper which makes it work exactly the same as that solution:

def product2(ar_list):
    """
    >>> list(product(()))
    [()]
    >>> list(product2(()))
    []
    """
    return product_args(*ar_list)

with one trade-off: it checks if recursion should break upon each outer loop, and one gain: no yield upon empty call, e.g.product(()), which I suppose would be semantically more correct (see the doctest).

Regarding list comprehension: the mathematical definition applies to an arbitrary number of arguments, while list comprehension could only deal with a known number of them.

Just to add a bit to what has already been said: if you use sympy, you can use symbols rather than strings which makes them mathematically useful.

import itertools
import sympy

x, y = sympy.symbols('x y')

somelist = [[x,y], [1,2,3], [4,5]]
somelist2 = [[1,2], [1,2,3], [4,5]]

for element in itertools.product(*somelist):
  print element

About sympy.

I believe this works:

def cartesian_product(L):  
   if L:
       return {(a,) + b for a in L[0] 
                        for b in cartesian_product(L[1:])}
   else:
       return {()}

This can be done a

[(x, y) for x in range(10) for y in range(10)]

another variable? No problem:

[(x, y, z) for x in range(10) for y in range(10) for z in range(10)]

The following code is a 95 % copy from Using numpy to build an array of all combinations of two arrays, all credits go there! This is said to be much faster since it is only in numpy.

import numpy as np

def cartesian(arrays, dtype=None, out=None):
    arrays = [np.asarray(x) for x in arrays]
    if dtype is None:
        dtype = arrays[0].dtype
    n = np.prod([x.size for x in arrays])
    if out is None:
        out = np.zeros([n, len(arrays)], dtype=dtype)

    m = int(n / arrays[0].size) 
    out[:,0] = np.repeat(arrays[0], m)
    if arrays[1:]:
        cartesian(arrays[1:], out=out[0:m, 1:])
        for j in range(1, arrays[0].size):
            out[j*m:(j+1)*m, 1:] = out[0:m, 1:]
    return out

You need to define the dtype as a parameter if you do not want to take the dtype from the first entry for all entries. Take dtype=”object” if you have letters and numbers as items. Test:

somelists = [
   [1, 2, 3],
   ['a', 'b'],
   [4, 5]
]

[tuple(x) for x in cartesian(somelists, 'object')]

Out:

[(1, 'a', 4),
 (1, 'a', 5),
 (1, 'b', 4),
 (1, 'b', 5),
 (2, 'a', 4),
 (2, 'a', 5),
 (2, 'b', 4),
 (2, 'b', 5),
 (3, 'a', 4),
 (3, 'a', 5),
 (3, 'b', 4),
 (3, 'b', 5)]

List comprehension is simple and clean:

import itertools

somelists = [
   [1, 2, 3],
   ['a', 'b'],
   [4, 5]
]
lst = [i for i in itertools.product(*somelists)]

You can use itertools.product in the standard library to get the cartesian product. Other cool, related utilities in itertools include permutations, combinations, and combinations_with_replacement. Here is a link to a python codepen for the snippet below:

from itertools import product

somelists = [
   [1, 2, 3],
   ['a', 'b'],
   [4, 5]
]

result = list(product(*somelists))
print(result)

With early rejection:

def my_product(pools: List[List[Any]], rules: Dict[Any, List[Any]], forbidden: List[Any]) -> Iterator[Tuple[Any]]:
    """
    Compute the cartesian product except it rejects some combinations based on provided rules
    
    :param pools: the values to calculate the Cartesian product on 
    :param rules: a dict specifying which values each value is incompatible with
    :param forbidden: values that are never authorized in the combinations
    :return: the cartesian product
    """
    if not pools:
        return

    included = set()

    # if an element has an entry of 0, it's acceptable, if greater than 0, it's rejected, cannot be negative
    incompatibles = defaultdict(int)
    for value in forbidden:
        incompatibles[value] += 1
    selections = [-1] * len(pools)
    pool_idx = 0

    def current_value():
        return pools[pool_idx][selections[pool_idx]]

    while True:
        # Discard incompatibilities from value from previous iteration on same pool
        if selections[pool_idx] >= 0:
            for value in rules[current_value()]:
                incompatibles[value] -= 1
            included.discard(current_value())

        # Try to get to next value of same pool
        if selections[pool_idx] != len(pools[pool_idx]) - 1:
            selections[pool_idx] += 1
        # Get to previous pool if current is exhausted
        elif pool_idx != 0:
            selections[pool_idx] = - 1
            pool_idx -= 1
            continue
        # Done if first pool is exhausted
        else:
            break

        # Add incompatibilities of newly added value
        for value in rules[current_value()]:
            incompatibles[value] += 1
        included.add(current_value())

        # Skip value if incompatible
        if incompatibles[current_value()] or \
                any(intersection in included for intersection in rules[current_value()]):
            continue

        # Submit combination if we're at last pool
        if pools[pool_idx] == pools[-1]:
            yield tuple(pool[selection] for pool, selection in zip(pools, selections))
        # Else get to next pool
        else:
            pool_idx += 1

I had a case where I had to fetch the first result of a very big Cartesian product. And it would take ages despite I only wanted one item. The problem was that it had to iterate through many unwanted results before finding a correct one because of the order of the results. So if I had 10 lists of 50 elements and the first element of the two first lists were incompatible, it had to iterate through the Cartesian product of the last 8 lists despite that they would all get rejected.

This implementation enables to test a result before it includes one item from each list. So when I check that an element is incompatible with the already included elements from the previous lists, I immediately go to the next element of the current list rather than iterating through all products of the following lists.


The answers/resolutions are collected from stackoverflow, are licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0 .

Leave a Reply

Your email address will not be published.