What does the * operator mean in Python, such as in code like zip(*x) or f(**k)?

  1. How is it handled internally in the interpreter?
  2. Does it affect performance at all? Is it fast or slow?
  3. When is it useful and when is it not?
  4. Should it be used in a function declaration or in a call?

The single star * unpacks the sequence/collection into positional arguments, so you can do this:

def sum(a, b):
    return a + b

values = (1, 2)

s = sum(*values)

This will unpack the tuple so that it actually executes as:

s = sum(1, 2)

The double star ** does the same, only using a dictionary and thus named arguments:

values = { 'a': 1, 'b': 2 }
s = sum(**values)

You can also combine:

def sum(a, b, c, d):
    return a + b + c + d

values1 = (1, 2)
values2 = { 'c': 10, 'd': 15 }
s = sum(*values1, **values2)

will execute as:

s = sum(1, 2, c=10, d=15)

Also see section 4.7.4 – Unpacking Argument Lists of the Python documentation.


Additionally you can define functions to take *x and **y arguments, this allows a function to accept any number of positional and/or named arguments that aren’t specifically named in the declaration.

Example:

def sum(*values):
    s = 0
    for v in values:
        s = s + v
    return s

s = sum(1, 2, 3, 4, 5)

or with **:

def get_a(**values):
    return values['a']

s = get_a(a=1, b=2)      # returns 1

this can allow you to specify a large number of optional parameters without having to declare them.

And again, you can combine:

def sum(*values, **options):
    s = 0
    for i in values:
        s = s + i
    if "neg" in options:
        if options["neg"]:
            s = -s
    return s

s = sum(1, 2, 3, 4, 5)            # returns 15
s = sum(1, 2, 3, 4, 5, neg=True)  # returns -15
s = sum(1, 2, 3, 4, 5, neg=False) # returns 15

One small point: these are not operators. Operators are used in expressions to create new values from existing values (1+2 becomes 3, for example. The * and ** here are part of the syntax of function declarations and calls.

I find this particularly useful for when you want to ‘store’ a function call.

For example, suppose I have some unit tests for a function ‘add’:

def add(a, b): return a + b
tests = { (1,4):5, (0, 0):0, (-1, 3):3 }
for test, result in tests.items():
    print 'test: adding', test, '==', result, '---', add(*test) == result

There is no other way to call add, other than manually doing something like add(test[0], test[1]), which is ugly. Also, if there are a variable number of variables, the code could get pretty ugly with all the if-statements you would need.

Another place this is useful is for defining Factory objects (objects that create objects for you).
Suppose you have some class Factory, that makes Car objects and returns them.
You could make it so that myFactory.make_car('red', 'bmw', '335ix') creates Car('red', 'bmw', '335ix'), then returns it.

def make_car(*args):
    return Car(*args)

This is also useful when you want to call a superclass’ constructor.

In a function call the single star turns a list into seperate arguments (e.g. zip(*x) is the same as zip(x1,x2,x3) if x=[x1,x2,x3]) and the double star turns a dictionary into seperate keyword arguments (e.g. f(**k) is the same as f(x=my_x, y=my_y) if k = {'x':my_x, 'y':my_y}.

In a function definition it’s the other way around: the single star turns an arbitrary number of arguments into a list, and the double start turns an arbitrary number of keyword arguments into a dictionary. E.g. def foo(*x) means “foo takes an arbitrary number of arguments and they will be accessible through the list x (i.e. if the user calls foo(1,2,3), x will be [1,2,3])” and def bar(**k) means “bar takes an arbitrary number of keyword arguments and they will be accessible through the dictionary k (i.e. if the user calls bar(x=42, y=23), k will be {'x': 42, 'y': 23})”.

It is called the extended call syntax. From the documentation:

If the syntax *expression appears in the function call, expression must evaluate to a sequence. Elements from this sequence are treated as if they were additional positional arguments; if there are positional arguments x1,…, xN, and expression evaluates to a sequence y1, …, yM, this is equivalent to a call with M+N positional arguments x1, …, xN, y1, …, yM.

and:

If the syntax **expression appears in the function call, expression must evaluate to a mapping, the contents of which are treated as additional keyword arguments. In the case of a keyword appearing in both expression and as an explicit keyword argument, a TypeError exception is raised.