I am trying to use parameter substitution with SQLite within Python for an IN clause. Here is a complete running example that demonstrates:

import sqlite3

c = sqlite3.connect(":memory:")
c.execute('CREATE TABLE distro (id INTEGER PRIMARY KEY AUTOINCREMENT, name TEXT)')

for name in 'Ubuntu Fedora Puppy DSL SuSE'.split():
  c.execute('INSERT INTO distro (name) VALUES (?)', [ name ] )

desired_ids = ["1", "2", "5", "47"]
result_set = c.execute('SELECT * FROM distro WHERE id IN (%s)' % (", ".join(desired_ids)), ())
for result in result_set:
  print result

It prints out:

(1, u’Ubuntu’)
(2, u’Fedora’)
(5, u’SuSE’)

As the docs state that “[y]ou shouldn’t assemble your query using Python’s string operations because doing so is insecure; it makes your program vulnerable to an SQL injection attack,” I am hoping to use parameter substitution.

When I try:

result_set = c.execute('SELECT * FROM distro WHERE id IN (?)', [ (", ".join(desired_ids)) ])

I get an empty result set, and when I try:

result_set = c.execute('SELECT * FROM distro WHERE id IN (?)', [ desired_ids ] )

I get:

InterfaceError: Error binding parameter 0 – probably unsupported type.

While I hope that any answer to this simplified problem will work, I would like to point out that the actual query I want to perform is in a doubly-nested subquery. To wit:

UPDATE dir_x_user SET user_revision = user_attempted_revision 
WHERE user_id IN 
    (SELECT user_id FROM 
        (SELECT user_id, MAX(revision) FROM users WHERE obfuscated_name IN 
            ("Argl883", "Manf496", "Mook657") GROUP BY user_id
        ) 
    )

You do need the right number of ?s, but that doesn’t pose a sql injection risk:

>>> result_set = c.execute('SELECT * FROM distro WHERE id IN (%s)' %
                           ','.join('?'*len(desired_ids)), desired_ids)
>>> print result_set.fetchall()
[(1, u'Ubuntu'), (2, u'Fedora'), (5, u'SuSE')]

According to http://www.sqlite.org/limits.html (item 9), SQLite can’t (by default) handle more than 999 parameters to a query, so the solutions here (generating the required list of placeholders) will fail if you have thousands of items that you’re looking IN. If that’s the case, you’re going to need to break up the list then loop over the parts of it and join up the results yourself.

If you don’t need thousands of items in your IN clause, then Alex’s solution is the way to do it (and appears to be how Django does it).

Update: this works:

import sqlite3

c = sqlite3.connect(":memory:")
c.execute('CREATE TABLE distro (id INTEGER PRIMARY KEY AUTOINCREMENT, name TEXT)')

for name in 'Ubuntu Fedora Puppy DSL SuSE'.split():
  c.execute('INSERT INTO distro (name) VALUES (?)', ( name,) )

desired_ids = ["1", "2", "5", "47"]
result_set = c.execute('SELECT * FROM distro WHERE id IN (%s)' % ("?," * len(desired_ids))[:-1], desired_ids)
for result in result_set:
  print result

The issue was that you need to have one ? for each element in the input list.

The statement ("?," * len(desired_ids))[:-1] makes a repeating string of “?,”, then cuts off the last comma. so that there is one question mark for each element in desired_ids.

I always end up doing something like this:

query = 'SELECT * FROM distro WHERE id IN (%s)' % ','.join('?' for i in desired_ids)
c.execute(query, desired_ids)

There’s no injection risk because you’re not putting strings from desired_ids into the query directly.

I needed to use some other named parameters so I’ve developed two helper functions that may be worth sharing.

def prepare_list_query(name, values):
    """Prepare SQLite query with named parameters."""
    list_query = ", ".join(":{}_{}".format(name, i) for i in range(len(values)))
    return list_query


def prepare_list_dict(name, values):
    """Prepare SQLite dict with named parameters."""
    list_dict = {"{}_{}".format(name, i): value for i, value in enumerate(values)}
    return list_dict

# Usage:

desired_ids = ["1", "2", "5", "47"]
desired_types = ["active", "inactive"]

sql = "SELECT * FROM distro WHERE id IN ({}) AND type IN ({})".format(
    prepare_list_query("desired_id", desired_ids),
    prepare_list_query("desired_type", desired_types),
)
sql_dict = {"some": "other parameters you might need"}
sql_dict.update(prepare_list_dict("desired_id", desired_ids))
sql_dict.update(prepare_list_dict("desired_type", desired_types))

# # This results in:
# sql = "SELECT * FROM distro WHERE id IN (:desired_id_0, :desired_id_1, :desired_id_2, :desired_id_3) AND type IN (:desired_type_0, :desired_type_1)

# # and
# sql_dict = {
#     "some": "other parameters you might need",
#     "desired_id_0": "1",
#     "desired_id_1": "2",
#     "desired_id_2": "5",
#     "desired_id_3": "47",
#     "desired_type_0": "active",
#     "desired_type_1": "inactive",
# }

I wrote a very thin layer called notanorm to address this issue.

https://pypi.org/project/notanorm/

…and then your code looks like this:

import notanorm

c = notanorm.SqliteDb(":memory:")
c.query('CREATE TABLE distro (id INTEGER PRIMARY KEY AUTOINCREMENT, name TEXT)')

for name in 'Ubuntu Fedora Puppy DSL SuSE'.split():
  c.insert('distro', name=name)

desired_ids = ["1", "2", "5", "47"]
result_set = c.select('distro', id=desired_ids)

for result in result_set:
  print(result)

Which is both easy to read, and also allows you to switch databases some day.

{'id': 1, 'name': 'Ubuntu'}
{'id': 2, 'name': 'Fedora'}
{'id': 5, 'name': 'SuSE'}

Also consider looking at the heavier framework: https://www.sqlalchemy.org/

In case sqlite has problem with the length of sql request the indefinite number of question marks can be some kind of way to beak things.