Splitting a string by list of indices

Each Answer to this Q is separated by one/two green lines.

I want to split a string by a list of indices, where the split segments begin with one index and end before the next one.

Example:

s="long string that I want to split up"
indices = [0,5,12,17]
parts = [s[index:] for index in indices]
for part in parts:
    print part

This will return:

long string that I want to split up
string that I want to split up
that I want to split up
I want to split up

I’m trying to get:

long
string
that
I want to split up

s="long string that I want to split up"
indices = [0,5,12,17]
parts = [s[i:j] for i,j in zip(indices, indices[1:]+[None])]

returns

['long ', 'string ', 'that ', 'I want to split up']

which you can print using:

print '\n'.join(parts)

Another possibility (without copying indices) would be:

s="long string that I want to split up"
indices = [0,5,12,17]
indices.append(None)
parts = [s[indices[i]:indices[i+1]] for i in xrange(len(indices)-1)]

Here is a short solution with heavy usage of the itertools module. The tee function is used to iterate pairwise over the indices. See the Recipe section in the module for more help.

>>> from itertools import tee, izip_longest
>>> s="long string that I want to split up"
>>> indices = [0,5,12,17]
>>> start, end = tee(indices)
>>> next(end)
0
>>> [s[i:j] for i,j in izip_longest(start, end)]
['long ', 'string ', 'that ', 'I want to split up']

Edit: This is a version that does not copy the indices list, so it should be faster.

You can write a generator if you don’t want to make any modifications to the list of indices:

>>> def split_by_idx(S, list_of_indices):
...     left, right = 0, list_of_indices[0]
...     yield S[left:right]
...     left = right
...     for right in list_of_indices[1:]:
...         yield S[left:right]
...         left = right
...     yield S[left:]
... 
>>> 
>>> 
>>> s="long string that I want to split up"
>>> indices = [5,12,17]
>>> [i for i in split_by_idx(s, indices)]
['long ', 'string ', 'that ', 'I want to split up']


The answers/resolutions are collected from stackoverflow, are licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0 .