What’s the best way to split a string into fixed length chunks and work with them in Python?

Each Answer to this Q is separated by one/two green lines.

I am reading in a line from a text file using:

   file = urllib2.urlopen("http://192.168.100.17/test.txt").read().splitlines()

and outputting it to an LCD display, which is 16 characters wide, in a telnetlib.write command. In the event that the line read is longer than 16 characters I want to break it down into sections of 16 character long strings and push each section out after a certain delay (e.g. 10 seconds), once complete the code should move onto the next line of the input file and continue.

I’ve tried searching various solutions and reading up on itertools etc. but my understanding of Python just isn’t sufficient to get anything to work without doing it in a very long winded way using a tangled mess of if then else statements that’s probably going to tie me in knots!

What’s the best way for me to do what I want?

One solution would be to use this function:

def chunkstring(string, length):
    return (string[0+i:length+i] for i in range(0, len(string), length))

This function returns a generator, using a generator comprehension. The generator returns the string sliced, from 0 + a multiple of the length of the chunks, to the length of the chunks + a multiple of the length of the chunks.

You can iterate over the generator like a list, tuple or string – for i in chunkstring(s,n):
, or convert it into a list (for instance) with list(generator). Generators are more memory efficient than lists because they generator their elements as they are needed, not all at once, however they lack certain features like indexing.

This generator also contains any smaller chunk at the end:

>>> list(chunkstring("abcdefghijklmnopqrstuvwxyz", 5))
['abcde', 'fghij', 'klmno', 'pqrst', 'uvwxy', 'z']

Example usage:

text = """This is the first line.
           This is the second line.
           The line below is true.
           The line above is false.
           A short line.
           A very very very very very very very very very long line.
           A self-referential line.
           The last line.
        """

lines = (i.strip() for i in text.splitlines())

for line in lines:
    for chunk in chunkstring(line, 16):
        print(chunk)

My favorite way to solve this problem is with the re module.

import re

def chunkstring(string, length):
  return re.findall('.{%d}' % length, string)

One caveat here is that re.findall will not return a chunk that is less than the length value, so any remainder is skipped.

However, if you’re parsing fixed-width data, this is a great way to do it.

For example, if I want to parse a block of text that I know is made up of 32 byte characters (like a header section) I find this very readable and see no need to generalize it into a separate function (as in chunkstring):

for header in re.findall('.{32}', header_data):
  ProcessHeader(header)

I know it’s an oldie, but like to add how to chop up a string with variable length columns:

def chunkstring(string, lengths):
    return (string[pos:pos+length].strip()
            for idx,length in enumerate(lengths)
            for pos in [sum(map(int, lengths[:idx]))])

column_lengths = [10,19,13,11,7,7,15]
fields = list(chunkstring(line, column_lengths))

I think this way is easier to read:

string = "when an unknown printer took a galley of type and scrambled it to make a type specimen book."
length = 20
list_of_strings = []
for i in range(0, len(string), length):
    list_of_strings.append(string[i:length+i])
print(list_of_strings)


The answers/resolutions are collected from stackoverflow, are licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0 .