I have a python editor where the user is entering a script or code, which is then put into a main method behind the scenes, while also having every line indented. The problem is that if a user has a multi line string, the indentation made to the whole script affects the string, by inserting a tab in every space. A problem script would be something so simple as:

"""foo
bar
foo2"""

So when in the main method it would look like:

def main():
    """foo
    bar
    foo2"""

and the string would now have an extra tab at the beginning of every line.

textwrap.dedent from the standard library is there to automatically undo the wacky indentation.

From what I see, a better answer here might be inspect.cleandoc, which does much of what textwrap.dedent does but also fixes the problems that textwrap.dedent has with the leading line.

The below example shows the differences:

>>> import textwrap
>>> import inspect
>>> x = """foo bar
    baz
    foobar
    foobaz
    """
>>> inspect.cleandoc(x)
'foo bar\nbaz\nfoobar\nfoobaz'
>>> textwrap.dedent(x)
'foo bar\n    baz\n    foobar\n    foobaz\n'
>>> y = """
...     foo
...     bar
... """
>>> inspect.cleandoc(y)
'foo\nbar'
>>> textwrap.dedent(y)
'\nfoo\nbar\n'
>>> z = """\tfoo
bar\tbaz
"""
>>> inspect.cleandoc(z)
'foo\nbar     baz'
>>> textwrap.dedent(z)
'\tfoo\nbar\tbaz\n'

Note that inspect.cleandoc also expands internal tabs to spaces.
This may be inappropriate for one’s use case, but works fine for me.

What follows the first line of a multiline string is part of the string, and not treated as indentation by the parser. You may freely write:

def main():
    """foo
bar
foo2"""
    pass

and it will do the right thing.

On the other hand, that’s not readable, and Python knows it. So if a docstring contains whitespace in it’s second line, that amount of whitespace is stripped off when you use help() to view the docstring. Thus, help(main) and the below help(main2) produce the same help info.

def main2():
    """foo
    bar
    foo2"""
    pass

Showing the difference between textwrap.dedent and inspect.cleandoc with a little more clarity:

Behavior with the leading part not indented

import textwrap
import inspect

string1="""String
with
no indentation
       """
string2="""String
        with
        indentation
       """
print('string1 plain=' + repr(string1))
print('string1 inspect.cleandoc=" + repr(inspect.cleandoc(string1)))
print("string1 texwrap.dedent=" + repr(textwrap.dedent(string1)))
print("string2 plain=' + repr(string2))
print('string2 inspect.cleandoc=" + repr(inspect.cleandoc(string2)))
print("string2 texwrap.dedent=" + repr(textwrap.dedent(string2)))

Output

string1 plain="String\nwith\nno indentation\n       '
string1 inspect.cleandoc="String\nwith\nno indentation\n       "
string1 texwrap.dedent="String\nwith\nno indentation\n"
string2 plain='String\n        with\n        indentation\n       '
string2 inspect.cleandoc="String\nwith\nindentation"
string2 texwrap.dedent="String\n        with\n        indentation\n"

Behavior with the leading part indented

string1="""
String
with
no indentation
       """
string2="""
        String
        with
        indentation
       """

print('string1 plain=' + repr(string1))
print('string1 inspect.cleandoc=" + repr(inspect.cleandoc(string1)))
print("string1 texwrap.dedent=" + repr(textwrap.dedent(string1)))
print("string2 plain=' + repr(string2))
print('string2 inspect.cleandoc=" + repr(inspect.cleandoc(string2)))
print("string2 texwrap.dedent=" + repr(textwrap.dedent(string2)))

Output

string1 plain="\nString\nwith\nno indentation\n       '
string1 inspect.cleandoc="String\nwith\nno indentation\n       "
string1 texwrap.dedent="\nString\nwith\nno indentation\n"
string2 plain='\n        String\n        with\n        indentation\n       '
string2 inspect.cleandoc="String\nwith\nindentation"
string2 texwrap.dedent="\nString\nwith\nindentation\n"

I wanted to preserve exactly what is between the triple-quote lines, removing common leading indent only. I found that texwrap.dedent and inspect.cleandoc didn’t do it quite right, so I wrote this one. It uses os.path.commonprefix.

import re
from os.path import commonprefix

def ql(s, eol=True):
    lines = s.splitlines()
    l0 = None
    if lines:
        l0 = lines.pop(0) or None
    common = commonprefix(lines)
    indent = re.match(r'\s*', common)[0]
    n = len(indent)
    lines2 = [l[n:] for l in lines]
    if not eol and lines2 and not lines2[-1]:
        lines2.pop()
    if l0 is not None:
        lines2.insert(0, l0)
    s2 = "\n".join(lines2)
    return s2

This can quote any string with any indent. I wanted it to include the trailing newline by default, but with an option to remove it so that it can quote any string neatly.

Example:

print(ql("""
     Hello
    |\---/|
    | o_o |
     \_^_/
    """))

print(ql("""
         World
        |\---/|
        | o_o |
         \_^_/
    """))

The second string has 4 spaces of common indentation because the final """ is indented less than the quoted text:

 Hello
|\---/|
| o_o |
 \_^_/

     World
    |\---/|
    | o_o |
     \_^_/

I thought this was going to be simpler, otherwise I wouldn’t have bothered with it!

The only way i see – is to strip first n tabs for each line starting with second, where n is known identation of main method.

If that identation is not known beforehand – you can add trailing newline before inserting it and strip number of tabs from the last line…

The third solution is to parse data and find beginning of multiline quote and do not add your identation to every line after until it will be closed.

Think there is a better solution..

I had a similar issue: I wanted my triple quoted string to be indented, but I didn’t want the string to have all those spaces at the beginning of each line. I used re to correct my issue:

        print(re.sub('\n *','\n', f"""Content-Type: multipart/mixed; boundary="===============9004758485092194316=="
`           MIME-Version: 1.0
            Subject: Get the reader's attention here!
            To: [email protected]

            --===============9004758485092194316==
            Content-Type: text/html; charset="us-ascii"
            MIME-Version: 1.0
            Content-Transfer-Encoding: 7bit

            Very important message goes here - you can even use <b>HTML</b>.
            --===============9004758485092194316==--
        """))

Above, I was able to keep my code indented, but the string was left trimmed essentially. All spaces at the beginning of each line were deleted. This was important since any spaces in front of the SMTP or MIME specific lines would break the email message.

The tradeoff I made was that I left the Content-Type on the first line because the regex I was using didn’t remove the initial \n (which broke email). If it bothered me enough, I guess I could have added an lstrip like this:

print(re.sub('\n *','\n', f"""
    Content-Type: ...
""").lstrip()

After reading this 10 year old page, I decided to stick with re.sub since I didn’t truly understand all the nuances of textwrap and inspect.

There is a much simpler way:

    foo = """first line\
             \nsecond line"""

So if I get it correctly, you take whatever the user inputs, indent it properly and add it to the rest of your program (and then run that whole program).

So after you put the user input into your program, you could run a regex, that basically takes that forced indentation back. Something like: Within three quotes, replace all “new line markers” followed by four spaces (or a tab) with only a “new line marker”.