I’m trying to remove the sub string _x that is located in the end of part of my df column names.

Sample df code:

import pandas as pd

d = {'W_x': ['abcde','abcde','abcde']}
df = pd.DataFrame(data=d)



     W_x  First_x Last_x                 Slice
0  abcde      0     1                   abFC=0.01
1  abcde      0     2  12fdak*4%FC=-0.035faf,dd43
2  abcde      0     3                 FC=0.5fasff

Desired output:

       W  First  Last                       Slice
0  abcde      0     1                   abFC=0.01
1  abcde      0     2  12fdak*4%FC=-0.035faf,dd43
2  abcde      0     3                 FC=0.5fasff

python < 3.9, pandas < 1.4

Use str.strip/rstrip:

# df.columns = df.columns.str.strip('_x')
# Or, 
df.columns = df.columns.str.rstrip('_x')  # strip suffix at the right end only.

# Index(['W', 'First', 'Last', 'Slice'], dtype="object")

To avoid the issue highlighted in the comments:

Beware of strip() if any column name starts or ends with either _ or
x beyond the suffix.

You could use str.replace,

df.columns = df.columns.str.replace(r'_x$', '')

# Index(['W', 'First', 'Last', 'Slice'], dtype="object")

Update: python >= 3.9, pandas >= 1.4

From version 1.4, you will soon be able to use str.removeprefix/str.removesuffix.


s = pd.Series(["str_foo", "str_bar", "no_prefix"])
0    str_foo
1    str_bar
2    no_prefix
dtype: object

0    foo
1    bar
2    no_prefix
dtype: object
s = pd.Series(["foo_str", "bar_str", "no_suffix"])
0    foo_str
1    bar_str
2    no_suffix
dtype: object

0    foo
1    bar
2    no_suffix
dtype: object

Note that 1.4 is not out yet, but you can play with this feature by installing a development environment of pandas.

df.columns = [col[:-2] for col in df.columns if col[-2:]=='_x' else col]


df.columns = [col.replace('_x', '') for col in df.columns]

I’d suggest to use the rename function:

df.rename(columns = lambda x: x.strip('_x'))

Output is as desired

Of yourse you can also take care of FabienP’s comment and modify if according to Quang Hoang’s solution:

df.rename(columns = lambda x: x.replace('_x$', ''))

gives the desired output.

Another solution is simply:

df.rename(columns = lambda x: x[:-2] if x.endswith('_x') else x)

I usually use @cs95 way but wrapping it in a data frame method just for convenience:

import pandas as pd

def drop_prefix(self, prefix):
    self.columns = self.columns.str.lstrip(prefix)
    return self

pd.core.frame.DataFrame.drop_prefix = drop_prefix

Then you can use it as with inverse method already implemented in pandas add_prefix:


Python=3.8 and Pandas=1.3:

Use df.columns = df.columns.str.replace('_x','') to get rid of the suffix.

This works well and only removes the exact substring(suffix) '_x' from the column names as opposed to str.strip/str.rstrip(substring) which removes all the characters mentioned in the substring from the DataFrame’s column names irrespective of whether the complete substring is present in the column name or not, the sequence in which these characters are occurring, etc.

With Python 3.9+ you can use string methods removesuffix() and removeprefix() as follows:

df.columns = df.rename(columns = lambda x: x.removesuffix('_x')) # or any suffix per say
df.columns = df.rename(columns = lambda x: x.removeprefix('prefix_i_want_to_remove')) 

Or you can directly map onto columns as:

df.columns = df.columns.map(lambda x: x.removesuffix('_x')) # or any suffix per say
df.columns = df.columns.map(lambda x: x.removeprefix('prefix_i_want_to_remove')) 

I had a similar request, needed to strip off a prefix for the columns headers. In my case the prefixes had this pattern: ‘p1-‘, ‘p2-‘, ‘p3-‘ and so on, so I use the following snippet to remove all of them: