How to add a suffix (or prefix) to each column name?

Each Answer to this Q is separated by one/two green lines.

I want to add _x suffix to each column name like so:

featuresA = myPandasDataFrame.columns.values + '_x'

How do I do this? Additionally, if I wanted to add x_ as a suffix, how would the solution change?

The following is the nicest way to add suffix in my opinion.

df = df.add_suffix('_some_suffix')

As it is a function that is called on DataFrame and returns DataFrame – you can use it in chain of the calls.

You can use a list comprehension:

df.columns = [str(col) + '_x' for col in df.columns]

There are also built-in methods like .add_suffix() and .add_prefix() as mentioned in another answer.

Elegant In-place Concatenation

If you’re trying to modify df in-place, then the cheapest (and simplest) option is in-place addition directly on df.columns (i.e., using Index.__iadd__).

df = pd.DataFrame({"A": [9, 4, 2, 1], "B": [12, 7, 5, 4]})
df

   A   B
0  9  12
1  4   7
2  2   5
3  1   4

df.columns += '_some_suffix'
df

   A_some_suffix  B_some_suffix
0              9             12
1              4              7
2              2              5
3              1              4

To add a prefix, you would similarly use

df.columns="some_prefix_" + df.columns
df

   some_prefix_A  some_prefix_B
0              9             12
1              4              7
2              2              5
3              1              4

Another cheap option is using a list comprehension with f-string formatting (available on python3.6+).

df.columns = [f'{c}_some_suffix' for c in df]
df

   A_some_suffix  B_some_suffix
0              9             12
1              4              7
2              2              5
3              1              4

And for prefix, similarly,

df.columns = [f'some_prefix{c}' for c in df]

Method Chaining

It is also possible to do add *fixes while method chaining. To add a suffix, use DataFrame.add_suffix

df.add_suffix('_some_suffix')

   A_some_suffix  B_some_suffix
0              9             12
1              4              7
2              2              5
3              1              4

This returns a copy of the data. IOW, df is not modified.

Adding prefixes is also done with DataFrame.add_prefix.

df.add_prefix('some_prefix_')

   some_prefix_A  some_prefix_B
0              9             12
1              4              7
2              2              5
3              1              4

Which also does not modify df.


Critique of add_*fix

These are good methods if you’re trying to perform method chaining:

df.some_method1().some_method2().add_*fix(...)

However, add_prefix (and add_suffix) creates a copy of the entire dataframe, just to modify the headers. If you believe this is wasteful, but still want to chain, you can call pipe:

def add_suffix(df):
    df.columns += '_some_suffix'
    return df

df.some_method1().some_method2().pipe(add_suffix)

I Know 4 ways to add a suffix (or prefix) to your column’s names:

1- df.columns = [str(col) + '_some_suffix' for col in df.columns]

or

2- df.rename(columns= lambda col: col+'_some_suffix')

or

3- df.columns += '_some_suffix' much easiar.

or, the nicest:

3- df.add_suffix('_some_suffix')

I haven’t seen this solution proposed above so adding this to the list:

df.columns += '_x'

And you can easily adapt for the prefix scenario.

Using DataFrame.rename

df = pd.DataFrame({'A': range(3), 'B': range(4, 7)})
print(df)
   A  B
0  0  4
1  1  5
2  2  6

Using rename with axis=1 and string formatting:

df.rename('col_{}'.format, axis=1)
# or df.rename(columns="col_{}".format)

   col_A  col_B
0      0      4
1      1      5
2      2      6

To actually overwrite your column names, we can assign the returned values to our df:

df = df.rename('col_{}'.format, axis=1)

or use inplace=True:

df.rename('col_{}'.format, axis=1, inplace=True)

I figured that this is what I would use quite often, for example:

df = pd.DataFrame({'silverfish': range(3), 'silverspoon': range(4, 7),
                   'goldfish': range(10, 13),'goldilocks':range(17,20)})

My way of dynamically renaming:

color_list = ['gold','silver']

for i in color_list:
    df[f'color_{i}']=df.filter(like=i).sum(axis=1)

OUTPUT:

{'silverfish': {0: 0, 1: 1, 2: 2},
 'silverspoon': {0: 4, 1: 5, 2: 6},
 'goldfish': {0: 10, 1: 11, 2: 12},
 'goldilocks': {0: 17, 1: 18, 2: 19},
 'color_gold': {0: 135, 1: 145, 2: 155},
 'color_silver': {0: 20, 1: 30, 2: 40}}


The answers/resolutions are collected from stackoverflow, are licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0 .

Leave a Reply

Your email address will not be published.