How to apply custom function to pandas data frame for each row [duplicate]

Each Answer to this Q is separated by one/two green lines.

I want to apply a custom function and create a derived column called population2050 that is based on two columns already present in my data frame.

import pandas as pd
import sqlite3
conn = sqlite3.connect('factbook.db')
query = "select * from facts where area_land =0;"
facts = pd.read_sql_query(query,conn)

def final_pop(initial_pop,growth_rate):
    final = initial_pop*math.e**(growth_rate*35)

facts['pop2050'] = facts['population','population_growth'].apply(final_pop,axis=1)

When I run the above code, I get an error. Am I not using the ‘apply’ function correctly?

You were almost there:

facts['pop2050'] = facts.apply(lambda row: final_pop(row['population'],row['population_growth']),axis=1)

Using lambda allows you to keep the specific (interesting) parameters listed in your function, rather than bundling them in a ‘row’.

Apply will pass you along the entire row with axis=1. Adjust like this assuming your two columns are called initial_popand growth_rate

def final_pop(row):
    return row.initial_pop*math.e**(row.growth_rate*35)

Your function,

def function(x):
  // your operation
  return x

call your function as,


You can achieve the same result without the need for DataFrame.apply(). Pandas series (or dataframe columns) can be used as direct arguments for NumPy functions and even built-in Python operators, which are applied element-wise. In your case, it is as simple as the following:

import numpy as np

facts['pop2050'] = facts['population'] * np.exp(35 * facts['population_growth'])

This multiplies each element in the column population_growth, applies numpy’s exp() function to that new column (35 * population_growth) and then adds the result with population.

The answers/resolutions are collected from stackoverflow, are licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0 .