[Solved] Append an empty row in dataframe using pandas

I am trying to append an empty row at the end of dataframe but unable to do so, even trying to understand how pandas work with append function and still not getting it.

Here’s the code:

import pandas as pd

excel_names = ["ARMANI+EMPORIO+AR0143-book.xlsx"]
excels = [pd.ExcelFile(name) for name in excel_names]
frames = [x.parse(x.sheet_names[0], header=None,index_col=None).dropna(how='all') for x in excels]
for f in frames:
    f.append(0, float('NaN'))
    f.append(2, float('NaN'))

There are two columns and random number of row.

with “print f” in for loop i Get this:

                             0                 1
0                   Brand Name    Emporio Armani
2                 Model number            AR0143
4                  Part Number            AR0143
6                   Item Shape       Rectangular
8   Dial Window Material Type           Mineral
10               Display Type          Analogue
12                 Clasp Type            Buckle
14               Case Material   Stainless steel
16              Case Diameter    31 millimetres
18               Band Material           Leather
20                 Band Length  Women's Standard
22                 Band Colour             Black
24                 Dial Colour             Black
26            Special Features       second-hand
28                    Movement            Quartz

Solution #1:

Add a new pandas.Series using pandas.DataFrame.append().

If you wish to specify the name (AKA the “index”) of the new row, use:

df.append(pandas.Series(name='NameOfNewRow'))

If you don’t wish to name the new row, use:

df.append(pandas.Series(), ignore_index=True)

where df is your pandas.DataFrame.

Respondent: Mansoor Akram

Solution #2:

You can add it by appending a Series to the dataframe as follows. I am assuming by blank you mean you want to add a row containing only “Nan”.
You can first create a Series object with Nan. Make sure you specify the columns while defining ‘Series’ object in the -Index parameter.
The you can append it to the DF. Hope it helps!

from numpy import nan as Nan
import pandas as pd

>>> df1 = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A3'],
...                     'B': ['B0', 'B1', 'B2', 'B3'],
...                     'C': ['C0', 'C1', 'C2', 'C3'],
...                     'D': ['D0', 'D1', 'D2', 'D3']},
...                     index=[0, 1, 2, 3])

>>> s2 = pd.Series([Nan,Nan,Nan,Nan], index=['A', 'B', 'C', 'D'])
>>> result = df1.append(s2)
>>> result
     A    B    C    D
0   A0   B0   C0   D0
1   A1   B1   C1   D1
2   A2   B2   C2   D2
3   A3   B3   C3   D3
4  NaN  NaN  NaN  NaN
Respondent: srcerer

Solution #3:

You can add a new series, and name it at the same time. The name will be the index of the new row, and all the values will automatically be NaN.

df.append(pd.Series(name='Afterthought'))
Respondent: silent_dev

Solution #4:

Assuming df is your dataframe,

df_prime = pd.concat([df, pd.DataFrame([[np.nan] * df.shape[1]], columns=df.columns)], ignore_index=True)

where df_prime equals df with an additional last row of NaN’s.

Note that pd.concat is slow so if you need this functionality in a loop, it’s best to avoid using it.
In that case, assuming your index is incremental, you can use

df.loc[df.iloc[-1].name + 1,:] = np.nan
Respondent: pocketdora

Solution #5:

The code below worked for me.

df.append(pd.Series([np.nan]), ignore_index = True)
Respondent: Dave Reikher

Solution #6:

Append “empty” row to data frame and fill selected cells:

Generate empty data frame (no rows just columns a and b):

import pandas as pd    
col_names =  ["a","b"]
df  = pd.DataFrame(columns = col_names)

Append empty row at the end of the data frame:

df = df.append(pd.Series(), ignore_index = True)

Now fill the empty cell at the end (len(df)-1) of the data frame in column a:

df.loc[[len(df)-1],'a'] = 123

Result:

     a    b
0  123  NaN

And of course one can iterate over the rows and fill cells:

col_names =  ["a","b"]
df  = pd.DataFrame(columns = col_names)
for x in range(0,5):
    df = df.append(pd.Series(), ignore_index = True)
    df.loc[[len(df)-1],'a'] = 123

Result:

     a    b
0  123  NaN
1  123  NaN
2  123  NaN
3  123  NaN
4  123  NaN
Respondent: kamal tanwar

Solution #7:

Assuming your df.index is sorted you can use:

df.loc[df.index.max() + 1] = None

It handles well different indexes and column types.

[EDIT] it works with pd.DatetimeIndex if there is a constant frequency, otherwise we must specify the new index exactly e.g:

df.loc[df.index.max() + pd.Timedelta(milliseconds=1)] = None

long example:

df = pd.DataFrame([[pd.Timestamp(12432423), 23, 'text_field']], 
                    columns=["timestamp", "speed", "text"],
                    index=pd.DatetimeIndex(start='2111-11-11',freq='ms', periods=1))
df.info()

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 1 entries, 2111-11-11 to 2111-11-11
Freq: L
Data columns (total 3 columns):
timestamp 1 non-null datetime64[ns]
speed 1 non-null int64
text 1 non-null object
dtypes: datetime64[ns](1), int64(1), object(1)
memory usage: 32.0+ bytes

df.loc[df.index.max() + 1] = None
df.info()

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 2 entries, 2111-11-11 00:00:00 to 2111-11-11 00:00:00.001000
Data columns (total 3 columns):
timestamp 1 non-null datetime64[ns]
speed 1 non-null float64
text 1 non-null object
dtypes: datetime64[ns](1), float64(1), object(1)
memory usage: 64.0+ bytes

df.head()

                            timestamp                   speed      text
2111-11-11 00:00:00.000 1970-01-01 00:00:00.012432423   23.0    text_field
2111-11-11 00:00:00.001 NaT NaN NaN
Respondent: Peter

Solution #8:

You can also use:

your_dataframe.insert(loc=0, value=np.nan, column="")

where loc is your empty row index.

Respondent: Daniel R

The answers/resolutions are collected from stackoverflow, are licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0 .

Leave a Reply

Your email address will not be published.