[Solved] Can you use rbind.fill without having it fill in NA’s?

I am trying to combine two dataframes with different number of columns and column headers. However, after I combine them using rbind.fill(), the resulting file has filled the empty cells with NA.

This is very inconvenient since one of the columns has data that is also represented as “NA” (for North America), so when I import it into a csv, the spreadsheet can’t tell them apart.

Is there a way for me to:

  1. Use the rbind.fill function without having it populate the empty cells with NA

or

  1. Change the column to replace the NA values*

*I’ve scoured the blogs, and have tried the two most popular solutions:

df$col[is.na(df$col)] <- 0, #it does not work
df$col = ifelse(is.na(df$col), "X", df$col), #it changes all the characters to numbers, and ruins the column

Let me know if you have any advice! I (unfortunately) cannot share the df, but will be willing to answer any questions!

Enquirer: David

||

Solution #1:

NA is not the same as "NA" to R, but might be interpreted as such by your favourite spreadsheet program. NA is a special value in R just like NaN (not a number). If I understand correctly, one of your solutions is to replace the “NA” values in the column representing North America with something else, in which case you should just be able to do…

df$col[ df$col == "NA" ] <- "NorthAmerica"

This is assuming that your “NA” values are actually character strings. is.na() won’t return any values if they are character strings which is why df$col[ is.na(df$col) ] <- 0 won’t work.

An example of the difference between NA and “NA”:

x <- c( 1, 2, 3 , "NA" , 4 , 5 , NA )

> x[ !is.na(x) ]
[1] "1"  "2"  "3"  "NA" "4"  "5"

> x[ x == "NA" & !is.na(x) ]
[1] "NA"

Method to resolve this

I think you want to leave “NA” and any NAs as they are in the first df, but make all NA in the second df formed from rbind.fill() change to something like “NotAvailable”. You can accomplish this like so…

df1 <- data.frame( col = rep( "NA" , 6 ) , x = 1:6 , z = rep( 1 , 6 ) )
df2 <- data.frame( col = rep( "SA" , 2 ) , x = 1:2 , y = 5:6 )
df <- rbind.fill( df1 , df2 )
temp <- df [ (colnames(df) %in% colnames(df2)) ]
temp[ is.na( temp ) ] <- "NotAvailable"
res <- cbind( temp , df[ !( colnames(df) %in% colnames(df2) ) ] )

#df has real NA values in column z and column y. We just want to get rid of y's
df

#     col x  z  y
#   1  NA 1  1 NA
#   2  NA 2  1 NA
#   3  NA 3  1 NA
#   4  NA 4  1 NA
#   5  NA 5  1 NA
#   6  NA 6  1 NA
#   7  SA 1 NA  5
#   8  SA 2 NA  6

#res has "NA" strings in col representing "North America" and NA values in z, whilst those in y have been removed
#More generally, any NA in df1 will be left 'as-is', whilst NA from df2 formed using rbind.fill will be converted to character string "NotAvilable"
res

#     col x            y  z
#   1  NA 1 NotAvailable  1
#   2  NA 2 NotAvailable  1
#   3  NA 3 NotAvailable  1
#   4  NA 4 NotAvailable  1
#   5  NA 5 NotAvailable  1
#   6  NA 6 NotAvailable  1
#   7  SA 1            5 NA
#   8  SA 2            6 NA
Respondent: Simon O’Hanlon

Solution #2:

If you have a dataframe that contains NA’s and you want to replace them all you can do something like:

df[is.na(df)] <- -999

This will take care of all NA’s in one shot

If you only want to act on a single column you can do something like

df$col[which(is.na(df$col))] <- -999
Respondent: kith

The answers/resolutions are collected from stackoverflow, are licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0 .

Leave a Reply

Your email address will not be published. Required fields are marked *

New ride at Perth's Adventure World a ‘swing on steroids’ - Australasian Leisure Management npp pharma the truth about anabolic steroid drugs and professional sports