How to Fix: Invalid factor level, NA generated in R

In this article, we will be looking at the approaches with the examples to fix the error: invalid factor level, NA generated.

Such type of warning message is produced by the compiler when a programmer tries to add a value to a factor variable in R that doesn’t have any existence at the beforehand as a defined level. The complete warning message is given below:

Warning message:
In `[<-.factor`(`*tmp*`, iseq, value = "C") :
  invalid factor level, NA generated 

When error might occur

Let’s create a data frame.


# Create a data frame
dataframe < - data.frame(team=factor(c('Alpha', 'Alpha',
                                       'Beta', 'Beta',
                                       'Charlie', 'Charlie',
                         points=c(96, 91, 86, 89, 93, 87, 91))
# Display the data frame
# Display the structure of the data frame


In this example, the team variable has the three types of values only: “Alpha”, “Beta”, “Charlie”. Now, we will try to insert an additional row at the end of the data frame having the team name equal to “Gamma”.



# Create a data frame
dataframe <- data.frame(team=factor(c('Alpha', 'Alpha'
                                      'Beta', 'Beta',
                                      'Charlie', 'Charlie',
                 points=c(96, 91, 86, 89, 93, 87, 91))
#add new row to end of data frame
dataframe[nrow(dataframe) + 1,] = c('Gamma', 99)



The compiler produces the warning message. This is because the value “Gamma” is not already present under the team column. Note that it is just a warning message and the compiler will automatically insert a new row at the end of the data frame but instead of “Gamma” the cell would have the value equal to NA.


# Create a data frame
dataframe <- data.frame(team=factor(c('Alpha', 'Alpha',
                                      'Beta', 'Beta',
                                      'Charlie', 'Charlie',
                 points=c(96, 91, 86, 89, 93, 87, 91))
# add new row to end of data frame
dataframe[nrow(dataframe) + 1,] = c('Gamma', 99)
# Display the dataframe


How the warning can be avoided:

We can get rid of this warning by firstly transforming the factor variable to a character variable and then we can transform it again to a factor variable just after adding the additional row.



# Create a data frame
dataframe <- data.frame(team=factor(c('Alpha', 'Alpha',
                                      'Beta', 'Beta'
                                      'Charlie', 'Charlie',
                 points=c(96, 91, 86, 89, 93, 87, 91))
# Convert team variable to character
dataframe$team <- as.character(dataframe$team)
# Insert a new row to end of data frame
dataframe[nrow(dataframe) + 1,] = c('Gamma', 99)
# Transform team variable back to factor
dataframe$team <- as.factor(dataframe$team)
# Display the data frame


As you can see in the output, the warning, as well as the “NA” thing, have been eliminated from the dataframe. Now let’s display the structure of the modified dataframe once:


# Create a data frame
dataframe <- data.frame(team=factor(c('Alpha', 'Alpha'
                                      'Beta', 'Beta'
                                      'Charlie', 'Charlie',
                 points=c(96, 91, 86, 89, 93, 87, 91))
# Convert team variable to character
dataframe$team <- as.character(dataframe$team)
# Insert a new row to end of data frame
dataframe[nrow(dataframe) + 1,] = c('Gamma', 99)
# Transform team variable back to factor
dataframe$team <- as.factor(dataframe$team)
# Display the structure of the data frame
