How to Fix: Invalid factor level, NA generated in R
In this article, we will be looking at the approaches with the examples to fix the error: invalid factor level, NA generated.
Such type of warning message is produced by the compiler when a programmer tries to add a value to a factor variable in R that doesn’t have any existence at the beforehand as a defined level. The complete warning message is given below:
Warning message: In `[<-.factor`(`*tmp*`, iseq, value = "C") : invalid factor level, NA generated
When error might occur
Let’s create a data frame.
R
# Create a data frame dataframe < - data.frame (team= factor ( c ( 'Alpha' , 'Alpha' , 'Beta' , 'Beta' , 'Charlie' , 'Charlie' , 'Charlie' )), points= c (96, 91, 86, 89, 93, 87, 91)) # Display the data frame dataframe # Display the structure of the data frame str (dataframe) |
Output:
In this example, the team variable has the three types of values only: “Alpha”, “Beta”, “Charlie”. Now, we will try to insert an additional row at the end of the data frame having the team name equal to “Gamma”.
Example:
R
# Create a data frame dataframe <- data.frame (team= factor ( c ( 'Alpha' , 'Alpha' , 'Beta' , 'Beta' , 'Charlie' , 'Charlie' , 'Charlie' )), points= c (96, 91, 86, 89, 93, 87, 91)) #add new row to end of data frame dataframe[ nrow (dataframe) + 1,] = c ( 'Gamma' , 99) |
Output:
The compiler produces the warning message. This is because the value “Gamma” is not already present under the team column. Note that it is just a warning message and the compiler will automatically insert a new row at the end of the data frame but instead of “Gamma” the cell would have the value equal to NA.
R
# Create a data frame dataframe <- data.frame (team= factor ( c ( 'Alpha' , 'Alpha' , 'Beta' , 'Beta' , 'Charlie' , 'Charlie' , 'Charlie' )), points= c (96, 91, 86, 89, 93, 87, 91)) # add new row to end of data frame dataframe[ nrow (dataframe) + 1,] = c ( 'Gamma' , 99) # Display the dataframe dataframe |
Output:
How the warning can be avoided:
We can get rid of this warning by firstly transforming the factor variable to a character variable and then we can transform it again to a factor variable just after adding the additional row.
Example:
R
# Create a data frame dataframe <- data.frame (team= factor ( c ( 'Alpha' , 'Alpha' , 'Beta' , 'Beta' , 'Charlie' , 'Charlie' , 'Charlie' )), points= c (96, 91, 86, 89, 93, 87, 91)) # Convert team variable to character dataframe$team <- as.character (dataframe$team) # Insert a new row to end of data frame dataframe[ nrow (dataframe) + 1,] = c ( 'Gamma' , 99) # Transform team variable back to factor dataframe$team <- as.factor (dataframe$team) # Display the data frame dataframe |
Output:
As you can see in the output, the warning, as well as the “NA” thing, have been eliminated from the dataframe. Now let’s display the structure of the modified dataframe once:
R
# Create a data frame dataframe <- data.frame (team= factor ( c ( 'Alpha' , 'Alpha' , 'Beta' , 'Beta' , 'Charlie' , 'Charlie' , 'Charlie' )), points= c (96, 91, 86, 89, 93, 87, 91)) # Convert team variable to character dataframe$team <- as.character (dataframe$team) # Insert a new row to end of data frame dataframe[ nrow (dataframe) + 1,] = c ( 'Gamma' , 99) # Transform team variable back to factor dataframe$team <- as.factor (dataframe$team) # Display the structure of the data frame str (dataframe) |
Output: