Before we try to locate NAs, we need…
1 2 3 4 5 6 7 8 9 10 |
##### Load Libraries ##### library(tidyverse) ##### Creating Sample Data ##### Name <- c("Abbey", "Brian", "Connie", "Dan", "Ethan") GPA <- c(3.0, 2.8, 2.1, 4.0, NA) Grade <- c(3, 4, NA, NA, 6) State <- c("AL", NA, NA, NA, "CA") data <- data.frame(Name, GPA, Grade, State) |
Case 1: How many NAs in a dataset?
1 |
sum(is.na(data)) |
1 2 |
> sum(is.na(data)) [1] 6 |
We know that there are 3 NAs in the dataset. But it doesn’t tell what columns they reside.
Case 2: How many NAs in each column?
1 |
colSums(is.na(data)) |
1 2 3 |
> colSums(is.na(data)) Name GPA Grade State 0 1 2 3 |
The code will tell the number of NAs each column has. In this example, we only have 4 columns. So the result is not clutter at all. But what if we have 90?
Case 3: How many NAs in each column? Part 2
Let’s add a couple of more lines of code and create the GGPLOT chart.
We will create a data frame that we will use to create the chart. But we need some more prep works.
1 |
na_chart <- data.frame(colSums(is.na(data))) |
1 2 3 4 5 6 |
> na_chart colSums.is.na.data.. Name 0 GPA 1 Grade 2 State 3 |
As you can see, it is not that neat for GGPLOT. Let’s change the name of the column.
1 |
na_chart <- rename(na_chart, "value" = "colSums.is.na.data..") |
Then change the index to be another column.
1 |
na_chart$feature <- rownames(na_chart) |
Now we are ready to plot the chart.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
ggplot(na_chart, aes(y = value, x = feature)) + geom_bar(stat="identity") + geom_text(aes(label = value), vjust = -1) + ggtitle("No. of NA in Each Feature") + theme( plot.background = element_rect(fill = "#F7F6ED"), legend.key = element_rect(fill = "#F7F6ED"), legend.background = element_rect(fill = "#F7F6ED"), panel.background = element_rect(fill = "#F7F6ED"), panel.border = element_rect(colour = "black", fill = NA, linetype = "dashed"), panel.grid.minor = element_line(colour = "#7F7F7F", linetype = "dotted"), panel.grid.major = element_line(colour = "#7F7F7F", linetype = "dotted") ) |