1 2 3 4 5 6 7 8 9 10 11 12 |
##### Creating Sample Data ##### Name <- as.character(c("Abb{ey", "B[ri]an", "Co.n}nie", "Dan", "Et#h!an", "Fr!ida")) GPA <- c(3.0, 2.8, 2.1, 4.0, NA, 3.0) Grade <- c(3, 4, NA, NA, 6, 9) State <- c("AL", NA, NA, NA, "CA", "VA") Credit_Score <- sample(550:750,6) ID <- sample(100:200,6) data <- data.frame(Name, GPA, Grade, State, Credit_Score, ID, stringsAsFactors = FALSE) ##### Load Library ##### library(stringr) |
We can use str_replace() or str_replace_all() to replace unwanted symbols depending on the situation.
str_replace
1 2 3 |
##### str_replace() ##### data$Name <- str_replace(data$Name, "[:punct:]","") data |
1 2 3 4 5 6 7 8 |
> data Name GPA Grade State Credit_Score ID 1 Abbey 3.0 3 AL 617 144 2 Bri]an 2.8 4 <NA> 604 142 3 Con}nie 2.1 NA <NA> 739 164 4 Dan 4.0 NA <NA> 608 140 5 Eth!an NA 6 CA 582 137 6 Frida 3.0 9 VA 664 175 |
str_replace() will only work on the first encounter with the symbols. To get rid of all symbols regardless of the number of an encounter, we need str_replace_all() .
str_replace_all
Let’s recreate the data frame again.
1 |
data <- data.frame(Name, GPA, Grade, State, Credit_Score, ID, stringsAsFactors = FALSE) |
Then we will use str_replace_all() .
1 2 3 |
##### str_replace_all() ##### data$Name <- str_replace_all(data$Name, "[:punct:]","") data |
1 2 3 4 5 6 7 8 |
> data Name GPA Grade State Credit_Score ID 1 Abbey 3.0 3 AL 617 144 2 Brian 2.8 4 <NA> 604 142 3 Connie 2.1 NA <NA> 739 164 4 Dan 4.0 NA <NA> 608 140 5 Ethan NA 6 CA 582 137 6 Frida 3.0 9 VA 664 175 |