Suppose the data comes in the presentation form as follows:
1 2 3 4 5 6 7 8 9 10 11 12 |
##### Creating Sample Data ##### library(tidyverse) Name <- as.character(c("Abbey", "Brian", "Connie", "Dan", "Ethan")) State <- c("AL", "DC", "WA", "NY", "CA") Winter <- c(1.0,2.0,3.0,4.0,4.0) Spring <- c(4.0,4.0,3.0,2.0,1.0) Summer <- c(3.0,3.0,3.0,3.0,3.0) data_wide <- data.frame(Name, State, Winter, Spring, Summer) data_wide |
1 2 3 4 5 6 7 |
> data_wide Name State Winter Spring Summer 1 Abbey AL 1 4 3 2 Brian DC 2 4 3 3 Connie WA 3 3 3 4 Dan NY 4 2 3 5 Ethan CA 4 1 3 |
Winter, Spring, and Summer represent GPA in each quarter. The format looks nice in Excel. But it is not tidy in R standard. Let’s group the quarters into one column and create a new ‘GPA’ column.
1 2 3 4 |
##### Gather ##### data_long <- data_wide %>% gather(Quarter, GPA, -c(Name,State)) data_long |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
> data_long Name State Quarter GPA 1 Abbey AL Winter 1 2 Brian DC Winter 2 3 Connie WA Winter 3 4 Dan NY Winter 4 5 Ethan CA Winter 4 6 Abbey AL Spring 4 7 Brian DC Spring 4 8 Connie WA Spring 3 9 Dan NY Spring 2 10 Ethan CA Spring 1 11 Abbey AL Summer 3 12 Brian DC Summer 3 13 Connie WA Summer 3 14 Dan NY Summer 3 15 Ethan CA Summer 3 |