Suppose we have the following vector
1 2 |
##### Vector ##### sales <- c(3,5,8,7,6) |
We can see that the difference between the first component and the second component is 2. Is there a function to automatically calculate? Yes, it’s diff() .
1 2 |
##### Diff ##### diff(sales) |
1 2 3 |
> ##### Diff ##### > diff(sales) [1] 2 3 -1 -1 |
Now let’s try to create a data frame with a date column.
1 2 3 |
##### DF ##### sales_data <- data.frame(Date = as.Date(c('01/01/2017','01/02/2017','01/03/2017','01/04/2017','01/05/2017'),format = '%m/%d/%Y'), Sales = sales) |
Now let’s try diff() again.
1 2 |
##### Diff - 2 ##### diff(sales_data$Sales) |
1 2 |
> diff(sales_data$Sales) [1] 2 3 -1 -1 |
Now, what if we want to create another column with the result? That can be a bit tricky. Dplyr library doesn’t have an easy way to do that.
However, this post (link) has an elegant way to do it. Thanks to josliber for figuring the way out.
1 2 3 4 |
##### josliber's way ##### sales_data$diff <- ave(sales_data$Sales, FUN = function(x) c(0, diff(x))) sales_data |
1 2 3 4 5 6 7 |
> sales_data Date Sales diff 1 2017-01-01 3 0 2 2017-01-02 5 2 3 2017-01-03 8 3 4 2017-01-04 7 -1 5 2017-01-05 6 -1 |