dplyr,lubridate:如何按周聚合数据帧?
发布时间:2020-12-14 00:50:31 所属栏目:百科 来源:网络整理
导读:请考虑以下示例 library(tidyverse)library(lubridate)time - seq(from =ymd("2014-02-24"),to= ymd("2014-03-20"),by="days")set.seed(123)values - sample(seq(from = 20,to = 50,by = 5),size = length(time),replace = TRUE)df2 - data_frame(time,value
请考虑以下示例
library(tidyverse) library(lubridate) time <- seq(from =ymd("2014-02-24"),to= ymd("2014-03-20"),by="days") set.seed(123) values <- sample(seq(from = 20,to = 50,by = 5),size = length(time),replace = TRUE) df2 <- data_frame(time,values) df2 <- df2 %>% mutate(day_of_week = wday(time,label = TRUE)) Source: local data frame [25 x 3] time values day_of_week <date> <dbl> <fctr> 1 2014-02-24 30 Mon 2 2014-02-25 45 Tues 3 2014-02-26 30 Wed 4 2014-02-27 50 Thurs 5 2014-02-28 50 Fri 6 2014-03-01 20 Sat 7 2014-03-02 35 Sun 8 2014-03-03 50 Mon 9 2014-03-04 35 Tues 10 2014-03-05 35 Wed 我想按周聚合这个数据帧. 也就是说,假设我将周定义为星期一早上开始,星期日晚上结束,我们称之为周一至周一周期. (重要的是,我希望能够选择其他约定,例如周五到周五). 然后,我只想计算每周价值的均值. 例如,在上面的示例中,可以计算2月24日星期一到3月2日星期日之间的平均值,依此类推. 我怎样才能做到这一点? 谢谢! 编辑:感谢所有提出想法的人.有点不寻常,我认为我的后期解决方案可能更合适.再次感谢!
在tidyverse, df2 %>% group_by(week = week(time)) %>% summarise(value = mean(values)) ## # A tibble: 5 × 2 ## week value ## <dbl> <dbl> ## 1 8 37.50000 ## 2 9 38.57143 ## 3 10 38.57143 ## 4 11 36.42857 ## 5 12 45.00000 或者使用isoweek代替: df2 %>% group_by(week = isoweek(time)) %>% summarise(value = mean(values)) ## # A tibble: 4 × 2 ## week value ## <int> <dbl> ## 1 9 37.14286 ## 2 10 40.71429 ## 3 11 35.00000 ## 4 12 42.50000 或者cut.Date: df2 %>% group_by(week = cut(time,"week")) %>% summarise(value = mean(values)) ## # A tibble: 4 × 2 ## week value ## <fctr> <dbl> ## 1 2014-02-24 37.14286 ## 2 2014-03-03 40.71429 ## 3 2014-03-10 35.00000 ## 4 2014-03-17 42.50000 如果您愿意,可以告诉您在周日开始: df2 %>% group_by(week = cut(time,"week",start.on.monday = FALSE)) %>% summarise(value = mean(values)) ## # A tibble: 4 × 2 ## week value ## <fctr> <dbl> ## 1 2014-02-23 37.50000 ## 2 2014-03-02 40.00000 ## 3 2014-03-09 33.57143 ## 4 2014-03-16 44.00000 如果您想转到星期二开始,请在您的日期添加一个: df2 %>% group_by(week = cut(time + 1,"week")) %>% summarise(value = mean(values)) ## # A tibble: 4 × 2 ## week value ## <fctr> <dbl> ## 1 2014-02-24 37.50000 ## 2 2014-03-03 40.00000 ## 3 2014-03-10 33.57143 ## 4 2014-03-17 44.00000 不过,标签将会关闭.如果使用cut,请考虑其include.lowest和right参数的含义,记录在?cut. (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |