在R中使用正则表达式和tidyr在第一个匹配实例上拆分列变量
发布时间:2020-12-14 06:05:00 所属栏目:百科 来源:网络整理
导读:试图在R数据框中拆分一个列,该数据框在变量中有多个空格,但我想在第一个空格上拆分.示例数据框: df - data.frame(game = c(1,2,3,4,5,6),date = c("Monday Apr 3","Tuesday Apr 4","Wednesday Apr 5","Thursday Apr 6","Friday Apr 7","Saturday Apr 8"))
试图在R数据框中拆分一个列,该数据框在变量中有多个空格,但我想在第一个空格上拆分.示例数据框:
df <- data.frame(game = c(1,2,3,4,5,6),date = c("Monday Apr 3","Tuesday Apr 4","Wednesday Apr 5","Thursday Apr 6","Friday Apr 7","Saturday Apr 8")) 我正在尝试使用tidyr在第一个空格中拆分df’date’列,以便日期在它自己的列中: game day date 1 1 Monday Apr 3 2 2 Tuesday Apr 4 3 3 Wednesday Apr 5 4 4 Thursday Apr 6 5 5 Friday Apr 7 6 6 Saturday Apr 8 以上是问题所在.以下是我尝试过的,出了什么问题. 通过tidyr文档,’sep’的默认值是’一个匹配任何非字母数字序列的正则表达式.’所以如果我这样做: df %>% separate(date,c("day","date")) 那将在空间上分裂,但它在两个空格上分裂(例如’星期一’之后的空格和’星期一4月3日”4月’之后的空格).结果是: game day date 1 1 Monday Apr 2 2 Tuesday Apr 3 3 Wednesday Apr 4 4 Thursday Apr 5 5 Friday Apr 6 6 Saturday Apr Warning message: Too many values at 6 locations: 1,6 我可以添加正则表达式来选择第一个空格(我检查了这个正则表达式在Sublime Text中工作): df %>% separate(date,"date"),sep='^[^s]*Ks') 但这给了我: game day date 1 1 Monday Apr 3 <NA> 2 2 Tuesday Apr 4 <NA> 3 3 Wednesday Apr 5 <NA> 4 4 Thursday Apr 6 <NA> 5 5 Friday Apr 7 <NA> 6 6 Saturday Apr 8 <NA> Warning message: Too few values at 6 locations: 1,6 出了什么问题?或者我如何使这项工作?或者我明白不明白的是什么? 解决方法
您需要指定要合并的额外参数:
library(tidyr) df %>% separate(date,extra = "merge") # game day date #1 1 Monday Apr 3 #2 2 Tuesday Apr 4 #3 3 Wednesday Apr 5 #4 4 Thursday Apr 6 #5 5 Friday Apr 7 #6 6 Saturday Apr 8 (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |