加入收藏 | 设为首页 | 会员中心 | 我要投稿 李大同 (https://www.lidatong.com.cn/)- 科技、建站、经验、云计算、5G、大数据,站长网!
当前位置: 首页 > 百科 > 正文

如何根据某些搜索字符串创建正则表达式以对数据框进行子集化?

发布时间:2020-12-14 05:56:18 所属栏目:百科 来源:网络整理
导读:我正在尝试搜索字符串以对数据帧进行子集化.我的df看起来像这样: dput(df)structure(list(Cause = structure(c(2L,1L),.Label = c("jasper not able to read the property table after the release","More than 7000 messages loaded which stuck up"),clas
我正在尝试搜索字符串以对数据帧进行子集化.我的df看起来像这样:

dput(df)
structure(list(Cause = structure(c(2L,1L),.Label = c("jasper not able to read the property table after the release","More than 7000  messages loaded which stuck up"),class = "factor"),Resolution = structure(1:2,.Label = c("jobs and reports are processed","Updated the property table which resolved the issue."),class = "factor")),.Names = c("Cause","Resolution"),class = "data.frame",row.names = c(NA,-2L))

我想这样做:

df1<-subset(df,grepl("*MQ*|*queue*|*Queue*",df$Cause))

在“原因”列中搜索MQ或队列或队列,使用匹配的记录对数据帧df进行子集化.它似乎没有工作,它捕获其他记录,MQ,队列或队列字符串不存在.

这是你怎么做的,我可以遵循的任何其他想法?

解决方法

下面的正则表达式似乎有效.我在data.frame中添加了一行,这是一个更有趣的例子.

我认为问题来自你的正则表达式中的* s,还添加了大括号来定义|的组但不要认为这是强制性的.

df <- data.frame(Cause=c("jasper not able to read the property table after the release","More than 7000  messages loaded which stuck up","blabla Queue blabla"),Resolution = c("jobs and reports are processed","Updated the property table which resolved the issue.","hop"))

> head(df)
Cause                                           Resolution
1 jasper not able to read the property table after the release                       jobs and reports are processed
2               More than 7000  messages loaded which stuck up Updated the property table which resolved the issue.
3                                          blabla Queue blabla                                                  hop

> subset(df,grepl("(MQ)|(queue)|(Queue)",df$Cause))
Cause Resolution
3 blabla Queue blabla        hop

这是你想要的吗?

(编辑:李大同)

【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!

    推荐文章
      热点阅读