正则表达式 – 提取字符向量中两个特定单词之间的所有单词

发布时间：2020-12-14 06:23:42 所属栏目：百科来源：网络整理

导读：有更有效的方法吗？如果没有字符串,我怎么能这样做？ txt - "I want to extract the words between this and that,this goes with that,this is a long way from that"library(stringr)w_start - "this"w_end - "that"pattern - paste0(w_start,"(.*?)",w_en

有更有效的方法吗？如果没有字符串,我怎么能这样做？

txt <- "I want to extract the words between this and that,this goes with that,this is a long way from that"

library(stringr)
w_start <- "this"
w_end <- "that"
pattern <- paste0(w_start,"(.*?)",w_end)
wordsbetween <- unlist(str_extract_all(txt,pattern))
gsub("^s+|s+$","",str_sub(wordsbetween,nchar(w_start)+1,-nchar(w_end)-1))
[1] "and"                "goes with"          "is a long way from"

这是我在qdap中使用的方法：

使用qdap：

library(qdap)
genXtract(txt,"this","that")

## > genXtract(txt,"that")
##         this  :  that1         this  :  that2         this  :  that3 
##                " and "          " goes with " " is a long way from "

没有添加包：

regmatches(txt,gregexpr("(?<=this).*?(?=that)",txt,perl=TRUE))

## > regmatches(txt,perl=TRUE))
## [[1]]
## [1] " and "                " goes with "          " is a long way from "

（编辑：李大同）

【声明】本站内容均来自网络，其相关言论仅代表作者个人观点，不代表本站立场。若无意侵犯到您的权利，请及时与联系站长删除相关内容!