ruby – 查找连续的子字符串索引

发布时间：2020-12-17 03:27:22 所属栏目：百科来源：网络整理

导读：给定搜索字符串和结果字符串(保证包含搜索字符串的所有字母,按顺序不区分大小写),如何最有效地获取表示与字母对应的结果字符串中的索引的范围数组在搜索字符串中？期望的输出： substrings( "word","Microsoft Office Word 2007" )#= [ 17..20 ]substrings(

给定搜索字符串和结果字符串(保证包含搜索字符串的所有字母,按顺序不区分大小写),如何最有效地获取表示与字母对应的结果字符串中的索引的范围数组在搜索字符串中？

期望的输出：

substrings( "word","Microsoft Office Word 2007" )
#=> [ 17..20 ]

substrings( "word","Network Setup Wizard" )
#=> [ 3..5,19..19 ]
#=> [ 3..4,18..19 ]   # Alternative,acceptable,less-desirable output

substrings( "word","Watch Network Daemon" )
#=> [ 0..0,10..11,14..14 ]

这是一个自动完成搜索框.这是a tool的类似于Quicksilver的屏幕截图,强调了我想要的字母.请注意 – 与我上面的理想输出不同 – 此屏幕截图不喜欢更长的单一匹配.

基准测试结果

对当前工作结果进行基准测试表明,@ tokland基于正则表达式的答案基本上与我提出的基于StringScanner的解决方案一样快,代码更少：

user     system      total        real
phrogz1    0.889000   0.062000   0.951000 (  0.944000)
phrogz2    0.920000   0.047000   0.967000 (  0.977000)
tokland    1.030000   0.000000   1.030000 (  1.035000)

这是基准测试：

a=["Microsoft Office Word 2007","Network Setup Wizard","Watch Network Daemon"]
b=["FooBar","Foo Bar","For the Love of Big Cars"]
test = { a=>%w[ w wo wor word ],b=>%w[ f fo foo foobar fb fbr ] }
require 'benchmark'
Benchmark.bmbm do |x|
  %w[ phrogz1 phrogz2 tokland ].each{ |method|
    x.report(method){ test.each{ |words,terms|
      words.each{ |master| terms.each{ |term|
        2000.times{ send(method,term,master) }
      } }
    } }
  }
end

解决方法

要有一些东西可以开始,那怎么样？

>> s = "word"
>> re = /#{s.chars.map{|c| "(#{c})" }.join(".*?")}/i # /(w).*?(o).*?(r).*?(d)/i/
>> match = "Watch Network Daemon".match(re)
=> #<MatchData "Watch Network D" 1:"W" 2:"o" 3:"r" 4:"D">
>> 1.upto(s.length).map { |idx| match.begin(idx) }
=> [0,10,11,14]

现在你只需要build the ranges(如果你真的需要它们,我想各个索引也可以).

（编辑：李大同）

【声明】本站内容均来自网络，其相关言论仅代表作者个人观点，不代表本站立场。若无意侵犯到您的权利，请及时与联系站长删除相关内容!