perl – 使用Parse :: RecDescent
发布时间:2020-12-15 23:21:52 所属栏目:大数据 来源:网络整理
导读:我有以下输入 @Book{press,author = "Press,W. and Teutolsky,S. and Vetterling,W. and Flannery B.",title = "Numerical {R}ecipes in {C}: The {A}rt of {S}cientific {C}omputing",year = 2007,publisher = "Cambridge University Press"} 我必须为RecDe
我有以下输入
@Book{press,author = "Press,W. and Teutolsky,S. and Vetterling,W. and Flannery B.",title = "Numerical {R}ecipes in {C}: The {A}rt of {S}cientific {C}omputing",year = 2007,publisher = "Cambridge University Press" } 我必须为RecDescent解析器生成器编写语法. <book> <keyword>press</keyword> <author>Press,W.+Teutolsky,S.+Vetterling,W.+Flannery B.</author> <title>Numerical {R}ecipes in {C}: The {A}rt of {S}cientific {C}omputing</title> <year>2007</year> <publisher>Cambridge University Press</publisher> </book> 应将附加和重复字段报告为错误(带有行号的正确消息,不再进行解析).我试着从这样的事情开始: use Parse::RecDescent; open(my $in,"<","parsing.txt") or die "Can't open parsing.txt: $!"; my $text; while (<$in>) { $text .= $_; } print $text; my $grammar = q { beginning: "@Book{" keyword fields "}" { print "<book>n",$item[2],$item[3],"</book>"; } keyword: /[a-zA-Z]+/ "," { return " <keyword>".$item[1]."</keyword>n"; } fields: one "," two "," tree "," four { return $item[1].$item[3].$item[5].$item[7]; } one: "author" "=" """ /[a-zA-Zs.,{}:]+/ """ { $item[4] =~ s/sands/+/g; return " <author>",$item[4],"</author>n"; } two: "title" "=" """ /[a-zA-Zs.,{}:]+/ """ { $item[4] =~ s/sands/+/g; return " <title>","</title>n"; } three: "year" "=" /[0-2][0-9][0-9][0-9]/ { return " <year>","</year>n"; } four: "publisher" "=" """ /[a-zA-Zs.,{}:]+/ """ { $item[4] =~ s/sands/+/g; return " <publisher>","</publisher>n"; } }; my $parser = new Parse::RecDescent($grammar) or die ("Bad grammar!"); defined $parser->beginning($text) or die ("Bad text!"); 但我甚至不知道这是否是正确的方法.请帮忙. 还有一个小问题.输入处的标签可能不是特定顺序,但每个标签只能出现一次.我是否必须为(作者,标题,年份,出版商)的所有排列编写子规则?因为我提出了: fields: field "," field "," field field: one | two | three | four 但它显然不能阻止重复标签. 解决方法
首先,你有一个错字:树而不是三个.
我运行你的程序,但添加了以下行: use strict; use warnings; # you should always have strict and warnings on $::RD_HINT = 1; # Parse::RecDescent hints $::RD_TRACE = 1; # Parse::RecDescent trace 并获得此调试输出: 1|beginning |>>Matched terminal<< (return value: | | |[@Book{]) | 1|beginning | |"press,n author = "Press,| | |W. and Teutolsky,S. and | | |Vetterling,W. and Flannery | | |B.",n title = "Numerical | | |{R}ecipes in {C}: The {A}rt | | |of {S}cientific | | |{C}omputing",n year = | | |2007,n publisher = | | |"Cambridge University | | |Press"n}n" 1|beginning |Trying subrule: [keyword] | 2| keyword |Trying rule: [keyword] | 2| keyword |Trying production: [/[a-zA-Z]+/ ','] | 2| keyword |Trying terminal: [/[a-zA-Z]+/] | 2| keyword |>>Matched terminal<< (return value: | | |[press]) | 2| keyword | |",W. and | | |Teutolsky,n publisher = | | |"Cambridge University | | |Press"n}n" 2| keyword |Trying terminal: [','] | 2| keyword |>>Matched terminal<< (return value: | | |[,]) | 2| keyword | |"n author = "Press,n publisher = | | |"Cambridge University | | |Press"n}n" 2| keyword |Trying action | 1|beginning |>>Matched subrule: [keyword]<< (return| | |value: [ <keyword>press</keyword> ]| 1|beginning | |"press,n publisher = | | |"Cambridge University | | |Press"n}n" 1|beginning |Trying subrule: [fields] | 2| fields |Trying rule: [fields] | 2| fields |Trying production: [one ',' two ',' | | |three ',' four] | 2| fields |Trying subrule: [one] | 3| one |Trying rule: [one] | 3| one |Trying production: ['author' '=' '"' | | |/[a-zA-Zs.,{}:]+/ '"'] | 3| one |Trying terminal: ['author'] | 3| one |<<Didn't match terminal>> | 3| one |<<Didn't match rule>> | 2| fields |<<Didn't match subrule: [one]>> | 2| fields |<<Didn't match rule>> | 1|beginning |<<Didn't match subrule: [fields]>> | 1|beginning |<<Didn't match rule>> | Bad text! at parser.pl line 32,<$in> line 6. 这表明它已经陷入第一个阶段,并且按下,将被放回输入流.这是因为你使用return而不是$return =作为Parse :: RecDescent手册says you should. 此外,一旦分配给$return变量,就不能再返回列表,并且必须手动将字符串连接在一起. 这是最终结果: use strict; use warnings; use Parse::RecDescent; open(my $in,"parsing.txt") or die "Can't open parsing.txt: $!"; my $text; while (<$in>) { $text .= $_; } print $text; my $grammar = q { beginning: "@Book{" keyword fields /s*}s*/ { print "<book>n"," { $return = " <keyword>$item[1]</keyword>n"; } fields: one /,s*/ two /,s*/ three /,s*/ four { $return = $item[1].$item[3].$item[5].$item[7]; } one: "author" "=" """ /[a-zA-Zs.,{}:]+/ """ { $item[4] =~ s/sands/+/g; $return = " <author>$item[4]</author>n"; } two: "title" "=" """ /[a-zA-Zs.,{}:]+/ """ { $item[4] =~ s/sands/+/g; $return = " <title>$item[4]</title>n"; } three: "year" "=" /[0-2][0-9][0-9][0-9]/ { $return = " <year>$item[3]</year>n"; } four: "publisher" "=" """ /[a-zA-Zs.,{}:]+/ """ { $item[4] =~ s/sands/+/g; $return = " <publisher>$item[4]</publisher>n"; } }; my $parser = new Parse::RecDescent($grammar) or die ("Bad grammar!"); defined $parser->beginning($text) or die ("Bad text!"); (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |