python – pyparsing OneOrMore嵌入在其他OneOrMore中
发布时间:2020-12-20 13:30:06 所属栏目:Python 来源:网络整理
导读:我试图第一次使用pyparsing. 我的解析器没有做我希望它会做的事情,有人可以检查一下,看看有什么问题.我试图在OneOrMore中嵌入OneOrMore,我认为应该可以正常工作,但事实并非如此. 以下是整个代码: import pyparsingstatus = """ sale number : 11/7 NAME ID
我试图第一次使用pyparsing.
我的解析器没有做我希望它会做的事情,有人可以检查一下,看看有什么问题.我试图在OneOrMore中嵌入OneOrMore,我认为应该可以正常工作,但事实并非如此. 以下是整个代码: import pyparsing status = """ sale number : 11/7 NAME ID PAWN PRICE TIME %C STATE START/STOP cross-cu-1 1055 1 106284K 07:49:36.19 25.05% run 1d01h cross-cu-2 918 1 104708K 07:38:19.08 24.02% run 1d01h sale number : 11/8 NAME ID PAWN PRICE TIME %C STATE START/STOP cross-cu-3 1055 1 106284K 07:49:36.19 25.05% run 1d01h cross-cu-4 918 1 104708K 07:38:19.08 24.02% run 1d01h """ integer = pyparsing.Word(pyparsing.nums).setParseAction(lambda toks: int(toks[0])) decimal = pyparsing.Word(pyparsing.nums + ".").setParseAction(lambda toks: float(toks[0])) wordSuppress = pyparsing.Suppress(pyparsing.Word(pyparsing.alphas)) endOfLine = pyparsing.LineEnd().suppress() colon = pyparsing.Suppress(":") saleNumber = pyparsing.Regex("d{2}/d{1}").setResultsName("saleNumber") lineSuppress = pyparsing.Regex("NAME.*STOP") + endOfLine saleRow = wordSuppress + wordSuppress + colon + saleNumber + endOfLine name = pyparsing.Regex("cross-cu-d").setResultsName("name") id = integer.setResultsName("id") pawn = integer.setResultsName("pawn") price = integer.setResultsName("price") + "K" time = pyparsing.Regex("d{2}:d{2}:d{2}.d{2}").setResultsName("time") c = decimal.setResultsName("c") + "%" state = pyparsing.Word(pyparsing.alphas).setResultsName("state") startStop = pyparsing.Word(pyparsing.alphanums).setResultsName("startStop") row = name + id + pawn + price + time + c + state + startStop + endOfLine table = pyparsing.OneOrMore(pyparsing.Group(saleRow + lineSuppress.suppress() + (pyparsing.OneOrMore(pyparsing.Group(row) | pyparsing.SkipTo(row).suppress())) ) | pyparsing.SkipTo(saleRow).suppress()) resultDic = [x.asDict() for x in table.parseString(status)] print resultDic 它只返回[{‘saleNumber’:’11 / 7’}] [{ {'saleNumber': '11/7'},{ elements in cross-cu-1 line,elements in cross-cu-2 line } },{ {'saleNumber': '11/8'},{ elements in cross-cu-3 line,elements in cross-cu-4 line } }] 任何帮助表示赞赏! 解决方法
在这种情况下,pyparsing可能是矫枉过正.为什么不直接读取文件然后解析结果?
代码如下所示: 编辑:我已更新代码以更密切地关注您的示例. 来自集合import defaultdict status = """ sale number : 11/7 NAME ID PAWN PRICE TIME %C STATE START/STOP cross-cu-1 1055 1 106284K 07:49:36.19 25.05% run 1d01h cross-cu-2 918 1 104708K 07:38:19.08 24.02% run 1d01h sale number : 11/8 NAME ID PAWN PRICE TIME %C STATE START/STOP cross-cu-3 1055 1 106284K 07:49:36.19 25.05% run 1d01h cross-cu-4 918 1 104708K 07:38:19.08 24.02% run 1d01h """ sale_number = '' sales = defaultdict(list) for line in status.split('n'): line = line.strip() if line.startswith("NAME"): continue elif line.startswith("sale number"): sale_number = line.split(':')[1].strip() elif not line or line.isspace() : continue else: # you can also use a regular expression here sales[sale_number].append(line.split()) for sale in sales: print sale,sales[sale] (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |