gpt4 book ai didi

Python 解析器 ply 不处理空格

转载 作者:行者123 更新时间:2023-12-02 15:57:08 27 4
gpt4 key购买 nike

我使用 ply 解析数据.我尝试使用空格作为词素的一部分。这里有一个简化的例子:

from ply.lex import lex
from ply.yacc import yacc

tokens = ('NUM', 'SPACE')

t_NUM = r'\d+'
t_SPACE = r' '

def t_error(t):
print(f'Illegal character {t.value[0]!r}')
t.lexer.skip(1)

lexer = lex()

def p_two(p):
'''
two : NUM SPACE NUM
'''
p[0] = ('two', p[1], p[2], p[3])

def p_error(p):
if p:
print(f"Syntax error at '{p.value}'")
else:
print("Syntax error at EOF")

parser = yacc()

ast = parser.parse('1 2')
print(ast)

当我运行时,出现错误:

ERROR: Regular expression for rule 't_SPACE' matches empty string
Traceback (most recent call last):
File "c:\demo\simple_space.py", line 19, in <module>
lexer = lex()
File "C:\demo\3rdparty\ply\ply\lex.py", line 752, in lex
raise SyntaxError("Can't build lexer")
SyntaxError: Can't build lexer

是否可以将空格指定为词素的一部分?一些额外的可能标记:

  • t_COMMENT = r'\#.*' 用于评论
  • t_DIVIDE = r': +' 分隔符

最佳答案

这在 Specification of tokens 的 Ply 手册部分有解释。 :

Internally, lex.py uses the re module to do its pattern matching. Patterns are compiled using the re.VERBOSE flag which can be used to help readability. However, be aware that unescaped whitespace is ignored and comments are allowed in this mode. If your pattern involves whitespace, make sure you use \s. If you need to match the # character, use [#].

所以一个文字空格字符必须写成[ ]\。 (\s,如手册​​中所建议的那样,匹配任何空格,而不仅仅是空格字符。)

关于Python 解析器 ply 不处理空格,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/71369399/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com