gpt4 book ai didi

python - 为自定义格式实现解析树的 pythonic 方法是什么?

转载 作者:行者123 更新时间:2023-11-28 22:02:44 26 4
gpt4 key购买 nike

我有一个非标准文件格式的项目,例如:

var foo = 5
load 'filename.txt'
var bar = 6
list baz = [1, 2, 3, 4]

我想像 BeautifulSoup 那样将其解析为数据结构。但是 BeautifulSoup 不支持这种格式。构建解析树以便我可以修改值并重新写出来的 pythonic 方法是什么?最后我想做这样的事情:

data = parse_file('file.txt')
data.foo = data.foo * 2
data.write_file('file_new.txt')

最佳答案

这是一个使用 pyparsing 的解决方案……它适用于您的情况。请注意,我不是专家,因此根据您的标准,代码可能很丑陋……干杯

class ConfigFile (dict):
"""
Configuration file data
"""

def __init__ (self, filename):
"""
Parses config file.
"""

from pyparsing import Suppress, Word, alphas, alphanums, nums, \
delimitedList, restOfLine, printables, ZeroOrMore, Group, \
Combine

equal = Suppress ("=")
lbrack = Suppress ("[")
rbrack = Suppress ("]")
delim = Suppress ("'")

string = Word (printables, excludeChars = "'")
identifier = Word (alphas, alphanums + '_')

integer = Word (nums).setParseAction (lambda t: int (t[0]))
real = Combine( Word(nums) + '.' + Word(nums) ).setParseAction (lambda t: float(t[0]))
value = real | integer

var_kwd = Suppress ("var")
load_kwd = Suppress ("load")
list_kwd = Suppress ("list")

var_stm = Group (var_kwd + identifier + equal + value +
restOfLine.suppress ()).setParseAction (
lambda tok: tok[0].insert(len(tok[0]), 0))

load_stm = Group (load_kwd + delim + string + delim +
restOfLine.suppress ()).setParseAction (
lambda tok: tok[0].insert(len(tok[0]), 1))

list_stm = Group (list_kwd + identifier + equal + lbrack +
Group ( delimitedList (value, ",") ) +
rbrack + restOfLine.suppress ()).setParseAction (
lambda tok: tok[0].insert(len(tok[0]), 2))


cnf_file = ZeroOrMore (var_stm | load_stm | list_stm)

lines = cnf_file.parseFile (filename)
self._lines = []
for line in lines:
self._lines.append ((line[-1], line[0]))
if line[-1] != 1: dict.__setitem__(self, line[0], line[1])
self.__initialized = True
# after initialisation, setting attributes is the same as setting an item

def __getattr__ (self, key):
try:
return dict.__getitem__ (self, key)
except KeyError:
return None


def __setattr__ (self, key, value):
"""Maps attributes to values. Only if we are initialised"""

# this test allows attributes to be set in the __init__ method
if not self.__dict__.has_key ('_ConfigFile__initialized'):
return dict.__setattr__(self, key, value)

# any normal attributes are handled normally
elif self.__dict__.has_key (key):
dict.__setattr__(self, key, value)

# takes care of including new 'load' statements
elif key == 'load':
if not isinstance (value, str):
raise ValueError, "Invalid data type"
self._lines.append ((1, value))

# this is called when setting new attributes after __init__
else:
if not isinstance (value, int) and \
not isinstance (value, float) and \
not isinstance (value, list):
raise ValueError, "Invalid data type"

if dict.has_key (self, key):
if type(dict.__getitem__(self, key)) != type (value):
raise ValueError, "Cannot modify data type."
elif not isinstance (value, list): self._lines.append ((0, key))
else: self._lines.append ((2, key))
dict.__setitem__(self, key, value)


def Write (self, filename):
"""
Write config file.
"""
fid = open (filename, 'w')
for d in self._lines:
if d[0] == 0: fid.write ("var %s = %s\n" % (d[1], str(dict.__getitem__(self, d[1]))))
elif d[0] == 1: fid.write ("file '%s'\n" % (d[1]))
else: fid.write ("list %s = %s\n" % (d[1], str(dict.__getitem__(self, d[1]))))


if __name__ == "__main__":

input="""var foo = 5
load 'filename.txt'
var bar = 6
list baz = [1, 2, 3, 4]"""

file ("test.txt", 'w').write (input)
config = ConfigFile ("test.txt")
# Modify existent items
config.foo = config.foo * 2
# Add new items
config.foo2 = [4,5,6,7]
config.foo3 = 12.3456
config.load = 'filenameX.txt'
config.load = 'filenameXX.txt'
config.Write ("test_new.txt")

编辑

我修改了要使用的类

__getitem__, __setitem__

按照海报的要求模仿“访问成员”语法来解析项目的方法。享受吧!

附言

重载

__setitem__

方法应该小心避免“正常”属性(类成员)的设置和解析的项目(像属性一样的访问)之间的干扰。现在已修复代码以避免这些问题。有关详细信息,请参阅以下引用 http://code.activestate.com/recipes/389916/。发现这个很有趣!

关于python - 为自定义格式实现解析树的 pythonic 方法是什么?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/10836407/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com