gpt4 book ai didi

python - 如何从 XML xpath 搜索生成的列表中的子字符串中去除字符?

转载 作者:太空宇宙 更新时间:2023-11-03 13:49:44 32 4
gpt4 key购买 nike

这个问题是对之前一个问题的补充。如果您需要更多背景知识,可以在此处查看原始问题:

Populating Python list using data obtained from lxml xpath command .

我已将 @ihor-kaharlichenko 的极好建议(来 self 最初的问题)合并到修改后的代码中,此处:

from lxml import etree as ET
from datetime import datetime

xmlDoc = ET.parse('http://192.168.1.198/Bench_read_scalar.xml')

response = xmlDoc.getroot()
tags = (
'address',
'status',
'flow',
'dp',
'inPressure',
'actVal',
'temp',
'valveOnPercent',
)

dmtVal = []

for dmt in response.iter('dmt'):
val = [str(dmt.xpath('./%s/text()' % tag)) for tag in tags]
val.insert(0, str(datetime.now())) #Add timestamp at beginning of each record
dmtVal.append(val)

for item in dmtVal:
str(item).strip('[')
str(item).strip(']')
str(item).strip('"')

最后一个 block 是我遇到问题的地方。我为 dmtVal 获取的数据如下所示:

[['2012-08-16 12:38:45.152222', "['0x46']", "['0x32']", "['1.234']", "['5.678']", "['9.123']", "['4.567']", "['0x98']", "['0x97']"], ['2012-08-16 12:38:45.152519', "['0x47']", "['0x33']", "['8.901']", "['2.345']", "['6.789']", "['0.123']", "['0x96']", "['0x95']"]]

但是,我真的希望数据看起来像这样:

[['2012-08-16 12:38:45.152222', '0x46', '0x32', '1.234', '5.678', '9.123', '4.567', '0x98', '0x97'], ['2012-08-16 12:38:45.152519', '0x47', '0x33', '8.901', '2.345', '6.789', '0.123', '0x96', '0x95']]

我认为这是一个相当简单的字符串剥离工作,我尝试了原始迭代中的代码(其中最初填充了dmtVal),但这没有用,所以我在循环外进行了剥离操作,如上所列,但它仍然不起作用。我在想我正在犯某种菜鸟错误,但找不到。欢迎提出任何建议!


感谢大家及时而有用的回复。这是更正后的代码:

from lxml import etree as ET
from datetime import datetime

xmlDoc = ET.parse('http://192.168.1.198/Bench_read_scalar.xml')

print '...Starting to parse XML nodes'

response = xmlDoc.getroot()

tags = (
'address',
'status',
'flow',
'dp',
'inPressure',
'actVal',
'temp',
'valveOnPercent',
)

dmtVal = []

for dmt in response.iter('dmt'):
val = [' '.join(dmt.xpath('./%s/text()' % tag)) for tag in tags]
val.insert(0, str(datetime.now())) #Add timestamp at beginning of each record
dmtVal.append(val)

产生:

...Starting to parse XML nodes
[['2012-08-16 14:41:10.442776', '0x46', '0x32', '1.234', '5.678', '9.123', '4.567', '0x98', '0x97'], ['2012-08-16 14:41:10.443052', '0x47', '0x33', '8.901', '2.345', '6.789', '0.123', '0x96', '0x95']]
...Done

谢谢大家!

最佳答案

将您当前的数据作为 grps

解决方案 1 - ast.literal_eval

import ast
grps = [['2012-08-16 12:38:45.152222', "['0x46']", "['0x32']", "['1.234']", "['5.678']", "['9.123']", "['4.567']", "['0x98']", "['0x97']"], ['2012-08-16 12:38:45.152519', "['0x47']", "['0x33']", "['8.901']", "['2.345']", "['6.789']", "['0.123']", "['0x96']", "['0x95']"]]
desired_output = [[grp[0]] + [ast.literal_eval(item)[0] for item in grp[1:]] for grp in grps]

print desired_output

输出

[['2012-08-16 12:38:45.152222', '0x46', '0x32', '1.234', '5.678', '9.123', '4.567', '0x98', '0x97'], ['2012-08-16 12:38:45.152519', '0x47', '0x33', '8.901', '2.345', '6.789', '0.123', '0x96', '0x95']]

解释

ast.literal_eval是执行 eval 的安全方法。它仅适用于评估数据类型(字符串、数字、元组、列表、字典、 bool 值和无)。在您的情况下,它将评估“['1.0']”为长度为 1 的列表,如 ['1.0']。您可能想看一看,并确保您理解 list comprehensions .

另一种写法是:

desired_output = []
for grp in grps: # loop through each group
new_grp = grp[0] # assign the first element (an array) to be our new_grp
for item in grp[1:] # loop over every item from index 1 to the end
evaluated_item = ast.literal_eval(item) # get the evaluated data
new_grp.append(evaluated_item[0]) # append the item in the 1 item list to the new_grp
desired_output.append(new_grp) # append the new_grp to the desired_output list

解决方案 2 - 正则表达式

import re
stripper = re.compile("[\[\]']")
grps = [['2012-08-16 12:38:45.152222', "['0x46']", "['0x32']", "['1.234']", "['5.678']", "['9.123']", "['4.567']", "['0x98']", "['0x97']"], ['2012-08-16 12:38:45.152519', "['0x47']", "['0x33']", "['8.901']", "['2.345']", "['6.789']", "['0.123']", "['0x96']", "['0x95']"]]
desired_output = [[grp[0]] + [ stripper.sub('', item) for item in grp[1:]] for grp in grps]

您的解决方案的问题是,在 for 循环中迭代的项目不是通过引用传递的,因此更改它们不会影响原始数据。

解决方案 3 - 修复您的原始代码

要修复您的解决方案,您需要:

for i, grp in enumerate(dmtVal):  # loop over the inner lists
for j, item in enumerate(grp):
dmtVal[i][j] = item.strip('\]')
dmtVal[i][j] = dmtVal[i][j].lstrip('\[')
dmtVal[i][j] = dmtVal[i][j].strip("'")

不是每次剥离时都将 balue balue 分配给 dmtVal[i][j],而是可以使用取消引用的值 item,对其进行操作,然后分配回到最后的 dmtVal[i][j]

for i, grp in enumerate(dmtVal):  # loop over the inner lists
for j, item in enumerate(grp):
# Could intead be
item = item.strip('\]')
item = dmtVal[i][j].lstrip('\[')
item = dmtVal[i][j].strip("'")
dmtVal[i][j] = item

或者更好的解决方案(恕我直言):

for i, grp in enumerate(dmtVal):  # loop over the inner lists
for j, item in enumerate(grp):
dmtVal[i][j] = item.replace('[', '').replace(']', '').replace("'", '')

关于python - 如何从 XML xpath 搜索生成的列表中的子字符串中去除字符?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/11994198/

32 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com