gpt4 book ai didi

python - 按分隔符分割时保持引用 block 完整

转载 作者:行者123 更新时间:2023-11-30 22:04:11 29 4
gpt4 key购买 nike

给定一个示例字符串 s = '嗨,我的名字是 Humpty-Dumpty,来自“爱丽丝,爱丽丝镜中奇遇记”',我想将其分成以下 block :

# To Do: something like {l = s.split(',')}
l = ['Hi', 'my name is Humpty-Dumpty', '"Alice, Through the Looking Glass"']

我不知道在哪里可以找到多少分隔符。

这是我最初的想法,它很长,而且不准确,因为它删除了所有分隔符,而我希望引号内的分隔符保留下来:

s = 'Hi, my name is Humpty-Dumpty, from "Alice, Through the Looking Glass"'
ss = []
inner_string = ""
delimiter = ','

for item in s.split(delimiter):
if not inner_string:
if '\"' not in item: # regullar string. not intersting
ss.append(item)
else:
inner_string += item # start inner string

elif inner_string:
inner_string += item

if '\"' in item: # end inner string
ss.append(inner_string)
inner_string = ""
else: # middle of inner string
pass

print(ss)
# prints ['Hi', ' my name is Humpty-Dumpty', ' from "Alice Through the Looking Glass"'] which is OK-ish

最佳答案

您可以使用 re.split 按正则表达式进行拆分:

>>> import re
>>> [x for x in re.split(r'([^",]*(?:"[^"]*"[^",]*)*)', s) if x not in (',','')]

s等于:

'Hi, my name is Humpty-Dumpty, from "Alice, Through the Looking Glass"'

它输出:

['Hi', ' my name is Humpty-Dumpty', ' from "Alice, Through the Looking Glass"']

正则表达式解释:

(
[^",]* zero or more chars other than " or ,
(?: non-capturing group
"[^"]*" quoted block
[^",]* followed by zero or more chars other than " or ,
)* zero or more times
)

关于python - 按分隔符分割时保持引用 block 完整,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53391766/

29 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com