gpt4 book ai didi

python - 按空格拆分,某些字符之间除外

转载 作者:太空宇宙 更新时间:2023-11-03 12:17:44 25 4
gpt4 key购买 nike

我正在解析一个包含以下行的文件

type("book") title("golden apples") pages(10-35 70 200-234) comments("good read")

我想把它分成不同的字段。

在我的示例中,有四个字段:类型、标题、页面和评论。

split 后想要的结果是

['type("book")', 'title("golden apples")', 'pages(10-35 70 200-234)', 'comments("good read")]

很明显,简单的字符串拆分是行不通的,因为它只会在每个空格处拆分。我想按空格拆分,但保留括号和引号之间的任何内容。

我该如何拆分?

最佳答案

这个正则表达式应该适合你 \s+(?=[^()]*(?:\(|$))

result = re.split(r"\s+(?=[^()]*(?:\(|$))", subject)

解释

r"""
\s # Match a single character that is a “whitespace character” (spaces, tabs, and line breaks)
+ # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
(?= # Assert that the regex below can be matched, starting at this position (positive lookahead)
[^()] # Match a single character NOT present in the list “()”
* # Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
(?: # Match the regular expression below
# Match either the regular expression below (attempting the next alternative only if this one fails)
\( # Match the character “(” literally
| # Or match regular expression number 2 below (the entire group fails if this one fails to match)
$ # Assert position at the end of a line (at the end of the string or before a line break character)
)
)
"""

关于python - 按空格拆分,某些字符之间除外,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/9644784/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com