gpt4 book ai didi

python - 如果出现空格而不是空格的情况,如何使用正则表达式

转载 作者:太空宇宙 更新时间:2023-11-03 20:26:57 24 4
gpt4 key购买 nike

一直在尝试获得以下内容:

[('08/03/2019', '', '58', '71', '162', '', '1', '71.68', '69.03', '441381.64', '2829.37', '14', '1', '226', '2', '224', '62', '271')]

如果数字之间没有空格,则有效。它最初看起来像这样:

'08/03/2019     175   58   71  162|    5     1| 71.68 69.03|  441381.64    2829.37|   14     1|  226    2  224   62|   271|'

[('08/03/2019', '175', '58', '71', '162', '5', '1', '71.68', '69.03', '441381.64', '2829.37', '14', '1', '226', '2', '224', '62', '271')]

使用的脚本:

re.compile(r"([0-9]{2}\/[0-9]{2}\/[0-9]{4})\s{5}(\d+)\s{3}(\d+)\s{3}(\d+)\s{2}(\d+)[|]\s{4}(\d+)\s{5}(\d+)[|]\s{1}(\d+[.]\d+)\s{1}(\d+[.]\d+)[|]\s{2}(\d+[.]\d+)\s{4}(\d+[.]\d+)[|]\s{3}(\d+)\s{5}(\d+)[|]\s{2}(\d+)\s{4}(\d+)\s{2}(\d+)\s{3}(\d+)[|]\s{3}(\d+)")

当原始数据集中出现空白时,就会出现问题,例如,缺少 175 和 5,re.compile 脚本无法获取该数字:

'08/03/2019        58   71  162|         1| 71.68 69.03|  441381.64    2829.37|   14     1|  226    2  224   62|   271|'

使用 (\s+)\s+ 进行分割没有帮助,因为空间模式不同(5,3,3,2,4, 5,1,1,2,4,3,5,2,4,2,3,3 是空格)。

最佳答案

您设计的表达式看起来很棒,您可能只想在那些可能缺少值的捕获组后面添加一个 ? ,这可能会解决您现在面临的问题。

Demo

例如,我们将添加两个 ?:

import re

expression = r"([0-9]{2}\/[0-9]{2}\/[0-9]{4})\s{5}(\d+)?\s{3}(\d+)\s{3}(\d+)\s{2}(\d+)[|]\s{4}(\d+)?\s{5}(\d+)[|]\s{1}(\d+[.]\d+)\s{1}(\d+[.]\d+)[|]\s{2}(\d+[.]\d+)\s{4}(\d+[.]\d+)[|]\s{3}(\d+)\s{5}(\d+)[|]\s{2}(\d+)\s{4}(\d+)\s{2}(\d+)\s{3}(\d+)[|]\s{3}(\d+)"

string = """
08/03/2019 58 71 162| 1| 71.68 69.03| 441381.64 2829.37| 14 1| 226 2 224 62| 271|

08/03/2019 175 58 71 162| 5 1| 71.68 69.03| 441381.64 2829.37| 14 1| 226 2 224 62| 271|

"""


print(re.findall(expression, string))

输出

[('08/03/2019', '', '58', '71', '162', '', '1', '71.68', '69.03', '441381.64', '2829.37', '14', '1', '226', '2', '224', '62', '271'), ('08/03/2019', '175', '58', '71', '162', '5', '1', '71.68', '69.03', '441381.64', '2829.37', '14', '1', '226', '2', '224', '62', '271')]
<小时/>

If you wish to explore/simplify/modify the expression, it's been explained on the top right panel of regex101.com. If you'd like, you can also watch in this link, how it would match against some sample inputs.

<小时/>

关于python - 如果出现空格而不是空格的情况,如何使用正则表达式,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57751948/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com