gpt4 book ai didi

python - 从最多 6 位数字的字符串中提取数值,可选 2 位小数

转载 作者:行者123 更新时间:2023-12-01 00:25:08 27 4
gpt4 key购买 nike

我有一个任务,需要从表示数值的文本中提取值。不过,我有兴趣提取最多 6 位数字的值,小数点是可选的。

例如,从下面的文本:

Total compensation for Mr. XYZ was $5,123,456 and other salary which was $650,000 in fiscal 2018, was determined to be approximately 8.78 times the median annual compensation for all of the firm's other employees, which was approximately $74,000. Some other salaries are 56000.

我需要提取

["650,000", "2018", "8.78", "74,000", "56000"] 

由此而来。

我正在使用的正则表达式:

((\d{1,3})(?:,[0-9]{3}){0,1}|(\d{1,6}))(\.\d {1,2})?
它可以正确识别 650,000 和 74,000,但无法正确识别其他人。

我找到了 this 7 位数字的货币正则表达式,并围绕它制作了一个 6 位数字的正则表达式,但没有成功。如何更正我的正则表达式?

最佳答案

试试这个:(?<![\d,.])(?:\d,?){0,5}\d(?:\.\d+)?(?!,?\d)

详细解释如下:

(?x) # flag for readable mode, whitespaces and comments are ignored 

# Make sure to not start in the middle of a number, so no digit, comma or dot before the match
(?<![\d,.])

# k-1 digits, with facultative comma between each. Therefore 5,4,3,2 are allowed for the sake of simplicity, be aware of that
(?:\d,?){0,5}

#The kth digit
\d

# Facultative dot and decimal part
(?:\.\d+)?

# Make sure to not stop in the middle of a big number, so no digit after. Comma is allowed, but only for the grammatical comma, so comma+digit is forbidden
(?!,?\d)

可能会有改进,但我认为这就是您想要的。可能有些情况没有处理,如果你发现了请告诉我。在这里测试一下:https://regex101.com/r/Wxi5Sj/2

关于python - 从最多 6 位数字的字符串中提取数值,可选 2 位小数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58637991/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com