我需要在一个段落中同时找到美元金额和围绕该金额的几个(3 或 4)个词。
in-process research and development of $184.3 million and charges $120 of
million for the impairment of long-lived assets. See Notes 2, 16 and21 to the
Consolidated Financial Statements. Income from continuingoperations for the
fiscal year ended September 30, 2001 also includes a netgain on sale of
businesses and investments of $276.6 million and a net gainon the sale of
common shares of a subsidiary of $64.1 million.
我想得到的是下面这样的东西, [amount, amount+数字字,amount前3-4字]。
[$184.3 $184.3 million, research and development of $184.3 million],[$120, $120 of million,charges $120 of
million for the impairment of long-lived assets ], [$276.6, $276.6 million, investments of $276.6 million] ,[ $64.1, $64.1 million, a subsidiary of $64.1 million.]
我试过的是这个,它只找到了美元金额。
[\$]{1}\d+\.?\d{0,2}
谢谢!
那么让我们为您的模式命名:
amount_patt = r"[\$]{1}[\d,]+\.?\d{0,2}"
然后应使用上述定义数字词:
digit_word_patt = amount_patt + r" (\w+)"
现在,对于周围的 3-4 个词,执行以下操作:
words_patt = r"(\S+ ){3, 4}" + amount_patt + r"(\S+ ){3, 4}"
大功告成!现在只需将它们与您的 re
方法一起使用即可提取字符串。
我是一名优秀的程序员,十分优秀!