gpt4 book ai didi

python - 从字符串中提取企业名称和时间段

转载 作者:太空宇宙 更新时间:2023-11-04 01:38:10 24 4
gpt4 key购买 nike

我正在使用 Python 从路透社提取有关某些公司的信息。我已经能够从 this page 获得官员/行政人员的姓名、传记和报酬。

现在,我想从传记部分提取以前的职位和公司,看起来像这样:

Mr. Donald T. Grimes is Senior Vice President, Chief Financial Officer and Treasurer of Wolverine World Wide, Inc., since May 2008. From 2007 to 2008, he was the Executive Vice President and Chief Financial Officer for Keystone Automotive Operations, Inc., a distributor of automotive accessories and equipment. Prior to Keystone, Mr. Grimes held a series of senior corporate and divisional finance roles at Brown-Forman Corporation, a manufacturer and marketer of premium wines and spirits. During his employment at Brown-Forman, Mr. Grimes was Vice President, Director of Beverage Finance from 2006 to 2007; Vice President, Director of Corporate Planning and Analysis from 2003 to 2006; and Senior Vice President, Chief Financial Officer of Brown-Forman Spirits America from 1999 to 2003.

我可以使用简单的正则表达式来获取从年份到年份,但我不知道如何编写正则表达式来获取标题和公司名称。我知道字符串格式不一致,所以我会采用至少适用于 70% 情况的答案。这是我想要的输出:

2007-2008, executive vice president and chief financial officer, Keystone Automotive operations

最佳答案

您试图解决的问题是众所周知的和研究过的,如果您在谷歌上搜索术语“命名实体提取”和“关系提取”,您会发现大量描述方法和算法的研究论文一些很好的起点是:

这些只是我发现的几个有趣的链接,还有很多,可能比这些更好,但这应该可以帮助您入门。

关于python - 从字符串中提取企业名称和时间段,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/7757554/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com