gpt4 book ai didi

python - 为什么我的程序没有检测到单词 "Address:"或 "Professional:"?

转载 作者:行者123 更新时间:2023-12-01 00:58:56 28 4
gpt4 key购买 nike

我正在尝试在此配置中搜索纯文本:

Named H Man, MBA
Personal:
Address:
Professional:
0000 Something St
Apt 000
City, ST 12345-6789
No Business Contact Information.
Academic:
2019 Bachelors, Education - AF s

我的目标是仅检索本文中地址的第一部分,即“0000 Something St”和“Apt 000”部分。由于明文中的某些条目配置不同,这让情况变得复杂,因此我使用更通用的方法:我尝试查找包含单词“Address:”或“Professional:”的行获取开始我想要的文本部分的行,然后找到其后包含逗号作为结尾的任何行。完成此操作后,我将编写代码以从这些行中删除不需要的所有内容。大多数文本都可以与编写的程序一起使用——只有这个没有输出任何内容,我认为这是因为由于某种原因它没有正确检测到“地址:”或“专业:”一词。

到目前为止我编写的代码是这样的,加上随后输出它们的方法,这不可能是问题:

def FindAddress(person):
global address
address = "NA"
addressUncropped = ""
lineBeforeAddress = 0
lineAfterAddress = 0
personLines = person.splitlines()
wordList = []
lineIndex = 0
for line in personLines: # This sets up the before and after markers to be used later
wordList = line.split(" ")
for word in wordList:
print(word)
if word == "Address:" or word == "Professional:" and lineBeforeAddress == 0:
lineBeforeAddress = lineIndex
if "," in line and lineAfterAddress == 0 and lineIndex >= lineBeforeAddress:
lineAfterAddress = lineIndex+1
lineIndex += 1
for line in personLines[lineBeforeAddress:lineAfterAddress]: # This uses the before and after markers to get the address
addressUncropped += line

如果您有任何其他可能有助于完成此任务的不相关建议,我也想听听。谢谢!

最佳答案

问题是第一行这个条件为真:

if "," in line and lineAfterAddress == 0 and lineIndex >= lineBeforeAddress:

第一行Named H Man, MBA中包含一个逗号。 lineAfterAddresslineBEforeAddress 均为零,因此 lineIndex >= lineBeforeAddress 为 true。您需要检查 lineBeforeAddress 是否已设置,因此还需要条件 lineBeforeAddress > 0

此外,此测试不应该在 for word in wordList 循环中,因为它只是测试整行,而不是单个单词。

最后的循环可以简化为:

addressUncropped = "".join(personLines[lineBeforeAddress:lineAfterAddress])

完整代码:

def FindAddress(person):
global address
address = "NA"
addressUncropped = ""
lineBeforeAddress = 0
lineAfterAddress = 0
personLines = person.splitlines()
wordList = []
lineIndex = 0
for line in personLines: # This sets up the before and after markers to be used later
wordList = line.split(" ")
for word in wordList:
if (word == "Address:" or word == "Professional:") and lineBeforeAddress == 0:
lineBeforeAddress = lineIndex
if "," in line and lineAfterAddress == 0 and lineBeforeAddress > 0 and lineIndex >= lineBeforeAddress:
lineAfterAddress = lineIndex+1
lineIndex += 1
addressUncropped = "".join(personLines[lineBeforeAddress:lineAfterAddress])
return addressUncropped

关于python - 为什么我的程序没有检测到单词 "Address:"或 "Professional:"?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55977165/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com