gpt4 book ai didi

python - 如何在 Python 中将单独的字符串对象合并到列表中

转载 作者:太空宇宙 更新时间:2023-11-03 20:25:54 25 4
gpt4 key购买 nike

我有一个名为“members”和“pcps”的对象作为最终产品,它们本身实际上是一堆单独的字符串对象。我需要将它们矢量化为单个列表,以便我可以将它们添加到数据框并最终作为 Excel 表格

当我从 PDF 中抓取文本数据时,问题出现了,它没有作为列表中的列表的数据结构。想知道我尝试创建“成员”系列的行是否可以以某种方式将这些单独的对象合并到一个列表中。


def PDFsearch(origFileName):

# creating a pdf File object of original pdf
pdfFileObj = open(origFileName, 'rb')
# creating a pdf Reader object
pdfReader = PyPDF2.PdfFileReader(pdfFileObj)

numPages = pdfReader.numPages
print(numPages)
for p in range(pdfReader.numPages):

# creating page object
pageObj = pdfReader.getPage(p)
#extract txt from pageObj into unicode string object
pages = pageObj.extractText()
# loop through string object by page
pges = []


for page in pages.split("\n"):
# split the pages into words
pges.append(page)

lns = []
for lines in page.split(" "):
for line in lines.split(","): #seperate the ,"This" from the last name
lns.append(line)

names = list()
if lns[0] == "Dear": # If first word in a line is "Dear"
names.append(lns[1:4]) # Get the 2nd and 3rd items (First and Last names)
for name in names:
members = " ".join(name) # These are the names we want

PCPs = lns[78:85]
pcps = " ".join(PCPs)

providers = pd.Series(pcps)
members = pd.Series(members)

'''This is what I get when I print the series 'members':

0 LAILIA TAYLOR
dtype: object
0 LATASIA WILLIS
dtype: object
0 LAURYN ALLEN
dtype: object
0 LAYLA ALVARADO
dtype: object
0 LAYLA BORELAND
dtype: object
0 LEANIAH MULLIGAN
dtype: object

All separate objects! Same with 'providers'. and when I create a dataframe and export to excel I only get one row'''

最佳答案

快速浏览一下,但我相信您的问题是您每次都在覆盖您的系列。尝试这样的事情:

# add at the beginning of your function 
temp = pd.DataFrame()
data = pd.DataFrame()

# this would replace where you assign to providers and members
temp['providers'] = pd.Series(pcps)
temp['members'] = pd.Series(members)
data = pd.concat([data, temp]).reset_index(drop=True)

这样你每次都会覆盖 temp,但你的数据 DataFrame 将包含所有成员和提供者。我希望这会有所帮助,祝你好运!

关于python - 如何在 Python 中将单独的字符串对象合并到列表中,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57827329/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com