gpt4 book ai didi

python - 从 CSV 表中提取子字符串

转载 作者:太空宇宙 更新时间:2023-11-03 14:10:22 25 4
gpt4 key购买 nike

我正在尝试清理 csv 表中的数据,如下所示:

KATY PERRY@katyperry
1,084,149,282,038,820
Justin Bieber@justinbieber
10,527,300,631,674,900,000
Barack Obama@BarackObama
9,959,243,562,511,110,000

我只想提取“@”句柄,例如:

@katyperry
@justinbieber
@BarackObama

这是我编写的代码,但它所做的只是一遍又一遍地重复表格的第二行:

import csv
import re
with open('C:\\Users\\TK\\Steemit\\Scripts\\twitter.csv', 'rt', encoding='UTF-8') as inp:
read = csv.reader(inp)
for row in read:
for i in row:
if i.isalpha():
stringafterword = re.split('\\@\\',row)[-1]
print(stringafterword)

最佳答案

如果你愿意使用re,你可以在一行中获取字符串列表:

import re

#content string added to make it a working example
content = """KATY PERRY@katyperry
1,084,149,282,038,820
Justin Bieber@justinbieber
10,527,300,631,674,900,000
Barack Obama@BarackObama
9,959,243,562,511,110,000"""

#solution using 're':
m = re.findall('@.*', content)
print(m)

#option without 're' but using string.find() based on your loop:
for row in content.split():
pos_of_at = row.find('@')
if pos_of_at > -1: #-1 indicates "substring not found"
print(row[pos_of_at:])

您当然应该将content字符串替换为文件内容。

关于python - 从 CSV 表中提取子字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48552288/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com