gpt4 book ai didi

python - 从字符串中提取信息

转载 作者:行者123 更新时间:2023-11-28 01:38:49 25 4
gpt4 key购买 nike

以下代码有效,但我无法提取我需要的信息。我可以使用 Soup 还是需要正则表达式?

from bs4 import BeautifulSoup
import urllib2
mynumber='1234567890'
url="http://www.nccptrai.gov.in/nccpregistry/saveSearchSub.misc?phoneno="+mynumber
page=urllib2.urlopen(url)
soup = BeautifulSoup(page.read())

table = soup.findAll("table")[1]
myl=[item.text.strip() for item in table.find_all('td')]
import re
re.findall(r'is:\s*[^,]*' , myl[1])

预期输出是第一个切片的第一个字符串中提到的 4 个参数。

['2014-08-07 15:50:00', 'Andhra Pradesh', 'Unitech', '0']

(注意日期更改为 Y-M-D)

返回的字符串看起来像这样......

[u'is:\n 31-10-2009 01:11\n\n\nService Area : \n Mumbai\n\n\nService Provider :\n Idea\n\n\n\n\n\nYour Preference is :0']

最佳答案

我依赖于 The number is registered in NCPR header(它在类 GridHeadertd 标签中)并得到通过 find_next_siblings() 的下一行:

import urllib2
from bs4 import BeautifulSoup

mynumber = '1234567890'
url = "http://www.nccptrai.gov.in/nccpregistry/saveSearchSub.misc?phoneno=" + mynumber

soup = BeautifulSoup(urllib2.urlopen(url))

header = soup.find('td', class_='GridHeader')

result = []
for row in header.parent.find_next_siblings('tr'):
cells = row.find_all('td')
try:
result.append(cells[2].get_text(strip=True))
except IndexError:
continue
print result

打印:

[u'07-08-2014 15:50', u'Andhra Pradesh', u'Unitech', u'0']

关于python - 从字符串中提取信息,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/27163597/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com