python - 如何从一串数字和逗号中挑选出数百万？-6ren

python - 如何从一串数字和逗号中挑选出数百万？

转载作者：行者123 更新时间：2023-12-05 01:03:40

我正在使用 beautifulsoup 在 Yahoo Finance 上进行一些网页抓取以获取乐趣。目标是获取 html 文件，找到财务数据并将其放入数组中。我已经设法得到这种格式的输出

总收入42,965,39136,483,93920,139,65822,588,85825,067,279

我如何将这些数字分成数百万？例如，我们知道 42,965,39136,483,939 实际上是 42,965,391 和 36,483,939，但我们如何编码呢？我试过使用正则表达式但没有成功。

with open('Nucor Yahoo HTML.html','r') as html:
content = html.read()
soup = BeautifulSoup(content, 'lxml')
tables = soup.find_all(class_ = 'rw-expnded')
for table in tables:
    pattern = re.compile(r'\d\d\d?,[0-9]{3},[0-9]{3}')
    matches = pattern.finditer(table.text)
    for match in matches:
        print(match)
    print(table.text)

html 文件在这里:https://finance.yahoo.com/quote/NUE/financials?p=NUE

最佳答案

我建议更改用于提取数据的代码并改用它(我觉得这比尝试在字符串中获得正确的截断要安全得多...):

获取数据:

import requests
from bs4 import BeautifulSoup
resp = requests.get("https://finance.yahoo.com/quote/NUE/financials?p=NUE",
                    headers={'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.0.0 Safari/537.36'})

解析数据并选择“总收入”行

soup = BeautifulSoup(resp.text, "html.parser")
total_revenue = [row for row in soup.find("div", {"data-test": "fin-row"}) if "Total Revenue" in row.text]

现在您可以选择列并使用它们

columns = total_revenue[0].find_all("div", {"data-test": "fin-col"})
for col in columns:
    print(col.text)

输出:

关于python - 如何从一串数字和逗号中挑选出数百万？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/73929885/

文章推荐： r - 查找多列具有相同值的行

文章推荐： r - 如何找到使等式成立的所有变量组合(在 2 个阈值之间)？

文章推荐： WebGL 正交相机

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

python - 如何从一串数字和逗号中挑选出数百万？