python - 输出文件留下几个月没有天气数据的数据-6ren

python - 输出文件留下几个月没有天气数据的数据

转载作者：太空宇宙更新时间：2023-11-03 16:35:19

我正在尝试从天气网站 wunderground.com 抓取数据。我希望 1941-2016 年每个月(一月至十二月)都从费城发出。

起初我有这段代码，但这只是在 2016 年 1 月抓取并制作了一个文件。

#!/usr/bin/python
#weather.scraper

from bs4 import BeautifulSoup
import urllib
import json

def main():
    # weatherData = weather_philadelphia_data #Json beginns here
    # with open(jsonfile, 'w') as outputFile:
    #     json.dump(weatherData, outputFile)
    # #scrapping beginns here
    r = urllib.urlopen("https://www.wunderground.com/history/airport/KPHL/2016/1/1/MonthlyHistory.html?&reqdb.zip=&reqdb.magic=&reqdb.wmo=&MR=1").read()
    soup = BeautifulSoup(r, "html.parser")
    tables = soup.find_all("table", class_="responsive airport-history-summary-table")

    weatherdata = []
    for table in tables: #reason for it to do it 12x

        for tr in table.find_all("tr"):
            firstTd = tr.find("td")
            if firstTd and firstTd.has_attr("class") and "indent" in firstTd['class']:
                values = {}
                tds = tr.find_all("td")
                maxVal = tds[1].find("span", class_="wx-value")
                avgVal = tds[2].find("span", class_="wx-value")
                minVal = tds[3].find("span", class_="wx-value")
                if maxVal:
                    values['max'] = maxVal.text
                if avgVal:
                    values['avg'] = avgVal.text
                if minVal:
                    values['min'] = minVal.text
                if len(tds) > 4:
                    sumVal = tds[4].find("span", class_="wx-value")
                    if sumVal:
                        values['sum'] = sumVal.text
                scrapedData = {}
                scrapedData[firstTd.text] = values
                weatherdata.append(scrapedData)
        break
    with open ("january_2016.json", 'w' ) as outFile:
        json.dump(weatherdata, outFile, indent=2)


print "done"
if __name__ == "__main__":
    main()

我尝试制作一个循环遍历所有年份和月份的 for 循环。它创建了文件，但里面没有数据，只显示年份。这是新代码:

#!/usr/bin/python
#weather.scraper
from bs4 import BeautifulSoup
import urllib
import json

allData = []
# this loops through all the Weather years
for y in range(1941, 2017):
    yearData = {}
    yearData['year'] = y
    months = []
    for m in range(1, 13):
        def main():
        # weatherData = weather_philadelphia_data #Json beginns here
        # with open(jsonfile, 'w') as outputFile:
        #     json.dump(weatherData, outputFile)
        # scrapping beginns here
            url = "https://www.wunderground.com/history/airport/KPHL/%d/%d/1/MonthlyHistory.html" % (y, m)
            r = urllib.urlopen(url).read()
            soup = BeautifulSoup(r, "html.parser")
            tables = soup.find_all("table", class_="responsive airport-history-summary-table")

            weatherPerMonth = {}
            weatherdata = []
            for table in tables: #reason for it to do it 12x

                for tr in table.find_all("tr"):
                     firstTd = tr.find("td")
                     if firstTd and firstTd.has_attr("class") and "indent" in firstTd['class']:
                         values = {}
                         tds = tr.find_all("td")
                         maxVal = tds[1].find("span", class_="wx-value")
                         avgVal = tds[2].find("span", class_="wx-value")
                         minVal = tds[3].find("span", class_="wx-value")
                         if maxVal:
                             values['max'] = maxVal.text
                         if avgVal:
                             values['avg'] = avgVal.text
                         if minVal:
                             values['min'] = minVal.text
                         if len(tds) > 4:
                             sumVal = tds[4].find("span", class_="wx-value")
                             if sumVal:
                                 values['sum'] = sumVal.text
                         scrapedData = {}
                         scrapedData[firstTd.text] = values
                         weatherdata.append(scrapedData)
                         break
            monthData = {}
            monthData['month'] = m
            monthData['weather'] = weatherPerMonth
            months.append(monthData)
        yearData['months'] = months
        allData.append(yearData)

        with open ("allData_philly.json", 'w' ) as outFile:
            json.dump(allData, outFile, indent=2)


print "done"
if __name__ == "__main__":
    main()

这是它生成的输出文件的一部分。

[  
 {
  "months": [], 
  "year": 1941
 }, 
]

直到2016年都是这样。

问题如下。我想要一个文件，其中包含 1941-2016 年 12 个月(一月至十二月)的天气数据，它应该如下所示:

[
  {
    "months": [{
              'month': 12
              'weather' : {
                  "Max Temperature": {
                    "max": "18", 
                    "avg": "6", 
                    "min": "-2"
                  }
                }, 
                {
                  "Mean Temperature": {
                    "max": "12", 
                    "avg": "1", 
                    "min": "-6"
                  }
                }, 
                {
                  "Min Temperature": {
                    "max": "6", 
                    "avg": "-3", 
                    "min": "-11"
                  }

      }], 
    "year": 1941
  }, 
]

但我不明白为什么我的代码不起作用，我希望有人能提供帮助!

最佳答案

您的代码看起来不错，只是有一些小问题阻止您获得正确的输出。

def main(): 位于循环内部，因此当您调用 main() 时，它不会循环所有年份。在您的第一个示例中看起来不错。
将 weatherPerMonth 声明为空列表，然后将其分配给 monthData['weather']。您的实际数据位于 weatherdata 中，但它永远不会写入任何地方。
下面的代码只是对您的代码进行了较小的修改，进行了一些重新排列和缩进更改，但它应该会为您提供所需的输出。

<小时/>

#weather.scraper
from bs4 import BeautifulSoup
import urllib.request
import json

allData = []
# this loops through all the Weather years
for y in range(2012, 2014):
    yearData = {}
    yearData['year'] = y
    months = []
    for m in range(1, 13):
        # weatherData = weather_philadelphia_data #Json beginns here
        # with open(jsonfile, 'w') as outputFile:
        #     json.dump(weatherData, outputFile)
        # scrapping beginns here
        url = "https://www.wunderground.com/history/airport/KPHL/%d/%d/1/MonthlyHistory.html" % (y, m)
        r = urllib.request.urlopen(url).read()
        soup = BeautifulSoup(r, "html.parser")
        tables = soup.find_all("table", class_="responsive airport-history-summary-table")

        weatherPerMonth = {}
        weatherdata = []

        monthData = {}

        for table in tables: #reason for it to do it 12x

            for tr in table.find_all("tr"):
                 firstTd = tr.find("td")
                 if firstTd and firstTd.has_attr("class") and "indent" in firstTd['class']:
                     values = {}
                     tds = tr.find_all("td")

                     maxVal = tds[1].find("span", class_="wx-value")
                     avgVal = tds[2].find("span", class_="wx-value")
                     minVal = tds[3].find("span", class_="wx-value")
                     if maxVal:
                         values['max'] = maxVal.text
                     if avgVal:
                         values['avg'] = avgVal.text
                     if minVal:
                         values['min'] = minVal.text
                     if len(tds) > 4:
                         sumVal = tds[4].find("span", class_="wx-value")
                         if sumVal:
                             values['sum'] = sumVal.text
                     scrapedData = {}
                     scrapedData[firstTd.text] = values

                     weatherdata.append(scrapedData)
                     monthData['month'] = m
                     monthData['weather'] = values
                     break


        months.append(monthData)
    yearData['months'] = months
    allData.append(yearData)

with open ("allData_philly.json", 'w' ) as outFile:
    json.dump(allData, outFile, indent=2)

关于python - 输出文件留下几个月没有天气数据的数据，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/37280309/

文章推荐： ruby - 如何从网页中提取 HTML 主题标题？

文章推荐： Paypal 自适应支付重定向 URL GET 变量

文章推荐： c# - 光标移动到 DataGridView 中的下一个单元格

文章推荐：使用“立即购买”按钮的 PayPal SandBox 错误

php - 以编程方式创建冷锋矢量(天气)
有谁知道在给定顶点列表的情况下以编程方式创建冷/暖锋矢量的方法(注意顶点不包含小三 Angular 形或半圆；[-105W，40.45N]等...)？我正在尝试在网络 map 应用程序上放置“当前前
ios - UIBarButtonItem(天气)
在 iOS 天气应用程序的右下角，有一个表格/列表图标。我想知道这是一个栏按钮项目，还是一个自定义图标。最佳答案这是一个带有自定义图标的自定义按钮。要查找来自 Apple 的所有模板图标，请转到
Python 天气 API
很难说出这里问的是什么。这个问题是模棱两可的、模糊的、不完整的、过于宽泛的或修辞的，无法以目前的形式得到合理的回答。为了帮助澄清这个问题以便可以重新打开它，visit the help center
Win10系统桌面如何添加日历，天气，时钟等小工具？
很多用户为了提高自己的工作效率，都会在电脑桌面上添加日历，天气，时钟等小工具，防止忘记一些重要的事项。那如何在win10系统桌面添加日历，天气，时钟等小工具呢?下面就跟着小编一起来看看吧。 Win
weather - NOAA 天气 API
我试图使用 NOAA 进行每小时预报，但它不允许我输入东经。请求必须是西经。他们甚至为印度等亚洲国家提供东经服务。这是我使用的链接 http://graphical.weather.gov/xml
Python 天气 API 问题
我正在使用 pywapi 来获取不同地点的天气状况。我使用的命令是: alaska = pywapi.get_weather_from_noaa('PABI') 其中“PABI”是阿拉斯加的电台 ID
swift - 天气 Swift 首选项
我目前正在使用 xCode 开发一个应用程序，我是 iOS 新手。我的天气格式当前正在向我的数据库提交“摄氏度”和“华氏度”，而不是发送 C 或 F。我已经尝试了几件事，但看起来它也将我的显示文本更改
javascript - 天气 App 每小时滚动效果
我在一些地方看到人们正在寻找天气应用程序效果。这些问题中有很多是针对 Android 应用程序或 JAVA 的，但我正在寻找在 HTML/CSS/Javascript 网站上的使用。这可能吗？当您查
php - 解析来自雅虎的某些信息!天气 RSS 提要
我正在使用雅虎! Weather RSS Feed 以获取我所在城镇的预报。我可以解析 XML 并获取天气描述，但随后我得到如下结果: Current Conditions: Light Rain,
javascript - 天气 API 和 JSON
这个问题已经有答案了: How can I access and process nested objects, arrays, or JSON? (31 个回答) 已关闭 6 年前。所以，我正在尝
java - Google 天气 API 条件
我正在使用 SAX 从 google 天气 API 中提取信息，但我遇到了“条件”问题。具体来说，我正在使用这段代码: public void startElement (String uri, S
jquery - 天气 API(RESTful)给出错误
我正在使用一个简单的 Web API 来获取伦敦城市的天气详细信息。但是，我没有得到所需的响应，而是收到 401 错误。我做错了什么吗？ HTML $(document).ready(func
javascript - 天气 api key 不起作用
我正在使用 Open Weather API，但是当我发送城市数据请求时，出现网络错误。这是获取资源的操作。 actions/index.js import axios from "axios";
javascript - 天气 API (openweathermap) 显示温度错误
这段代码中的所有内容我都在尝试为用户提供本地的位置和温度但是不知何故温度以摄氏度显示的方式更少而且下面也没有更新是我尝试过的就像它是 4-5 小时回溯数据到 10 度摄氏度不太像如果温度是 22(摄氏
swift - CLLocation + 天气(Alamofire)问题
我正在尝试使用 CLLocation 来捕获经度和纬度，然后使用 Alamofire 中的经度和纬度来获取天气。每次，经度和纬度都不会停止更新，也不会打印天气数据(如果你想查看这里是数据的示例链接:h
python - Dialogflow 天气 webhook 连接错误
fullfillment page至于我个人的兴趣，我想使用天气 api 制作一个天气聊天机器人。为此，我使用对话流，但是当我按照 Dialogflow github 进行 webhook 连接时但出
ios - 多个问题(日期、天气、地点)
我正在开发一款应用程序，我相信它会对学生有所帮助。我想根据位置包括天气(仅温度)。例如，如果我要从一个城市前往另一个城市，我希望它能针对新城市自动更新。我该怎么做？对于我的第二个问题，我还想将日期(即
java - 如何制作像 Nexus one 天气/新闻应用这样的标签？
有谁知道如何制作像天气/新闻应用程序中的 Nexus One 选项卡这样的选项卡，我的意思是通过轻弹屏幕即可转到下一个选项卡的功能，谢谢视频:http://www.youtube.com/watch
iphone - iOS 在字符串中搜索字符串，或者如何使用 Yahoo 天气
我正在尝试制作一个天气应用程序。到目前为止，我设法从 rss 中获得了一些基本信息。我正在使用以下代码: if ([currentElement isEqualToString:@"descr
python - 天气 "events"根据 Pandas 的时差进行分组
我有一个由站点标识符代码(“usaf”)和日期组织的表面天气观测数据框(fzraHrObs)。 fzraHrObs 有几列天气数据。车站代码和日期(日期时间对象)如下所示: usaf dat

太空宇宙

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

python - 输出文件留下几个月没有天气数据的数据