gpt4 book ai didi

python - 解析 json 元素

转载 作者:行者123 更新时间:2023-12-01 05:02:47 30 4
gpt4 key购买 nike

我正在使用 Scrapy 和 Regex 来解析一些非标准的 Web 源代码。然后我希望解析返回的字典的第一个元素:

from scrapy.contrib.spiders import CrawlSpider, Rule
from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor
from scrapy.selector import Selector
from scrapy.item import Item
from scrapy.spider import BaseSpider
from scrapy import log
from scrapy.cmdline import execute
from scrapy.utils.markup import remove_tags
import time
import re
import json
import requests


class ExampleSpider(CrawlSpider):
name = "goal2"
allowed_domains = ["whoscored.com"]
start_urls = ["http://www.whoscored.com"]
download_delay = 5

rules = [Rule(SgmlLinkExtractor(allow=('\Teams'),deny=(),), follow=False, callback='parse_item')]

def parse_item(self, response):

sel = Selector(response)
titles = sel.xpath("normalize-space(//title)")
print '-' * 170
myheader = titles.extract()[0]
print '********** Page Title:', myheader.encode('utf-8'), '**********'
print '-' * 170

match1 = re.search(re.escape("DataStore.prime('stage-player-stat', defaultTeamPlayerStatsConfigParams.defaultParams , ") \
+ '(\[.*\])' + re.escape(");"), response.body)


if match1 is not None:
playerdata1 = match1.group(1)

teamid = json.loads(playerdata1[0])
print teamid

“playerdata1”第一个元素的键称为“TeamId”。我认为上述方法可行,但是我收到以下错误:

    teamid = json.loads(playerdata1[0])
File "C:\Python27\lib\json\__init__.py", line 338, in loads
return _default_decoder.decode(s)
File "C:\Python27\lib\json\decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\Python27\lib\json\decoder.py", line 382, in raw_decode
obj, end = self.scan_once(s, idx)
exceptions.ValueError: Expecting object: line 1 column 1 (char 0)

谁能看出这里出了什么问题吗?

最佳答案

match1.group(1) 返回一个字符串。然后,您对该字符串建立索引:

teamid = json.loads(playerdata1[0])

这里,[0] 将为您提供该字符串的第一个字符。删除其中的索引表达式以使用整个字符串:

teamid = json.loads(playerdata1)

现在 teamid 是一个包含玩家对象的列表:

>>> len(teamid)
22
>>> teamid[0].keys()
[u'FirstName', u'LastName', u'KnownName', u'Field', u'GameStarted', u'AerialWon', u'TeamRegionCode', u'SecondYellow', u'ShotsBlocked', u'TotalShots', u'Assists', u'Red', u'Name', u'PositionText', u'Ranking', u'PositionLong', u'PlayerId', u'SubOff', u'Dispossesed', u'TeamId', u'TotalTackles', u'TotalLongBalls', u'Goals', u'SubOn', u'WasDribbled', u'AerialLost', u'Turnovers', u'ShotsOnTarget', u'WSName', u'Fouls', u'ManOfTheMatch', u'Height', u'TeamName', u'RegionCode', u'TotalPasses', u'TotalThroughBalls', u'Dribbles', u'DateOfBirth', u'OwnGoals', u'WasFouled', u'TotalClearances', u'Rating', u'PlayedPositionsRaw', u'Weight', u'AccurateLongBalls', u'OffsidesWon', u'AccuratePasses', u'Yellow', u'KeyPasses', u'TotalCrosses', u'AccurateCrosses', u'IsCurrentPlayer', u'Age', u'PositionShort', u'AccurateThroughBalls', u'Interceptions', u'Offsides']

关于python - 解析 json 元素,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/25703740/

30 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com