gpt4 book ai didi

python - 维基百科信息框的内容

转载 作者:太空狗 更新时间:2023-10-29 22:07:08 27 4
gpt4 key购买 nike

我需要获取任何电影信息框的内容。我知道电影的名字。一种方法是获取维基百科页面的完整内容,然后对其进行解析,直到找到 {{Infobox 然后获取信息框的内容。

有没有其他方法可以使用一些 API 或解析器来实现同样的目的?

我正在使用 Python 和 pywikipediabot API。

我也熟悉 wikitools API。因此,如果有人有与 wikitools API 相关的解决方案,请不要使用 pywikipedia,也请提及。

最佳答案

另一个很棒的 MediaWiki 解析器是 mwparserfromhell .

In [1]: import mwparserfromhell

In [2]: import pywikibot

In [3]: enwp = pywikibot.Site('en','wikipedia')

In [4]: page = pywikibot.Page(enwp, 'Waking Life')

In [5]: wikitext = page.get()

In [6]: wikicode = mwparserfromhell.parse(wikitext)

In [7]: templates = wikicode.filter_templates()

In [8]: templates?
Type: list
String Form:[u'{{Use mdy dates|date=September 2012}}', u"{{Infobox film\n| name = Waking Life\n| im <...> critic film|waking-life|Waking Life}}', u'{{Richard Linklater}}', u'{{DEFAULTSORT:Waking Life}}']
Length: 31
Docstring:
list() -> new empty list
list(iterable) -> new list initialized from iterable's items

In [10]: templates[:2]
Out[10]:
[u'{{Use mdy dates|date=September 2012}}',
u"{{Infobox film\n| name = Waking Life\n| image = Waking-Life-Poster.jpg\n| image_size = 220px\n| alt =\n| caption = Theatrical release poster\n| director = [[Richard Linklater]]\n| producer = [[Tommy Pallotta]]<br />[[Jonah Smith]]<br />Anne Walker-McBay<br />Palmer West\n| writer = Richard Linklater\n| starring = [[Wiley Wiggins]]\n| music = Glover Gill\n| cinematography = Richard Linklater<br />[[Tommy Pallotta]]\n| editing = Sandra Adair\n| studio = [[Thousand Words]]\n| distributor = [[Fox Searchlight Pictures]]\n| released = {{Film date|2001|01|23|[[Sundance Film Festival|Sundance]]|2001|10|19|United States}}\n| runtime = 101 minutes<!--Theatrical runtime: 100:40--><ref>{{cite web |title=''WAKING LIFE'' (15) |url=http://www.bbfc.co.uk/releases/waking-life-2002-3|work=[[British Board of Film Classification]]|date=September 19, 2001|accessdate=May 6, 2013}}</ref>\n| country = United States\n| language = English\n| budget =\n| gross = $3,176,880<ref>{{cite web|title=''Waking Life'' (2001)|work=[[Box Office Mojo]] |url=http://www.boxofficemojo.com/movies/?id=wakinglife.htm|accessdate=March 20, 2010}}</ref>\n}}"]

In [11]: infobox_film = templates[1]

In [12]: for param in infobox_film.params:
print param.name, param.value

name Waking Life

image Waking-Life-Poster.jpg

image_size 220px

alt

caption Theatrical release poster

director [[Richard Linklater]]

producer [[Tommy Pallotta]]<br />[[Jonah Smith]]<br />Anne Walker-McBay<br />Palmer West

writer Richard Linklater

starring [[Wiley Wiggins]]

music Glover Gill

cinematography Richard Linklater<br />[[Tommy Pallotta]]

editing Sandra Adair

studio [[Thousand Words]]

distributor [[Fox Searchlight Pictures]]

released {{Film date|2001|01|23|[[Sundance Film Festival|Sundance]]|2001|10|19|United States}}

runtime 101 minutes<!--Theatrical runtime: 100:40--><ref>{{cite web |title=''WAKING LIFE'' (15) |url=http://www.bbfc.co.uk/releases/waking-life-2002-3|work=[[British Board of Film Classification]]|date=September 19, 2001|accessdate=May 6, 2013}}</ref>

country United States

language English

budget

gross $3,176,880<ref>{{cite web|title=''Waking Life'' (2001)|work=[[Box Office Mojo]] |url=http://www.boxofficemojo.com/movies/?id=wakinglife.htm|accessdate=March 20, 2010}}</ref>

不要忘记参数也是 mwparserfromhell 对象!

关于python - 维基百科信息框的内容,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/8088226/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com