gpt4 book ai didi

Python 3.4 : LXML : Parsing Tables

转载 作者:太空宇宙 更新时间:2023-11-03 17:39:13 25 4
gpt4 key购买 nike

我想解析雅虎财经的整个表格。据我了解,“tbody”和“thead”标签不是由 lxml 注册的,而是作为附加 TR 注册的,因此我切换了 xpath:

/html/body/div[4]/div[4]/table[2]/tbody/tr[2]/td/table[2]/tbody/tr/td/table/tbody

下面的代码中看到的内容

url = 'http://finance.yahoo.com/q/is?s=MMM+Income+Statement&annual'

tree = html.parse(url)



tick_content = [td.text_content() for td in tree.xpath('/html/body/div[4]/div[4]/table[2]/tr[3]/td/table[2]/tr[1]/td/table/td[1]')]

print(tick_content)

我返回一个空白屏幕。有没有特殊的方法来解析表orrrr?

最佳答案

您可以直接搜索带有yfnc_tabledata1的表,而不是使用Chrome生成的巨大的长XPath。类(class);只有一个:

>>> tree.xpath("//table[@class='yfnc_tabledata1']")
[<Element table at 0x10445e788>]

访问您的<td>从那里:

>>> tree.xpath("//table[@class='yfnc_tabledata1']//td[1]")[0].text_content()
'Period EndingDec 31, 2014Dec 31, 2013Dec 31, 2012\n \n Total Revenue\n \n \n \n 31,821,000\xa0\xa0\n \n \n \n 30,871,000\xa0\xa0\n \n \n \n 29,904,000\xa0\xa0\n \n Cost of Revenue16,447,000\xa0\xa016,106,000\xa0\xa015,685,000\xa0\xa0\n \n Gross Profit\n \n \n \n 15,374,000\xa0\xa0\n \n \n \n 14,765,000\xa0\xa0\n \n \n \n 14,219,000\xa0\xa0\n \n \n \n Operating Expenses\n \n Research Development1,770,000\xa0\xa01,715,000\xa0\xa01,634,000\xa0\xa0\n \n Selling General and Administrative6,469,000\xa0\xa06,384,000\xa0\xa06,102,000\xa0\xa0\n \n Non Recurring\n -\n \xa0\n -\n \xa0\n -\n \xa0\n \n Others\n -\n \xa0\n -\n \xa0\n -\n \xa0\n \n \n \n Total Operating Expenses\n -\n \xa0\n -\n \xa0\n -\n \xa0\n \n Operating Income or Loss\n \n \n \n 7,135,000\xa0\xa0\n \n \n \n 6,666,000\xa0\xa0\n \n \n \n 6,483,000\xa0\xa0\n \n \n \n Income from Continuing Operations\n \n Total Other Income/Expenses Net33,000\xa0\xa041,000\xa0\xa039,000\xa0\xa0\n \n Earnings Before Interest And Taxes7,168,000\xa0\xa06,707,000\xa0\xa06,522,000\xa0\xa0\n \n Interest Expense142,000\xa0\xa0145,000\xa0\xa0171,000\xa0\xa0\n \n Income Before Tax7,026,000\xa0\xa06,562,000\xa0\xa06,351,000\xa0\xa0\n \n Income Tax Expense2,028,000\xa0\xa01,841,000\xa0\xa01,840,000\xa0\xa0\n \n Minority Interest(42,000)(62,000)(67,000)\n \n \n \n Net Income From Continuing Ops4,956,000\xa0\xa04,659,000\xa0\xa04,444,000\xa0\xa0\n \n Non-recurring Events\n \n Discontinued Operations\n -\n \xa0\n -\n \xa0\n -\n \xa0\n \n Extraordinary Items\n -\n \xa0\n -\n \xa0\n -\n \xa0\n \n Effect Of Accounting Changes\n -\n \xa0\n -\n \xa0\n -\n \xa0\n \n Other Items\n -\n \xa0\n -\n \xa0\n -\n \xa0\n \n Net Income\n \n \n \n 4,956,000\xa0\xa0\n \n \n \n 4,659,000\xa0\xa0\n \n \n \n 4,444,000\xa0\xa0\n \n Preferred Stock And Other Adjustments\n -\n \xa0\n -\n \xa0\n -\n \xa0\n \n Net Income Applicable To Common Shares\n \n \n \n 4,956,000\xa0\xa0\n \n \n \n 4,659,000\xa0\xa0\n \n \n \n 4,444,000\xa0\xa0\n \n '
>>> print(tree.xpath("//table[@class='yfnc_tabledata1']//td[1]")[0].text_content())
Period EndingDec 31, 2014Dec 31, 2013Dec 31, 2012

Total Revenue



31,821,000  



30,871,000  



29,904,000  

Cost of Revenue16,447,000  16,106,000  15,685,000  

Gross Profit



15,374,000  



14,765,000  



14,219,000  



Operating Expenses

Research Development1,770,000  1,715,000  1,634,000  

Selling General and Administrative6,469,000  6,384,000  6,102,000  

Non Recurring
-
 
-
 
-
 

Others
-
 
-
 
-
 



Total Operating Expenses
-
 
-
 
-
 

Operating Income or Loss



7,135,000  



6,666,000  



6,483,000  



Income from Continuing Operations

Total Other Income/Expenses Net33,000  41,000  39,000  

Earnings Before Interest And Taxes7,168,000  6,707,000  6,522,000  

Interest Expense142,000  145,000  171,000  

Income Before Tax7,026,000  6,562,000  6,351,000  

Income Tax Expense2,028,000  1,841,000  1,840,000  

Minority Interest(42,000)(62,000)(67,000)



Net Income From Continuing Ops4,956,000  4,659,000  4,444,000  

Non-recurring Events

Discontinued Operations
-
 
-
 
-
 

Extraordinary Items
-
 
-
 
-
 

Effect Of Accounting Changes
-
 
-
 
-
 

Other Items
-
 
-
 
-
 

Net Income



4,956,000  



4,659,000  



4,444,000  

Preferred Stock And Other Adjustments
-
 
-
 
-
 

Net Income Applicable To Common Shares



4,956,000  



4,659,000  



4,444,000  

关于Python 3.4 : LXML : Parsing Tables,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/30801462/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com