gpt4 book ai didi

javascript - 如何使用 lxml 从文本区域获取 javascript 代码?

转载 作者:行者123 更新时间:2023-11-28 03:27:01 25 4
gpt4 key购买 nike

我正在尝试从文本区域中提取 javascript 代码,这是我的代码:

 def getCode(self,request):
#print "Extracting URL: " + request
opener = self.login(self.username,self.password)
html = etree.HTML(opener.open(request).read())

textarea = html.xpath('//*[@id="codeText"]/text()')
for code in textarea:
return code

这是我尝试从中提取的 html 代码:

<textarea onclick="javascript: this.select();" id="codeText" style="height: 300px;width:500px;">            <!-- Clickon Affiliate code start here -->
<object type="application/x-shockwave-flash" data="http://banners.clickon.co.il/LOVELY2_banners/swf/JWFLZxzNxjclWGP.swf?url=http://track.clickon.co.il/click/Q8uTE8BXZz1pskj/JWFLZxzNxjclWGP/TsQ8uTE8BXZz1pskjtS" width="728" height="90">
<param name="movie" value="http://banners.clickon.co.il/LOVELY2_banners/swf/JWFLZxzNxjclWGP.swf?url=http://track.clickon.co.il/click/Q8uTE8BXZz1pskj/JWFLZxzNxjclWGP/TsQ8uTE8BXZz1pskjtS" />
<param name="scale" value="exactfit" />
<param name="wmode" value="transparent" />
</object>
<img alt="" style="visibility: hidden;" src="http://track.clickon.co.il/imp/Q8uTE8BXZz1pskj/JWFLZxzNxjclWGP/TsQ8uTE8BXZz1pskjtS" />

</textarea>

如果文本区域仅包含链接或文本,我的 getCode 函数运行良好,但如果它包含 java 脚本代码,我将无法提取它。你能帮帮我吗?

谢谢,

亚尼夫。

最佳答案

在代码中,for 循环返回得太早;仅返回第一个文本。

如果您想要所有标签和文本,请尝试以下操作。

import lxml.etree as etree

htmlchunk = '''
<textarea onclick="javascript: this.select();" id="codeText" style="height: 300px;width:500px;"> <!-- Clickon Affiliate code start here -->
<object type="application/x-shockwave-flash" data="http://banners.clickon.co.il/LOVELY2_banners/swf/JWFLZxzNxjclWGP.swf?url=http://track.clickon.co.il/click/Q8uTE8BXZz1pskj/JWFLZxzNxjclWGP/TsQ8uTE8BXZz1pskjtS" width="728" height="90">
<param name="movie" value="http://banners.clickon.co.il/LOVELY2_banners/swf/JWFLZxzNxjclWGP.swf?url=http://track.clickon.co.il/click/Q8uTE8BXZz1pskj/JWFLZxzNxjclWGP/TsQ8uTE8BXZz1pskjtS" />
<param name="scale" value="exactfit" />
<param name="wmode" value="transparent" />
</object>
<img alt="" style="visibility: hidden;" src="http://track.clickon.co.il/imp/Q8uTE8BXZz1pskj/JWFLZxzNxjclWGP/TsQ8uTE8BXZz1pskjtS" />

</textarea>
'''

html = etree.HTML(htmlchunk)
textarea, = html.xpath('//*[@id="codeText"]')
print(textarea.text + ''.join(etree.tostring(code) for code in textarea) + textarea.tail)

输出:

            <!-- Clickon Affiliate code start here -->
<object type="application/x-shockwave-flash" data="http://banners.clickon.co.il/LOVELY2_banners/swf/JWFLZxzNxjclWGP.swf?url=http://track.clickon.co.il/click/Q8uTE8BXZz1pskj/JWFLZxzNxjclWGP/TsQ8uTE8BXZz1pskjtS" width="728" height="90">
<param name="movie" value="http://banners.clickon.co.il/LOVELY2_banners/swf/JWFLZxzNxjclWGP.swf?url=http://track.clickon.co.il/click/Q8uTE8BXZz1pskj/JWFLZxzNxjclWGP/TsQ8uTE8BXZz1pskjtS"/>
<param name="scale" value="exactfit"/>
<param name="wmode" value="transparent"/>
</object>
<img alt="" style="visibility: hidden;" src="http://track.clickon.co.il/imp/Q8uTE8BXZz1pskj/JWFLZxzNxjclWGP/TsQ8uTE8BXZz1pskjtS"/>

关于javascript - 如何使用 lxml 从文本区域获取 javascript 代码?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/20584895/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com