gpt4 book ai didi

Python - 如何从文本中打印出一行

转载 作者:行者123 更新时间:2023-12-01 01:21:03 25 4
gpt4 key购买 nike

所以我一直在使用 bs4 进行训练并设法打印出文本。现在我设法打印出 var ajaxsearch ,其中 init 的功能要多得多。

我编写了一段代码,它打印出所有包含 javascript 的内容,并打印出 var ajaxsearch 开头的位置:

  try:
product_li_tags = bs4.find_all('script', {'type': 'text/javascript'})
except Exception:
product_li_tags = []

special_code = ''
for s in product_li_tags:
if s.text.strip().startswith('var ajaxsearch'):
special_code = s.text
break

print(special_code)

我得到的输出是:

var ajaxsearch = false;
var combinationsFromController ={
"224114": {
"attributes_values": {
"4": "5.5"
},
"attributes": [
22
],

"unit_impact": 0,
"minimal_quantity": "1",
"date_formatted": "",
"available_date": "",
"id_image": -1,
"list": "'22'"
},
"224140": {
"attributes_values": {
"4": "6"
},
"attributes": [
23
],
"unit_impact": 0,
"minimal_quantity": "1",
"date_formatted": "",
"available_date": "",
"id_image": -1,
"list": "'23'"
},
"224160": {
"attributes_values": {
"4": "6.5"
},
"attributes": [
24
],
"unit_impact": 0,
"minimal_quantity": "1",
"date_formatted": "",
"available_date": "",
"id_image": -1,
"list": "'24'"
},
"224139": {
"attributes_values": {
"4": "7"
},
"attributes": [
25
],
"unit_impact": 0,
"minimal_quantity": "1",
"date_formatted": "",
"available_date": "",
"id_image": -1,
"list": "'25'"
},
"224138": {
"attributes_values": {
"4": "7.5"
},
"attributes": [
26
],
"unit_impact": 0,
"minimal_quantity": "1",
"date_formatted": "",
"available_date": "",
"id_image": -1,
"list": "'26'"
},
"224113": {
"attributes_values": {
"4": "8"
},
"attributes": [
27
],
"unit_impact": 0,
"minimal_quantity": "1",
"date_formatted": "",
"available_date": "",
"id_image": -1,
"list": "'27'"
},
"224129": {
"attributes_values": {
"4": "8.5"
},
"attributes": [
28
],
"unit_impact": 0,
"minimal_quantity": "1",
"date_formatted": "",
"available_date": "",
"id_image": -1,
"list": "'28'"
},
"224161": {
"attributes_values": {
"4": "9"
},
"attributes": [
29
],
"unit_impact": 0,
"minimal_quantity": "1",
"date_formatted": "",
"available_date": "",
"id_image": -1,
"list": "'29'"
}
};
var contentOnly = false;
var Blank = 1;
var Format = 2;

这意味着当我打印出 s.text 时。我将得到上面代码的输出。小编辑:如果我尝试执行 if s.text.strip().startswith('var CombinationsFromController'): ,它将找不到该值,而且如果我以相反的方式更改它if 'var CombinationsFromController' in s.text.strip(): 它将打印出与上面相同的输出。

但是我的问题是,我只想打印出 var CombinationsFromController 并跳过其余部分,稍后我可以使用 json.loads 将值转换为 json,但在此之前我的问题是,我如何打印以便我可以设法只获得值 var CombinationsFromController

编辑:可能解决了!

for s in product_li_tags:
if 'var combinationsFromController' in s.text.strip():
for line in s.text.splitlines():
if line.startswith('var combinationsFromController'):
get_full_text = line.strip()
get_config = get_full_text.split(" = ")
cut_text = get_config[1][:-1]
get_json_values = json.loads(cut_text)

最佳答案

如果我正确理解你的问题,你有一个代表 5 个 JavaScript 变量的 121 行字符串,并且你想要获取一个仅包含第二个变量的子字符串。

您可以按如下方式使用 Python 字符串操作:

start = special_code.split('\n').index('var combinationsFromController ={')
end = special_code.split('\n')[start + 1:].index('var contentOnly = false;')
print('\n'.join(lines[start:end + 3]))

使用方法 str.index 查找您需要的 javascript 变量的出现次数。如果变量的顺序是任意的,即您不知道目标变量之后的下一个变量的名称是什么,您仍然可以使用类似的字符串操作来获取所需的子字符串。

lines = special_code.split('\n')
start = lines.index('var combinationsFromController ={')
end = lines[-1]
for i, line in enumerate(lines[start + 1:]):
if 'var' in line:
end = start + i
break
print('\n'.join(lines[start:end + 1]))

关于Python - 如何从文本中打印出一行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53831391/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com