gpt4 book ai didi

python - 为什么 utorrents Magnet 到 Torrent 文件的获取速度比我的 python 脚本更快?

转载 作者:行者123 更新时间:2023-11-30 23:06:13 25 4
gpt4 key购买 nike

我正在尝试使用 python 脚本转换 .torrent 文件中的 torrent 磁力网址。python 脚本连接到 dht 并等待元数据,然后从中创建 torrent 文件。

例如

#!/usr/bin/env python
'''
Created on Apr 19, 2012
@author: dan, Faless

GNU GENERAL PUBLIC LICENSE - Version 3

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.

http://www.gnu.org/licenses/gpl-3.0.txt

'''

import shutil
import tempfile
import os.path as pt
import sys
import libtorrent as lt
from time import sleep


def magnet2torrent(magnet, output_name=None):
if output_name and \
not pt.isdir(output_name) and \
not pt.isdir(pt.dirname(pt.abspath(output_name))):
print("Invalid output folder: " + pt.dirname(pt.abspath(output_name)))
print("")
sys.exit(0)

tempdir = tempfile.mkdtemp()
ses = lt.session()
params = {
'save_path': tempdir,
'duplicate_is_error': True,
'storage_mode': lt.storage_mode_t(2),
'paused': False,
'auto_managed': True,
'duplicate_is_error': True
}
handle = lt.add_magnet_uri(ses, magnet, params)

print("Downloading Metadata (this may take a while)")
while (not handle.has_metadata()):
try:
sleep(1)
except KeyboardInterrupt:
print("Aborting...")
ses.pause()
print("Cleanup dir " + tempdir)
shutil.rmtree(tempdir)
sys.exit(0)
ses.pause()
print("Done")

torinfo = handle.get_torrent_info()
torfile = lt.create_torrent(torinfo)

output = pt.abspath(torinfo.name() + ".torrent")

if output_name:
if pt.isdir(output_name):
output = pt.abspath(pt.join(
output_name, torinfo.name() + ".torrent"))
elif pt.isdir(pt.dirname(pt.abspath(output_name))):
output = pt.abspath(output_name)

print("Saving torrent file here : " + output + " ...")
torcontent = lt.bencode(torfile.generate())
f = open(output, "wb")
f.write(lt.bencode(torfile.generate()))
f.close()
print("Saved! Cleaning up dir: " + tempdir)
ses.remove_torrent(handle)
shutil.rmtree(tempdir)

return output


def showHelp():
print("")
print("USAGE: " + pt.basename(sys.argv[0]) + " MAGNET [OUTPUT]")
print(" MAGNET\t- the magnet url")
print(" OUTPUT\t- the output torrent file name")
print("")


def main():
if len(sys.argv) < 2:
showHelp()
sys.exit(0)

magnet = sys.argv[1]
output_name = None

if len(sys.argv) >= 3:
output_name = sys.argv[2]

magnet2torrent(magnet, output_name)


if __name__ == "__main__":
main()

上面的脚本需要大约 1 分钟以上的时间来获取元数据并创建 .torrent 文件,而 utorrent 客户端只需要几秒钟,这是为什么?

如何使我的脚本更快?

我想获取大约 1k+ 种子的元数据。

例如磁力链接

magnet:?xt=urn:btih:BFEFB51F4670D682E98382ADF81014638A25105A&dn=openSUSE+13.2+DVD+x86_64.iso&tr=udp%3A%2F%2Ftracker.openbittorrent.com%3A80&tr=udp%3A%2F%2Ftracker.publicbt.com%3A80&tr=udp%3A%2F%2Ftracker.ccc.de%3A80
<小时/>

更新:

我已经在我的脚本中指定了这样的已知 dht 路由器 URL。

session = lt.session()
session.listen_on(6881, 6891)

session.add_dht_router("router.utorrent.com", 6881)
session.add_dht_router("router.bittorrent.com", 6881)
session.add_dht_router("dht.transmissionbt.com", 6881)
session.add_dht_router("router.bitcomet.com", 6881)
session.add_dht_router("dht.aelitis.com", 6881)
session.start_dht()

但它仍然很慢,有时我会遇到类似的错误

DHT error [hostname lookup] (1) Host not found (authoritative)
could not map port using UPnP: no router found
<小时/>

更新:

我编写了这个 scmall 脚本,它从数据库获取十六进制信息哈希并尝试从 dht 获取元数据,然后将 torrent 文件插入数据库。

我让它无限期地运行,因为我不知道如何保存状态,所以保持它运行将获得更多的对等点,并且获取元数据会更快。

#!/usr/bin/env python
# this file will run as client or daemon and fetch torrent meta data i.e. torrent files from magnet uri

import libtorrent as lt # libtorrent library
import tempfile # for settings parameters while fetching metadata as temp dir
import sys #getting arguiments from shell or exit script
from time import sleep #sleep
import shutil # removing directory tree from temp directory
import os.path # for getting pwd and other things
from pprint import pprint # for debugging, showing object data
import MySQLdb # DB connectivity
import os
from datetime import date, timedelta

#create lock file to make sure only single instance is running
lock_file_name = "/daemon.lock"

if(os.path.isfile(lock_file_name)):
sys.exit('another instance running')
#else:
#f = open(lock_file_name, "w")
#f.close()

session = lt.session()
session.listen_on(6881, 6891)

session.add_dht_router("router.utorrent.com", 6881)
session.add_dht_router("router.bittorrent.com", 6881)
session.add_dht_router("dht.transmissionbt.com", 6881)
session.add_dht_router("router.bitcomet.com", 6881)
session.add_dht_router("dht.aelitis.com", 6881)
session.start_dht()

alive = True
while alive:

db_conn = MySQLdb.connect( host = 'localhost', user = '', passwd = '', db = 'basesite', unix_socket='') # Open database connection
#print('reconnecting')
#get all records where enabled = 0 and uploaded within yesterday
subset_count = 5 ;

yesterday = date.today() - timedelta(1)
yesterday = yesterday.strftime('%Y-%m-%d %H:%M:%S')
#print(yesterday)

total_count_query = ("SELECT COUNT(*) as total_count FROM content WHERE upload_date > '"+ yesterday +"' AND enabled = '0' ")
#print(total_count_query)
try:
total_count_cursor = db_conn.cursor()# prepare a cursor object using cursor() method
total_count_cursor.execute(total_count_query) # Execute the SQL command
total_count_results = total_count_cursor.fetchone() # Fetch all the rows in a list of lists.
total_count = total_count_results[0]
print(total_count)
except:
print "Error: unable to select data"

total_pages = total_count/subset_count
#print(total_pages)

current_page = 1
while(current_page <= total_pages):
from_count = (current_page * subset_count) - subset_count

#print(current_page)
#print(from_count)

hashes = []

get_mysql_data_query = ("SELECT hash FROM content WHERE upload_date > '" + yesterday +"' AND enabled = '0' ORDER BY record_num ASC LIMIT "+ str(from_count) +" , " + str(subset_count) +" ")
#print(get_mysql_data_query)
try:
get_mysql_data_cursor = db_conn.cursor()# prepare a cursor object using cursor() method
get_mysql_data_cursor.execute(get_mysql_data_query) # Execute the SQL command
get_mysql_data_results = get_mysql_data_cursor.fetchall() # Fetch all the rows in a list of lists.
for row in get_mysql_data_results:
hashes.append(row[0].upper())
except:
print "Error: unable to select data"

print(hashes)

handles = []

for hash in hashes:
tempdir = tempfile.mkdtemp()
add_magnet_uri_params = {
'save_path': tempdir,
'duplicate_is_error': True,
'storage_mode': lt.storage_mode_t(2),
'paused': False,
'auto_managed': True,
'duplicate_is_error': True
}
magnet_uri = "magnet:?xt=urn:btih:" + hash.upper() + "&tr=udp%3A%2F%2Ftracker.openbittorrent.com%3A80&tr=udp%3A%2F%2Ftracker.publicbt.com%3A80&tr=udp%3A%2F%2Ftracker.ccc.de%3A80"
#print(magnet_uri)
handle = lt.add_magnet_uri(session, magnet_uri, add_magnet_uri_params)
handles.append(handle) #push handle in handles list

#print("handles length is :")
#print(len(handles))

while(len(handles) != 0):
for h in handles:
#print("inside handles for each loop")
if h.has_metadata():
torinfo = h.get_torrent_info()
final_info_hash = str(torinfo.info_hash())
final_info_hash = final_info_hash.upper()
torfile = lt.create_torrent(torinfo)
torcontent = lt.bencode(torfile.generate())
tfile_size = len(torcontent)
try:
insert_cursor = db_conn.cursor()# prepare a cursor object using cursor() method
insert_cursor.execute("""INSERT INTO dht_tfiles (hash, tdata) VALUES (%s, %s)""", [final_info_hash , torcontent] )
db_conn.commit()
#print "data inserted in DB"
except MySQLdb.Error, e:
try:
print "MySQL Error [%d]: %s" % (e.args[0], e.args[1])
except IndexError:
print "MySQL Error: %s" % str(e)

shutil.rmtree(h.save_path()) # remove temp data directory
session.remove_torrent(h) # remove torrnt handle from session
handles.remove(h) #remove handle from list

else:
if(h.status().active_time > 600): # check if handle is more than 10 minutes old i.e. 600 seconds
#print('remove_torrent')
shutil.rmtree(h.save_path()) # remove temp data directory
session.remove_torrent(h) # remove torrnt handle from session
handles.remove(h) #remove handle from list
sleep(1)
#print('sleep1')

print('sleep10')
sleep(10)
current_page = current_page + 1
#print('sleep20')
sleep(20)

os.remove(lock_file_name);

现在我需要按照 Arvid 的建议实现新事物。

<小时/>

更新

我已经成功实现了 Arvid 的建议。以及我在 deluge 支持论坛 http://forum.deluge-torrent.org/viewtopic.php?f=7&t=42299&start=10 中找到的更多扩展

#!/usr/bin/env python

import libtorrent as lt # libtorrent library
import tempfile # for settings parameters while fetching metadata as temp dir
import sys #getting arguiments from shell or exit script
from time import sleep #sleep
import shutil # removing directory tree from temp directory
import os.path # for getting pwd and other things
from pprint import pprint # for debugging, showing object data
import MySQLdb # DB connectivity
import os
from datetime import date, timedelta

def var_dump(obj):
for attr in dir(obj):
print "obj.%s = %s" % (attr, getattr(obj, attr))

session = lt.session()
session.add_extension('ut_pex')
session.add_extension('ut_metadata')
session.add_extension('smart_ban')
session.add_extension('metadata_transfer')

#session = lt.session(lt.fingerprint("DE", 0, 1, 0, 0), flags=1)

session_save_filename = "/tmp/new.client.save_state"

if(os.path.isfile(session_save_filename)):

fileread = open(session_save_filename, 'rb')
session.load_state(lt.bdecode(fileread.read()))
fileread.close()
print('session loaded from file')
else:
print('new session started')

session.add_dht_router("router.utorrent.com", 6881)
session.add_dht_router("router.bittorrent.com", 6881)
session.add_dht_router("dht.transmissionbt.com", 6881)
session.add_dht_router("router.bitcomet.com", 6881)
session.add_dht_router("dht.aelitis.com", 6881)
session.start_dht()

alerts = []

alive = True
while alive:
a = session.pop_alert()
alerts.append(a)
print('----------')
for a in alerts:
var_dump(a)
alerts.remove(a)


print('sleep10')
sleep(10)
filewrite = open(session_save_filename, "wb")
filewrite.write(lt.bencode(session.save_state()))
filewrite.close()

让它运行一分钟并收到警报

obj.msg = no router found 
<小时/>

更新:

经过一些测试看起来像

session.add_dht_router("router.bitcomet.com", 6881)

导致

('%s: %s', 'alert', 'DHT error [hostname lookup] (1) Host not found (authoritative)')
<小时/>

更新:我添加了

session.start_dht()
session.start_lsd()
session.start_upnp()
session.start_natpmp()

并收到警报

('%s: %s', 'portmap_error_alert', 'could not map port using UPnP: no router found')

最佳答案

正如 MatteoItalia 指出的那样,引导 DHT 并不是即时的,有时可能需要一段时间。引导过程完成时没有明确定义的时间点,它是越来越多地连接到网络的连续体。

您了解的连接越多、良好、稳定的节点越多,查找速度就越快。分解大部分引导过程(以获得更多同类比较)的一种方法是在获得 dht_bootstrap_alert 之后开始计时。 (并且在此之前不要添加磁力链接)。

添加 dht 引导节点将主要使其可能进行引导,但它仍然不一定会特别快。您通常需要大约 270 个节点左右(包括替换节点)被视为引导。

为了加快引导过程,您可以做的一件事是确保 save and load session 状态,其中包括dht routing table 。这会将上一个 session 中的所有节点重新加载到路由表中,并且(假设您没有更改 IP 并且一切正常)引导应该会更快。

确保您session constructor 中启动 DHT (作为flags参数,只需传入add_default_plugins),load the state ,添加路由器节点,然后 start the dht .

不幸的是,要使其在内部工作,涉及很多移动部件,顺序很重要,并且可能存在微妙的问题。

另外,请注意,保持 DHT 持续运行会更快,因为重新加载状态仍然会通过 Bootstrap ,它只会有更多的节点预先进行 ping 并尝试“连接”。

禁用 start_default_features 标志还意味着 UPnP 和 NAT-PMP 将不会启动,如果您使用它们,则必须 start也可以手动进行。

关于python - 为什么 utorrents Magnet 到 Torrent 文件的获取速度比我的 python 脚本更快?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32753998/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com