gpt4 book ai didi

python - 为什么多线程和不同的功能/范围共享单个导入过程

转载 作者:行者123 更新时间:2023-11-28 22:42:06 25 4
gpt4 key购买 nike

自从我多年前使用 Python 以来,这个陷阱是第一个难以发现的错误。

让我展示一个过于简单的例子,我有这个文件/目录:

[xiaobai@xiaobai import_pitfall]$ tree -F -C -a
.
├── import_all_pitall/
│   ├── hello.py
│   └── __init__.py
└── thread_test.py

1 directory, 3 files
[xiaobai@xiaobai import_pitfall]$

thread_test.py 的内容:

[xiaobai@xiaobai import_pitfall]$ cat thread_test.py 
import time
import threading

def do_import1():
print( "do_import 1A" )
from import_all_pitall import hello
print( "do_import 1B", id(hello), locals() )

def do_import2():
print( "do_import 2A" )
from import_all_pitall import hello as h
print( "do_import 2B", id(h), locals() )

def do_import3():
print( "do_import 3A" )
import import_all_pitall.hello as h2
#no problem if import different module #import urllib as h2
print( "do_import 3B", id(h2), locals() )

print( "main 1" )
t = threading.Thread(target=do_import1)
print( "main 2" )
t.start()
print( "main 3" )
t2 = threading.Thread(target=do_import2)
print( "main 4" )
t2.start()
print( "main 5" )
print(globals()) #no such hello
#time.sleep(2) #slightly wait for do_import 1A import finished to test print hello below.
#print( "main 6", id(hello), locals() ) #"name 'hello' not defined" error even do_import1 was success
do_import3()
print( "main -1" )
[xiaobai@xiaobai import_pitfall]$

hello.py 的内容:

[xiaobai@xiaobai import_pitfall]$ cat import_all_pitall/hello.py
print( "haha0" )
import time
t = time.time()
print( "haha1" )
def do_task():
success = 0
while not success:
try:
time.sleep(1)
undefined_func( "Done haha" )
success = 1
except Exception as e:
print("exception occur", e)
print( "haha time is ", t )
do_task()
print( "haha -1" )
[xiaobai@xiaobai import_pitfall]$

而 import_all_pitall/init.py 是一个空文件。

让我们运行它:

[xiaobai@xiaobai import_pitfall]$ python thread_test.py 
main 1
main 2
do_import 1A
main 3
haha0
haha1
main 4
do_import 2A
main 5
{'do_import1': <function do_import1 at 0x7f9d884760c8>, 'do_import3': <function do_import3 at 0x7f9d884a6758>, 'do_import2': <function do_import2 at 0x7f9d884a66e0>, '__builtins__': <module '__builtin__' (built-in)>, '__file__': 'thread_test.py', 't2': <Thread(Thread-2, started 140314429765376)>, '__package__': None, 'threading': <module 'threading' from '/usr/lib64/python2.7/threading.pyc'>, 't': <Thread(Thread-1, started 140314438158080)>, 'time': <module 'time' from '/usr/lib64/python2.7/lib-dynload/timemodule.so'>, '__name__': '__main__', '__doc__': None}
do_import 3A
('exception occur', NameError("global name 'undefined_func' is not defined",))
('haha time is ', 1439451183.753475)
('exception occur', NameError("global name 'undefined_func' is not defined",))
('haha time is ', 1439451183.753475)
('exception occur', NameError("global name 'undefined_func' is not defined",))
('haha time is ', 1439451183.753475)
('exception occur', NameError("global name 'undefined_func' is not defined",))
('haha time is ', 1439451183.753475)
('exception occur', NameError("global name 'undefined_func' is not defined",))
('haha time is ', 1439451183.753475)
('exception occur', NameError("global name 'undefined_func' is not defined",))
('haha time is ', 1439451183.753475)
^C('exception occur', NameError("global name 'undefined_func' is not defined",))
('haha time is ', 1439451183.753475)
('exception occur', NameError("global name 'undefined_func' is not defined",))
('haha time is ', 1439451183.753475)
^C('exception occur', NameError("global name 'undefined_func' is not defined",))
('haha time is ', 1439451183.753475)
^C^C('exception occur', NameError("global name 'undefined_func' is not defined",))
('haha time is ', 1439451183.753475)
... #Forever

仔细看,“do_import 2B”和“do_import 3B”在哪里?它只是卡在导入指令上,甚至没有转到导入的第一行,因为只有一个 time.time() 将运行。它挂起只是因为第一次在“未完成”循环状态下在另一个线程/函数上导入相同的模块。我的整个系统很大而且是多线程的,在我知道情况之前很难调试。

在我注释掉 hello.py 中的 '#undefined_func( "Done haha​​")' 之后:

print( "haha0" )
import time
t = time.time()
print( "haha1" )
def do_task():
success = 0
while not success:
try:
time.sleep(1)
#undefined_func( "Done haha" )
success = 1
except Exception as e:
print("exception occur", e)
print( "haha time is ", t )
do_task()
print( "haha -1" )

然后运行它:

[xiaobai@xiaobai import_pitfall]$ python3 thread_test.py 
main 1
main 2
do_import 1A
main 3
main 4
do_import 2A
main 5
{'do_import3': <function do_import3 at 0x7f31a462c048>, '__package__': None, 't2': <Thread(Thread-2, started 139851179529984)>, '__name__': '__main__', '__cached__': None, 'threading': <module 'threading' from '/usr/lib64/python3.4/threading.py'>, '__doc__': None, 'do_import2': <function do_import2 at 0x7f31ac1d56a8>, 'do_import1': <function do_import1 at 0x7f31ac2c0bf8>, '__spec__': None, 't': <Thread(Thread-1, started 139851187922688)>, '__file__': 'thread_test.py', 'time': <module 'time' from '/usr/lib64/python3.4/lib-dynload/time.cpython-34m.so'>, '__loader__': <_frozen_importlib.SourceFileLoader object at 0x7f31ac297048>, '__builtins__': <module 'builtins' (built-in)>}
do_import 3A
haha0
haha1
haha -1
do_import 1B 139851188124312 {'hello': <module 'import_all_pitall.hello' from '/home/xiaobai/note/python/import_pitfall/import_all_pitall/hello.py'>}
do_import 2B 139851188124312 {'h': <module 'import_all_pitall.hello' from '/home/xiaobai/note/python/import_pitfall/import_all_pitall/hello.py'>}
do_import 3B 139851188124312 {'h2': <module 'import_all_pitall.hello' from '/home/xiaobai/note/python/import_pitfall/import_all_pitall/hello.py'>}
main -1
[xiaobai@xiaobai import_pitfall]$

我打印 id 并找出它们都共享相同的 id 139851188124312。因此 3 个函数共享相同的导入对象/进程。但这对我来说没有意义,我认为对象是函数的本地对象,因为如果我尝试在全局范围内打印导入的“hello”对象,它会抛出错误:

编辑 thread_test.py 以在全局范围内打印 hello 对象:

...
print( "main 5" )
print(globals()) #no such hello
time.sleep(2) #slightly wait for do_import 1A import finished to test print hello below.
print( "main 6", id(hello), locals() ) #"name 'hello' not defined" error even do_import1 was success
do_import3()
print( "main -1" )

让我们运行它:

[xiaobai@xiaobai import_pitfall]$ python3 thread_test.py 
main 1
main 2
do_import 1A
main 3
main 4
do_import 2A
main 5
{'t': <Thread(Thread-1, started 140404878976768)>, '__spec__': None, 'time': <module 'time' from '/usr/lib64/python3.4/lib-dynload/time.cpython-34m.so'>, '__cached__': None, '__loader__': <_frozen_importlib.SourceFileLoader object at 0x7fb296b87048>, 'do_import2': <function do_import2 at 0x7fb296ac56a8>, 'do_import1': <function do_import1 at 0x7fb296bb0bf8>, '__doc__': None, '__file__': 'thread_test.py', 'do_import3': <function do_import3 at 0x7fb28ef19f28>, 't2': <Thread(Thread-2, started 140404870584064)>, '__name__': '__main__', '__package__': None, '__builtins__': <module 'builtins' (built-in)>, 'threading': <module 'threading' from '/usr/lib64/python3.4/threading.py'>}
haha0
haha1
haha -1
do_import 1B 140404879178392 {'hello': <module 'import_all_pitall.hello' from '/home/xiaobai/note/python/import_pitfall/import_all_pitall/hello.py'>}
do_import 2B 140404879178392 {'h': <module 'import_all_pitall.hello' from '/home/xiaobai/note/python/import_pitfall/import_all_pitall/hello.py'>}
Traceback (most recent call last):
File "thread_test.py", line 31, in <module>
print( "main 6", id(hello), locals() ) #"name 'hello' not defined" error even do_import1 was success
NameError: name 'hello' is not defined
[xiaobai@xiaobai import_pitfall]$

hello 不是全局的,但为什么它可以被不同函数的不同线程共享?为什么 python 不允许唯一的本地导入?为什么 python 共享导入过程,并且它使所有其他线程无缘无故地“等待”只是因为一个线程在导入过程中挂起?

最佳答案

回答其中一个问题-

I print the id and figure they all share the same id 140589697897480. So 3 functions share the same import object/process.

是的,当您导入模块时,python 会导入模块对象并将其缓存在 sys.modules 中。然后对于该模块的任何后续导入,python 从 sys.modules 获取模块对象并返回它,它不会再次导入。

对于同一个问题的第二部分——

But this doesn't make sense to me, i though object is local to the function, because if i try to print imported "hello" object on global scope, it will throw error

好吧,sys.modules 不是本地的,但是名称 hello 是函数的本地名称。如上所述,如果您再次尝试导入该模块,python 将首先查找 sys.modules 以查看它是否已被导入,如果包含该模块则返回,否则将其导入并添加到 sys.modules


对于第一个程序,当导入 python 模块时,它从顶层运行,在你的 hello.py 中你有一个无限循环 - while 1: ,因为 1 始终为真。所以导入永远不会完成。

如果你不想无限循环运行,你应该在导入模块时把你不想运行的代码放在里面-

if __name__ == '__main__':

上面if语句里面的代码只会运行,如果直接运行脚本,导入模块时不会运行。


我猜你说 -

After i comment out the '#undefined_func( "Done haha" )' in hello.py

您实际上注释掉了完整的无限循环,因此导入成功。

关于python - 为什么多线程和不同的功能/范围共享单个导入过程,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31982561/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com