Python 从 cProfile 获得有意义的结果-6ren

Python 从 cProfile 获得有意义的结果

转载作者：IT老高更新时间：2023-10-28 21:58:01

我在一个文件中有一个 Python 脚本，它需要 30 多秒才能运行。我正在尝试对其进行分析，因为我想大大减少这个时间。

我正在尝试使用 cProfile 分析脚本，但基本上它似乎告诉我的是，是的，主脚本需要很长时间才能运行，但没有给出那种我期待的崩溃。在终端，我输入如下内容:

cat my_script_input.txt | python -m cProfile -s time my_script.py

我得到的结果是:

<my_script_output>
             683121 function calls (682169 primitive calls) in 32.133 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1   31.980   31.980   32.133   32.133 my_script.py:18(<module>)
   121089    0.050    0.000    0.050    0.000 {method 'split' of 'str' objects}
   121090    0.038    0.000    0.049    0.000 fileinput.py:243(next)
        2    0.027    0.014    0.036    0.018 {method 'sort' of 'list' objects}
   121089    0.009    0.000    0.009    0.000 {method 'strip' of 'str' objects}
   201534    0.009    0.000    0.009    0.000 {method 'append' of 'list' objects}
   100858    0.009    0.000    0.009    0.000 my_script.py:51(<lambda>)
      952    0.008    0.000    0.008    0.000 {method 'readlines' of 'file' objects}
 1904/952    0.003    0.000    0.011    0.000 fileinput.py:292(readline)
    14412    0.001    0.000    0.001    0.000 {method 'add' of 'set' objects}
      182    0.000    0.000    0.000    0.000 {method 'join' of 'str' objects}
        1    0.000    0.000    0.000    0.000 fileinput.py:80(<module>)
        1    0.000    0.000    0.000    0.000 fileinput.py:197(__init__)
        1    0.000    0.000    0.000    0.000 fileinput.py:266(nextfile)
        1    0.000    0.000    0.000    0.000 {isinstance}
        1    0.000    0.000    0.000    0.000 fileinput.py:91(input)
        1    0.000    0.000    0.000    0.000 fileinput.py:184(FileInput)
        1    0.000    0.000    0.000    0.000 fileinput.py:240(__iter__)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

这似乎没有告诉我任何有用的信息。绝大多数时间都被简单地列为:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1   31.980   31.980   32.133   32.133 my_script.py:18(<module>)

在 my_script.py 中，第 18 行只不过是文件头 block 注释的结束 """工作集中在第 18 行。整个脚本主要由基于行的处理组成，其中大部分是一些字符串拆分、排序和设置工作，所以我希望能找到大部分时间用于这些事件中的一个或多个。就目前而言，将所有时间分组在 cProfile 的结果中视为发生在注释行上没有任何意义，或者至少不能说明实际上一直在消耗什么。

编辑:我构建了一个与上述案例类似的最小工作示例来演示相同的行为:

mwe.py

import fileinput

for line in fileinput.input():
    for i in range(10):
        y = int(line.strip()) + int(line.strip())

然后调用它:

perl -e 'for(1..1000000){print "$_\n"}' | python -m cProfile -s time mwe.py

要得到结果:

         22002536 function calls (22001694 primitive calls) in 9.433 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    8.004    8.004    9.433    9.433 mwe.py:1(<module>)
 20000000    1.021    0.000    1.021    0.000 {method 'strip' of 'str' objects}
  1000001    0.270    0.000    0.301    0.000 fileinput.py:243(next)
  1000000    0.107    0.000    0.107    0.000 {range}
      842    0.024    0.000    0.024    0.000 {method 'readlines' of 'file' objects}
 1684/842    0.007    0.000    0.032    0.000 fileinput.py:292(readline)
        1    0.000    0.000    0.000    0.000 fileinput.py:80(<module>)
        1    0.000    0.000    0.000    0.000 fileinput.py:91(input)
        1    0.000    0.000    0.000    0.000 fileinput.py:197(__init__)
        1    0.000    0.000    0.000    0.000 fileinput.py:184(FileInput)
        1    0.000    0.000    0.000    0.000 fileinput.py:266(nextfile)
        1    0.000    0.000    0.000    0.000 {isinstance}
        1    0.000    0.000    0.000    0.000 fileinput.py:240(__iter__)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

我是否以某种方式错误地使用了 cProfile？

最佳答案

正如我在评论中提到的，当您无法让 cProfile 在外部工作时，您通常可以在内部使用它。没那么难。

例如，当我在我的 Python 2.7 中使用 -m cProfile 运行时，我实际上得到了与您相同的结果。但是当我手动检测您的示例程序时:

import fileinput
import cProfile

pr = cProfile.Profile()
pr.enable()
for line in fileinput.input():
    for i in range(10):
        y = int(line.strip()) + int(line.strip())
pr.disable()
pr.print_stats(sort='time')

...这是我得到的:

         22002533 function calls (22001691 primitive calls) in 3.352 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
 20000000    2.326    0.000    2.326    0.000 {method 'strip' of 'str' objects}
  1000001    0.646    0.000    0.700    0.000 fileinput.py:243(next)
  1000000    0.325    0.000    0.325    0.000 {range}
      842    0.042    0.000    0.042    0.000 {method 'readlines' of 'file' objects}
 1684/842    0.013    0.000    0.055    0.000 fileinput.py:292(readline)
        1    0.000    0.000    0.000    0.000 fileinput.py:197(__init__)
        1    0.000    0.000    0.000    0.000 fileinput.py:91(input)
        1    0.000    0.000    0.000    0.000 {isinstance}
        1    0.000    0.000    0.000    0.000 fileinput.py:266(nextfile)
        1    0.000    0.000    0.000    0.000 fileinput.py:240(__iter__)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

这更有用:它告诉您可能已经预料到的情况，即您一半以上的时间都花在了调用 str.strip() 上。

另外，请注意，如果您无法编辑包含要分析的代码的文件 (mwe.py)，您可以随时这样做:

import cProfile
pr = cProfile.Profile()
pr.enable()
import mwe
pr.disable()
pr.print_stats(sort='time')

即使这并不总是有效。例如，如果您的程序调用 exit()，则必须使用 try:/finally: 包装器和/或 atexit。它调用 os._exit() 或 segfaults，你可能完全被淹没了。但这并不常见。

但是，我后来发现:如果将所有代码移出全局范围，-m cProfile 似乎可以工作，至少在这种情况下是这样。例如:

import fileinput

def f():
    for line in fileinput.input():
        for i in range(10):
            y = int(line.strip()) + int(line.strip())
f()

现在 -m cProfile 的输出包括:

  2000000    4.819    0.000    4.819    0.000 :0(strip)
   100001    0.288    0.000    0.295    0.000 fileinput.py:243(next)

我不知道为什么这也让它慢了一倍……或者这只是缓存效应；距离我上次运行它已经有几分钟了，我在这之间做了很多网页浏览。但这并不重要，重要的是大部分时间都在合理的地方收费。

但是如果我改变它以将外部循环移动到全局级别，并且只有它的主体变成一个函数，大部分时间都会再次消失。

另一种选择，除非作为最后的手段，否则我不会建议......

我注意到，如果我使用 profile 而不是 cProfile，它在内部和外部都可以工作，从而为正确的调用计费。但是，这些调用也慢了大约 5 倍。并且似乎还有额外的 10 秒的持续开销(如果在内部使用，则从 import profile 收取费用，如果在外部使用，则在第 1 行收取费用)。因此，要发现 split 占用了我 70% 的时间，而不是等待 4 秒并执行 2.326/3.352，我必须等待 27 秒，然后执行 10.93/(26.34 - 10.01)。没什么好玩的……

最后一件事:我使用 CPython 3.4 开发版本得到了相同的结果——在内部使用时结果正确，在外部使用时所有费用都计入第一行代码。但是 PyPy 2.2/2.7.3 和 PyPy3 2.1b1/3.2.3 似乎都用 -m cProfile 给了我正确的结果。这可能只是意味着 PyPy 的 cProfile 是在 profile 之上伪造的，因为纯 Python 代码足够快。

无论如何，如果有人能弄清楚/解释为什么 -m cProfile 不起作用，那就太好了……但否则，这通常是一个非常好的解决方法。

关于Python 从 cProfile 获得有意义的结果，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/21274898/

文章推荐： python - 在 Python 中获取自午夜以来的秒数

文章推荐： c++ - 难以置信的快速 C++ 委托(delegate)和不同的翻译单元

文章推荐： python - 特定列上 Pandas 的滚动平均值

文章推荐： python - 在我的代码中使用 python 字 "type"是否安全？

python - 从函数内调用 Python cProfile。 (或其他将 cProfile 与 Django 一起使用的方法)
如何从一个函数中调用 cProfile，使用它来调用和分析另一个函数？我有一个函数 start()，它是从我的网页调用的(使用 Django)。在此函数中，我放置了 cProfile 调用: cPr
python - cProfile 命令行如何减少输出
我正在尝试在我的 python 脚本上运行 cProfile，我关心的是运行所需的总时间。有没有办法修改 python -m cProfile myscript.py 所以输出只是总时间？最佳答案
python - 无法弄清楚如何在程序内部调用 cProfile
对于初学者的问题很抱歉，但我无法弄清楚 cProfile(我真的是 Python 的新手) 我可以通过我的终端运行它: python -m cProfile myscript.py 但我需要在网络服务
python - cProfile 导入
我目前正在学习如何使用 cProfile，我有一些疑问。我目前正在尝试分析以下脚本: import time def fast(): print("Fast!") def slow():
python cProfile 和配置文件模型跳过函数
基本上，当我运行 cProfile 模块时，它会跳过一些函数，而普通的配置文件模块会产生此错误。 The debugged program raised the exception unhan
Python cProfile 结果似乎没有加起来
我附上了 Python 脚本的 cProfile 结果的屏幕截图。我知道第二行是指 arcpy 站点包中的地理处理函数。但是，我不清楚第一行指的是什么: C:\Program Files (x86)\
python - cProfile 是在背叛我吗？
我想知道为什么我的基于 pyzmq 和 protobuf 的消息传递 ping-pong 比预期的要慢得多，所以我使用 cProfile 来检查您在本文末尾找到的脚本。 protoc --python
python - cProfile 没有方法运行调用
我正在尝试使用 cProfile 来分析一些 python 代码。我相信我需要使用 cProfile.runcall()，而不是 cProfile.run()，因为我要运行的方法是 self.func
python - cProfile 占用大量内存
我正在尝试用 python 分析我的项目，但内存不足。我的项目本身相当占用内存，但在 cProfile 下运行时，即使是半大小的运行也会因“MemoryError”而终止。进行较小的运行并不是一个
python - 嵌套函数中的 cProfile
我正在尝试使用 cProfile.run 分析嵌套函数。我知道 cProfile 可能与我调用它的范围不在同一范围内运行，但我不太确定实现这一目标的惯用方法是什么。这是一个 MVCE: def foo
Python cProfile - 修饰函数模糊配置文件可视化
我有一个带有 @classmethod 的基类，它充当许多后代类中大量方法的装饰器。 class BaseClass(): @classmethod def some_decorato
python - cProfile 配置文件在线程内调用吗？
我在一些代码上运行了 cprofile，除其他外，它产生了几个线程来完成大部分工作。当我查看分析的输出时，我没有看到线程内调用的所有函数的日志记录。我确定他们被调用了，因为他们做的事情很容易看到，例如
python - cProfile 需要很长时间
我开始使用 cProfile 来分析我的 python 脚本。我注意到一些非常奇怪的事情。当我使用 time 来测量我的脚本的运行时间时，它需要 4.3 秒。当我使用 python -m cPro
python - cProfile 未运行
我试图使用 cProfile 对我的代码进行性能测试，但遗憾的是无论我如何尝试，cProfile 都无法正常运行。这是我所做的: import cProfile cProfile.run('addNu
python - cProfiling 具有相对导入的模块
我在 mymodule 中有这些文件 mymodule ├── config.py ├── __init__.py └── lib.py 有了这个简单的内容: # config.py NAME = "
python - cProfile 将数据保存到文件会导致字符困惑
我在一个名为 bot4CA.py 的模块上使用 cProfile，所以在控制台中我输入: python -m cProfile -o thing.txt bot4CA.py 模块运行并退出后，它会创建
python - 过滤掉不相关的 cProfile 输出
我正在使用 cProfile 分析一个 Python 应用程序，我发现它的输出非常冗长。我正在使用此代码创建配置文件并将其可视化: PYTHONPATH=. \ python3 \ -
python cprofile 显示了很多信息。可以仅限于我的代码吗
cProfile 在输出中显示了很多内置函数调用。我们可以将输出限制为我编写的代码吗？因此，在下面的示例中，我能否仅看到来自 testrun 的行或来自驻留在同一脚本中的 testrun() 调用的函
python - 如何让 cProfile 只打印重要的功能？
我从我的 cProfile 输出中获得了大约 300 个条目，每次使用它时都必须向上滚动很长时间。有没有办法让 cProfile 只打印前 10 行之类的东西？最佳答案您可以按“累积”排序并使用
python - 为什么 cProfile 只运行一次代码？
另一方面，timeit 运行代码 1,000,000 次，以获得与其他代码的合理渐近比较。 cProfile 仅运行代码一次，结果中只有 3 个小数位 (0.000)，不足以了解完整情况。你会得到如

IT老高

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

Python 从 cProfile 获得有意义的结果