python - 使用 memory_profiler 分析代码会增加执行时间-6ren

python - 使用 memory_profiler 分析代码会增加执行时间

转载作者：太空宇宙更新时间：2023-11-04 04:41:05

我正在编写一个简单的应用程序，它将一个大的文本文件拆分成多个较小的文件，我已经编写了它的两个版本，一个使用列表，一个使用生成器。我使用 memory_profiler 模块分析了两个版本，它清楚地显示了生成器版本更好的内存效率，但是奇怪的是，当分析使用生成器的版本时，它增加了执行时间。下面的演示解释了我的意思

使用列表的版本

from memory_profiler import profile


@profile()
def main():
    file_name = input("Enter the full path of file you want to split into smaller inputFiles: ")
    input_file = open(file_name).readlines()
    num_lines_orig = len(input_file)
    parts = int(input("Enter the number of parts you want to split in: "))
    output_files = [(file_name + str(i)) for i in range(1, parts + 1)]
    st = 0
    p = int(num_lines_orig / parts)
    ed = p
    for i in range(parts-1):
        with open(output_files[i], "w") as OF:
            OF.writelines(input_file[st:ed])
        st = ed
        ed = st + p

    with open(output_files[-1], "w") as OF:
        OF.writelines(input_file[st:])


if __name__ == "__main__":
    main()

当使用分析器运行时

$ time py36 Splitting\ text\ files_BAD_usingLists.py                                                                                                               

Enter the full path of file you want to split into smaller inputFiles: /apps/nttech/rbhanot/Downloads/test.txt
Enter the number of parts you want to split in: 3
Filename: Splitting text files_BAD_usingLists.py

Line #    Mem usage    Increment   Line Contents
================================================
     6     47.8 MiB      0.0 MiB   @profile()
     7                             def main():
     8     47.8 MiB      0.0 MiB       file_name = input("Enter the full path of file you want to split into smaller inputFiles: ")
     9    107.3 MiB     59.5 MiB       input_file = open(file_name).readlines()
    10    107.3 MiB      0.0 MiB       num_lines_orig = len(input_file)
    11    107.3 MiB      0.0 MiB       parts = int(input("Enter the number of parts you want to split in: "))
    12    107.3 MiB      0.0 MiB       output_files = [(file_name + str(i)) for i in range(1, parts + 1)]
    13    107.3 MiB      0.0 MiB       st = 0
    14    107.3 MiB      0.0 MiB       p = int(num_lines_orig / parts)
    15    107.3 MiB      0.0 MiB       ed = p
    16    108.1 MiB      0.7 MiB       for i in range(parts-1):
    17    107.6 MiB     -0.5 MiB           with open(output_files[i], "w") as OF:
    18    108.1 MiB      0.5 MiB               OF.writelines(input_file[st:ed])
    19    108.1 MiB      0.0 MiB           st = ed
    20    108.1 MiB      0.0 MiB           ed = st + p
    21                             
    22    108.1 MiB      0.0 MiB       with open(output_files[-1], "w") as OF:
    23    108.1 MiB      0.0 MiB           OF.writelines(input_file[st:])



real    0m6.115s
user    0m0.764s
sys     0m0.052s

在没有分析器的情况下运行

$ time py36 Splitting\ text\ files_BAD_usingLists.py 
Enter the full path of file you want to split into smaller inputFiles: /apps/nttech/rbhanot/Downloads/test.txt
Enter the number of parts you want to split in: 3

real    0m5.916s
user    0m0.696s
sys     0m0.080s

现在使用发电机

@profile()
def main():
    file_name = input("Enter the full path of file you want to split into smaller inputFiles: ")
    input_file = open(file_name)
    num_lines_orig = sum(1 for _ in input_file)
    input_file.seek(0)
    parts = int(input("Enter the number of parts you want to split in: "))
    output_files = ((file_name + str(i)) for i in range(1, parts + 1))
    st = 0
    p = int(num_lines_orig / parts)
    ed = p
    for i in range(parts-1):
        file = next(output_files)
        with open(file, "w") as OF:
            for _ in range(st, ed):
                OF.writelines(input_file.readline())

            st = ed
            ed = st + p
            if num_lines_orig - ed < p:
                ed = st + (num_lines_orig - ed) + p
            else:
                ed = st + p

    file = next(output_files)
    with open(file, "w") as OF:
        for _ in range(st, ed):
            OF.writelines(input_file.readline())


if __name__ == "__main__":
    main()

当使用探查器选项运行时

$ time py36 -m memory_profiler Splitting\ text\ files_GOOD_usingGenerators.py                                                                                                                                      
Enter the full path of file you want to split into smaller inputFiles: /apps/nttech/rbhanot/Downloads/test.txt
Enter the number of parts you want to split in: 3
Filename: Splitting text files_GOOD_usingGenerators.py

Line #    Mem usage    Increment   Line Contents
================================================
     4   47.988 MiB    0.000 MiB   @profile()
     5                             def main():
     6   47.988 MiB    0.000 MiB       file_name = input("Enter the full path of file you want to split into smaller inputFiles: ")
     7   47.988 MiB    0.000 MiB       input_file = open(file_name)
     8   47.988 MiB    0.000 MiB       num_lines_orig = sum(1 for _ in input_file)
     9   47.988 MiB    0.000 MiB       input_file.seek(0)
    10   47.988 MiB    0.000 MiB       parts = int(input("Enter the number of parts you want to split in: "))
    11   48.703 MiB    0.715 MiB       output_files = ((file_name + str(i)) for i in range(1, parts + 1))
    12   47.988 MiB   -0.715 MiB       st = 0
    13   47.988 MiB    0.000 MiB       p = int(num_lines_orig / parts)
    14   47.988 MiB    0.000 MiB       ed = p
    15   48.703 MiB    0.715 MiB       for i in range(parts-1):
    16   48.703 MiB    0.000 MiB           file = next(output_files)
    17   48.703 MiB    0.000 MiB           with open(file, "w") as OF:
    18   48.703 MiB    0.000 MiB               for _ in range(st, ed):
    19   48.703 MiB    0.000 MiB                   OF.writelines(input_file.readline())
    20                             
    21   48.703 MiB    0.000 MiB               st = ed
    22   48.703 MiB    0.000 MiB               ed = st + p
    23   48.703 MiB    0.000 MiB               if num_lines_orig - ed < p:
    24   48.703 MiB    0.000 MiB                   ed = st + (num_lines_orig - ed) + p
    25                                         else:
    26   48.703 MiB    0.000 MiB                   ed = st + p
    27                             
    28   48.703 MiB    0.000 MiB       file = next(output_files)
    29   48.703 MiB    0.000 MiB       with open(file, "w") as OF:
    30   48.703 MiB    0.000 MiB           for _ in range(st, ed):
    31   48.703 MiB    0.000 MiB               OF.writelines(input_file.readline())



real    1m48.071s
user    1m13.144s
sys     0m19.652s

在没有分析器的情况下运行

$ time py36  Splitting\ text\ files_GOOD_usingGenerators.py 
Enter the full path of file you want to split into smaller inputFiles: /apps/nttech/rbhanot/Downloads/test.txt
Enter the number of parts you want to split in: 3

real    0m10.429s
user    0m3.160s
sys     0m0.016s

那么，为什么分析首先会使我的代码变慢？其次，如果分析影响执行速度，那么为什么这种影响没有在使用列表的代码版本上显示。

最佳答案

我使用 line_profiler 对代码进行了 cpu_profiled，这次我得到了答案，生成器版本花费更多时间的原因是因为以下几行

19         2      11126.0   5563.0      0.2          with open(file, "w") as OF:
    20    379886     200418.0      0.5      3.0              for _ in range(st, ed):
    21    379884    2348653.0      6.2     35.1                  OF.writelines(input_file.readline())

为什么列表版本不会变慢是因为

   19         2       9419.0   4709.5      0.4          with open(output_files[i], "w") as OF:
    20         2    1654165.0 827082.5     65.1              OF.writelines(input_file[st:ed])

对于列表，新文件是通过简单地通过切片获取列表的副本来编写的，这实际上是一个单一的语句。然而对于生成器版本，新文件是通过逐行读取输入文件来填充的，这使得内存分析器对每一行进行分析，这相当于增加了 cpu 时间。

关于python - 使用 memory_profiler 分析代码会增加执行时间，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/50627267/

文章推荐： linux - shell 脚本中的 for x in {1..10} 仅运行一次

macos - 执行 wine != 执行 `which wine`
我有一个“有趣”的问题，即以两种不同的方式运行 wine 会导致: $> wine --version /Applications/Wine.app/Contents/Resources/bin/wi
javascript - CRONTAB 执行 Python，使用 puppeteer 执行 Node 来进行网页抓取不起作用
我制作了这个网络抓取工具来获取网页中的表格。我使用 puppeteer (不知道 crontab 有问题)、Python 进行清理并处理数据库的输出但令我惊讶的是，当我执行它时 */50 * * *
javascript - 对 javascript 函数的 Objective-C 调用何时被调用/执行，何时不被调用/执行？
JavaScript 是否被调用或执行取决于什么？准确地说，我有两个函数，它们都以相同的方式调用: [self.mapView stringByEvaluatingJavaScriptFromStri
python - 为什么使用 statsmodels 执行 OLS 和使用 scikit 执行 PooledOLS 时会得到相同的结果？
我目前正在使用 python 做一个机器学习项目(这里是初学者，从头开始学习一切)。只是想知道 statsmodels 的 OLS 和 scikit 的 PooledOlS 使用我拥有的相同面板数据
c# - 通过 Enterprise Guide 执行 SAS 和从 .Net 执行 IOM 之间的区别
在使用集成对象模型 (IOM) 后，我可以执行 SAS 代码并将 SAS 数据集读入 .Net/C# 数据集 here . 只是好奇，使用 .Net 作为 SAS 服务器的客户端与使用 Enterpr
javascript - jQuery 不会使用 animate : top 200px function. 执行，但它会使用 animate: height 执行
有一些直接的 jQuery 在单击时隐藏打开的 div 未显示，但仍将高度添加到导航中以使其看起来好像要掉下来了。这个脚本工作正常: $(document).ready(funct
java - 为什么我的代码使用 'IF' 执行 'ELSE' 和 '==' ，但不使用 '.equals' 执行？
这个问题已经有答案了: How do I compare strings in Java? (23 个回答) 已关闭 4 年前。这里是 Java 新手，我正在使用 NetBeans 尝试一些简单的代
python - Keras 2.0.8 仅使用 Python 3.x 执行 1 个 epoch，使用 2.x 执行 10 个
如果我将它切换到 Python 2.x，它执行 10。这是为什么？训练逻辑回归模型 import keras.backend as
JavaScript 执行
我有两个脚本，它们包含在 HTML 正文中。在第一个脚本中，我初始化一个 JS 对象，该对象在第二个脚本标记中引用。 ... obj.a = 1000; obj.
执行@number时的Java链接列表错误消息
每当我运行该方法时，我都会收到一个带有数字的错误以下是我的代码。 public String getAccount() { String s = "Listing the accounts";
java - 执行 while 循环以显示菜单
我已经用 do~while(true) 创建了我的菜单；但是每次用户输入一个数字时，它不会运行程序，而是再次显示菜单!你怎么看？ //我的主要方法 public static void main(St
ipython - 执行/命令完成时通知
执行命令后，如何让IPython通知我？我可以使用铃声/警报还是通过弹出窗口获取它？我正在OS X 10.8.5的iTerm上运行Anaconda。最佳答案使用最新版本的iTerm，您可以在she
java - Swing 执行
您好，我刚刚使用菜单栏为 Swing 编写了代码。但是问题出现在运行中。我输入: javac Menu.java java Menu 它没有给出任何错误，但 GUI 没有显示。这是我的源代码以供引用:
.net - 执行.NET应用程序时验证Authenticode签名
我觉得这里缺少明显的东西，但是我看不到它写在任何地方。我使用Authenticode证书对可执行文件进行签名，但是当我开始学习有关它的更多信息时，我对原样的值(value)提出了质疑。签名的exe
按钮单击事件上的 JavaScript 执行
我正在设计一个应用程序，它使用 DataTables 中的预定义库来创建数据表。我想对数据表执行删除操作，为此应在按钮单击事件上执行 java 脚本。 $(document).ready(functi
Haskell - 执行 while 循环
我是 Haskell 新手，如果有人愿意帮助我，我会很高兴!我试图让这个程序与 do while 循环一起工作。第二个 getLine 命令的结果被放入变量 goGlenn 中，如果 goGlenn
java - 执行 while 循环时出现问题
我有一个用 swing 实现迷你游戏的程序，在主类中我有一个循环，用于监听游戏 map 中的 boolean 值。使用 while 实现的循环不会执行一条指令，如果它是唯一的一条指令，我不知道为什么。
java - 执行.jar时将OJBDC添加到类路径
我正在尝试开发一个连接到 Oracle 数据库并执行函数的 Java 应用程序。如果我在 Eclipse 中运行该应用程序，它可以工作，但是当我尝试在 Windows 命令提示符中运行 .jar 时，
java future 执行
我正在阅读有关 Java 中的 Future 和 javascript 中的 Promises 的内容。下面是我作为示例编写的代码。我的问题是分配给 future 的任务什么时候开始执行？当如下行创
java - 执行 && 最有效的方法？
我有一个常见的情况，您有两个变量(xSpeed 和 ySpeed)，当它们低于 minSpeed 时，我想将它们独立设置为零，并在它们都为零时退出。最有效的方法是什么？目前我有两种方法(方法2更干净

太空宇宙

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

python - 使用 memory_profiler 分析代码会增加执行时间