c++ - 内存/速度问题的一般策略-6ren

c++ - 内存/速度问题的一般策略

转载作者：太空狗更新时间：2023-10-29 23:53:52

我有一个 c++ 代码，它运行大约 200 个 ASCII 文件，进行一些基本的数据处理，并输出一个包含(基本上)所有数据的单个 ASCII 文件。

该程序一开始运行得非常快，然后在中途急剧减慢，也许逐渐减慢一点，然后在其余部分以相当慢的速度进行。 IE。它在大约 5 秒内完成前 ~80 个文件，在大约 50 秒内完成~200 个文件。每个文件基本相同。

我正在寻找有关如何追踪问题或内存泄漏的建议。

更多细节:起初我会在我的程序开始时使用 fopen(FILE *outputFile, "w") ，在结束时使用 fclose() 。前 40 个文件大约需要 4 秒；然后约 1.5 分钟即可处理约 200 个文件。

我认为输出文件可能会阻塞内存，所以我将代码更改为 fopen(outputFile, "a") 每次迭代(即每次我打开一个新文件时)，以及每次关闭输入时的 fclose()文件...如上所述，这将性能提高到大约 50 秒。

这个“修复”会如此显着但并非完全有效，这似乎很奇怪。

此外，我没有动态分配任何内存(没有调用“新建”或“删除”或“免费”或其他)......所以我什至不确定我怎么能内存泄漏。

任何帮助将不胜感激!谢谢!

代码:

vector<string> dirCon;
// Uses boost::filesystem to store every file in directory
bool retVal = FileSystem::getDirectoryContents(HOME_DIR+HISTORY_DIR, &dirCon, 2);

int counter = 0;
for(int i = 0; i < dirCon.size(); i++) { 
    // Create output file
    FILE *outFile;
    string outputFileName = HOME_DIR ... ;
    // open file as append "a"
    bool ifRet = initFile(outFile, outputFileName.c_str(), "a");
    if(!ifRet) {
        fprintf(stderr, "ERROR ... ");
        return false;
    }       

    // Get the topmost directory name
    size_t loc = dirCon.at(i).find_last_of("/");
    string dirName = dirCon.at(i).substr(loc+1, (dirCon.at(i).size()-(loc+1)));

    // Get the top directory content
    vector<string> subDirCon;
    bool subRetVal = FileSystem::getDirectoryContents(dirCon.at(i), &subDirCon);
    if(!subRetVal) { fprintf(stderr, "ERROR\n"); return false; }

    // Go through each file in directory, look for the one that matches
    for(int j = 0; j < subDirCon.size(); j++) {

        // Get filename
        loc = subDirCon.at(j).find_last_of("/");
        string fileName = subDirCon.at(j).substr(loc+1, (subDirCon.at(j).size()-(loc+1)));

        // If filename matches desired station, process and store
        if( fileName == string(dirName ...) ) {
            // Open File
            FILE *inFile;
            if(!initFile(inFile, subDirCon.at(j).c_str(), "r")) { 
                fprintf(stderr, "ERROR: ... !\n");
                break;
            }

            // Parse file line-by-line
            char str[TB_CHARLIMIT_LARGE];
            const char *delim = ",";
            while(true) {
                vector<string> splitString;
                fgets(str, TB_CHARLIMIT_LARGE, inFile);

                if(feof(inFile)) { break; }     // break at end of file
                removeEndLine(str);

                // If non-comment line, parse
                if(str[0] != COMCHAR){
                    string strString(str);
                    // remove end line char
                    strString.erase(std::remove(strString.begin(), strString.end(), '\n'), strString.end());
                    strcpy(str, strString.c_str());

                    char *temp = strtok(str,delim);
                    char *lastTemp;
                    while(temp != NULL) {
                        splitString.push_back(string(temp));
                        temp = strtok(NULL,delim);
                    }
                    if(splitString.size() > 0) { 
                        DateTime dtTemp(splitString.at(0));  
                        goodLines++;

                        /*  ... process splitString, use dtTemp ... */

                        // Output to file
                        fprintf(outFile, "%s\n", strFromStrVec(splitString).c_str());
                    }
                }
            } //while
            fclose(inFile); 
        }
    } //j
    cout << "GoodLines = " << goodLines << endl;

    fclose(outFile);
} // i

bool getDirectoryContents(const string dirName, vector<string> *conts) {
    path p(dirName);
    try {
        // Confirm Exists
        if(!exists(p)) {
            fprintf(stderr, "ERROR: '%s' does not exist!\n", dirName.c_str());
            return false;
        }

        // Confirm Directory
        if(!is_directory(p)) {
            return false;
        }

        conts->clear();

        // Store paths to sort later
        typedef vector<path> vec;
        vec v;

        copy(directory_iterator(p), directory_iterator(), back_inserter(v));

        sort(v.begin(), v.end()); 

        for(vec::const_iterator it(v.begin()), it_end(v.end()); it != it_end; ++it) {
            conts->push_back(it->string());
        }


    } catch(const filesystem_error& ex) {
        fprintf(stderr, "ERROR: '%s'!\n", ex.what());
        return false;
    }   

    return true;
}

最佳答案

如果没有更多信息，我猜你正在处理的是 Schlemiel the Painter 的算法:(Original) (Wikipedia) .他们非常容易陷入字符串处理。让我举个例子。

我想读取文件中的每一行，以某种方式处理每一行，然后通过一些中间处理运行它。然后我想收集结果，并可能将其写回磁盘。这是一种方法。我犯了一个很容易被忽略的大错误:

// proc.cpp
class Foo
{
  public:
  std::string chew_on(std::string const& line_to_chew_on) {...}
  ...
};

Foo processor;
std::string buffer;

// Read/process
FILE *input=fopen(..., "r");
char linebuffer[1000+1];
for (char *line=fgets(linebuffer, 1000, input); line; 
     line=fgets(linebuffer, 1000, input) ) 
{
    buffer=buffer+processor.chew_on(line);  //(1)
}
fclose(input);

// Write
FILE *output=fopen(...,"w");
fwrite(buffer.data(), 1, buffer.size(), output);
fclose(output);

这里的问题乍一看很容易被忽略，即每次运行 (1) 行时，都会复制 buffer 的全部内容。如果有 1000 行，每行 100 个字符，您最终会花费时间复制 100+200+300+400+....+100,000=5,050,000 字节拷贝来运行它。增加到 10,000 行？ 500,500,000。那个油漆 jar 离我们越来越远了。

在此特定示例中，修复很简单。 (1) 行应为:

    buffer.append(processor.chew_on(line)); // (2)

或等效地:(感谢 Matthieu M.):

    buffer += processor.chew_on(line);

这会有所帮助，因为(通常)std::string 不需要制作 buffer 的完整拷贝来执行 append 函数，而在 (1) 中，我们坚持要进行复制。

更一般地说，假设 (a) 您正在进行的处理保持状态，(b) 您经常引用所有或大部分状态，以及 (c) 该状态随时间增长。那么很有可能您已经编写了 Θ(n²) 时间算法，该算法将完全表现出您正在谈论的行为类型。

编辑

当然，“为什么我的代码很慢？”的常见答案是“运行配置文件”。有许多工具和技术可用于执行此操作。一些选项包括:

callgrind/kcachegrind (如 David Schwartz 所建议)

Random Pausing (如 Mike Dunlavey 所建议)

GNU 分析器，gprof

GNU 测试覆盖分析器，gcov

oprofile

他们各有所长。 “随机暂停”可能是最简单的实现方式，尽管它可能很难解释结果。 'gprof' 和 'gcov' 在多线程程序上基本上没用。 Callgrind 很彻底但很慢，有时会在多线程程序上玩一些奇怪的把戏。 oprofile 速度很快，可以很好地处理多线程程序，但可能难以使用，并且可能会遗漏一些东西。

但是，如果您正在尝试分析单线程程序，并且正在使用 GNU 工具链进行开发，gprof 可能是一个很好的选择。以我的 proc.cpp 为例，上面。出于演示目的，我将分析未优化的运行。首先，我重建我的程序以进行分析(将 -pg 添加到编译和链接步骤):

$ g++ -O0 -g -pg -o proc.o -c proc.cpp
$ g++ -pg -o proc proc.o

我运行程序一次以创建分析信息:

./proc

除了执行它通常执行的操作外，此运行还将在当前目录中创建一个名为“gmon.out”的文件。现在，我运行 gprof 来解释结果:

$ gprof ./procFlat profile:Each sample counts as 0.01 seconds.  %   cumulative   self              self     total            time   seconds   seconds    calls  ms/call  ms/call  name    100.50      0.01     0.01   234937     0.00     0.00  std::basic_string<...> std::operator+<...>(...)  0.00      0.01     0.00   234937     0.00     0.00  Foo::chew_on(std::string const&)  0.00      0.01     0.00        1     0.00    10.05  do_processing(std::string const&, std::string const&)...

Yes indeed, 100.5% of my program's time is spent in std::string operator+. Well, ok, up to some sampling error. (I'm running this in a VM ... it seems that the timing being captured by gprof is off. My program took much longer than 0.01 cumulative seconds to run...)

For my very simple example, gcov is a little less instructive. But here's what it happens to show. First, compile and run for gcov:

$ g++ -O0 -fprofile-arcs -ftest-coverage -o proc proc.cpp
$ ./proc
$ gcov ./proc
...

这会在当前目录中创建一堆以 .gcno、.gcda、.gcov 结尾的文件。 .gcov 中的文件告诉我们每行代码在运行期间执行了多少次。因此，在我的示例中，我的 proc.cpp.gcov 最终看起来像这样:

        -:    0:Source:proc.cpp        -:    0:Graph:proc.gcno        -:    0:Data:proc.gcda        -:    0:Runs:1        -:    0:Programs:1        -:    1:#include         -:    2:#include         -:    4:class Foo        -:    5:{        -:    6:  public:   234937:    7:  std::string chew_on(std::string const& line_to_chew_on) {return line_to_chew_on;}        -:    8:};        -:    9:        -:   10:        -:   11:        1:   12:int do_processing(std::string const& infile, std::string const& outfile)        -:   13:{        -:   14:  Foo processor;        2:   15:  std::string buffer;        -:   16:        -:   17:  // Read/process        1:   18:  FILE *input=fopen(infile.c_str(), "r");        -:   19:  char linebuffer[1000+1];   234938:   20:  for (char *line=fgets(linebuffer, 1000, input); line;         -:   21:       line=fgets(linebuffer, 1000, input) )         -:   22:    {   234937:   23:      buffer=buffer+processor.chew_on(line);  //(1)        -:   24:    }        1:   25:  fclose(input);        -:   26:        -:   27:  // Write        1:   28:  FILE *output=fopen(outfile.c_str(),"w");        1:   29:  fwrite(buffer.data(), 1, buffer.size(), output);        1:   30:  fclose(output);        1:   31:}        -:   32:        1:   33:int main()        -:   34:{        1:   35:  do_processing("/usr/share/dict/words","outfile");        -:   36:}

So from this, I'm going to have to conclude that the std::string::operator+ at line 23 (which is executed 234,937 times) is a potential cause of my program's slowness.

As an aside, callgrind/kcachegrind work with multithreaded programs, and can provide much, much more information. For this program I run:

g++ -O0 -o proc proc.cpp
valgrind --tool=callgrind ./proc  # this takes forever to run
kcachegrind callgrind.out.*

我发现以下输出，表明真正耗尽我的周期的是大量内存拷贝(99.4% 的执行时间花在 __memcpy_ssse3_back 上)，我可以看到这一切都发生在某处在我的来源第 23 行下方: kcachegrind screenshot

关于c++ - 内存/速度问题的一般策略，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/8860603/

文章推荐： c++ - 字符串匹配算法

文章推荐： c# - C# 中的字典类 - 两个对象的相等性

文章推荐： c# - 这些(+=、-=、*=、/=)运算符是什么意思？

文章推荐： c# - Visual Studio 无法识别 System.Linq

php - 将时间:[Distance,速度]数据转换为距离:[Time,速度]数据
我在想出一个算法时遇到了麻烦... 我有一系列 GPS 数据，以 1 秒为间隔记录时间、速度、距离。假设距离是米，速度是米/秒。可能有超过 2 小时的数据，或 7200 个点。这里的“时间”字段主要是
Java集合排序和自定义排序-速度
使用java排序器，即: Collections.sort(myArrayList, new Comparator() { @Override public int c
一直困扰我的MySql优化/速度
有什么区别吗 SELECT * FROM my_table 和 SELECT my_column_id FROM my_table 地点: my_table 有百万行网站上有大量并发用户进行sql查
Mysql Order BY 速度
有2个样本。在第一个示例中，使用 orderby 可以更快地获得结果。 (根据 phpmyadmin 速度报告) 在另一个例子中，我没有使用 order by，它给出的结果较慢。 (根据 phpmy
Tensorflow shuffle_batch 速度
我注意到，如果我将训练数据加载到内存中并将其作为 numpy 数组提供到图中，与使用相同大小的 shuffle 批次相比，速度会有很大差异，我的数据有大约 1000 个实例。使用内存 1000 次迭
python - 如何提高python中虚线线图的效率(速度)
我在 python 中使用破折号。我正在绘制记录到 SQLite 数据库中的实时数据，目前，我正在绘制单个值与时间线图。我计划再添加 20 个图表，但目前，随着时间的增加， plotly 变慢，我认为
速度 hasNext 方法
我试图调用 hasNext Velocity 模板中的方法，以便根据 foreach 循环中的位置影响行为 - 仅 hasNext没有按照文档工作。这是 Velocity 用户指南的片段，关于 ha
performance - 三角函数的效率/速度
在我正在制作的游戏中，我有两个点，pt1 和 pt2，我想计算出它们之间的角度。我已经在较早的计算中计算出距离。显而易见的方法是对垂直距离上的水平距离进行反正切 (tan(theta) = opp/a
velocity - 速度，检查字符串是否为空而不为空的最有效方法是什么
我经常遇到字符串值不存在和/或为空的情况。这是测试这种情况的最佳方法吗？ #if( $incentive.disclaimer && $!incentive.disclaimer != '' )
variables - 速度:检查变量是否已定义的任何方法
我想将一个模板nested包含在其他模板cont1，cont2和cont3中。并且嵌套模板应仅对cont1隐藏一个特定控件。在包含在cont1中之前，我想为一些标志变量$hideMyControl
.net - 使用Azure媒体服务更改播放速度(速度)
是否可以更改从“Windows Azure Media Encoder”输出的音频的播放速度？我正在使用配置为“WMA High Quality Audio”的“Windows Azure Medi
velocity - 速度-无法合并时删除字段
我使用速度将String(template)与字段合并 hi there I'am ${name}, And I'am ${age} old. velocity将字段${name}和${age}与一种
c# - 速度 - 将位图数据复制到数组中还是直接使用它？
我使用的是 LockedBitmap 类，它简化了 C# 中位图数据的处理。目前它正在将数据复制到本地 byte[] 数组中，然后通过其类方法访问该数组以获取/设置像素颜色值。这比直接通过指针访问锁
java - 速度:如何定义全局变量
我尝试在 VM_global_library.vm 文件中添加一堆 #set($x=abc) 语句，但这些变量在我的 VM 模板中不可用。我想为图像的基本路径等设置一个全局变量。这可能吗？最佳答案
java - 速度。无法加载我的资源
我的项目结构: -src --main ---java ----makers -----SomeClass ---resources ----htmlPattern.vm 如何告诉 SomeClass
java - 速度 - 更正正则表达式以删除控制字符？
我正在尝试从 Velocity 中的字符串中删除不需要的字符(换行符可以，但不能像 EM 和 CAN ASCII 控制字符那样)。 #set($cleanScreen = $cleanScreen.r
java - 无法在点处分割 - 速度
我想在日.月.年之间的点处分割日期。例如:2015 年 1 月 14 日至 {14, 01, 2015}这是我使用的代码:dates3.get(0) 包含我从页面的文本字段获取的字符串“14.01.2
java - 速度:迭代问题
之后，从 1.5 升级到速度引擎 1.7 出现了 1.5 没有的问题。为了解释这个问题，我必须展示一个代码片段: #foreach($someVariable in $someCollection)
MySQL - "select"速度
我想知道从表中选择所有字段是否更快: SELECT * 或只选择您真正需要的: SELECT field1, field2, field3, field4, field5... 假设表有大约 10 个
iphone - 模仿照片应用程序的平移行为(速度)
我正在尝试模仿照片应用程序的行为，在该应用程序中，用户用手指平移照片并且照片具有一定的速度。由于我不会深入的原因，我不能将 UIScrollView 与它的缩放 UIImageView 一起使用，而是

太空狗

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

c++ - 内存/速度问题的一般策略