gpt4 book ai didi

C++ fstream 输出错误的数据

转载 作者:塔克拉玛干 更新时间:2023-11-03 08:00:36 24 4
gpt4 key购买 nike

上下文优先:

我的程序做一些并行计算,这些计算记录在一个文件中。线程按 block 分组(我使用的是 CUDA)。日志文件的格式如下:

#begin run
({blockIdx,threadIdx}) {thread_info}
({blockIdx,threadIdx}) {thread_info}
...
#end run

我写了一个函数,它应该读取日志文件并按线程对每条运行消息进行排序。

//------------------------------------------------------------------------------
// Comparison struct for log file sorting
//------------------------------------------------------------------------------
typedef struct
{
bool operator()(const string &rString1 , const string &rString2)
{
int closeParenthesisLocalition1 = rString1.find_first_of(')');
int closeParenthesisLocalition2 = rString2.find_first_of(')');
int compResult = rString1.compare(0 , closeParenthesisLocalition1 + 2 , rString2 , 0 , closeParenthesisLocalition2 + 2);
return (compResult < 0);
}
} comp;

//------------------------------------------------------------------------------------
// Sort the log file. Lines with same prefix (blockIdx,ThreadIdx) will be grouped in file per run.
//------------------------------------------------------------------------------------
void CudaUnitTest::sortFile()
{
comp comparison;
deque<string> threadsPrintfs;
ifstream inputFile(m_strInputFile);
assert(inputFile.is_open());

//Read whole input file and close it. Saves disk accesses.
string strContent((std::istreambuf_iterator<char>(inputFile)), std::istreambuf_iterator<char>());
inputFile.close();

ofstream outputFile(m_strOutputFile);
assert(outputFile.is_open());

string strLine;
int iBeginRunIdx = -10; //value just to addapt on while loop (to start on [0])
int iBeginRunNewLineOffset = 10; //"idx offset to a new line char in string. Starts with the offset of the string "#begin run\n".
int iEndRunIdx;
int iLastNewLineIdx;
int iNewLineIdx;

while((iBeginRunIdx = strContent.find("#begin run\n" , iBeginRunIdx + iBeginRunNewLineOffset)) != string::npos)
{
iEndRunIdx = strContent.find("#end run\n" , iBeginRunIdx + iBeginRunNewLineOffset);
assert(iEndRunIdx != string::npos);

iLastNewLineIdx = iBeginRunIdx + iBeginRunNewLineOffset;
while((iNewLineIdx = strContent.find("\n" , iLastNewLineIdx + 1)) < iEndRunIdx)
{
strLine = strContent.substr(iLastNewLineIdx + 1 , iNewLineIdx);
if(verifyPrefix(strLine))
threadsPrintfs.push_back(strLine);
iLastNewLineIdx = iNewLineIdx;
}

//sort last run info
sort(threadsPrintfs.begin() , threadsPrintfs.end() , comparison);
threadsPrintfs.push_front("#begin run\n");
threadsPrintfs.push_back("#end run\n");

//output it
for(deque<string>::iterator it = threadsPrintfs.begin() ; it != threadsPrintfs.end() ; ++it)
{
assert(outputFile.good());
outputFile.write(it->c_str() , it->size());
}
outputFile.flush();
threadsPrintfs.clear();
}

outputFile.close();
}

问题是生成的文件有很多垃圾数据。例如,一个 6KB 的输入日志文件生成了 192KB 的输出日志!输出文件似乎有很多输入文件的重复。不过,在调试代码时,双端队列在排序前后显示了正确的值。我认为 ofstream 写的本身有问题。

编辑:该函数未并行运行。

最佳答案

只是为了显示最终代码。请注意 substr 的变化,现在接收长度的不是索引。

//------------------------------------------------------------------------------------
// Sort the log file. Lines with same prefix (blockIdx,ThreadIdx) will be grouped in file per run.
//------------------------------------------------------------------------------------
void CudaUnitTest::sortFile()
{
comp comparison;
deque<string> threadsPrintfs;
ifstream inputFile(m_strInputFile);
assert(inputFile.is_open());

//Read whole input file and close it. Saves disk accesses.
string strContent((std::istreambuf_iterator<char>(inputFile)), std::istreambuf_iterator<char>());
inputFile.close();

ofstream outputFile(m_strOutputFile);
assert(outputFile.is_open());

string strLine;
int iBeginRunIdx = -10; //value just to addapt on while loop (to start on [0])
int iBeginRunNewLineOffset = 10; //"idx offset to a new line char in string. Starts with the offset of the string "#begin run\n".
int iEndRunIdx;
int iLastNewLineIdx;
int iNewLineIdx;

while((iBeginRunIdx = strContent.find("#begin run\n" , iBeginRunIdx + iBeginRunNewLineOffset)) != string::npos)
{
iEndRunIdx = strContent.find("#end run\n" , iBeginRunIdx + iBeginRunNewLineOffset);
assert(iEndRunIdx != string::npos);

iLastNewLineIdx = iBeginRunIdx + iBeginRunNewLineOffset;
while((iNewLineIdx = strContent.find("\n" , iLastNewLineIdx + 1)) < iEndRunIdx)
{
strLine = strContent.substr(iLastNewLineIdx + 1 , iNewLineIdx - iLastNewLineIdx);
if(verifyPrefix(strLine))
threadsPrintfs.push_back(strLine);
iLastNewLineIdx = iNewLineIdx;
}

//sort last run info
sort(threadsPrintfs.begin() , threadsPrintfs.end() , comparison);
threadsPrintfs.push_front("#begin run\n");
threadsPrintfs.push_back("#end run\n");

//output it
for(deque<string>::iterator it = threadsPrintfs.begin() ; it != threadsPrintfs.end() ; ++it)
{
assert(outputFile.good());
outputFile.write(it->c_str() , it->size());
}
threadsPrintfs.clear();
}

outputFile.close();
}

关于C++ fstream 输出错误的数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/8097719/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com