gpt4 book ai didi

c++ - 惯用地拆分 string_view

转载 作者:搜寻专家 更新时间:2023-10-31 01:30:24 24 4
gpt4 key购买 nike

我读了The most elegant way to iterate the words of a string并享受答案的简洁性。现在我想对 string_view 做同样的事情。问题是,stringstream 不能接受 string_view:

#include <iostream>
#include <string>
#include <sstream>
#include <algorithm>
#include <iterator>

int main() {
using namespace std;
string_view sentence = "And I feel fine...";
istringstream iss(sentence); // <== error
copy(istream_iterator<string_view>(iss),
istream_iterator<string_view>(),
ostream_iterator<string_view>(cout, "\n"));
}

那么有没有办法做到这一点?如果不是,那么这样的事情不是惯用的推理是什么?

最佳答案

按分隔符拆分并返回 vector<string_view> .

设计用于在 .csv 中快速拆分行文件。

MSVC 2017 v15.9.6 下测试和 Intel Compiler v19.0C++17 编译(这是 string_view 所必需的)。

#include <string_view>

std::vector<std::string_view> Split(const std::string_view str, const char delim = ',')
{
std::vector<std::string_view> result;

int indexCommaToLeftOfColumn = 0;
int indexCommaToRightOfColumn = -1;

for (int i=0;i<static_cast<int>(str.size());i++)
{
if (str[i] == delim)
{
indexCommaToLeftOfColumn = indexCommaToRightOfColumn;
indexCommaToRightOfColumn = i;
int index = indexCommaToLeftOfColumn + 1;
int length = indexCommaToRightOfColumn - index;

// Bounds checking can be omitted as logically, this code can never be invoked
// Try it: put a breakpoint here and run the unit tests.
/*if (index + length >= static_cast<int>(str.size()))
{
length--;
}
if (length < 0)
{
length = 0;
}*/

std::string_view column(str.data() + index, length);
result.push_back(column);
}
}
const std::string_view finalColumn(str.data() + indexCommaToRightOfColumn + 1, str.size() - indexCommaToRightOfColumn - 1);
result.push_back(finalColumn);
return result;
}

注意生命周期:a string_view不应该比 parent 活得更长string它是进入的窗口。如果父string超出范围,那么 string_view 是什么指向是无效的。在这种特殊情况下,API 设计很难出错,因为输入/输出都是 string_view这些都是进入父字符串的窗口。这最终在内存复制和 CPU 使用方面变得相当高效。

请注意,如果使用 string_view唯一的缺点是失去隐式空终止。所以使用支持 string_view 的函数,例如lexical_cast Boost 中用于将字符串转换为数字的函数。

我用它来快速解析 .csv 文件。为了获取 .csv 文件中的每一行,我使用了 istringstreamgetLine()这是非常快的(单核上每秒约 2GB 或每秒 1,200,000 行)。

单元测试。使用 Google Test用于测试(我使用 vcpkg 安装)。

// Google Test integrates into VS2017 if ReSharper is installed. 
#include "gtest/gtest.h" // Can install using vcpkg
// In main(), call:
// ::testing::InitGoogleTest(&argc, argv);return RUN_ALL_TESTS();

TEST(Strings, Split)
{
{
const std::string str = "A,B,C";
auto tokens = Split(str, ',');
EXPECT_TRUE(tokens.size() == 3);
EXPECT_TRUE(tokens[0] == "A");
EXPECT_TRUE(tokens[1] == "B");
EXPECT_TRUE(tokens[2] == "C");
}
{
const std::string str = ",B,C";
auto tokens = Split(str, ',');
EXPECT_TRUE(tokens.size() == 3);
EXPECT_TRUE(tokens[0] == "");
EXPECT_TRUE(tokens[1] == "B");
EXPECT_TRUE(tokens[2] == "C");
}
{
const std::string str = "A,B,";
auto tokens = Split(str, ',');
EXPECT_TRUE(tokens.size() == 3);
EXPECT_TRUE(tokens[0] == "A");
EXPECT_TRUE(tokens[1] == "B");
EXPECT_TRUE(tokens[2] == "");
}
{
const std::string str = "";
auto tokens = Split(str, ',');
EXPECT_TRUE(tokens.size() == 1);
EXPECT_TRUE(tokens[0] == "");
}
{
const std::string str = "A";
auto tokens = Split(str, ',');
EXPECT_TRUE(tokens.size() == 1);
EXPECT_TRUE(tokens[0] == "A");
}
{
const std::string str = ",";
auto tokens = Split(str, ',');
EXPECT_TRUE(tokens.size() == 2);
EXPECT_TRUE(tokens[0] == "");
EXPECT_TRUE(tokens[1] == "");
}
{
const std::string str = ",,";
auto tokens = Split(str, ',');
EXPECT_TRUE(tokens.size() == 3);
EXPECT_TRUE(tokens[0] == "");
EXPECT_TRUE(tokens[1] == "");
EXPECT_TRUE(tokens[2] == "");
}
{
const std::string str = "A,";
auto tokens = Split(str, ',');
EXPECT_TRUE(tokens.size() == 2);
EXPECT_TRUE(tokens[0] == "A");
EXPECT_TRUE(tokens[1] == "");
}
{
const std::string str = ",B";
auto tokens = Split(str, ',');
EXPECT_TRUE(tokens.size() == 2);
EXPECT_TRUE(tokens[0] == "");
EXPECT_TRUE(tokens[1] == "B");
}
}

关于c++ - 惯用地拆分 string_view,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48012539/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com