c++ - std::vector 差异-6ren

c++ - std::vector 差异

转载作者：IT老高更新时间：2023-10-28 22:20:47

31

4

如何确定 2 个 vector 的差异是什么？

我有 vector<int> v1和 vector<int> v2 ;

我正在寻找的是 vector<int> vDifferences仅包含仅在 v1 中的元素或 v2 .

有标准的方法吗？

最佳答案

这是完整且正确的答案。在可以使用 set_symmetric_difference 算法之前，源范围必须排序:

  using namespace std; // For brevity, don't do this in your own code...

  vector<int> v1;
  vector<int> v2;

  // ... Populate v1 and v2

  // For the set_symmetric_difference algorithm to work, 
  // the source ranges must be ordered!    
  vector<int> sortedV1(v1);
  vector<int> sortedV2(v2);

  sort(sortedV1.begin(),sortedV1.end());
  sort(sortedV2.begin(),sortedV2.end());

  // Now that we have sorted ranges (i.e., containers), find the differences    
  vector<int> vDifferences;

  set_symmetric_difference(
    sortedV1.begin(),
    sortedV1.end(),
    sortedV2.begin(),
    sortedV2.end(),
    back_inserter(vDifferences));

  // ... do something with the differences

应该注意，排序是一项昂贵的操作(即 O(n log n) for common STL implementations )。特别是对于其中一个或两个容器非常大(即数百万个整数或更多)的情况，基于算法复杂性，使用哈希表的不同算法可能更可取。这是该算法的高级描述:

Load each container into a hash table.

If the two containers differ in size, the hash table corresponding to the smaller one will be used for traversal in Step 3. Otherwise, the first of the two hash tables will be used.

Traverse the hash table chosen in Step 2, checking to see if each item is present in both hash tables. If it is, remove it from both of them. The reason that the smaller hash table is preferred for traversal is because hash table lookups are on the average O(1) regardless of container size. Therefore, the time to traverse is a linear function of n (i.e., O(n)), where n is the size of the hash table being traversed.

Take the union of the remaining items in the hash tables and store the result in a difference container.

C++11 通过标准化 unordered_multiset 容器为我们提供了这种解决方案的一些功能。我还使用了 auto 关键字的新用法进行显式初始化，以使以下基于哈希表的解决方案更加简洁:

using namespace std; // For brevity, don't do this in your own code...

// The remove_common_items function template removes some and / or all of the
// items that appear in both of the multisets that are passed to it. It uses the
// items in the first multiset as the criteria for the multi-presence test.
template <typename tVal>
void remove_common_items(unordered_multiset<tVal> &ms1, 
                         unordered_multiset<tVal> &ms2)
{
  // Go through the first hash table
  for (auto cims1=ms1.cbegin();cims1!=ms1.cend();)
  {
    // Find the current item in the second hash table
    auto cims2=ms2.find(*cims1);

    // Is it present?
    if (cims2!=ms2.end())
    {
      // If so, remove it from both hash tables
      cims1=ms1.erase(cims1);
      ms2.erase(cims2);
    }
    else // If not
      ++cims1; // Move on to the next item
  }
}

int main()
{
  vector<int> v1;
  vector<int> v2;

  // ... Populate v1 and v2

  // Create two hash tables that contain the values
  // from their respective initial containers    
  unordered_multiset<int> ms1(v1.begin(),v1.end());
  unordered_multiset<int> ms2(v2.begin(),v2.end());

  // Remove common items from both containers based on the smallest
  if (v1.size()<=v2.size)
    remove_common_items(ms1,ms2);
  else
    remove_common_items(ms2,ms1);

  // Create a vector of the union of the remaining items
  vector<int> vDifferences(ms1.begin(),ms1.end());

  vDifferences.insert(vDifferences.end(),ms2.begin(),ms2.end());

  // ... do something with the differences
}

为了确定哪种解决方案更适合特定情况，分析这两种算法将是明智之举。尽管基于哈希表的解决方案在 O(n) 中，但它需要更多代码，并且每个找到的重复项(即哈希表删除)都需要做更多的工作。它还(可悲地)使用自定义差分函数而不是标准 STL 算法。

应该注意的是，两种解决方案都以与元素在原始容器中出现的顺序很可能完全不同的顺序呈现差异。通过使用哈希表解决方案的变体可以解决此问题。以下是高级描述(仅在第 4 步中与前面的解决方案不同):

Load each container into a hash table.

If the two containers differ in size, the smaller hash table will be used for traversal in Step 3. Otherwise, the first of the two will be used.

Traverse the hash table chosen in Step 2, checking to see if each item is present in both hash tables. If it is, remove it from both of them.

To form the difference container, traverse the original containers in order (i.e., the first container before the second). Look up each item from each container in its respective hash table. If it is found, the item is to be added to the difference container and removed from its hash table. Items not present in the respective hash tables will be skipped. Thus, only the items that are present in the hash tables will wind up in the difference container and their order of appearance will remain the same as it was in the original containers, because those containers dictate the order of the final traversal.

为了保持原始顺序，第 4 步变得比之前的解决方案更昂贵，尤其是在移除的商品数量较多的情况下。这是因为:

将通过在各自哈希表中的存在性测试对所有项目进行第二次测试，以确定是否有资格出现在差异容器中。
当差异容器形成时，哈希表将删除其其余项，作为第 1 项的差异测试的一部分。

关于c++ - std::vector 差异，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/7771796/

31

4

0

文章推荐： c++ - 通过引用传递与通过指针传递？

文章推荐： python - 在 Django 中自定义管理表单，同时使用自动发现

文章推荐： C++将char数组传递给函数

CUDA替代__syncthreads而不是__threadfence()差异
我从NVIDIA手册Eg中复制了以下代码:__threadfence()。他们为什么有在以下代码中使用了__threadfence()。我认为使用__syncthreads()而不是__thread
带有修订范围和更改列表的 SVN 差异
我在使用 SVN 更改列表和 svn diff 时遇到了一些麻烦.特别是我想获取特定修订范围的特定文件列表的更改历史记录。 SVN 变更列表似乎是完美的解决方案，所以我的方法是: svn change
shell - 差异/合并两个文件
我有两个 IP 地址列表。我需要将它们合并到三个文件中，交集，仅来自 list1 的文件和仅来自 list2 的文件。我可以用 awk/diff 或任何其他简单的 unix 命令来做到这一点吗？如何
上一个和新工作副本之间的 svn 差异
假设自上次更新(恢复)到我的 a.b 文件以来我做了一些更改。此 a.b 文件也在存储库中更改。现在我想将我所做的更改与 repos 更改进行比较。如果我 svn revert 文件，我可以看到
JavaBeans 比较器/差异
关闭。这个问题不符合Stack Overflow guidelines .它目前不接受答案。我们不允许提问寻求书籍、工具、软件库等的推荐。您可以编辑问题，以便用事实和引用来回答。关闭 7 年前。
openssl sha256 差异
我使用的是 openssl 1.0.1c , linux x86_64 我正在创建包含“hello”的文件(没有换行符) openssl dgst -sha256 hello_file i get :
naming - 共同与核心 - 差异
假设我们有几个库。有什么区别核心和普通图书馆？他们应该如何被认可，我们是否组织了两者的职责？ +Common -Class1 +Core -Class2 +Lib1 has : Comm
以毫秒为单位的日期之间的 SQLite 差异
如何在 SQLite 中计算以毫秒为单位的最小时间间隔？好的，提供一些背景信息，这是我的 table 的样子: link_budget table 所以有这个时间列，我想发出一个请求，以毫秒为单位
concurrency - 乐观与多版本并发控制 - 差异？
我想知道，乐观并发控制 (OCC) 和多版本并发控制 (MVCC) 之间的区别是什么？到目前为止，我知道两者都是基于更新的版本检查。在 OCC 中，我读到了没有获取读取访问锁的事务，仅适用于以后的
c# - SignalR 差异
说到 SignalR，我有点菜鸟。刚刚开始四处探索和谷歌搜索它，我想知道是否有人可以向我解释完成的事情之间的一些差异。在我见过的一些示例中，人们需要创建一个 Startup 类并定义 app.Map
math - 两个四元数之间的“差异”
我在 Ogre 工作，但这是一个一般的四元数问题。我有一个对象，我最初对其应用旋转四元数 Q1。后来，我想让它看起来好像我最初通过不同的四元数 Q2 旋转了对象。我如何计算四元数，该四元数将采用已
Javascript 模块模式 - 差异
我了解 javascript 模块模式，但我使用两种类型的模块模式，并且想从架构 Angular 了解它们之间的区别。 // PATTERN ONE var module = (function()
Scala JSON 差异
我有两个具有完全相同键的 JSON。 val json1 = """{ 'name': 'Henry', 'age' : 26, 'activities' : {
vba - 文件复制与名称函数？差异？
我发现使用 VBA 在 Excel 中复制单个文件有两种不同的方法。一是文件复制: FileCopy (originalPath), (pathToCopyTo) 另一个是名称: Name (orig
java - float[] 差异
我想知道查找两个 float 组之间差异的绝对值的最有效方法是什么？是否是以下内容: private float absDifference(float[] vector1, float[] vec
Wicket:getApplication 差异
我有一个关于 wicket getApplication 的问题。 getApplication() 和 getSession().getApplication 有什么区别？部署 wicket 应用
使用和不使用追溯模式的持久订阅之间的 activemq 差异
我刚刚开始使用activemq，我有一个关于追溯消费者的问题，为了启用这个功能，你需要有一个持久的订阅。但是在主题上启用和不启用追溯的持久订阅有什么区别？ activemq 文档说。 http://a
Scala JSON 差异
我有两个具有完全相同键的 JSON。 val json1 = """{ 'name': 'Henry', 'age' : 26, 'activities' : {
types - 浮点和整数的Erlang二进制表示，差异？
得到另一个 Erlang 二进制表示查询('因为这就是我最近正在阅读的内容，并且需要二进制协议(protocol)实现)。如果我正确理解了类型说明符，那么对于“浮点”类型值，8 字节表示似乎很好(这
java - 重载和隐藏 - 差异
关闭。这个问题需要多问focused 。目前不接受答案。想要改进此问题吗？更新问题，使其仅关注一个问题 editing this post . 已关闭 4 年前。 Improve this ques

首页

博学

6Ren·AI

商城

c++ - std::vector 差异