gpt4 book ai didi

c++ - 使用 Boost 解析后在 JSON 文件中查找重复项

转载 作者:塔克拉玛干 更新时间:2023-11-03 07:48:38 28 4
gpt4 key购买 nike

像下面的代码那样解析后,如何在 JSON 文件中找到重复项?我想计算数据中的重复项数,其中重复项的名字、姓氏和电子邮件地址都匹配。

JSON 文件比较大,这里就不复制粘贴了。但这是它的一个片段:

[  
{
"firstName":"Cletus",
"lastName":"Defosses",
"emailAddress":"ea4ad81f-4111-4d8d-8738-ecf857bba992.Defosses@somedomain.org"
},
{
"firstName":"Sherron",
"lastName":"Siverd",
"emailAddress":"51c985c5-381d-4d0e-b5ee-83005f39ce17.Siverd@somedomain.org"
},
{
"firstName":"Garry",
"lastName":"Eirls",
"emailAddress":"cc43c2da-d12c-467f-9318-beb3379f6509.Eirls@somedomain.org"
}]

这是 main.cpp 文件:

#include <iostream>
#include <string>

#include "Customer.h"
#include "boost\property_tree\ptree.hpp"
#include "boost\property_tree\json_parser.hpp"
#include "boost\foreach.hpp"

using namespace std;

int main()
{
int numOfCustomers;

// parse the JSON file
boost::property_tree::ptree file;
boost::property_tree::read_json("customers.json", file);

cout << "Reading file..." << endl;

numOfCustomers = file.size();

// iterate over each top level entry
BOOST_FOREACH(boost::property_tree::ptree::value_type const& rowPair, file.get_child(""))
{
// rowPair.first == "" and rowPair.second is the subtree with names and emails

// iterate over rows and columns
BOOST_FOREACH(boost::property_tree::ptree::value_type const& itemPair, rowPair.second)
{
// e.g. itemPair.first == "firstName: " or "lastName: "
cout << itemPair.first << ": ";
// e.g. itemPair.second is the actual names and emails
cout << itemPair.second.get_value<std::string>() << endl;
}
cout << endl;
}
cout << endl;

return 0;
}

Customer 类只是一个通用类。

class Customer
{
private:
std::string m_firstNme;
std::string m_lastName;
std::string m_emailAddress;

public:
std::string getFirstName();
void setFirstName(std::string firstName);

std::string getLastName();
void setLastName(std::string lastName);

std::string getEmailAddress();
void setEmailAddress(std::string emailAddress);
};

最佳答案

您通常会将客户对象/键插入到 std::set 中或 std::map并定义一个总排序,在插入时发现重复项。

定义关键函数和比较对象:

boost::tuple<string const&, string const&, string const&> key_of(Customer const& c) {
return boost::tie(c.getFirstName(), c.getLastName(), c.getEmailAddress());
}

struct by_key {
bool operator()(Customer const& a, Customer const& b) const {
return key_of(a) < key_of(b);
}
};

现在您可以简单地将对象插入 set<Customer, by_key> 中:

set<Customer, by_key> unique;

// iterate over each top level array
BOOST_FOREACH(boost::property_tree::ptree::value_type const& rowPair, file.get_child(""))
{
Customer current;
current.setFirstName ( rowPair.second.get ( "firstName", "?" ) ) ;
current.setLastName ( rowPair.second.get ( "lastName", "?" ) ) ;
current.setEmailAddress ( rowPair.second.get ( "emailAddress", "?" ) ) ;

if (unique.insert(current).second)
cout << current << "\n";
else
cout << "(duplicate skipped)\n";
}

完整演示

我在您的示例 JSON 中复制了 1 个条目,您可以实时查看它

Live On Coliru

#include <iostream>
#include <string>
#include <set>

#include "Customer.h"
#include <boost/property_tree/ptree.hpp>
#include <boost/property_tree/json_parser.hpp>
#include <boost/foreach.hpp>
#include <boost/tuple/tuple_comparison.hpp>

using namespace std;

namespace {

boost::tuple<string const&, string const&, string const&> key_of(Customer const& c) {
return boost::tie(c.getFirstName(), c.getLastName(), c.getEmailAddress());
}

struct by_key {
bool operator()(Customer const& a, Customer const& b) const {
return key_of(a) < key_of(b);
}
};

inline ostream& operator<<(ostream& os, Customer const& c) {
return os << "{ '"
<< c.getFirstName() << "', '"
<< c.getLastName() << "', '"
<< c.getEmailAddress() << " }";
}
}

int main()
{
// parse the JSON file
boost::property_tree::ptree file;
boost::property_tree::read_json("customers.json", file);

cout << "Reading file..." << endl;

set<Customer, by_key> unique;

// iterate over each top level array
BOOST_FOREACH(boost::property_tree::ptree::value_type const& rowPair, file.get_child(""))
{
Customer current;
current.setFirstName ( rowPair.second.get ( "firstName", "?" ) ) ;
current.setLastName ( rowPair.second.get ( "lastName", "?" ) ) ;
current.setEmailAddress ( rowPair.second.get ( "emailAddress", "?" ) ) ;

if (unique.insert(current).second)
cout << current << "\n";
else
cout << "(duplicate skipped)\n";
}

cout << "\n" << (file.size() - unique.size()) << " duplicates were found\n";
}

打印:

Reading file...
{ 'Sherron', 'Siverd', '51c985c5-381d-4d0e-b5ee-83005f39ce17.Siverd@somedomain.org }
{ 'Cletus', 'Defosses', 'ea4ad81f-4111-4d8d-8738-ecf857bba992.Defosses@somedomain.org }
(duplicate skipped)
{ 'Garry', 'Eirls', 'cc43c2da-d12c-467f-9318-beb3379f6509.Eirls@somedomain.org }

1 duplicates were found

NOTE I've adjusted the getters to be less wasteful by returning const&:

std::string const& getFirstName() const        { return m_firstName;            } 
std::string const& getLastName() const { return m_lastName; }
std::string const& getEmailAddress() const { return m_emailAddress; }

奖金

这是 26 行 c++14 代码的等效程序:

Live On Coliru

关于c++ - 使用 Boost 解析后在 JSON 文件中查找重复项,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/33445295/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com