- android - 多次调用 OnPrimaryClipChangedListener
- android - 无法更新 RecyclerView 中的 TextView 字段
- android.database.CursorIndexOutOfBoundsException : Index 0 requested, 光标大小为 0
- android - 使用 AppCompat 时,我们是否需要明确指定其 UI 组件(Spinner、EditText)颜色
我想分块解析 XML 文件,这样它就不会耗尽内存并以列式存储方式解析它即 key1:value1, key2:value2, key3:value3,等等。
目前,我正在读取这样的文件:
string parseFieldFromLine(const string &line, const string &key)
{
// We're looking for a thing that looks like:
// [key]="[value]"
// as part of a larger string.
// We are given [key], and want to return [value].
// Find the start of the pattern
string keyPattern = key + "=\"";
ssize_t idx = line.find(keyPattern);
// No match
if (idx == -1)
return "";
// Find the closing quote at the end of the pattern
size_t start = idx + keyPattern.size();
size_t end = start;
while (line[end] != '"')
{
end++;
}
// Extract [value] from the overall string and return it
// We have (start, end); substr() requires,
// so we must compute, (start, length).
return line.substr(start, end - start);
}
map<string, User> users;
void readUsers(const string &filename)
{
ifstream fin;
fin.open(filename.c_str());
string line;
while (getline(fin, line))
{
User u;
u.Id = parseFieldFromLine(line, "Id");
u.DisplayName = parseFieldFromLine(line, "DisplayName");
users[u.Id] = u;
}
}
如您所见,我正在调用一个函数来查找一行中的子字符串。这是错误的,因为如果我有一个格式错误的文件(行),我会得到意想不到的值,导致静默失败。
我读过有关使用 XML 解析器的信息,但对 C++ 是新手,我无法确定哪一个在键值格式中最有效,因为我对测试工作/效率也知之甚少。我当前的 i/p 数据如下所示:
<?xml version="1.0" encoding="utf-8"?>
<posts>
<row Id="1" PostTypeId="1" AcceptedAnswerId="509" CreationDate="2009-04-30T06:49:01.807" Score="13" ViewCount="903" Body="<p>Our nightly full (and periodic differential) backups are becoming quite large, due mostly to the amount of indexes on our tables; roughly half the backup size is comprised of indexes.</p>

<p>We're using the <strong>Simple</strong> recovery model for our backups.</p>

<p>Is there any way, through using <code>FileGroups</code> or some other file-partitioning method, to <strong>exclude</strong> indexes from the backups?</p>

<p>It would be nice if this could be extended to full-text catalogs, as well.</p>
" OwnerUserId="3" LastEditorUserId="919" LastEditorDisplayName="" LastEditDate="2009-05-04T02:11:16.667" LastActivityDate="2009-05-10T15:22:39.707" Title="How to exclude indexes from backups in SQL Server 2008" Tags="<sql-server><backup><sql-server-2008><indexes>" AnswerCount="3" CommentCount="0" FavoriteCount="3" />
<row Id="2" PostTypeId="1" AcceptedAnswerId="1238" CreationDate="2009-04-30T07:04:18.883" Score="18" ViewCount="1951" Body="<p>We've struggled with the RAID controller in our database server, a <a href="http://www.pc.ibm.com/europe/server/index.html?nl&amp;cc=nl" rel="nofollow">Lenovo ThinkServer</a> RD120. It is a rebranded Adaptec that Lenovo / IBM dubs the <a href="http://www.redbooks.ibm.com/abstracts/tips0054.html#ServeRAID-8k" rel="nofollow">ServeRAID 8k</a>.</p>

<p>We have patched this <a href="http://www.redbooks.ibm.com/abstracts/tips0054.html#ServeRAID-8k" rel="nofollow">ServeRAID 8k</a> up to the very latest and greatest:</p>

<ul>
<li>RAID bios version</li>
<li>RAID backplane bios version</li>
<li>Windows Server 2008 driver</li>
</ul>

<p>This RAID controller has had multiple critical BIOS updates even in the short 4 month time we've owned it, and the <a href="ftp://ftp.software.ibm.com/systems/support/system%5Fx/ibm%5Ffw%5Faacraid%5F5.2.0-15427%5Fanyos%5F32-64.chg" rel="nofollow">change history</a> is just.. well, scary. </p>

<p>We've tried both write-back and write-through strategies on the logical RAID drives. <strong>We still get intermittent I/O errors under heavy disk activity.</strong> They are not common, but serious when they happen, as they cause SQL Server 2008 I/O timeouts and sometimes failure of SQL connection pools.</p>

<p>We were at the end of our rope troubleshooting this problem. Short of hardcore stuff like replacing the entire server, or replacing the RAID hardware, we were getting desperate.</p>

<p>When I first got the server, I had a problem where drive bay #6 wasn't recognized. Switching out hard drives to a different brand, strangely, fixed this -- and updating the RAID BIOS (for the first of many times) fixed it permanently, so I was able to use the original "incompatible" drive in bay 6. On a hunch, I began to assume that <a href="http://www.newegg.com/Product/Product.aspx?Item=N82E16822136143" rel="nofollow">the Western Digital SATA hard drives</a> I chose were somehow incompatible with the ServeRAID 8k controller.</p>

<p>Buying 6 new hard drives was one of the cheaper options on the table, so I went for <a href="http://www.newegg.com/Product/Product.aspx?Item=N82E16822145215" rel="nofollow">6 Hitachi (aka IBM, aka Lenovo) hard drives</a> under the theory that an IBM/Lenovo RAID controller is more likely to work with the drives it's typically sold with.</p>

<p>Looks like that hunch paid off -- we've been through three of our heaviest load days (mon,tue,wed) without a single I/O error of any kind. Prior to this we regularly had at least one I/O "event" in this time frame. <strong>It sure looks like switching brands of hard drive has fixed our intermittent RAID I/O problems!</strong></p>

<p>While I understand that IBM/Lenovo probably tests their RAID controller exclusively with their own brand of hard drives, I'm disturbed that a RAID controller would have such subtle I/O problems with particular brands of hard drives.</p>

<p>So my question is, <strong>is this sort of SATA drive incompatibility common with RAID controllers?</strong> Are there some brands of drives that work better than others, or are "validated" against particular RAID controller? I had sort of assumed that all commodity SATA hard drives were alike and would work reasonably well in any given RAID controller (of sufficient quality).</p>
" OwnerUserId="1" LastActivityDate="2011-03-08T08:18:15.380" Title="Do RAID controllers commonly have SATA drive brand compatibility issues?" Tags="<raid><ibm><lenovo><serveraid8k>" AnswerCount="8" FavoriteCount="2" />
<row Id="3" PostTypeId="1" AcceptedAnswerId="104" CreationDate="2009-04-30T07:48:06.750" Score="26" ViewCount="692" Body="<ul>
<li>How do you keep your servers up to date?</li>
<li>When using a package manager like <a href="http://wiki.debian.org/Aptitude" rel="nofollow">Aptitude</a>, do you keep an upgrade / install history, and if so, how do you do it?</li>
<li>When installing or upgrading packages on multiple servers, are there any ways to speed the process up as much as possible?</li>
</ul>
" OwnerUserId="22" LastEditorUserId="22" LastEditorDisplayName="" LastEditDate="2009-04-30T08:05:02.217" LastActivityDate="2009-06-05T04:01:09.423" Title="Best practices for keeping UNIX packages up to date?" Tags="<unix><package-management><server-management>" AnswerCount="11" FavoriteCount="14" />
<row Id="4" PostTypeId="2" ParentId="3" CreationDate="2009-04-30T07:49:58.027" Score="10" ViewCount="" Body="<p>Regarding your third question: I always run a local repository. Even if it's only for one machine, it saves time in case I need to reinstall (I generally use something like aptitude autoclean), and for two machines, it almost always pays off.</p>

<p>For the clusters I admin, I don't generally keep explicit logs: I let the package manager do it for me. However, for those machines (as opposed to desktops), I don't use automatic installations, so I do have my notes about what I intended to install to all machines.</p>
" OwnerUserId="28" LastActivityDate="2009-04-30T07:49:58.027" CommentCount="1" />
<row Id="5" PostTypeId="2" ParentId="2" CreationDate="2009-04-30T07:56:20.070" Score="4" ViewCount="" Body="<p>I don't think it's common per se. However, as soon as you start using enterprise storage controllers, whether that be SAN's or standalone RAID controllers, you'll generally want to adhere to their compatibility list rather closely.</p>

<p>You may be able to save some bucks on the sticker price by buying a cheap range of disks, but that's probably one of the last areas I'd want to save money on - given the importance of data in most scenarios.</p>

<p>In other words, explicit incompatibility is very uncommon, but explicit compatibility adherence is recommendable.</p>
" OwnerUserId="24" LastActivityDate="2009-04-30T07:56:20.070" />
<row Id="6" PostTypeId="1" AcceptedAnswerId="537" CreationDate="2009-04-30T07:57:06.247" Score="8" ViewCount="2648" Body="<p>Our database currently only has one FileGroup, PRIMARY, which contains roughly 8GB of data (table rows, indexes, full-text catalog).</p>

<p>When is a good time to split this into secondary data files? What are some criteria that I should be aware of?</p>
" OwnerUserId="3" LastActivityDate="2009-07-08T07:23:49.527" Title="In SQL Server, when should you split your PRIMARY Data FileGroup into secondary data files?" Tags="<sql-server><files><filegroups>" AnswerCount="3" FavoriteCount="1" />
<row Id="7" PostTypeId="1" AcceptedAnswerId="17" CreationDate="2009-04-30T07:57:09.117" Score="12" ViewCount="529" Body="<p>What enterprise virus-scanning systems do you recommend?</p>
" OwnerUserId="32" LastActivityDate="2009-04-30T11:51:09.290" Title="What is the best enterprise virus-scanning system?" Tags="<antivirus>" AnswerCount="8" CommentCount="3" FavoriteCount="2" />
<row Id="8" PostTypeId="2" ParentId="3" CreationDate="2009-04-30T07:57:15.653" Score="0" ViewCount="" Body="<p>You can have a local repository and configure all servers to point to it for updates. Not only you get speed of local downloads, you also get to control which official updates you want installed on your infrastructure in order to prevent any compatibility issues.</p>

<p>On the Windows side of things, I've used <a href="http://technet.microsoft.com/en-us/wsus/default.aspx" rel="nofollow">Windows Server Update Services</a> with very satisfying results.</p>
" OwnerUserId="36" LastActivityDate="2009-04-30T07:57:15.653" />
其他文件:
<?xml version="1.0" encoding="utf-8"?>
<users>
<row Id="1" Reputation="4220" CreationDate="2009-04-30T07:08:27.067" DisplayName="Jeff Atwood" EmailHash="51d623f33f8b83095db84ff35e15dbe8" LastAccessDate="2011-09-03T13:30:29.990" WebsiteUrl="http://www.codinghorror.com/blog/" Location="El Cerrito, CA" Age="40" AboutMe="<p><img src="http://img377.imageshack.us/img377/4074/wargames1xr6.jpg" width="250"></p>

<p><a href="http://www.codinghorror.com/blog/archives/001169.html" rel="nofollow">Stack Overflow Valued Associate #00001</a></p>

<p>Wondering how our software development process works? <a href="http://www.youtube.com/watch?v=08xQLGWTSag" rel="nofollow">Take a look!</a></p>
" Views="3562" UpVotes="1995" DownVotes="31" />
<row Id="2" Reputation="697" CreationDate="2009-04-30T07:08:27.067" DisplayName="Geoff Dalgas" EmailHash="b437f461b3fd27387c5d8ab47a293d35" LastAccessDate="2011-09-05T22:14:06.527" WebsiteUrl="http://stackoverflow.com" Location="Corvallis, OR" Age="34" AboutMe="<p>Developer on the StackOverflow team. Find me on</p>

<p><a href="http://www.twitter.com/SuperDalgas" rel="nofollow">Twitter</a>
<br><br>
<a href="http://blog.stackoverflow.com/2009/05/welcome-stack-overflow-valued-associate-00003/" rel="nofollow">Stack Overflow Valued Associate #00003</a> </p>
" Views="291" UpVotes="46" DownVotes="2" />
<row Id="3" Reputation="259" CreationDate="2009-04-30T07:08:27.067" DisplayName="Jarrod Dixon" EmailHash="2dfa19bf5dc5826c1fe54c2c049a1ff1" LastAccessDate="2011-09-01T20:43:27.743" WebsiteUrl="http://stackoverflow.com" Location="New York, NY" Age="32" AboutMe="<p><a href="http://blog.stackoverflow.com/2009/01/welcome-stack-overflow-valued-associate-00002/" rel="nofollow">Developer on the Stack Overflow team</a>.</p>

<p>Was dubbed <strong>SALTY SAILOR</strong> by Jeff Atwood, as filth and flarn would oft-times fly when dealing with a particularly nasty bug!</p>

<ul>
<li>Twitter me: <a href="http://twitter.com/jarrod_dixon" rel="nofollow">jarrod_dixon</a></li>
<li>Email me: jarrod.m.dixon@gmail.com</li>
</ul>
" Views="210" UpVotes="259" DownVotes="4" />
最佳答案
我猜你要找的是 SAX parser它不会一次读取整个文档(如 DOM-parser 那样),但可以为特定事件定义回调(例如,新 XML 元素的开头)。由于您正在逐个元素处理,这听起来很适合您。
我必须承认我从来没有用 C++ 做过任何 XML 解析,但他的两个库听起来很适合你的问题:
关于c++ - 如何解析列式存储格式的 XML 文件?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58574000/
我正在 csv 上使用 hadoop 来分析一些数据。我使用sql/mysql(不确定)来分析数据,现在陷入了僵局。 我花了好几个小时在谷歌上搜索,却没有找到任何相关的东西。我需要一个查询,在该查询中
我正在为 Bootstrap 网格布局的“简单”任务而苦苦挣扎。我希望在大视口(viewport)上有 4 列,然后在中型设备上有 2 列,最后在较小的设备上只有 1 列。 当我测试我的代码片段时,似
对于这个令人困惑的标题,我深表歉意,我想不出这个问题的正确措辞。相反,我只会给你背景信息和目标: 这是在一个表中,一个人可能有也可能没有多行数据,这些行可能包含相同的 activity_id 值,也可
具有 3 列的数据库表 - A int , B int , C int 我的问题是: 如何使用 Sequelize 结果找到 A > B + C const countTasks = await Ta
我在通过以下功能编写此查询时遇到问题: 首先按第 2 列 DESC 排序,然后从“不同的第 1 列”中选择 只有 Column1 是 DISTINCT 此查询没有帮助,因为它首先从第 1 列中进行选择
使用 Bootstrap 非常有趣和有帮助,目前我在创建以下需求时遇到问题。 “使用 bootstrap 在桌面上有 4 列,在平板电脑上有 2 列,在移动设备上有 1 列”谁能告诉我正确的结构 最佳
我是 R 新手,正在问一个非常基本的问题。当然,我在尝试从所提供的示例中获取指导的同时做了功课here和 here ,但无法在我的案例中实现这个想法,即可能是由于我的问题中的比较维度更大。 我的实
通常我会使用 R 并执行 merge.by,但这个文件似乎太大了,部门中的任何一台计算机都无法处理它! (任何从事遗传学工作的人的附加信息)本质上,插补似乎删除了 snp ID 的 rs 数字,我只剩
我有一个 df , delta1 delta2 0 -1 2 0 -1 0 0 0 我想知道如何分配 delt
您好,我想知道是否可以执行以下操作。显然,我已经尝试在 phpMyAdmin 中运行它,但出现错误。也许还有另一种方式来编写此查询。 SELECT * FROM eat_eat_restaurants
我有 2 个列表(标题和数据值)。我想要将数据值列 1 匹配并替换为头文件列 1,以获得与 dataValue 列 1 和标题值列 2 匹配的值 头文件 TotalLoad,M0001001 Hois
我有两个不同长度的文件,file2 是一个很大的引用文件,我从中提取文件 1 的数据。 我有一行 awk,我通常会对其进行调整以在我的文件中进行查找和替换,但它总是在同一列中进行查找和替换。 所以对于
假设我有两个表,如下所示。 create table contract( c_ID number(1) primary key, c_name varchar2(50) not
我有一个带有 varchar 列的 H2 表,其检查约束定义如下: CONSTRAINT my_constraint CHECK (varchar_field <> '') 以下插入语句失败,但当我删
这是最少量的代码,可以清楚地说明我的问题: One Two Three 前 2 个 div 应该是 2 个左列。第三个应该占据页面的其余部分。最后,我将添加选项来隐藏和
在 Azure 中的 Log Analytics 中,我为 VM Heartbeat 选择一个预定义查询,我在编辑器中运行查询正常,但当我去创建警报时,我不断收到警报“查询未返回 TimeGenera
在 Azure 中的 Log Analytics 中,我为 VM Heartbeat 选择一个预定义查询,我在编辑器中运行查询正常,但当我去创建警报时,我不断收到警报“查询未返回 TimeGenera
今天我开始使用 JexcelApi 并遇到了这个:当您尝试从特定位置获取元素时,不是像您通常期望的那样使用sheet.getCell(row,col),而是使用sheet.getCell(col,ro
我有一个包含 28 列的数据库。第一列是代码,第二列是名称,其余是值。 public void displayData() { con.Open(); MySqlDataAdapter
我很沮丧:每当我缩小这个网页时,一切都变得一团糟。我如何将网页居中,以便我可以缩小并且元素不会被错误定位。 (它应该是 2 列,但所有内容都合并为 1)我试过 但由于某种原因,这不起作用。 www.o
我是一名优秀的程序员,十分优秀!