gpt4 book ai didi

c - 多次调用 XML CharacterDataHandler 回调

转载 作者:太空宇宙 更新时间:2023-11-04 03:24:44 24 4
gpt4 key购买 nike

我正在学习 libexpat。为了基本熟悉使用 API,我拼凑了这个示例:

代码:

#include <stdio.h>
#include <expat.h>
#include <string.h>
#include <iostream>

void start(void* userData, const char* name, const char* argv[])
{
std::cout << "name: " << name << std::endl;

int i = 0;

while (argv[i])
{
std::cout << "argv[" << i << "] == " << argv[i++] << std::endl;
}
}

void end(void* userData, const char* name)
{
}

void value(void* userData, const char* val, int len)
{
char str[len+1];
strncpy(str, val, len);
str[len] = '\0';

std::cout << "value: " << str << std::endl;
}

int main(int argc, char* argv[], char* envz[])
{
XML_Parser parser = XML_ParserCreate(NULL);
XML_SetElementHandler(parser, start, end);
XML_SetCharacterDataHandler(parser, value);

int bytesRead = 0;
char val[1024] = {};
FILE* fp = fopen("./catalog.xml", "r");
std::cout << "fp == 0x" << (void*)fp << std::endl;

do
{
bytesRead = fread(val, 1, sizeof(val), fp);
std::cout << "In while loop bytesRead==" << bytesRead << std::endl;

if (0 == XML_Parse(parser, val, bytesRead, (bytesRead < sizeof(val))))
{
break;
}
}
while (1);

XML_ParserFree(parser);
std::cout << __FUNCTION__ << " end" << std::endl;

return 0;
}

catalog.xml:

<CATALOG>
<CD key1="value1" key2="value2">
<TITLE>Empire Burlesque</TITLE>
<ARTIST>Bob Dylan</ARTIST>
<YEAR>1995</YEAR>
</CD>
</CATALOG>

生成文件:

xml: xml.o
g++ xml.o -lexpat -o xml

xml.o: main.cpp Makefile
g++ -g -c main.cpp -o xml.o

输出:

fp == 0x0x22beb50
In while loop bytesRead==148
name: CATALOG
value:

value:
name: CD
argv[1] == key1
argv[2] == value1
argv[3] == key2
argv[4] == value2
value:

value:
name: TITLE
value: Empire Burlesque
value:

value:
name: ARTIST
value: Bob Dylan
value:

value:
name: YEAR
value: 1995
value:

value:
value:

In while loop bytesRead==0
main end

问题:

从输出来看,我使用 XML_SetCharacterDataHandler() 安装的回调似乎为 CATALOG、CD、TITLE 和 ARTIST xml 标签调用了两次,然后为 YEAR 标签调用了多次- 有人可以解释这种行为吗?从提到的 catalog.xml 中,我不清楚为什么有(或永远会有)多个值与任何 XML 标记相关联。

谢谢。

引用:

归功于 this site以上述示例代码为基础。

最佳答案

expat 解析器可能将文本节点拆分为对字符数据处理程序的多个调用。要正确处理文本节点,您必须通过多次调用积累文本,并在收到包含标签的“结束”事件时处理它。

这在一般情况下是正确的,即使在不同的解析器和不同的语言中也是如此——即在 Java 中也是如此。

例如参见 http://marcomaggi.github.io/docs/expat.html#using-comm

A common first–time mistake with any of the event–oriented interfaces to an XML parser is to expect all the text contained in an element to be reported by a single call to the character data handler. Expat, like many other XML parsers, reports such data as a sequence of calls; there's no way to know when the end of the sequence is reached until a different callback is made.

同样来自 the expat documentation

A single block of contiguous text free of markup may still result in a sequence of calls to this handler. In other words, if you're searching for a pattern in the text, it may be split across calls to this handler.

关于c - 多次调用 XML CharacterDataHandler 回调,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42125772/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com