gpt4 book ai didi

c++ - 如何初始化指向输出缓冲区长度的指针?

转载 作者:搜寻专家 更新时间:2023-10-31 01:37:34 24 4
gpt4 key购买 nike

我正在使用此代码进行正则表达式替换,使用 pcre2图书馆:

PCRE2_SIZE outlengthptr=256;                       //this line
PCRE2_UCHAR* output_buffer; //this line
output_buffer=(PCRE2_UCHAR*)malloc(outlengthptr); //this line
uint32_t rplopts=PCRE2_SUBSTITUTE_GLOBAL;
int ret=pcre2_substitute(
re1234, /*Points to the compiled pattern*/
subject, /*Points to the subject string*/
subject_length, /*Length of the subject string*/
0, /*Offset in the subject at which to start matching*/
rplopts, /*Option bits*/
0, /*Points to a match data block, or is NULL*/
0, /*Points to a match context, or is NULL*/
replace, /*Points to the replacement string*/
replace_length, /*Length of the replacement string*/
output_buffer, /*Points to the output buffer*/
&outlengthptr /*Points to the length of the output buffer*/
);

但我似乎不明白如何正确定义 output_buffer 和指向其长度的指针 (outlengthptr)。

当我给 outlengthptr 一个固定值时,代码可以工作,但它保持固定,即它不会更改为 output_buffer 的新长度。但是根据pcre2_substitue() specification它应该被更改为 output_buffer 的新长度:

The length, startoffset and rlength values are code units, not characters, as is the contents of the variable pointed at by outlengthptr, which is updated to the actual length of the new string.

问题是:

  1. 当我将 outlengthptr 设置为固定值时,最终字符串会以固定长度被截断。
  2. 如果我不初始化变量 outlengthptr,我会遇到段错误。

这是函数的原型(prototype):

 int pcre2_substitute(const pcre2_code *code, PCRE2_SPTR subject, PCRE2_SIZE length, PCRE2_SIZE startoffset, uint32_t options, pcre2_match_data *match_data, pcre2_match_context *mcontext, PCRE2_SPTR replacement, PCRE2_SIZE rlength, PCRE2_UCHAR *outputbuffer, PCRE2_SIZE *outlengthptr); 

This is the man page of the function .

最佳答案

pcre2api page说以下(强调我的):

The function returns the number of replacements that were made. This may be zero if no matches were found, and is never greater than 1 unless PCRE2_SUBSTITUTE_GLOBAL is set. In the event of an error, a negative error code is returned. Except for PCRE2_ERROR_NOMATCH (which is never returned), any errors from pcre2_match() or the substring copying functions are passed straight back. PCRE2_ERROR_BADREPLACEMENT is returned for an invalid replacement string (unrecognized sequence following a dollar sign), and PCRE2_ERROR_NOMEMORY is returned if the output buffer is not big enough.

因此,从一个应该容纳大部分结果的初始缓冲区开始——不要太大也不要太小。这取决于您的应用。
例如,您可以尝试从输入字符串长度的 120% 作为启发式开始,因为这对于大多数常见的正则表达式替换用法来说似乎是一个合理的选择。

然后,使用此缓冲区调用函数,并将其大小传递给它。

  • 如果您得到肯定的结果(或零),您就完成了。
  • 如果得到 PCRE2_ERROR_NOMEMORY,则将缓冲区大小加倍并重试(根据需要重复此步骤多次)
  • 如果您得到不同的错误代码,请将其作为真正的错误案例进行相应处理。

关于c++ - 如何初始化指向输出缓冲区长度的指针?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/33981841/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com