gpt4 book ai didi

c++ - 如何在 C 或 C++ 中编写简单的正则表达式模式匹配函数?

转载 作者:IT老高 更新时间:2023-10-28 22:36:37 25 4
gpt4 key购买 nike

这是我今天笔试的一道题,函数签名是

int is_match(char* pattern,char* string)

模式仅限于ASCII字符和量化*?,所以比较简单。 is_match 如果匹配则返回 1,否则返回 0。

我该怎么做?

最佳答案

Brian Kernighan提供了一篇关于 A Regular Expression Matcher 的短文那Rob Pike作为他们正在编写的一本书的演示程序编写的。这篇文章读起来很不错,解释了一些关于代码和正则表达式的一般信息。

我玩过这段代码,做了一些更改以试验一些扩展,例如返回字符串中模式匹配的位置,以便可以从原始文本中复制匹配模式的子字符串。

来自文章:

I suggested to Rob that we needed to find the smallest regular expression package that would illustrate the basic ideas while still recognizing a useful and non-trivial class of patterns. Ideally, the code would fit on a single page.

Rob disappeared into his office, and at least as I remember it now, appeared again in no more than an hour or two with the 30 lines of C code that subsequently appeared in Chapter 9 of TPOP. That code implements a regular expression matcher that handles these constructs:

c    matches any literal character c
. matches any single character
^ matches the beginning of the input string
$ matches the end of the input string
* matches zero or more occurrences of the previous character

This is quite a useful class; in my own experience of using regular expressions on a day-to-day basis, it easily accounts for 95 percent of all instances. In many situations, solving the right problem is a big step on the road to a beautiful program. Rob deserves great credit for choosing so wisely, from among a wide set of options, a very small yet important, well-defined and extensible set of features.

Rob's implementation itself is a superb example of beautiful code: compact, elegant, efficient, and useful. It's one of the best examples of recursion that I have ever seen, and it shows the power of C pointers. Although at the time we were most interested in conveying the important role of a good notation in making a program easier to use and perhaps easier to write as well, the regular expression code has also been an excellent way to illustrate algorithms, data structures, testing, performance enhancement, and other important topics.

文章中的实际 C 源代码非常好。

/* match: search for regexp anywhere in text */
int match(char *regexp, char *text)
{
if (regexp[0] == '^')
return matchhere(regexp+1, text);
do { /* must look even if string is empty */
if (matchhere(regexp, text))
return 1;
} while (*text++ != '\0');
return 0;
}

/* matchhere: search for regexp at beginning of text */
int matchhere(char *regexp, char *text)
{
if (regexp[0] == '\0')
return 1;
if (regexp[1] == '*')
return matchstar(regexp[0], regexp+2, text);
if (regexp[0] == '$' && regexp[1] == '\0')
return *text == '\0';
if (*text!='\0' && (regexp[0]=='.' || regexp[0]==*text))
return matchhere(regexp+1, text+1);
return 0;
}

/* matchstar: search for c*regexp at beginning of text */
int matchstar(int c, char *regexp, char *text)
{
do { /* a * matches zero or more instances */
if (matchhere(regexp, text))
return 1;
} while (*text != '\0' && (*text++ == c || c == '.'));
return 0;
}

关于c++ - 如何在 C 或 C++ 中编写简单的正则表达式模式匹配函数?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/4040835/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com