C 在字符串中搜索单词-6ren

C 在字符串中搜索单词

转载作者：太空宇宙更新时间：2023-11-04 07:03:56

我希望有人能帮助我。我认为这是一个简单的问题，我想编写一个程序来搜索文件中的单词。

char *such = "Ingo";
char *fund;
FILE *datei;
char text[100];

datei = fopen("names.txt", "r");

if (datei == NULL) {
    printf("Fehler\n");
}
else 
{
    fscanf(datei, "%100c", text);
    text[100] = '\0';
    //i think this dont work
    if (fgets(text, 100, datei) != NULL)
    {
        printf("%s \n", text);
    }   
}

return 0;

该文件包含以下内容:

Ingo Test Test 123 Test Ingo Ingo

现在我想搜索名称“Ingo”在文件中出现的频率。

是否可以搜索更多的词，比如“ingo”和“test”并计算这个？

最佳答案

您应该测试很多条件以确保您只匹配整个单词等。以下是搜索 jury 的一种方法。并且只匹配 jury , jury's , 但不是 injury .您还应该考虑是否要匹配单词的复数形式(例如 review 和 reviews 。在单个定界符集合 ( delim ) 下方被认为可以确保您匹配整个单词。您可以轻松地打破它如果您想匹配复数形式或各种其他后缀，则分为两部分并设置开头和结尾。

代码期望文件名作为第一个参数进行搜索，搜索词 ( sterm ) 作为第二个参数。 (如果没有给出参数，它将在 stdin 上的文本中搜索 'the' )。该代码将文件中的每一行读入一个名为 line 的临时缓冲区中然后搜索 line 中的每个字符对于 sterm 中的起始字符.如果找到，则检查前一个字符以确保它是定界符，然后单词后面的字符(按 sterm 长度)也是定界符。如果是与sterm相同字符开头的单词, 前后分隔，然后使用 strncmp 比较内容.

如果满足所有条件，则将单词复制到tmp和 count递增。结果与 line 中从零开始的位置一起打印为了比赛。这只是一个基本的全词搜索，尚未优化，但应该为您提供一个从较少包含的子字符串中区分全词的起点。 (即搜索 'the' 也不会匹配 'them' 、 'then' 、 'they' 等)。您还可以将此代码转换为一个函数，它将每个匹配项的行号和位置保存在一个结构数组中，您可以将指针返回到该数组。这样你就可以解析你的文本并返回一个指向保存每个匹配项的行和位置的数组的指针。 (那是另一天)。

查看代码，如果您有任何问题，请告诉我。如果您不关心只匹配全词，那么您可以简单地调用strstr在每一行上重复，同时推进指针以计算搜索词的出现次数。最能满足您需求的内容。

#include <stdio.h>
#include <string.h>

#define MAXS 256

int main (int argc, char **argv)
{
    char line[MAXS] = {0};  /* line buffer for fgets */
    FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
    char *sterm = argc > 2 ? argv[2] : "the";
    char *delim = " \t\n\'\".";
    size_t count = 0, idx = 0, slen = strlen (sterm);

    if (!fp) {
        fprintf (stderr, "error: file open failed '%s'\n", argv[1]);
        return 1;
    }

    while (fgets (line, MAXS, fp))
    {
        size_t i, llen = strlen (line);
        idx++;

        if (llen < slen + 1)
            continue;       /* line not longer than search term + \n */

        for (i = 0; i < llen - slen + 1; i++) {

            if (line[i] != *sterm)
                continue;   /* char != first char in sterm  */
            if (i && !strchr (delim, line[i-1]))
                continue;   /* prior char is not a delim    */
            if (!strchr (delim, line[i+slen]))
                continue;   /* next char is not a delim     */
            if (strncmp (&line[i], sterm, slen))
                continue;   /* chars don't match sterm      */

            printf (" line[%2zu] match %2zu. '%s' at location %zu\n",
                    idx, ++count, sterm, &line[i] - line);
        }
    }
    if (fp != stdin) fclose (fp);

    printf ("\n total occurrences of '%s' in '%s' : %zu\n\n",
            sterm, argc > 1 ? argv[1] : "stdin", count);

    return 0;
}

示例文件

$ cat dat/damages.txt
Personal injury damage awards are unliquidated
and are not capable of certain measurement; thus, the
jury has broad discretion in assessing the amount of
damages in a personal injury case. Yet, at the same
time, a factual sufficiency review insures that the
evidence supports the jury's award; and, although
difficult, the law requires appellate courts to conduct
factual sufficiency reviews on damage awards in
personal injury cases. Thus, while a jury has latitude in
assessing intangible damages in personal injury cases,
a jury's damage award does not escape the scrutiny of
appellate review.

Because Texas law applies no physical manifestation
rule to restrict wrongful death recoveries, a
trial court in a death case is prudent when it chooses
to submit the issues of mental anguish and loss of
society and companionship. While there is a
presumption of mental anguish for the wrongful death
beneficiary, the Texas Supreme Court has not indicated
that reviewing courts should presume that the mental
anguish is sufficient to support a large award. Testimony
that proves the beneficiary suffered severe mental
anguish or severe grief should be a significant and
sometimes determining factor in a factual sufficiency
analysis of large non-pecuniary damage awards.

输出

$ ./bin/searchterm dat/damages.txt jury
 line[ 3] match  1. 'jury' at location 0
 line[ 6] match  2. 'jury' at location 22
 line[ 9] match  3. 'jury' at location 37
 line[11] match  4. 'jury' at location 2

 total occurrences of 'jury' in 'dat/damages.txt' : 4

或

$ ./bin/searchterm <dat/damages.txt
 line[ 2] match  1. 'the' at location 50
 line[ 3] match  2. 'the' at location 39
 line[ 4] match  3. 'the' at location 43
 line[ 5] match  4. 'the' at location 48
 line[ 6] match  5. 'the' at location 18
 line[ 7] match  6. 'the' at location 11
 line[11] match  7. 'the' at location 38
 line[17] match  8. 'the' at location 10
 line[19] match  9. 'the' at location 34
 line[20] match 10. 'the' at location 13
 line[21] match 11. 'the' at location 42
 line[23] match 12. 'the' at location 12

 total occurrences of 'the' in 'stdin' : 12

使用指针而不是数组索引符号

您可能会发现使用指针而不是数组索引 符号更自然。 (例如，使用 char *p = line; 并推进 p ，而不是使用 line[X] 符号)。如果是这样，您可以将读取循环替换为以下内容:

    while (fgets (line, MAXS, fp))
    {
        char *p = line;
        size_t llen = strlen (line);
        idx++;

        if (llen < slen + 1)
            continue;       /* line not longer than search term + \n */

        for (;p < (line + llen - slen + 1); p++) {

            if (*p != *sterm)
                continue;   /* char != first char in sterm  */
            if (p > line && !strchr (delim, *(p - 1)))
                continue;   /* prior char is not a delim    */
            if (!strchr (delim, *(p + slen)))
                continue;   /* next char is not a delim     */
            if (strncmp (p, sterm, slen))
                continue;   /* chars don't match sterm      */

            printf (" line[%2zu] match %2zu. '%s' at location %zu\n",
                    idx, ++count, sterm, p - line);
        }
    }

指针表示法在 C 中可能更自然一些。如果您有任何问题，请告诉我。

关于C 在字符串中搜索单词，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/34907493/

文章推荐： c - 试图学习 C，但我遇到了奇怪的错误(至少对我而言)

文章推荐： html - CSS 网格中的响应式图像

文章推荐： java - 为任何可比较集合编写排序方法 (Java)

文章推荐： Python列表字典理解

java - 多字段子集合中的 Hibernate 搜索/lucene 搜索
我在我的应用程序中使用 Hibernate Search。其中一个子集合被映射为 IndexedEmbedded。子对象有两个字段，一个是 id，另一个是日期(使用日期分辨率到毫秒)。当我搜索 id=
java - App Engine 搜索 api GeoPoint 搜索
The App Engine Search API有一个 GeoPoint 字段。可以用它来进行半径搜索吗？例如，给定一个 GeoPoint，查找位于特定半径内的所有文档。截至目前，它看起来像 Ge
mysql - Laravel MySQL 搜索。允许用户进行自定义 bool 搜索
客户对我正在做的员工管理项目提出了这个新要求，以允许他们的用户进行自定义 bool 搜索。基本上允许他们使用:AND、OR、NOT、括号和引号。实现它的最佳方法是什么？我检查了 mysql，它们使
php - 搜索 PHP 数组比从 MySQL 搜索/检索更快
很想知道哪个更快 - 如果我有一个包含 25000 个键值对的数组和一个包含相同信息的 MySQL 数据库，搜索哪个会更快？非常感谢大家! 最佳答案回答这个问题的最好方法是执行基准测试。关于ph
Vim - 如何使用 smartcase 进行/搜索，而使用 noic 进行 * 搜索？
我喜欢 smartcase，也喜欢 * 和 # 搜索命令。但我更希望 * 和 # 搜索命令区分大小写，而/和 ?搜索命令遵循 smartcase 启发式。是否有隐藏在某个地方我还没有找到的设置？我宁
java - 使用 Marklogic 的 Java 搜索 API 与 XQuery/XSLT API 进行文档 XPath 搜索
我有以下 Marklogic 查询，当在查询控制台中运行时，它允许我检索具有管理员权限的系统用户: xquery version "1.0-ml"; import schema namespace b
PHP:搜索 "a.."
我希望当您搜索例如“A”时，所有以“A”开头的全名都会出现。因此，如果名为“Andreas blabla”的用户将显示我现在有这个: $query = "SELECT full_name, id,
Javascript 搜索
我想在我的网站上添加对人名的搜索。好友列表已经显示在页面上。我喜欢 Facebook 这样做的方式，您开始输入姓名，Facebook 只会显示与查询匹配的好友。 http://cl.ly/2t2V0
PHP错误问题(搜索)
您好，我在我的网站上进行搜索时遇到此错误。 Fatal error: Uncaught Error: Call to undefined function mysql_connect() in /ho
算法总结--搜索
声明( 叠甲 )：鄙人水平有限，本文为作者的学习总结，仅供参考。 1. 搜索介绍搜索算法包括深度优先搜索（DFS）和广度优先搜索（BFS）这两种，从起点开始，逐渐扩大
Flutter - FutureBuilder - 搜索
我正在为用户列表使用 FuturBuilder。我通过 futur: fetchpost() 通过 API 获取用户。在专栏的开头，我实现了一个搜索栏。那么我该如何实现我的搜索栏正在搜索呢？ Cont
搜索 mvc 保持同一页面
我正在使用 MVC5，我想搜索结果并停留在同一页面，这是我在 Controller (LiaisonsProjetsPPController) 中执行搜索操作的方法: public ActionRes
Azure 搜索 - 上传与合并或上传之间的区别
Azure 搜索中的两种方法 Upload 与 MergeOrUpload 之间有什么区别。他们都做完全相同的事情。即，如果文档不存在，它们都会上传文档；如果文档已经存在，则替换该文档。由于这两种
audio - 声音匹配/搜索
实际上，声音匹配/搜索的当前状态是什么？我目前正在远程参与规划一个 Web 应用程序，该应用程序将包含和公开记录的短音频剪辑(最多 3-5 秒，人名)的数据库。已经提出了一个问题，是否可以实现基于用户
azure 搜索。如果我有很多面怎么办
在商业应用程序中，具有数百个面并不罕见。当然，并非所有产品都带有所有这些标记。但是在搜索时，我需要添加一个方面查询字符串参数，其中列出了我想要返回的所有方面。由于我事先不知道相关列表，因此我必须在查
cuda - 搜索-lcudart时跳过不兼容的libcudart.so
当我使用nvcc 5.0编译.cu文件时，编译器会为我提供以下信息。 /usr/bin/ld: skipping incompatible /usr/local/cuda-5.0/lib/libcud
Azure 搜索 - 作为第一个或单个结果完全匹配
我正在使用基于丰富的 Lucene 查询解析器语法的 Azure 搜索。我将“~1”定义为距离符号的附加参数)。但我面临的问题是，即使存在完全匹配，实体也没有排序。 (例如，“blue~1”将返回“b
java - 搜索 ArrayList
我目前有 3 个类，一个包含 GUI 的主类，我在其中调用此方法，一个包含数据的客户类，以及一个从客户类收集数据并将其放入数组列表的 customerList 类，以及还包含搜索数组列表方法。我正在
部分列的 SQL 搜索
假设我有多个 6 字符的字母数字字符串。 abc123、abc231、abc456、cba123、bac231 和 bac123 。基本上我想要一个可以搜索和列出所有 abc 实例的选择语句。我只
SQL 不区分大小写的 IN 搜索
我有这个表 "Table"内容: +--------+ | Serial | +--------+ | d100m | <- expected result | D100M | <- expect

太空宇宙

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

C 在字符串中搜索单词