gpt4 book ai didi

c# - Linq 获取句子中的单词

转载 作者:太空狗 更新时间:2023-10-30 00:04:26 25 4
gpt4 key购买 nike

我有一个单词列表和一个句子列表。我想知道哪些可以在哪些句子中找到。

这是我的代码:

List<string> sentences = new List<string>();
List<string> words = new List<string>();

sentences.Add("Gallia est omnis divisa in partes tres, quarum unam incolunt Belgae, aliam Aquitani, tertiam qui ipsorum lingua Celtae, nostra Galli appellantur.");
sentences.Add("Alea iacta est.");
sentences.Add("Libenter homines id, quod volunt, credunt.");

words.Add("est");
words.Add("homines");

List<string> myResults = sentences
.Where(sentence => words
.Any(word => sentence.Contains(word)))
.ToList();

我需要的是一个元组列表。有了句子和单词,that in the sentence.

最佳答案

首先,我们必须定义什么是单词。让它是字母和撇号的任意组合

  Regex regex = new Regex(@"[\p{L}']+");

其次,我们应该考虑一下我们应该如何处理案例。让我们实现不区分大小写例程:

  HashSet<string> wordsToFind = new HashSet<string>(StringComparer.OrdinalIgnoreCase) {
"est",
"homines"
};

然后我们可以使用Regex匹配句子中的单词,Linq 查询句子:

代码:

  var actualWords = sentences
.Select((text, index) => new {
text = text,
index = index,
words = regex
.Matches(text)
.Cast<Match>()
.Select(match => match.Value)
.ToArray()
})
.SelectMany(item => item.words
.Where(word => wordsToFind.Contains(word))
.Select(word => Tuple.Create(word, item.index + 1)));

string report = string.Join(Environment.NewLine, actualWords);

Console.Write(report);

结果:

  (est, 1)         // est appears in the 1st sentence
(est, 2) // est appears in the 2nd sentence as well
(homines, 3) // homines appears in the 3d sentence

如果你想要Tuple<string, string>对于单词句子,只需更改Tuple.Create(word, item.index + 1)对于 Tuple.Create(word, item.text)在最后Select

关于c# - Linq 获取句子中的单词,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56426412/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com