MA-6ren">
gpt4 book ai didi

c# - 是否有可能找出字符串列表中的公共(public)部分

转载 作者:行者123 更新时间:2023-12-05 04:30:02 26 4
gpt4 key购买 nike

我正在努力找出字符串列表中的公共(public)字符串部分。如果我们取一个样本数据集

private readonly List<string> Xpath = new List<string>()
{
"BODY>MAIN:nth-of-type(1)>DIV>SECTION>DIV>SECTION>DIV>DIV:nth-of-type(1)>DIV>DIV:nth-of-type(3)>DIV>ARTICLE>DIV>DIV>DIV>SECTION:nth-of-type(1)>H2:nth-of-type(1)",
"BODY>MAIN:nth-of-type(1)>DIV>SECTION>DIV>SECTION>DIV>DIV:nth-of-type(1)>DIV>DIV:nth-of-type(3)>DIV>ARTICLE>DIV>DIV>DIV>SECTION:nth-of-type(2)>H2:nth-of-type(1)",
"BODY>MAIN:nth-of-type(1)>DIV>SECTION>DIV>SECTION>DIV>DIV:nth-of-type(1)>DIV>DIV:nth-of-type(3)>DIV>ARTICLE>DIV>DIV>DIV>SECTION:nth-of-type(3)>H2:nth-of-type(1)",
"BODY>MAIN:nth-of-type(1)>DIV>SECTION>DIV>SECTION>DIV>DIV:nth-of-type(1)>DIV>DIV:nth-of-type(3)>DIV>ARTICLE>DIV>DIV>DIV>SECTION:nth-of-type(4)>H2:nth-of-type(1)",
"BODY>MAIN:nth-of-type(1)>DIV>SECTION>DIV>SECTION>DIV>DIV:nth-of-type(1)>DIV>DIV:nth-of-type(3)>DIV>ARTICLE>DIV>DIV>DIV>SECTION:nth-of-type(5)>H2:nth-of-type(1)",
"BODY>MAIN:nth-of-type(1)>DIV>SECTION>DIV>SECTION>DIV>DIV:nth-of-type(1)>DIV>DIV:nth-of-type(3)>DIV>ARTICLE>DIV>DIV>DIV>SECTION:nth-of-type(6)>H2:nth-of-type(1)",
"BODY>MAIN:nth-of-type(1)>DIV>SECTION>DIV>SECTION>DIV>DIV:nth-of-type(1)>DIV>DIV:nth-of-type(3)>DIV>ARTICLE>DIV>DIV>DIV>SECTION:nth-of-type(7)>H2:nth-of-type(1)",
"BODY>MAIN:nth-of-type(1)>DIV>SECTION>DIV>SECTION>DIV>DIV:nth-of-type(1)>DIV>DIV:nth-of-type(3)>DIV>ARTICLE>DIV>DIV>DIV>SECTION:nth-of-type(8)>H2:nth-of-type(1)",
"BODY>MAIN:nth-of-type(1)>DIV>SECTION>DIV>SECTION>DIV>DIV:nth-of-type(1)>DIV>DIV:nth-of-type(3)>DIV>ARTICLE>DIV>DIV>DIV>SECTION:nth-of-type(9)>H2:nth-of-type(1)"
};

由此,我想找出这些 child 与哪些 child 相似。 data 是一个 Xpath 列表。以编程方式我应该能够告诉

预期输出:

BODY>MAIN:nth-of-type(1)>DIV>SECTION>DIV>SECTION>DIV>DIV:nth-of-type(1)>DIV>DIV:nth-of-type(3)>DIV>ARTICLE>DIV>DIV>DIV

为了得到这个我做了这样的事情。我用 >> 分隔每个项目,然后最初为每个数据集创建一个项目列表。

然后用这个找出什么是独特的元素

private IEnumerable<T> GetCommonItems<T>(IEnumerable<T>[] lists)
{
HashSet<T> hs = new HashSet<T>(lists.First());
for (int i = 1; i < lists.Length; i++)
{
hs.IntersectWith(lists[i]);
}
return hs;
}

能够找出唯一值并再次创建数据集。但是,如果这包含 Ex:- Div 在两个地方,并且它也在每个原始数据集中,那么这个方法将只选择一个 Div。

从那时起我会得到这样的东西:

BODY>MAIN:nth-of-type(1)>DIV>SECTION

但是我需要这个

BODY>MAIN:nth-of-type(1)>DIV>SECTION>DIV>SECTION>DIV>DIV:nth-of-type(1)>DIV>DIV:nth-of-type(3)>DIV>ARTICLE>DIV>DIV>DIV

最佳答案

免责声明:这不是最高效的解决方案,但它有效:)

  • 让我们开始用 > 字符分割第一条路径
  • 对所有路径做同样的事情
char separator = '>';
IEnumerable<string> firstPathChunks = Xpath[0].Split(separator);
var chunks = Xpath.Select(path => path.Split(separator).ToList()).ToArray();
  • 遍历 firstPathChunks
    • 遍历 block
    • 如果有匹配则删除第一个元素
    • 如果所有第一个元素都被删除,则将匹配的前缀附加到 sb
void Process(StringBuilder sb)
{
foreach (var pathChunk in firstPathChunks)
{
foreach (var chunk in chunks)
{
if (chunk[0] != pathChunk)
{
return;
}
chunk.RemoveAt(0);
}
sb.Append(pathChunk);
sb.Append(separator);
}
}

示例用法

var sb = new StringBuilder();
Process(sb);
Console.WriteLine(sb.ToString());

输出

BODY>MAIN:nth-of-type(1)>DIV>SECTION>DIV>SECTION>DIV>DIV:nth-of-type(1)>DIV>DIV:nth-of-type(3)>DIV>ARTICLE>DIV>DIV>DIV>

关于c# - 是否有可能找出字符串列表中的公共(public)部分,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/72182204/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com