gpt4 book ai didi

c# - 如何在 Azure 搜索中允许自定义分析器使用通配符

转载 作者:行者123 更新时间:2023-12-03 00:25:35 25 4
gpt4 key购买 nike

感谢您提前提供的帮助。

我正在使用 Azure 搜索 .Net SDK 构建索引器。我目前也在使用自定义分析器

在使用自定义分析器之前,我使用的是EnLucene分析器,它允许我使用通配符搜索*。比如我是用让用户搜索后缀搜索的。如果用户搜索“app”,则会返回“apple、application、approach”等结果。请不要建议自动完成或建议,因为建议器不能与自定义分析器一起使用。我不想创建仅仅因为建议者就增加了 20 个搜索字段。 (一项用于建议,一项用于搜索)。

下面是我的自定义分析器示例。它不允许我使用 * 进行部分匹配。我不是在寻找任何前缀或后缀部分匹配的 NGram 解决方案。我实际上想使用通配符 *。我该怎么做才能允许通配符搜索?

var definition = new Index()
{
Name = indexName,
Fields = mapFields,
Analyzers = new[]
{
new CustomAnalyzer
{
Name = "custom_analyzer",
Tokenizer = TokenizerName.Whitespace,
TokenFilters = new[]
{
TokenFilterName.AsciiFolding,
TokenFilterName.Lowercase,
TokenFilterName.Phonetic
}
}
}
};

最佳答案

以下是您可以执行此操作的方法:

  • 添加自定义分析器,如下所示:

{
"name":"names",
"fields":[
{ "name":"id", "type":"Edm.String", "key":true, "searchable":false },
{ "name":"name", "type":"Edm.String", "analyzer":"my_standard" }
],
"analyzers":[
{
"name":"my_standard",
"@odata.type":"#Microsoft.Azure.Search.CustomAnalyzer",
"tokenizer":"standard",
"tokenFilters":[ "lowercase", "asciifolding" ]
}
]
}

// Below snippet is for creating definition using c#
new CustomAnalyzer
{
Name = "custom_analyzer",
Tokenizer = TokenizerName.Standard,
TokenFilters = new[]
{
TokenFilterName.Lowercase,
TokenFilterName.AsciiFolding,
TokenFilterName.Phonetic
}
}

  • 然后在创建文档定义时引用自定义分析器,如下所示:

    [IsSearchable, IsFilterable, IsSortable, Analyzer("custom_analyzer")]
public string Property { get; set; }

查看此博客以获取更多引用:

https://azure.microsoft.com/en-in/blog/custom-analyzers-in-azure-search/

这是自定义分析仪的示例测试方法:

[Fact]
public void CanSearchWithCustomAnalyzer()
{
Run(() =>
{
const string CustomAnalyzerName = "my_email_analyzer";
const string CustomCharFilterName = "my_email_filter";

Index index = new Index()
{
Name = SearchTestUtilities.GenerateName(),
Fields = new[]
{
new Field("id", DataType.String) { IsKey = true },
new Field("message", (AnalyzerName)CustomAnalyzerName) { IsSearchable = true }
},
Analyzers = new[]
{
new CustomAnalyzer()
{
Name = CustomAnalyzerName,
Tokenizer = TokenizerName.Standard,
CharFilters = new[] { (CharFilterName)CustomCharFilterName }
}
},
CharFilters = new[] { new PatternReplaceCharFilter(CustomCharFilterName, "@", "_") }
};

Data.GetSearchServiceClient().Indexes.Create(index);

SearchIndexClient indexClient = Data.GetSearchIndexClient(index.Name);

var documents = new[]
{
new Document() { { "id", "1" }, { "message", "My email is <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="c7b4a8aaa2a8a9a287b4a8aaa2b0afa2b5a2e9b4a8aaa2b3afaea9a0" rel="noreferrer noopener nofollow">[email protected]</a>." } },
new Document() { { "id", "2" }, { "message", "His email is <a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="493a26242c26272c0927263e212c3b2c6727263d2120272e" rel="noreferrer noopener nofollow">[email protected]</a>." } },
};

indexClient.Documents.Index(IndexBatch.Upload(documents));
SearchTestUtilities.WaitForIndexing();

DocumentSearchResult<Document> result = indexClient.Documents.Search("<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="c8bba7a5ada7a6ad88bba7a5adbfa0adbaade6bba7a5adbca0a1a6af" rel="noreferrer noopener nofollow">[email protected]</a>");

Assert.Equal("1", result.Results.Single().Document["id"]);
});
}

欢迎在对话中标记我,希望对您有所帮助。

关于c# - 如何在 Azure 搜索中允许自定义分析器使用通配符,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58822467/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com