gpt4 book ai didi

c# - Azure 搜索文档添加自定义分析器、分词器和分词过滤器

转载 作者:行者123 更新时间:2023-12-03 05:34:07 29 4
gpt4 key购买 nike

我正在将 Azure 搜索 sdk 从 Microsoft.Azure.Search (v10) 迁移到 Azure.Search.Documents (v11)。

之前,在 v10 中,我们能够使用 C# SDK 通过自定义分析器、分词器创建索引,如下所示:

var index = new Microsoft.Azure.Search.Models.Index(
name: GetIndexName(),
defaultScoringProfile: defaultScoringProfile,
fields: AzureQuestionItemDefinition.GetQuestionItemFieldsDefinition(),
analyzers: new[] {
new CustomAnalyzer
{
Name = "standardAnalyzer",
Tokenizer = TokenizerName.Standard,
TokenFilters = new[]
{
TokenFilterName.Lowercase,
TokenFilterName.AsciiFolding,
TokenFilterName.Phonetic,
}
},
new CustomAnalyzer
{
Name = "prefixAnalyzer",
Tokenizer = TokenizerName.Standard,
TokenFilters = new[]
{
TokenFilterName.Lowercase,
TokenFilterName.AsciiFolding,
TokenFilterName.Phonetic,
"edgeNgramTokenFilter"
}
},
},
tokenFilters: new[]
{
new EdgeNGramTokenFilterV2("edgeNgramTokenFilter", minGram: 2, maxGram: 10, EdgeNGramTokenFilterSide.Front),
},
scoringProfiles: new[]
{
new ScoringProfile(defaultScoringProfile)
{
TextWeights = new TextWeights()
{
Weights = new Dictionary<string, double>() {
{ nameof(QuestionItem.Text), 5.0 },
{ nameof(QuestionItem.Context), 5.0 },
{ $"{nameof(QuestionItem.Asker)}/{nameof(QuestionItem.Asker.Name)}", 3.0 },
{ $"{nameof(QuestionItem.Answers)}/{nameof(AnswerItem.Text)}", 2.0 },
{ $"{nameof(QuestionItem.Answers)}/{nameof(AnswerItem.AnswererName)}", 2.0 }
}
}
}
}

在迁移到新的 Azure.Search.Documents v11 时,我找不到使用 C# SDK 创建索引的方法。

我发现 SearchIndex 属性是只读:

//
// Summary:
// Represents a search index definition, which describes the fields and search behavior
// of an index.
public class SearchIndex : IUtf8JsonSerializable
{
//
// Summary:
// Initializes a new instance of the Azure.Search.Documents.Indexes.Models.SearchIndex
// class.
//
// Parameters:
// name:
// The name of the index.
//
// Exceptions:
// T:System.ArgumentException:
// name is an empty string.
//
// T:System.ArgumentNullException:
// name is null.
public SearchIndex(string name);
//
// Summary:
// Initializes a new instance of the Azure.Search.Documents.Indexes.Models.SearchIndex
// class.
//
// Parameters:
// name:
// The name of the index.
//
// fields:
// Fields to add to the index.
//
// Exceptions:
// T:System.ArgumentException:
// name is an empty string.
//
// T:System.ArgumentNullException:
// name or fields is null.
public SearchIndex(string name, IEnumerable<SearchField> fields);

//
// Summary:
// The name of the scoring profile to use if none is specified in the query. If
// this property is not set and no scoring profile is specified in the query, then
// default scoring (tf-idf) will be used.
public string DefaultScoringProfile { get; set; }
//
// Summary:
// Options to control Cross-Origin Resource Sharing (CORS) for the index.
public CorsOptions CorsOptions { get; set; }
//
// Summary:
// A description of an encryption key that you create in Azure Key Vault. This key
// is used to provide an additional level of encryption-at-rest for your data when
// you want full assurance that no one, not even Microsoft, can decrypt your data
// in Azure Cognitive Search. Once you have encrypted your data, it will always
// remain encrypted. Azure Cognitive Search will ignore attempts to set this property
// to null. You can change this property as needed if you want to rotate your encryption
// key; Your data will be unaffected. Encryption with customer-managed keys is not
// available for free search services, and is only available for paid services created
// on or after January 1, 2019.
public SearchResourceEncryptionKey EncryptionKey { get; set; }
//
// Summary:
// The type of similarity algorithm to be used when scoring and ranking the documents
// matching a search query. The similarity algorithm can only be defined at index
// creation time and cannot be modified on existing indexes. If null, the ClassicSimilarity
// algorithm is used.
public SimilarityAlgorithm Similarity { get; set; }
//
// Summary:
// Gets the name of the index.
[CodeGenMemberAttribute("name")]
public string Name { get; }
//
// Summary:
// Gets the analyzers for the index.
public IList<LexicalAnalyzer> Analyzers { get; }
//
// Summary:
// Gets the character filters for the index.
public IList<CharFilter> CharFilters { get; }
//
// Summary:
// Gets or sets the fields in the index. Use Azure.Search.Documents.Indexes.FieldBuilder
// to define fields based on a model class, or Azure.Search.Documents.Indexes.Models.SimpleField,
// Azure.Search.Documents.Indexes.Models.SearchableField, and Azure.Search.Documents.Indexes.Models.ComplexField
// to manually define fields. Index fields have many constraints that are not validated
// with Azure.Search.Documents.Indexes.Models.SearchField until the index is created
// on the server.
public IList<SearchField> Fields { get; set; }
//
// Summary:
// Gets the scoring profiles for the index.
public IList<ScoringProfile> ScoringProfiles { get; }
//
// Summary:
// Gets the suggesters for the index.
public IList<SearchSuggester> Suggesters { get; }
//
// Summary:
// Gets the token filters for the index.
public IList<TokenFilter> TokenFilters { get; }
//
// Summary:
// Gets the tokenizers for the index.
public IList<LexicalTokenizer> Tokenizers { get; }
//
// Summary:
// The Azure.ETag of the Azure.Search.Documents.Indexes.Models.SearchIndex.
public ETag? ETag { get; set; }
}

我的问题是如何设置自定义分词器、TokenFilters、ScoringProfiles...

最佳答案

默认情况下,集合属性在新的 Azure .NET 客户端库中进行初始化。尽管您无法设置属性,但您仍然可以对每个属性调用 Add:

var index = new SearchIndex("myindex");
index.ScoringProfiles.Add(new ScoringProfile(...));

我个人觉得这不太方便,因为我喜欢编写基于表达式的代码,因此我已经将此反馈传递给 Azure SDK 团队。

关于c# - Azure 搜索文档添加自定义分析器、分词器和分词过滤器,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/63780948/

29 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com