gpt4 book ai didi

python - 在 Django 模型中使用 Trigram (gin_trgm_ops) 创建 Gin 索引

转载 作者:太空狗 更新时间:2023-10-29 18:20:52 24 4
gpt4 key购买 nike

django.contrib.postgres 的新 TrigramSimilarity 特性非常适合我遇到的问题。我将它用于搜索栏以查找难以拼写的拉丁名称。问题是有超过 200 万个名字,搜索时间比我想要的要长。

我想在 postgres documentation 中描述的三元组上创建一个索引.

但我不确定如何以 Django API 使用它的方式执行此操作。对于 postgres 文本搜索,有关于如何创建索引的描述,但没有关于 trigram similarity 的描述。 .

这是我现在拥有的:

class NCBI_names(models.Model):
tax_id = models.ForeignKey(NCBI_nodes, on_delete=models.CASCADE, default = 0)
name_txt = models.CharField(max_length=255, default = '')
name_class = models.CharField(max_length=32, db_index=True, default = '')

class Meta:
indexes = [GinIndex(fields=['name_txt'])]

在 View 的get_queryset 方法中:

class TaxonSearchListView(ListView):    
#form_class=TaxonSearchForm
template_name='collectie/taxon_list.html'
paginate_by=20
model=NCBI_names
context_object_name = 'taxon_list'

def dispatch(self, request, *args, **kwargs):
query = request.GET.get('q')
if query:
try:
tax_id = self.model.objects.get(name_txt__iexact=query).tax_id.tax_id
return redirect('collectie:taxon_detail', tax_id)
except (self.model.DoesNotExist, self.model.MultipleObjectsReturned) as e:
return super(TaxonSearchListView, self).dispatch(request, *args, **kwargs)
else:
return super(TaxonSearchListView, self).dispatch(request, *args, **kwargs)

def get_queryset(self):
result = super(TaxonSearchListView, self).get_queryset()
#
query = self.request.GET.get('q')
if query:
result = result.exclude(name_txt__icontains = 'sp.')
result = result.annotate(similarity=TrigramSimilarity('name_txt', query)).filter(similarity__gt=0.3).order_by('-similarity')
return result

最佳答案

我找到了一个 12/2020 article这样使用最新版本的 Django ORM:

class Author(models.Model):
first_name = models.CharField(max_length=100)
last_name = models.CharField(max_length=100)

class Meta:
indexes = [
GinIndex(
name='review_author_ln_gin_idx',
fields=['last_name'],
opclasses=['gin_trgm_ops'],
)
]

如果像最初的发帖者一样,您希望创建一个与 icontains 一起使用的索引,则必须索引列的 UPPER(),这需要来自 OpClass 的特殊处理。 :

from django.db.models.functions import Upper
from django.contrib.postgres.indexes import GinIndex, OpClass

class Author(models.Model):
indexes = [
GinIndex(
OpClass(Upper('last_name'), name='gin_trgm_ops'),
name='review_author_ln_gin_idx',
)
]

灵感来自 old article在这个问题上,我登陆了current one它为 GistIndex 提供了以下解决方案:

更新:从 Django-1.11 开始,事情似乎更简单了,如 this answerdjango docs建议:

from django.contrib.postgres.indexes import GinIndex

class MyModel(models.Model):
the_field = models.CharField(max_length=512, db_index=True)

class Meta:
indexes = [GinIndex(fields=['the_field'])]

来自 Django-2.2 ,属性 opclasses 将在 class Index(fields=(), name=None, db_tablespace=None, opclasses=()) 中可用为此目的。


from django.contrib.postgres.indexes import GistIndex

class GistIndexTrgrmOps(GistIndex):
def create_sql(self, model, schema_editor):
# - this Statement is instantiated by the _create_index_sql()
# method of django.db.backends.base.schema.BaseDatabaseSchemaEditor.
# using sql_create_index template from
# django.db.backends.postgresql.schema.DatabaseSchemaEditor
# - the template has original value:
# "CREATE INDEX %(name)s ON %(table)s%(using)s (%(columns)s)%(extra)s"
statement = super().create_sql(model, schema_editor)
# - however, we want to use a GIST index to accelerate trigram
# matching, so we want to add the gist_trgm_ops index operator
# class
# - so we replace the template with:
# "CREATE INDEX %(name)s ON %(table)s%(using)s (%(columns)s gist_trgrm_ops)%(extra)s"
statement.template =\
"CREATE INDEX %(name)s ON %(table)s%(using)s (%(columns)s gist_trgm_ops)%(extra)s"

return statement

然后您可以像这样在您的模型类中使用它:

class YourModel(models.Model):
some_field = models.TextField(...)

class Meta:
indexes = [
GistIndexTrgrmOps(fields=['some_field'])
]

关于python - 在 Django 模型中使用 Trigram (gin_trgm_ops) 创建 Gin 索引,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44820345/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com