gpt4 book ai didi

python - Wagtail 默认搜索不适用于非英语字段

转载 作者:太空宇宙 更新时间:2023-11-03 14:43:45 24 4
gpt4 key购买 nike

我在项目中使用默认数据库后端来实现搜索功能:

from __future__ import absolute_import, unicode_literals

from django.core.paginator import EmptyPage, PageNotAnInteger, Paginator
from django.shortcuts import render

from home.models import BlogPage, get_all_tags
from wagtail.wagtailsearch.models import Query


def search(request):
search_query = request.GET.get('query', None)
page = request.GET.get('page', 1)

# Search
if search_query:
search_results = BlogPage.objects.live().search(search_query)
query = Query.get(search_query)

# Record hit
query.add_hit()
else:
search_results = BlogPage.objects.none()

# Pagination
paginator = Paginator(search_results, 10)
try:
search_results = paginator.page(page)
except PageNotAnInteger:
search_results = paginator.page(1)
except EmptyPage:
search_results = paginator.page(paginator.num_pages)

return render(request, 'search/search.html', {
'search_query': search_query,
'blogpages': search_results,
'tags': get_all_tags()
})

博客页面:

class BlogPage(Page):
date = models.DateField("Post date")
intro = models.CharField(max_length=250)
body = StreamField([
('heading', blocks.CharBlock(classname="full title")),
('paragraph', blocks.RichTextBlock()),
('image', ImageChooserBlock()),
('code', CodeBlock()),
])
tags = ClusterTaggableManager(through=BlogPageTag, blank=True)

search_fields = Page.search_fields + [
index.SearchField('intro'),
index.SearchField('body'),
]
...

只有当 BlogPage 模型中的 body 字段是英语时,如果我尝试在 body 中使用一些俄语单词,搜索才能正常工作> 字段,那么它不会搜索任何内容。我查看了数据库,发现 BlogPagebody 字段,如下所示:

[{"value": "\u0442\u0435\u0441\u0442\u043e\u0432\u044b\u0439", "id": "3343151a-edbc-4165-89f2-ce766922d68e", "type": "heading"}, {"value": "<p>\u0442\u0435\u0441\u0442\u0438\u043f\u0440</p>", "id": "22d3818d-8c69-4d72-967e-7c1f807e80b2", "type": "paragraph"}]

所以,问题是 wagtail 将 Streamfield 字段保存为 unicode 字符,如果我在 phpmyadmin 中手动更改为:

[{"value": "Тест", "id": "3343151a-edbc-4165-89f2-ce766922d68e", "type": "heading"}, {"value": "<p>Тестовый</p>", "id": "22d3818d-8c69-4d72-967e-7c1f807e80b2", "type": "paragraph"}]

然后搜索开始工作,所以也许有人知道如何防止 wagtail 以 unicode 保存 Streamfield 字段?

最佳答案

我讨厌这种解决方法,但我决定添加另一个字段 search_bodysearch_intro,然后使用它们进行搜索:

class BlogPage(Page):
date = models.DateField("Post date")
intro = models.CharField(max_length=250)
body = StreamField([
('heading', blocks.CharBlock(classname="full title")),
('paragraph', blocks.RichTextBlock()),
('image', ImageChooserBlock()),
('code', CodeBlock()),
])
search_intro = models.CharField(max_length=250)
search_body = models.CharField(max_length=50000)
tags = ClusterTaggableManager(through=BlogPageTag, blank=True)

def main_image(self):
gallery_item = self.gallery_images.first()
if gallery_item:
return gallery_item.image
else:
return None

def get_context(self, request):
context = super(BlogPage, self).get_context(request)
context['tags'] = get_all_tags()
context['page_url'] = urllib.parse.urljoin(BASE_URL, self.url)
return context

def save(self, *args, **kwargs):
if self.body.stream_data and isinstance(
self.body.stream_data[0], tuple):
self.search_body = ''
for block in self.body.stream_data:
if len(block) >= 2:
self.search_body += str(block[1])
self.search_intro = self.intro.lower()
self.search_body = self.search_body.lower()
return super().save(*args, **kwargs)

search_fields = Page.search_fields + [
index.SearchField('search_intro'),
index.SearchField('search_body'),
]
...

搜索/views.py:

def search(request):
search_query = request.GET.get('query', None)
page = request.GET.get('page', 1)

# Search
if search_query:
search_results = BlogPage.objects.live().search(search_query.lower())
query = Query.get(search_query)
...

关于python - Wagtail 默认搜索不适用于非英语字段,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46412871/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com