python - Wagtail 默认搜索不适用于非英语字段

标签 python wagtail

我在项目中使用默认数据库后端来实现搜索功能:

from __future__ import absolute_import, unicode_literals

from django.core.paginator import EmptyPage, PageNotAnInteger, Paginator
from django.shortcuts import render

from home.models import BlogPage, get_all_tags
from wagtail.wagtailsearch.models import Query


def search(request):
    search_query = request.GET.get('query', None)
    page = request.GET.get('page', 1)

    # Search
    if search_query:
        search_results = BlogPage.objects.live().search(search_query)
        query = Query.get(search_query)

        # Record hit
        query.add_hit()
    else:
        search_results = BlogPage.objects.none()

    # Pagination
    paginator = Paginator(search_results, 10)
    try:
        search_results = paginator.page(page)
    except PageNotAnInteger:
        search_results = paginator.page(1)
    except EmptyPage:
        search_results = paginator.page(paginator.num_pages)

    return render(request, 'search/search.html', {
        'search_query': search_query,
        'blogpages': search_results,
        'tags': get_all_tags()
    })

博客页面:

class BlogPage(Page):
    date = models.DateField("Post date")
    intro = models.CharField(max_length=250)
    body = StreamField([
        ('heading', blocks.CharBlock(classname="full title")),
        ('paragraph', blocks.RichTextBlock()),
        ('image', ImageChooserBlock()),
        ('code', CodeBlock()),
    ])
    tags = ClusterTaggableManager(through=BlogPageTag, blank=True)

    search_fields = Page.search_fields + [
        index.SearchField('intro'),
        index.SearchField('body'),
    ]
    ...

只有当 BlogPage 模型中的 body 字段是英语时,如果我尝试在 body 中使用一些俄语单词,搜索才能正常工作> 字段,那么它不会搜索任何内容。 我查看了数据库,发现 BlogPagebody 字段,如下所示:

[{"value": "\u0442\u0435\u0441\u0442\u043e\u0432\u044b\u0439", "id": "3343151a-edbc-4165-89f2-ce766922d68e", "type": "heading"}, {"value": "<p>\u0442\u0435\u0441\u0442\u0438\u043f\u0440</p>", "id": "22d3818d-8c69-4d72-967e-7c1f807e80b2", "type": "paragraph"}]

所以,问题是 wagtail 将 Streamfield 字段保存为 unicode 字符,如果我在 phpmyadmin 中手动更改为:

[{"value": "Тест", "id": "3343151a-edbc-4165-89f2-ce766922d68e", "type": "heading"}, {"value": "<p>Тестовый</p>", "id": "22d3818d-8c69-4d72-967e-7c1f807e80b2", "type": "paragraph"}]

然后搜索开始工作,所以也许有人知道如何防止 wagtail 以 unicode 保存 Streamfield 字段?

最佳答案

我讨厌这种解决方法,但我决定添加另一个字段 search_bodysearch_intro,然后使用它们进行搜索:

class BlogPage(Page):
    date = models.DateField("Post date")
    intro = models.CharField(max_length=250)
    body = StreamField([
        ('heading', blocks.CharBlock(classname="full title")),
        ('paragraph', blocks.RichTextBlock()),
        ('image', ImageChooserBlock()),
        ('code', CodeBlock()),
    ])
    search_intro = models.CharField(max_length=250)
    search_body = models.CharField(max_length=50000)
    tags = ClusterTaggableManager(through=BlogPageTag, blank=True)

    def main_image(self):
        gallery_item = self.gallery_images.first()
        if gallery_item:
            return gallery_item.image
        else:
            return None

    def get_context(self, request):
        context = super(BlogPage, self).get_context(request)
        context['tags'] = get_all_tags()
        context['page_url'] = urllib.parse.urljoin(BASE_URL, self.url)
        return context

    def save(self, *args, **kwargs):
        if self.body.stream_data and isinstance(
                self.body.stream_data[0], tuple):
            self.search_body = ''
            for block in self.body.stream_data:
                if len(block) >= 2:
                    self.search_body += str(block[1])
        self.search_intro = self.intro.lower()
        self.search_body = self.search_body.lower()
        return super().save(*args, **kwargs)

    search_fields = Page.search_fields + [
        index.SearchField('search_intro'),
        index.SearchField('search_body'),
    ]
    ...

搜索/views.py:

def search(request):
    search_query = request.GET.get('query', None)
    page = request.GET.get('page', 1)

    # Search
    if search_query:
        search_results = BlogPage.objects.live().search(search_query.lower())
        query = Query.get(search_query)
    ...

关于python - Wagtail 默认搜索不适用于非英语字段,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46412871/

相关文章:

django - 从 wagtail 外部上传 Wagtail 图像

django - Wagtail/Django 内置主菜单

wagtail - 从 Wagtail RichTextField 链接选择器中删除 "Internal link"选项

python - 使用 urllib 下载 HTTPS 页面,错误 :14077438:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert internal error

python - 无法使用带有 python 的 selenium 通过 sendkeys 将文件上传到 iframe 内的按钮元素

Python线程给出全局名称未定义错误

wagtail - 如何在应用程序之间移动 StreamBlock

python - 将二进制字符串转换为二进制文字

python - Tree 的所有可能分区(簇)

python - Django Nginx 不提供 wagtail 管理 css/js 文件