gpt4 book ai didi

Django Haystack 更新索引更快

转载 作者:行者123 更新时间:2023-12-04 00:39:03 28 4
gpt4 key购买 nike

我已经使用 Django Haystack 一段时间了,它很棒!我有一个相当繁重的网站,其中的数据需要不时更新(15 到 30 分钟)。

使用 python manage.py update_index 时更新数据需要很多时间。有没有办法加快这个速度?或者,如果可能的话,也许只更新更改的数据。

我目前正在使用 Django Haystack 1.2.7 和 Solr 作为后端和 Django 1.4。

谢谢!!!

编辑:

是的,我已经尝试阅读文档的那部分,但我真正需要的是一种加快索引速度的方法。也许只更新最近的数据而不是全部更新。我找到了 get_updated_field但是不知道怎么用。在文档中只提到了为什么使用它,但没有显示真实的例子。

编辑2:

start = DateTimeField(model_attr='start', null=True, faceted=True, --HERE?--)

编辑 3:

好的,我已经实现了下面的解决方案,但是当我尝试重建索引(使用 45000 数据)时,它几乎使我的计算机崩溃。等待 10 分钟后出现错误:
 File "manage.py", line 10, in <module>
execute_from_command_line(sys.argv)
File "/usr/local/lib/python2.7/dist-packages/django/core/management/__init__.py", line 443, in execute_from_command_line
utility.execute()
File "/usr/local/lib/python2.7/dist-packages/django/core/management/__init__.py", line 382, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/usr/local/lib/python2.7/dist-packages/django/core/management/base.py", line 196, in run_from_argv
self.execute(*args, **options.__dict__)
File "/usr/local/lib/python2.7/dist-packages/django/core/management/base.py", line 232, in execute
output = self.handle(*args, **options)
File "/usr/local/lib/python2.7/dist-packages/haystack/management/commands/rebuild_index.py", line 16, in handle
call_command('update_index', **options)
File "/usr/local/lib/python2.7/dist-packages/django/core/management/__init__.py", line 150, in call_command
return klass.execute(*args, **defaults)
File "/usr/local/lib/python2.7/dist-packages/django/core/management/base.py", line 232, in execute
output = self.handle(*args, **options)
File "/usr/local/lib/python2.7/dist-packages/haystack/management/commands/update_index.py", line 193, in handle
return super(Command, self).handle(*apps, **options)
File "/usr/local/lib/python2.7/dist-packages/django/core/management/base.py", line 304, in handle
app_output = self.handle_app(app, **options)
File "/usr/local/lib/python2.7/dist-packages/haystack/management/commands/update_index.py", line 229, in handle_app
do_update(index, qs, start, end, total, self.verbosity)
File "/usr/local/lib/python2.7/dist-packages/haystack/management/commands/update_index.py", line 109, in do_update
index.backend.update(index, current_qs)
File "/usr/local/lib/python2.7/dist-packages/haystack/backends/solr_backend.py", line 73, in update
self.conn.add(docs, commit=commit, boost=index.get_field_weights())
File "/usr/local/lib/python2.7/dist-packages/pysolr.py", line 686, in add
m = ET.tostring(message, encoding='utf-8')
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1127, in tostring
ElementTree(element).write(file, encoding, method=method)
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 821, in write
serialize(write, self._root, encoding, qnames, namespaces)
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 940, in _serialize_xml
_serialize_xml(write, e, encoding, qnames, None)
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 940, in _serialize_xml
_serialize_xml(write, e, encoding, qnames, None)
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 915, in _serialize_xml
write("<" + tag)
MemoryError

最佳答案

get_updated_field应该返回一个字符串,其中包含模型上包含模型更新日期的属性名称( haystack docs )。带有 auto_now=True 的 DateField 将是理想的( Django docs )。

例如,我的 UserProfile 模型有一个名为 updated 的字段

模型.py

class UserProfile(models.Model):
user = models.ForeignKey(User)
# lots of other fields snipped
updated = models.DateTimeField(auto_now=True)

search_indexes.py
class UserProfileIndex(SearchIndex):
text = CharField(document=True, use_template=True)
user = CharField(model_attr='user')
user_fullname = CharField(model_attr='user__get_full_name')

def get_model(self):
return UserProfile

def get_updated_field(self):
return "updated"

然后当我运行 ./manage.py update_index --age=10它只索引过去 10 小时内更新的用户配置文件。

关于Django Haystack 更新索引更快,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/13819130/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com