gpt4 book ai didi

Hibernate Search 索引未完成的文档

转载 作者:行者123 更新时间:2023-12-02 02:08:59 24 4
gpt4 key购买 nike

我在批量索引我的数据时遇到问题。我想索引一个 Article 列表,在我需要获取信息的成员上使用一些 @IndexedEmbeddedArticle 从另外两个 bean 获取附加信息:PageArticlefulltext

由于 Hibernate Search Annotations,批处理正在正确地更新数据库 并将新的 Document 添加到我的 Lucene 索引中。但是添加的文档有不完整的字段。似乎 Hibernate Search 没有看到所有的注释。

因此,当我查看生成的 lucene 索引时,感谢 Luke,我有一些关于 Article 和 Page 对象的字段,但没有关于 ArticleFulltext 的字段,但我的数据库中有正确的数据,这意味着 persist() 操作是正确完成...

我真的需要一些帮助,因为我看不出我的 Page 和 ArticleFullText 之间有什么区别......

奇怪的是,如果我使用 MassIndexer,它会正确地将 Article + Page + Articlefulltext 数据添加到 lucene 索引中。但我不想每次进行大更新时都重建数百万文档索引...

我将 log4j 日志级别设置为调试 hibernate 搜索和 lucene。他们没有给我那么多信息。

这是我的 beans 代码和批处理代码。

预先感谢您的帮助,

文章.java :

@Entity
@Table(name = "article", catalog = "test")
@Indexed(index="articleText")
@Analyzer(impl = FrenchAnalyzer.class)
public class Article implements java.io.Serializable {

@Id
@GeneratedValue(strategy = IDENTITY)
@Column(name = "id", unique = true, nullable = false)
@DocumentId
private Integer id;

@ManyToOne(fetch = FetchType.LAZY)
@JoinColumn(name = "firstpageid", nullable = false)
@IndexedEmbedded
private Page page;

@Column(name = "heading", length = 300)
@Field(name= "title", index = Index.YES, store = Store.YES)
@Boost(2.5f)
private String heading;

@Column(name = "subheading", length = 300)
private String subheading;

@OneToOne(fetch = FetchType.LAZY, mappedBy = "article")
@IndexedEmbedded
private Articlefulltext articlefulltext;
[... bean methods etc ...]

页面.java

@Entity
@Table(name = "page", catalog = "test")
public class Page implements java.io.Serializable {

private Integer id;
@IndexedEmbedded
private Issue issue;
@ContainedIn
private Set<Article> articles = new HashSet<Article>(0);
[... bean method ...]

Articlefulltext.java

@Entity
@Table(name = "articlefulltext", catalog = "test")
@Analyzer(impl = FrenchAnalyzer.class)
public class Articlefulltext implements java.io.Serializable {

@GenericGenerator(name = "generator", strategy = "foreign", parameters = @Parameter(name = "property", value = "article"))
@Id
@GeneratedValue(generator = "generator")
@Column(name = "aid", unique = true, nullable = false)
private int aid;

@OneToOne(fetch = FetchType.LAZY)
@PrimaryKeyJoinColumn
@ContainedIn
private Article article;

@Column(name = "fulltextcontents", nullable = false)
@Field(store=Store.YES, index=Index.YES, analyzer = @Analyzer(impl = FrenchAnalyzer.class), bridge= @FieldBridge(impl = FulltextSplitBridge.class))
// This Field is not add to the Resulting Document ! I put a log into FulltextSplitBridge, and it's never called during a batch process. But if I use a MassIndexer, i see that FulltextSplitBridge is called for each Articlefulltext ...
private String fulltextcontents;
[... bean method ...]

这是用于更新数据库和 Lucene 索引的代码

批处理源代码:

FullTextEntityManager em = null;

@Override
protected void executeInternal(JobExecutionContext arg0) throws JobExecutionException {
ApplicationContext ap = null;
EntityManagerFactory emf = null;
EntityTransaction tx = null;


try {
ap = (ApplicationContext) arg0.getScheduler().getContext().get("applicationContext");
emf = (EntityManagerFactory) ap.getBean("entityManagerFactory", EntityManagerFactory.class);
em = Search.getFullTextEntityManager(emf.createEntityManager());
tx = em.getTransaction();


tx.begin();
// [... em.persist() some things which aren't lucene related, so i skip them ....]
for(File xmlFile : xmlList){
Reel reel = new Reel(title, reelpath);
em.persist(reel);
Article article = new Article();
// [... set Article fields, so i skip them ....]
Articlefulltext ft = new Articlefulltext();
// [... set Articlefulltext fields, so i skip them ....]
ft.setArticle(article);
ft.setFulltextcontents(bufferBlock.toString());
em.persist(ft); // i persist ft before article because of FK issues
em.persist(article); // there, the Annotation update Lucene index, but there's not updating fultextContent (see my first post)
if ( nbFileDone % 50 == 0 ) {
//flush a batch of inserts and release memory:
em.flush();
em.clear();
}
}
tx.commit();
}
catch(Exception e){
tx.rollback();
}
em.close();
}

最佳答案

嗯,你似乎没有设置关系的两边。我可以看到 ft.setArticle(article),但看不到 article.setFtArticle(ft)。关系的两边都需要设置。在您的情况下,Articlefulltext 是关系的所有者,但这确实意味着您不必设置双方。

关于Hibernate Search 索引未完成的文档,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/13743915/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com