gpt4 book ai didi

command-line - 在 Tesseract 中禁用字典

转载 作者:行者123 更新时间:2023-12-04 17:52:17 24 4
gpt4 key购买 nike

为英语运行 Tesseract 时如何禁用字典更正?

我目前正在将 tesseract 作为子进程运行。

最佳答案

尝试将这些变量(将它们放在配置文件中)设置为 false:

load_system_dawg 
load_freq_dawg
load_punc_dawg
load_number_dawg
load_unambig_dawg
load_bigram_dawg
load_fixed_length_dawgs
https://groups.google.com/forum/?fromgroups=#!searchin/tesseract-ocr/Disable$20dictionary$20in$20Tesseract/tesseract-ocr/5nvIo1DJxHE/f3gBi2pTKykJ
另请阅读 How to increase the trust in/strength of the dictionary?在常见问题解答中。从中:

For tesseract-ocr < 3.01 try upping NON_WERD and GARBAGE_STRING in dict/permute.cpp to maybe 3 or even 5.

For tesseract-ocr >= 3.01 try increasing the variables language_model_penalty_non_freq_dict_word and language_model_penalty_non_dict_word in a config file. By default they are 0.1 and 0.15 respectively.

关于command-line - 在 Tesseract 中禁用字典,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/14364662/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com