c++ - pocketsphinx 简单示例在基本 C 测试中有效，但在包含在 C++ 项目中时无效-6ren

c++ - pocketsphinx 简单示例在基本 C 测试中有效，但在包含在 C++ 项目中时无效

转载作者：行者123 更新时间：2023-11-30 17:20:52

我正在尝试使用 pocketsphinx 构建一个项目。我处于早期状态，我首先在单个 main.c 中尝试了文档中的一个简单示例，该示例读取文件并检测单词:works。

现在我尝试将其包含到我的 C++ SDL 项目中，通过 RtAudio 从麦克风读取音频数据，但它不起作用。

我得到:

INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 0 words
ERROR: "ngram_search.c", line 1141: Couldn't find <s> in first frame

麦克风初始化为:

RtAudio::StreamParameters parameters;
parameters.deviceId = adc.getDefaultInputDevice();
parameters.nChannels = 1;
parameters.firstChannel = 0;
unsigned int sampleRate = 16000;
unsigned int bufferFrames = 512;
adc.openStream(NULL, &parameters, RTAUDIO_SINT16, sampleRate, &bufferFrames, &rtCallback, info);
adc.startStream();

然后在我的 C++ 类中，我有以下 protected 变量:

cmd_ln_t *decoderConfig;
ps_decoder_t *currentDecoder;
bool spaceDown_; // true as long as user holds space
bool startNextTime_; // true if user just pressed space first time
bool endNextTime_; // true if user just released space
int16 *detectionBuffer;
int detectionBufferSize;
int detectionBufferPos;

在构造函数中我这样做(删除了错误检查以获得更好的可读性):

#define MODELDIR "/usr/local/share/pocketsphinx/model"
decoderConfig = cmd_ln_init(NULL, ps_args(), TRUE,
                    "-hmm", MODELDIR "/hmm/en_US/hub4wsj_sc_8k",
                    "-lm", MODELDIR "/lm/en/turtle.DMP",
                    "-dict", MODELDIR "/lm/en/turtle.dic",
                    NULL);
currentDecoder = ps_init(decoderConfig);

然后，当用户按下空格时，我将 startNextTime_ 和 spaceDown_ 设置为 true，如果他释放空格，我将 stopNextTime_ 设置为 true。

RtAudio 回调调用该类的一个方法，该方法执行以下操作(我将整个音频数据从用户按下空格复制到释放到缓冲区中，不确定这是否有必要，但我猜不会造成伤害):

if (spaceDown()) {
            if (startNextTime()) {
                    int rv = ps_start_utt(currentDecoder);
                    if (rv < 0) {
                            std::cout << "error on ps_start_utt" << std::endl;
                    }
                    setStartNextTime(false);
                    if (detectionBuffer != 0) {
                            free(detectionBuffer);
                    }
                    detectionBufferSize = 65536;
                    detectionBuffer = (int16*)malloc(detectionBufferSize*sizeof(int16));
                    detectionBufferPos = 0;
            }

            if (frames+detectionBufferPos > detectionBufferSize) {
                    detectionBufferSize *= 2;
                    detectionBuffer = (int16*)realloc(detectionBuffer, detectionBufferSize*sizeof(int16));
            }

            memcpy(detectionBuffer+detectionBufferPos, buf, frames*sizeof(int16));

            ps_process_raw(currentDecoder, detectionBuffer+detectionBufferPos, (size_t)frames, 0, 1);
            detectionBufferPos += frames;
            if (endNextTime()) {
                    int rv = ps_end_utt(currentDecoder);
                    int32 score = 0;
                    char const *hyp = ps_get_hyp(currentDecoder, &score);
                    if (hyp != NULL) {
                            std::cout << "got " << hyp << " with score " << score << " and prob " << ps_get_prob(currentDecoder) << std::endl;;
                    } else {
                            std::cout << "no hyp " << std::endl;
                    }
                    setSpaceDown(false);
                    setEndNextTime(false);
            }
}

在上面的 pocketsphinx 错误消息之后，我得到了“no hyp”输出。我一遍又一遍地将它与我的小测试 c 文件进行比较，唯一的区别是 a) 我从麦克风而不是文件读取数据，b) 它在线程中运行。

有什么想法吗？

编辑:

这是 pocketsphinx 日志:

    INFO: cmd_ln.c(697): Parsing command line:
\
    -hmm /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k \
    -lm /usr/local/share/pocketsphinx/model/lm/en/turtle.DMP \
    -dict /usr/local/share/pocketsphinx/model/lm/en/turtle.dic 

Current configuration:
[NAME]      [DEFLT]     [VALUE]
-agc        none        none
-agcthresh  2.0     2.000000e+00
-allphone           
-allphone_ci    no      no
-alpha      0.97        9.700000e-01
-ascale     20.0        2.000000e+01
-aw     1       1
-backtrace  no      no
-beam       1e-48       1.000000e-48
-bestpath   yes     yes
-bestpathlw 9.5     9.500000e+00
-ceplen     13      13
-cmn        current     current
-cmninit    8.0     8.0
-compallsen no      no
-debug              0
-dict               /usr/local/share/pocketsphinx/model/lm/en/turtle.dic
-dictcase   no      no
-dither     no      no
-doublebw   no      no
-ds     1       1
-fdict              
-feat       1s_c_d_dd   1s_c_d_dd
-featparams         
-fillprob   1e-8        1.000000e-08
-frate      100     100
-fsg                
-fsgusealtpron  yes     yes
-fsgusefiller   yes     yes
-fwdflat    yes     yes
-fwdflatbeam    1e-64       1.000000e-64
-fwdflatefwid   4       4
-fwdflatlw  8.5     8.500000e+00
-fwdflatsfwin   25      25
-fwdflatwbeam   7e-29       7.000000e-29
-fwdtree    yes     yes
-hmm                /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k
-input_endian   little      little
-jsgf               
-keyphrase          
-kws                
-kws_plp    1e-1        1.000000e-01
-kws_threshold  1       1.000000e+00
-latsize    5000        5000
-lda                
-ldadim     0       0
-lifter     0       0
-lm             /usr/local/share/pocketsphinx/model/lm/en/turtle.DMP
-lmctl              
-lmname             
-logbase    1.0001      1.000100e+00
-logfn              
-logspec    no      no
-lowerf     133.33334   1.333333e+02
-lpbeam     1e-40       1.000000e-40
-lponlybeam 7e-29       7.000000e-29
-lw     6.5     6.500000e+00
-maxhmmpf   30000       30000
-maxwpf     -1      -1
-mdef               
-mean               
-mfclogdir          
-min_endfr  0       0
-mixw               
-mixwfloor  0.0000001   1.000000e-07
-mllr               
-mmap       yes     yes
-ncep       13      13
-nfft       512     512
-nfilt      40      40
-nwpen      1.0     1.000000e+00
-pbeam      1e-48       1.000000e-48
-pip        1.0     1.000000e+00
-pl_beam    1e-10       1.000000e-10
-pl_pbeam   1e-10       1.000000e-10
-pl_pip     1.0     1.000000e+00
-pl_weight  3.0     3.000000e+00
-pl_window  5       5
-rawlogdir          
-remove_dc  no      no
-remove_noise   yes     yes
-remove_silence yes     yes
-round_filters  yes     yes
-samprate   16000       1.600000e+04
-seed       -1      -1
-sendump            
-senlogdir          
-senmgau            
-silprob    0.005       5.000000e-03
-smoothspec no      no
-svspec             
-tmat               
-tmatfloor  0.0001      1.000000e-04
-topn       4       4
-topn_beam  0       0
-toprule            
-transform  legacy      legacy
-unit_area  yes     yes
-upperf     6855.4976   6.855498e+03
-uw     1.0     1.000000e+00
-vad_postspeech 50      50
-vad_prespeech  10      10
-vad_threshold  2.0     2.000000e+00
-var                
-varfloor   0.0001      1.000000e-04
-varnorm    no      no
-verbose    no      no
-warp_params            
-warp_type  inverse_linear  inverse_linear
-wbeam      7e-29       7.000000e-29
-wip        0.65        6.500000e-01
-wlen       0.025625    2.562500e-02

INFO: cmd_ln.c(697): Parsing command line:
\
    -nfilt 20 \
    -lowerf 1 \
    -upperf 4000 \
    -wlen 0.025 \
    -transform dct \
    -round_filters no \
    -remove_dc yes \
    -remove_noise no \
    -svspec 0-12/13-25/26-38 \
    -feat 1s_c_d_dd \
    -agc none \
    -cmn current \
    -cmninit 45,-3,1 \
    -varnorm no 

Current configuration:
[NAME]      [DEFLT]     [VALUE]
-agc        none        none
-agcthresh  2.0     2.000000e+00
-alpha      0.97        9.700000e-01
-ceplen     13      13
-cmn        current     current
-cmninit    8.0     45,-3,1
-dither     no      no
-doublebw   no      no
-feat       1s_c_d_dd   1s_c_d_dd
-frate      100     100
-input_endian   little      little
-lda                
-ldadim     0       0
-lifter     0       0
-logspec    no      no
-lowerf     133.33334   1.000000e+00
-ncep       13      13
-nfft       512     512
-nfilt      40      20
-remove_dc  no      yes
-remove_noise   yes     no
-remove_silence yes     yes
-round_filters  yes     no
-samprate   16000       1.600000e+04
-seed       -1      -1
-smoothspec no      no
-svspec             0-12/13-25/26-38
-transform  legacy      dct
-unit_area  yes     yes
-upperf     6855.4976   4.000000e+03
-vad_postspeech 50      50
-vad_prespeech  10      10
-vad_threshold  2.0     2.000000e+00
-varnorm    no      no
-verbose    no      no
-warp_params            
-warp_type  inverse_linear  inverse_linear
-wlen       0.025625    2.500000e-02

INFO: acmod.c(252): Parsed model-specific feature parameters from /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/feat.params
INFO: feat.c(715): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(143): mean[0]= 12.00, mean[1..12]= 0.0
INFO: acmod.c(171): Using subvector specification 0-12/13-25/26-38
INFO: mdef.c(518): Reading model definition: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/mdef
INFO: mdef.c(531): Found byte-order mark BMDF, assuming this is a binary mdef file
INFO: bin_mdef.c(336): Reading binary model definition: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/mdef
INFO: bin_mdef.c(516): 50 CI-phone, 143047 CD-phone, 3 emitstate/phone, 150 CI-sen, 5150 Sen, 27135 Sen-Seq
INFO: tmat.c(206): Reading HMM transition probability matrices: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/transition_matrices
INFO: acmod.c(124): Attempting to use PTM computation module
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/means
INFO: ms_gauden.c(292): 1 codebook, 3 feature, size: 
INFO: ms_gauden.c(294):  256x13
INFO: ms_gauden.c(294):  256x13
INFO: ms_gauden.c(294):  256x13
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/variances
INFO: ms_gauden.c(292): 1 codebook, 3 feature, size: 
INFO: ms_gauden.c(294):  256x13
INFO: ms_gauden.c(294):  256x13
INFO: ms_gauden.c(294):  256x13
INFO: ms_gauden.c(354): 0 variance values floored
INFO: ptm_mgau.c(805): Number of codebooks doesn't match number of ciphones, doesn't look like PTM: 1 != 50
INFO: acmod.c(126): Attempting to use semi-continuous computation module
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/means
INFO: ms_gauden.c(292): 1 codebook, 3 feature, size: 
INFO: ms_gauden.c(294):  256x13
INFO: ms_gauden.c(294):  256x13
INFO: ms_gauden.c(294):  256x13
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/variances
INFO: ms_gauden.c(292): 1 codebook, 3 feature, size: 
INFO: ms_gauden.c(294):  256x13
INFO: ms_gauden.c(294):  256x13
INFO: ms_gauden.c(294):  256x13
INFO: ms_gauden.c(354): 0 variance values floored
INFO: s2_semi_mgau.c(904): Loading senones from dump file /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/sendump
INFO: s2_semi_mgau.c(928): BEGIN FILE FORMAT DESCRIPTION
INFO: s2_semi_mgau.c(1023): Using memory-mapped I/O for senones
INFO: s2_semi_mgau.c(1294): Maximum top-N: 4 Top-N beams: 0 0 0
INFO: phone_loop_search.c(115): State beam -225 Phone exit beam -225 Insertion penalty 0
INFO: dict.c(320): Allocating 4217 * 32 bytes (131 KiB) for word entries
INFO: dict.c(333): Reading main dictionary: /usr/local/share/pocketsphinx/model/lm/en/turtle.dic
INFO: dict.c(213): Allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(336): 110 words read
INFO: dict.c(342): Reading filler dictionary: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/noisedict
INFO: dict.c(213): Allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(345): 11 words read
INFO: dict2pid.c(396): Building PID tables for dictionary
INFO: dict2pid.c(406): Allocating 50^3 * 2 bytes (244 KiB) for word-initial triphones
INFO: dict2pid.c(132): Allocated 60400 bytes (58 KiB) for word-final triphones
INFO: dict2pid.c(196): Allocated 60400 bytes (58 KiB) for single-phone word triphones
INFO: ngram_model_arpa.c(77): No \data\ mark in LM file
INFO: ngram_model_dmp.c(142): Will use memory-mapped I/O for LM file
INFO: ngram_model_dmp.c(196): ngrams 1=91, 2=212, 3=177
INFO: ngram_model_dmp.c(242):       91 = LM.unigrams(+trailer) read
INFO: ngram_model_dmp.c(288):      212 = LM.bigrams(+trailer) read
INFO: ngram_model_dmp.c(314):      177 = LM.trigrams read
INFO: ngram_model_dmp.c(339):       20 = LM.prob2 entries read
INFO: ngram_model_dmp.c(359):       12 = LM.bo_wt2 entries read
INFO: ngram_model_dmp.c(379):       12 = LM.prob3 entries read
INFO: ngram_model_dmp.c(407):        1 = LM.tseg_base entries read
INFO: ngram_model_dmp.c(463):       91 = ascii word strings read
INFO: ngram_search_fwdtree.c(99): 67 unique initial diphones
INFO: ngram_search_fwdtree.c(148): 0 root, 0 non-root channels, 15 single-phone words
INFO: ngram_search_fwdtree.c(186): Creating search tree
INFO: ngram_search_fwdtree.c(192): before: 0 root, 0 non-root channels, 15 single-phone words
INFO: ngram_search_fwdtree.c(326): after: max nonroot chan increased to 328
INFO: ngram_search_fwdtree.c(339): after: 67 root, 200 non-root channels, 14 single-phone words
INFO: ngram_search_fwdflat.c(157): fwdflat: min_ef_width = 4, max_sf_win = 25
INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 0 words
ERROR: "ngram_search.c", line 1141: Couldn't find <s> in first frame
no hyp

最佳答案

我有类似的经历(解析文件工作正常；切换到麦克风会出现 "Couldn't find <s> in first frame" 错误)。我检查了麦克风数据流，发现它有一个 WAV header 。我将麦克风流切换为 RAW，而不是默认的 WAV，然后它开始工作。我不熟悉 RtAudio，但我推测:1)它不是原始 PCM 数据，或者 2)它以 header 开头，这是 sphinx 不喜欢的。

注意:我正在使用arecord使用 Raspbian Jessie 在 Raspberry Pi 上获取麦克风流。

关于c++ - pocketsphinx 简单示例在基本 C 测试中有效，但在包含在 C++ 项目中时无效，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/28404163/

文章推荐： c - OpenMP 并行错误 : missing increment expression

文章推荐： c# - 对异步任务使用取消

文章推荐： c - 符号查找错误: undefined symbol: fclose

IPv6 示例 Wireshark 示例
这个问题在这里已经有了答案: 关闭 11 年前。 Possible Duplicate: Sample data for IPv6? 除了 wireshark 在其网站上提供的内容之外，是否有可以下
c# - WPF 中的多拖放——示例/示例/教程？
我正在寻找可以集成到现有应用程序中并使用多拖放功能的示例或任何现成的解决方案。我在互联网上找到的大多数解决方案在将多个项目从 ListBox 等控件拖放到另一个 ListBox 时效果不佳。谁能指出我
java - GATE Embedded 示例示例 NoClassFound 错误
我是 GATE Embedded 的新手，我尝试了简单的示例并得到了 NoClassDefFoundError。首先我会解释我尝试了什么在 D:\project\gate-7.0 中下载并提取 Ga
eclipse-rcp - Eclipse 中的 JFace 示例，如 SWT 示例？
是否有像 Eclipse 中的 SWT 示例那样的多合一 JFace 控件示例？搜索(在 stackoverflow.com 上使用谷歌搜索和搜索)对我没有帮助。如果它是一个独立的应用程序或 ecl
google-compute-engine - Google 计算引擎 .NET API 示例/示例/教程
我找不到任何可以清楚地解释如何通过 .net API(特别是 c#)使用谷歌计算引擎的内容。有没有人可以指点我什么？附言我知道 API 引用 ( https://developers.google.
基于Basicauth的一个C#示例
最近在做公司的一个项目时，客户需要我们定时获取他们矩阵系统的数据。在与客户进行对接时，提到他们的接口使用的目前不常用的BASIC 认证。天呢，它好不安全，容易被不法人监听，咋还在使用呀。但是没办法呀，
基于Basicauth的一个C#示例
最近在做公司的一个项目时，客户需要我们定时获取他们矩阵系统的数据。在与客户进行对接时，提到他们的接口使用的目前不常用的BASIC 认证。天呢，它好不安全，容易被不法人监听，咋还在使用呀。但是没办法呀，
YAML 示例
我正在尝试为我的应用程序设计配置文件格式并选择了 YAML。但是，这(显然)意味着我需要能够定义、解析和验证正确的 YAML 语法! 在配置文件中，必须有一个名为 widgets 的集合/序列。 .这
python - 示例
你能给我一个使用 pysmb 库连接到一些 samba 服务器的例子吗？我读过有类 smb.SMBConnection.SMBConnection(用户名、密码、my_name、remote_name
示例：iptables限制ssh链接服务器
linux服务器默认通过22端口用ssh协议登录，这种不安全。今天想做限制，即允许部分来源ip连接服务器。案例目标：通过iptables规则限制对linux服务器的登录。处理方法：编
Sonarqube PostProjectAnalysisTask 示例？
我一直在寻找任何 PostProjectAnalysisTask 工作代码示例，但没有看。 This页面指出 HipChat plugin使用这个钩子(Hook)，但在我看来它仍然使用遗留的 Po
GWT CustomScrollPanel 示例
我发现了 GWT 的 CustomScrollPanel 以及如何自定义滚动条，但我找不到任何示例或如何设置它。是否有任何示例显示正在使用的自定义滚动条？最佳答案这是自定义 native 滚动条的
Marionette CRUD 示例
我正在尝试开发一个 Backbone Marionette 应用程序，我需要知道如何以最佳方式执行 CRUD(创建、读取、更新和销毁)操作。我找不到任何解释这一点的资源(仅适用于 Backbone)。
Android BLE 示例
关闭。这个问题需要details or clarity .它目前不接受答案。想改进这个问题？通过 editing this post 添加详细信息并澄清问题. 去年关闭。 Improve this
将多个实例提交到数据库的表单的 Django 示例？
我需要一个提交多个单独请求的 django 表单，如果没有大量定制，我找不到如何做到这一点的示例。即，假设有一个汽车维修店使用的表格。该表格将列出商店能够进行的所有可能的维修，并且用户将选择他们想要进
spring - MultiTenantSpringLiquibase 示例。
我有一个 Multi-Tenancy 应用程序。然而，这个相同的应用程序有 liquibase。我需要在我的所有数据源中运行 liquibase，但是我不能使用这个 Bean。我的应用程序.yml
业务应用程序的 TDD 示例
我了解有关单元测试的一般思想，并已在系统中发生复杂交互的场景中使用它，但我仍然对所有这些原则结合在一起有疑问。我们被警告不要测试框架或数据库。好的 UI 设计不适合非人工测试。 MVC 框架不包括一
Clojure For Comprehension 示例
我正在使用 docjure并且它的 select-columns 函数需要一个列映射。我想获取所有列而无需手动指定。如何将以下内容生成为惰性无限向量序列 [:A :B :C :D :E ... :A
yii - findByAttributes 示例
$condition使用说明和 $param在 findByAttributes在 Yii 在大多数情况下，这就是我使用 findByAttributes 的方式 Person::model()->f
未启用 qtcreator 示例
我在 Ubuntu 11.10 上安装了 qtcreator sudo apt-get install qtcreator 安装的版本有:QT Creator 2.2.1、QT 4.7.3 当我启动

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

c++ - pocketsphinx 简单示例在基本 C 测试中有效，但在包含在 C++ 项目中时无效