android - 用于 Android 听写应用程序的 PocketSphinx

标签 android speech-recognition pocketsphinx pocketsphinx-android

我正在尝试使用 PocketSphinx on Android 实现“听写”功能结合 Keith Vertanen 的 language models 之一.我修改了the sample看起来像这样:

private void setupRecognizer(File assetsDir) throws IOException {
 recognizer = defaultSetup()
     .setAcousticModel(new File(assetsDir, "en-us-ptm"))
     .setDictionary(new File(assetsDir, "cmudict-en-us.dict"))
     .setRawLogDir(assetsDir)
     .setKeywordThreshold(1e-45f)
     .setBoolean("-allphone_ci", true)
      .getRecognizer();
  recognizer.addListener(this);
  File ngramModel = new File(assetsDir, "lm_csr_5k_nvp_2gram.arpa");
  recognizer.addNgramSearch(NGRAM_SEARCH, ngramModel);

其中 lm_csr_5k_nvp_2gram.arpa 来自 Keith Vertanen 网站上的 5K NVP 2-gram 下载。

我收到这个错误:

1 18:04:29.861 2837-2863/? I/SpeechRecognizer: Load N-gram model /storage/emulated/0/Android/data/edu.cmu.sphinx.pocketsphinx/files/sync/lm_csr_5k_nvp_2gram.arpa
01-31 18:04:29.861 2837-2863/? I/cmusphinx: INFO: ngram_model_trie.c(399): Trying to read LM in trie binary format
01-31 18:04:29.861 2837-2863/? I/cmusphinx: INFO: ngram_model_trie.c(410): Header doesn't match
01-31 18:04:29.861 2837-2863/? I/cmusphinx: INFO: ngram_model_trie.c(177): Trying to read LM in arpa format
01-31 18:04:29.862 2837-2863/? E/cmusphinx: ERROR: "ngram_model_trie.c", line 103: Bad ngram count
01-31 18:04:29.862 2837-2863/? I/cmusphinx: INFO: ngram_model_trie.c(489): Trying to read LM in DMP format
01-31 18:04:29.862 2837-2863/? E/cmusphinx: ERROR: "ngram_model_trie.c", line 500: Wrong magic header size number a5c6461: /storage/emulated/0/Android/data/edu.cmu.sphinx.pocketsphinx/files/sync/lm_csr_5k_nvp_2gram.arpa is not a dump file
01-31 18:04:29.864 2837-2863/? E/AndroidRuntime: FATAL EXCEPTION: AsyncTask #1
                                                 Process: edu.cmu.sphinx.pocketsphinx, PID: 2837
                                                 java.lang.RuntimeException: An error occurred while executing doInBackground()
                                                     at android.os.AsyncTask$3.done(AsyncTask.java:309)
                                                     at java.util.concurrent.FutureTask.finishCompletion(FutureTask.java:354)
                                                     at java.util.concurrent.FutureTask.setException(FutureTask.java:223)
                                                     at java.util.concurrent.FutureTask.run(FutureTask.java:242)
                                                     at android.os.AsyncTask$SerialExecutor$1.run(AsyncTask.java:234)
                                                     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1113)
                                                     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:588)
                                                     at java.lang.Thread.run(Thread.java:818)
                                                  Caused by: java.lang.RuntimeException: Decoder_setLmFile returned -1
                                                     at edu.cmu.pocketsphinx.PocketSphinxJNI.Decoder_setLmFile(Native Method)
                                                     at edu.cmu.pocketsphinx.Decoder.setLmFile(Decoder.java:172)
                                                     at edu.cmu.pocketsphinx.SpeechRecognizer.addNgramSearch(SpeechRecognizer.java:247)
                                                     at edu.cmu.pocketsphinx.demo.PocketSphinxActivity.setupRecognizer(PocketSphinxActivity.java:161)
                                                     at edu.cmu.pocketsphinx.demo.PocketSphinxActivity.access$000(PocketSphinxActivity.java:50)
                                                     at edu.cmu.pocketsphinx.demo.PocketSphinxActivity$1.doInBackground(PocketSphinxActivity.java:72)
                                                     at edu.cmu.pocketsphinx.demo.PocketSphinxActivity$1.doInBackground(PocketSphinxActivity.java:66)
                                                     at android.os.AsyncTask$2.call(AsyncTask.java:295)
                                                     at java.util.concurrent.FutureTask.run(FutureTask.java:237)
                                                     at android.os.AsyncTask$SerialExecutor$1.run(AsyncTask.java:234) 
                                                     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1113) 
                                                     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:588) 
                                                     at java.lang.Thread.run(Thread.java:818) 

线条

01-31 18:04:29.861 2837-2863/? I/cmusphinx: INFO: ngram_model_trie.c(177): Trying to read LM in arpa format
01-31 18:04:29.862 2837-2863/? E/cmusphinx: ERROR: "ngram_model_trie.c", line 103: Bad ngram count

让我觉得 lm_csr_5k_nvp_2gram.arpa 文件格式不正确之类的。该文件如下所示:

\data\
ngram 1=5000
ngram 2=4331397
ngram 3=0

\1-grams:
-2.11154    </s>    0
-99 <s> -3.13167
-0.3954594  <unk>   -0.4365645
-2.271447   a   -2.953606
-3.384721   a.  -1.85196
-5.788997   a.'s    -0.8137056
-4.139672   abandoned   -0.9728376
-3.904189   ability -1.838658
-4.360272   able    -2.161723
...

至少看起来像示例文件 here .

我唯一的其他想法是,也许扩展名是错误的,因为 this

Language model can be stored and loaded in three different format - text ARPA format, binary format BIN and binary DMP format. ARPA format takes more space but it is possible to edit it. ARPA files have .lm extension. Binary format takes significantly less space and faster to load. Binary files have .lm.bin extension. It is also possible to convert between formats. DMP format is obsolete and not recommended.

这听起来像是文件应该命名为 lm_csr_5k_nvp_2gram.lm 而不是 lm_csr_5k_nvp_2gram.arpa。但是,我确实尝试重命名文件,但异常没有任何变化。

正确的做法是什么?

最佳答案

好吧,这是模型格式的问题,ngram 模型中的这一行导致了问题:

ngram 3=0

您可以删除有问题的行或更新 pocketsphinx-android-demo,我刚刚推出了解决此问题的新版本。

总的来说,手机上的听写并不是微不足道的,因为手机真的很慢。我不建议你使用 2-gram,最好使用经过大量修剪的 3-gram 模型。你可以用 srilm 修剪。

您还可以阅读 optimization doc了解其他要调整的内容。

关于android - 用于 Android 听写应用程序的 PocketSphinx,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35123338/

相关文章:

java - 如何使用 Android-Intent 在单击按钮时发送 TextView 数据?

android - 在 Android 中使用 arcTo 创建凹路径

android - 如何获取设备的默认 SpeechService 提供程序 - Android API 13

python - 使用 WAV 文件在 python 中语音到文本

speech-recognition - Google Glass XE17 更新后 SpeechRecognizer 损坏——如何解决?

python - 实时语音识别

c - 使用 PocketSphinx 打印置信度值

android - Unity 2019 - 无法启用 MultiDex 支持

android - 虚拟设备的cpu数被强制为1

python-3.x - Python pip3 pocketsphinx 安装错误