java - 自定义 TextToSpeechService 中的错误突出显示

标签 java android audio text-to-speech wav

我从 Android API 扩展了 TextToSpeechService 来制作我自己的自定义 TTS 服务。 TTS 服务从 TTS 服务器获取信息。服务器为我提供了一个音频缓冲区,其中包含一些列表,其中包含亮点的开始位置及其时间。问题是我的高亮显示太早地出现在 TTS 引擎即将说出的下一个单词上。我似乎找不到导致此问题的原因。我认为 audioPositionMillis 可能是错误的,但据我所知计算是正确的。我认为 audioPositionMillis 快了大约 700 毫秒。我忽略了一些小事情

   @Override
    protected synchronized void onSynthesizeText(SynthesisRequest request, SynthesisCallback callback) {

        // Note that we call onLoadLanguage here since there is no guarantee
        // that there was a prior call to this function.
        int load = onLoadLanguage(request.getLanguage(), request.getCountry(), request.getVariant());

        // We might get requests for a language we don't support - in which case
        // we error out early before wasting too much time.
        if (load == TextToSpeech.LANG_NOT_SUPPORTED) {
            callback.error();
            return;
        }

        String ttsText = request.getCharSequenceText().toString();
        final int speechRate = mapSpeechRate(request.getSpeechRate());
        TtsParams ttsParams = new TtsParams(ttsText, currentVoice, speechRate, VOLUME,
                TIME_BETWEEN_SENTENCES_MILLIS, BIT_RATE, TtsParams.Format.WAV);

        try {
            TtsInfo data = null;
            Response<TtsInfo> response = serviceManager.getTtsInfo(ttsParams); //Synchronous call because methods executed on the synthesisCallback need to be called on the synth thread.
            if(response != null){
                data = response.body();
            }

            if(data == null){
                callback.error();
                return;
            }

            //Response does not make any sense to me, we modify its data
            List<Integer> wordPositionsMs = data.getAudioPos();
            List<Integer> wordStartPositions = data.getCharPos();
            List<Integer> wordLengths = data.getCharCount();

            wordStartPositions.add(0, 0);
            wordStartPositions.remove(wordStartPositions.size() - 1);

            wordPositionsMs.add(0, 102); //First word always starts at 102ms according to the docs
            wordPositionsMs.remove(wordStartPositions.size() - 1);

            callback.start(SAMPLING_RATE_HZ, AudioFormat.ENCODING_PCM_16BIT, CHANNEL_COUNT);
            int maxBufferSize = callback.getMaxBufferSize();
            byte[] audioBuffer = Base64.decode(data.getByteArray(), Base64.DEFAULT);
            int offset = 0;
            while (offset < audioBuffer.length) {
                int bytesToWrite = Math.min(maxBufferSize, audioBuffer.length - offset);
                if(callback.audioAvailable(audioBuffer, offset, bytesToWrite) != TextToSpeech.SUCCESS){
                    callback.error();
                    return;
                }

                if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.O) {
                    long audioPositionMillis = Math.round(offset / ((SAMPLING_RATE_HZ/1000D) * CHANNEL_COUNT * (BIT_DEPTH/8D)));
                    int wordIndex = -1;
                    for (int i = 0; i < wordPositionsMs.size(); i++) {
                        if (audioPositionMillis > wordPositionsMs.get(i)) {
                            wordIndex++;
                        } else {
                            break;
                        }
                    }

                    if (wordIndex > -1) {
                        int wordStart = wordStartPositions.get(wordIndex);
                        int wordLength = wordLengths.get(wordIndex);
                        callback.rangeStart(-1, wordStart, wordStart + wordLength);
                    }
                }

                offset += bytesToWrite;
            }
            callback.done();
        } catch (IOException | NoNetworkException e) {
            e.printStackTrace();
            callback.error();
        }
    }

最佳答案

我将-1作为markerInFrames参数传递给rangeStart回调方法,这导致了这个问题。

解决方案:

callback.rangeStart((int)(offset/(BIT_DEPTH/8D)), wordStart, wordStart + wordLength);

关于java - 自定义 TextToSpeechService 中的错误突出显示,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60638540/

相关文章:

python - pygame-混音器系统未初始化

java - myeclipse 工作台中的嵌套工作集

java - 用 Java 和 C 生成 CRC

java - Android 按钮点击崩溃应用程序

actionscript-3 - 此类如何与声音连接?

c# - 将和弦作为同步声音演奏

java - 如何用不同类型的方法覆盖方法?

java - 为什么HttpPost会出现异常?

android - 如何裁剪位图?

android - 应用关闭后在 Android 中存储 ArrayList