python - Google Speech-to-text API，InvalidArgument : 400 Must use single channel (mono)

标签 python google-cloud-speech

我不断收到此错误 InvalidArgument: 400在 google Speech-to-text 中，问题似乎是我使用 2 声道音频(立体声)，而 API 正在等待(单声道)中的 wav。

如果我在音频编辑器中转换文件，它可能会工作，但我不能使用音频编辑器来转换一批文件。有没有办法在 Python 或 Google Cloud 中更改音频类型。

注意:我已经尝试过使用“wave 模块”，但我一直收到错误 #7，因为文件类型无法识别(我无法使用 Python 中的模块 wave 读取 wav 文件)

-ERROR- InvalidArgument: 400 Must use single channel (mono) audio, but WAV header indicates 2 channels.

最佳答案

假设您使用的是 google-cloud-speech 库，您可以使用 audio_channel_count您的 RecognitionConfig 中的属性(property)并指定输入音频数据中的 channel 数(默认为一个 channel (单声道))。你可以这样做:

from google.cloud import speech

client = speech.SpeechClient()
results = client.recognize(
    audio = speech.types.RecognitionAudio(
        uri = 'gs://your-bucket/recording.wav',
    ),
    config = speech.types.RecognitionConfig(
        encoding = 'LINEAR16',
        language_code = 'en-US',
        sample_rate_hertz = 44100,
        audio_channel_count = 2,
    ),
)

见API doc了解更多信息。

关于python - Google Speech-to-text API，InvalidArgument : 400 Must use single channel (mono)，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/55106509/

上一篇：python - TensorFlow - tf.keras.layers.Layer 与 tf.keras.Model 之间的区别

下一篇：.net - 将 .NET Core 登录到文件和控制台 - 带有时间戳

相关文章：

c++ - Python 与 C : different outputs

python - 谷歌云语音API同步语音识别文档中出现错误

javascript - 如何通过 socket.io 将实时音频从浏览器流式传输到 Google Cloud Speech？

java - 如何使用 Google Cloud Speech API 进行实时语音识别？

python - 使用 Tensorflow 服务的双向流

python - 请求 - 无法处理两个具有相同名称、不同域的 cookie

python - 满足文件中的三个条件

python - Pandas future 警告 : Columnar iteration over characters will be deprecated in future releases

google-speech-api - 第400章指定MP3编码来匹配音频文件

python - 狮身人面像不同页面的不同背景图片