python - 有人可以帮我找到我的 IBM 演讲中的错误文本代码/

标签 python websocket speech-to-text ibm-watson

我正在使用 websockets 将请求发送到 IBM 的语音到文本 api,并且我不断收到管道中断错误。 IBM speech to text api 的文档说它可以占用 4mb 的帧,但我只能给它 70 kb 而不会中断。 https://www.ibm.com/watson/developercloud/doc/speech-to-text/websockets.html#WSopen 此外,如果我提供一个小于 70kb(5 秒)的文件,它会以不返回任何东西为代价。

    import websocket
    from requests import get
    import user_info
    import json
    import time
    import threading

    api_token = "https://stream.watsonplatform.net/authorization/api/v1/token"
    s2t_url = "wss://stream.watsonplatform.net/speech-to-text/api/v1/recognize"
    s2t_model = 'es-ES_BroadbandModel'
    mb_chunk = 1024*50
    # https://pypi.python.org/pypi/websocket-clien*
    # https://www.ibm.com/watson/developercloud/doc/speech-to-text/websockets.html


    # -------
    # on_open
    # -------
    def on_open(ws):
        """
        Called by the websocet after it is
        opened and sends metadataabout the sound file
        """
        print("--------------WebSocket is open--------------")
        message = {
            'action': 'start',
            'content-type': 'audio/wav'
        }
        #def send_binary(*args):
        ws.send(json.dumps(message))
        i = 0
        with open("Deepak2_hwv4122_uncompressed.wav", "rb") as wav:
            # while True:
            piece = wav.read(mb_chunk)
            ws.send(piece)
            print(i)
            i+=1
            if not piece:
                #break
                pass
            wav.close()
            # ws.close()
        #t = threading.Thread(target=send_binary)
        #t.start()


# ----------
# on_message
# ----------
def on_message(ws, message):
    print("------------------MESSAGE------------------")
    print(message)


# --------
# on_error
# --------
def on_error(ws, error):
    print(error)
    print("------------------ERROR------------------")
# --------
# on_close
# --------
def on_close(ws):
    print("------------Connection is Closed-----------")
    ws.close()

# ----------------
# get_token
# ----------------
def get_token():
    """
    REST request to get the watson voice service API token
    """
    url = api_token + "?url=" + user_info.AUTH['url']
    print("URL: " + url)
    res = get(url, auth=(user_info.AUTH['username'], user_info.AUTH['password']))
    print('Auth Token: ' + res.text)
    return res.text


# ----
# main
# ----
if __name__ == "__main__":
    global ws_url
    cur_token = get_token()
    ws_url = s2t_url + '?watson-token=' + cur_token + '&model=' + s2t_model
    print("ws_uri: " + ws_url)

    # Start WebSocket Connection
    websocket.enableTrace(True)
    ws = websocket.WebSocketApp(ws_url, on_message=on_message, on_error=on_error, on_close=on_close)
    ws.on_open = on_open
    ws.run_forever()

The error I am getting is [Errno 32] Broken pipe File "/home/dell/rahmi/env/lib/python3.5/site-packages/websocket/_app.py", line 268, in _callback callback(self, *args) File "watson-test.py", line 35, in on_open ws.send(piece) File "/home/dell/rahmi/env/lib/python3.5/site-packages/websocket/_app.py", line 117, in send if not self.sock or self.sock.send(data, opcode) == 0: File "/home/dell/rahmi/env/lib/python3.5/site-packages/websocket/_core.py", line 234, in send return self.send_frame(frame) File "/home/dell/rahmi/env/lib/python3.5/site-packages/websocket/_core.py", line 259, in send_frame l = self._send(data) File "/home/dell/rahmi/env/lib/python3.5/site-packages/websocket/_core.py", line 423, in _send return send(self.sock, data) File "/home/dell/rahmi/env/lib/python3.5/site-packages/websocket/_socket.py", line 116, in send return sock.send(data) File "/usr/lib/python3.5/ssl.py", line 861, in send return self._sslobj.write(data) File "/usr/lib/python3.5/ssl.py", line 586, in write return self._sslobj.write(data)

最佳答案

我快速查看了您的代码,发现缺少部分,在 on_open 方法中推送所有音频后,您没有发出音频流结束信号。您可以通过发送空二进制消息或带有字符串 {'action': 'stop'} 的文本消息来表示音频结束,如下所述:https://www.ibm.com/watson/developercloud/doc/speech-to-text/websockets.html我相信这就是你没有得到任何结果的原因。另外请确保在服务器回复最终结果之前不要关闭 websocket。

谢谢Sayuri Mizuguchi的回答,其实我写的代码托管在https://github.com/watson-developer-cloud/speech-to-text-websockets-python ,这是一个通过 websockets 与 Watson STT 交互的简单示例。该项目正在此处集成到 Watson Python SDK 中:https://github.com/watson-developer-cloud/python-sdk

关于转换为 base64,您只需要确保音频作为二进制消息发送,websocket 堆栈通常具有发送文本消息或二进制消息的能力。

关于python - 有人可以帮我找到我的 IBM 演讲中的错误文本代码/,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44509361/

相关文章:

python - 写入文件时,您能否具体说明要写入的位置?

python - 如何在 SQLAlchemy 中进行多级预加载?

JAVA EE 7 WebSocket 实现

c++ - Websocketpp : Address already in use

java - 单击按钮时语音识别 Intent 未打开

java - 没有谷歌弹出窗口的Android应用程序中的连续语音识别

python - 派生自 wx.Dialog 的通用 MessageBox

python - Python 中的异常处理

html - 如何从阻塞操作发送 WebSocket 事件? (需要设计建议)

node.js - 如何从 .wav 文件转录完整的音频文本?