python - Pyaudio - 将声音数据转换为字符串的算法

<分区>

关闭。这个问题不符合Stack Overflow guidelines .它目前不接受答案。

我们不允许提问寻求书籍、工具、软件库等的推荐。您可以编辑问题，以便用事实和引用来回答。

关闭 6 年前。

我正在使用 Pyaudio 录制声音并从中提取数据。现在我录制了一个声音并用 matplotlib 显示它。

import pyaudio,numpy
import matplotlib.pyplot as plt

FORMAT = pyaudio.paFloat32
SAMPLEFREQ = 44100
FRAMESIZE = 1024
NOFFRAMES = 220
p = pyaudio.PyAudio()
print('running')

stream = p.open(format=FORMAT,channels=1,rate=SAMPLEFREQ,input=True,frames_per_buffer=FRAMESIZE)
data = stream.read(NOFFRAMES*FRAMESIZE)
decoded = numpy.fromstring(data, 'Float32')
for x in decoded:
    if x != 0.0:   #
        print (x)  #--- decoded is very huge, I just print the first float number
        break      #


stream.stop_stream()
stream.close()
p.terminate()
print('done')
plt.plot(decoded)
plt.show()

此代码的示例输出是；

我的主要目标是弄清楚 decoded 中的 float 并将它们转换为字符串。比如我想检测我是否记录了aaa，我想对那个记录的数据的数据进行处理，最后将其转换为aaa。 decoded 是一个巨大的 float 列表，所以我找不到处理它的方法。我愿意听取有关库的建议，以及实现此目标的正确算法是什么。

在我看来，我使用了错误的库，但找不到适合我目标的正确库/方法。

最佳答案

这听起来像是您在征求有关使用 python 进行“语音(音频)到文本(字符串)”转换的建议。有一些很棒的 API 和 python 库可用于执行语音到文本的转换:

Getting started with speech recognition and python

Pygrs

SpeechRecognition 3.4.6

关于python - Pyaudio - 将声音数据转换为字符串的算法，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/37618990/