go - 如何在 Go 中从麦克风获取 WAV 音频

我的程序使用 Vosk 语音识别库的 Go 绑定(bind)，它将音频作为 WAV 单声道音频的字节 slice 进行接收。我的程序当前使用外部命令 arecord 从麦克风获取 WAV 音频，但我更喜欢在 Go 中正确执行此操作，并且最好没有任何共享库依赖项。

我尝试使用 malgo包，但卡在如何将麦克风的原始音频转换为 WAV 上。我发现 WAV 编码包只能写入文件 (io.WriteSeeker)，但我需要转换来自麦克风的连续流以进行实时语音识别。

至少是Linux

最佳答案

我最终也使用了 malgo，以及 malgo.FormatS16。

在此回调中生成字节:

    // https://github.com/gen2brain/malgo/blob/master/_examples/capture/capture.go
    onRecvFrames := func(pSample2, pSample []byte, framecount uint32) {
        // Empirically, len(pSample) is 480, so for sample rate 44100 it's triggered about every 10ms.
        // sampleCount := framecount * deviceConfig.Capture.Channels * sizeInBytes
        pSampleData = append(pSampleData, pSample...)
    }

我可以将其转换为 int (为此使用 GPT-4):

func twoByteDataToIntSlice(audioData []byte) []int {
    intData := make([]int, len(audioData)/2)
    for i := 0; i < len(audioData); i += 2 {
        // Convert the pCapturedSamples byte slice to int16 slice for FormatS16 as we go
        value := int(binary.LittleEndian.Uint16(audioData[i : i+2]))
        intData[i/2] = value
    }
    return intData
}

然后使用“github.com/go-audio/wav”来生成内存中的wav字节(GPT-4再次创建了内存中文件系统黑客来克服 io.WriteSeeker 要求)

// Create an in-memory file to support io.WriteSeeker needed for NewEncoder which is needed for finalizing headers.
    inMemoryFilename := "in-memory-output.wav"
    inMemoryFile, err := fs.Create(inMemoryFilename)
    dbg(err)
    // We will call Close ourselves.

    // Convert audio data to IntBuffer
    inputBuffer := &audio.IntBuffer{Data: intData, Format: &audio.Format{SampleRate: iSampleRate, NumChannels: iNumChannels}}

    // Create a new WAV wavEncoder
    bitDepth := 16
    audioFormat := 1
    wavEncoder := wav.NewEncoder(inMemoryFile, iSampleRate, bitDepth, iNumChannels, audioFormat)

我在尝试将您想要的东西组合在一起时开发了这些片段 - 流式语音助手 [WIP] https://github.com/Petrzlen/vocode-golang

关于go - 如何在 Go 中从麦克风获取 WAV 音频，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/76096551/

go - 如何在 Go 中从麦克风获取 WAV 音频

上一篇：c# - 在 Avalonia 应用程序中使用 Microsoft Store 在应用程序内购买

下一篇：redis - Redis 中的嵌入和地理过滤搜索