python - 使用现场麦克风的 pyaudio 检测水龙头

标签 python microphone pyaudio

如何使用 pyaudio 检测来自现场麦克风的突然敲击声?

最佳答案

一种方法:

  • 一次读取一组样本, 说 0.05 秒值得
  • 计算 block 的 RMS 幅度(平方 的平方的平均值的根 单个样本)
  • 如果 block 的 RMS 幅度大于阈值,则为“嘈杂 block ”,否则为“安静 block ”
  • 突然的敲击将是一个安静的街区,然后是少量嘈杂的街区,然后是一个安静的街区
  • 如果你从来没有得到一个安静的街区,你的阈值太低了
  • 如果你从来没有得到一个嘈杂的街区,你的阈值太高了

我的应用程序在无人看管的情况下录制“有趣”的噪音,所以只要有噪音 block 它就会记录。如果有 15 秒的嘈杂时间(“捂住耳朵”),它将阈值乘以 1.1,如果有 15-分钟的安静时间(“更努力地听”),它将阈值乘以 0.9 )。您的应用程序将有不同的需求。

另外,刚刚注意到我的代码中关于观察到的 RMS 值的一些注释。在 Macbook Pro 的内置麦克风上,具有 +/- 1.0 标准化音频数据范围,输入音量设置为最大,一些数据点:

  • 0.003-0.006(-50dB 到 -44dB)我家的中央供暖风扇声音非常大
  • 0.010-0.40(-40dB 到 -8dB)在同一台笔记本电脑上打字
  • 0.10 (-20dB) 在 1' 距离处轻轻弹响手指
  • 0.60 (-4.4dB) 在 1' 处大声打响指

更新:这里有一个示例可以帮助您入门。

#!/usr/bin/python

# open a microphone in pyAudio and listen for taps

import pyaudio
import struct
import math

INITIAL_TAP_THRESHOLD = 0.010
FORMAT = pyaudio.paInt16 
SHORT_NORMALIZE = (1.0/32768.0)
CHANNELS = 2
RATE = 44100  
INPUT_BLOCK_TIME = 0.05
INPUT_FRAMES_PER_BLOCK = int(RATE*INPUT_BLOCK_TIME)
# if we get this many noisy blocks in a row, increase the threshold
OVERSENSITIVE = 15.0/INPUT_BLOCK_TIME                    
# if we get this many quiet blocks in a row, decrease the threshold
UNDERSENSITIVE = 120.0/INPUT_BLOCK_TIME 
# if the noise was longer than this many blocks, it's not a 'tap'
MAX_TAP_BLOCKS = 0.15/INPUT_BLOCK_TIME

def get_rms( block ):
    # RMS amplitude is defined as the square root of the 
    # mean over time of the square of the amplitude.
    # so we need to convert this string of bytes into 
    # a string of 16-bit samples...

    # we will get one short out for each 
    # two chars in the string.
    count = len(block)/2
    format = "%dh"%(count)
    shorts = struct.unpack( format, block )

    # iterate over the block.
    sum_squares = 0.0
    for sample in shorts:
        # sample is a signed short in +/- 32768. 
        # normalize it to 1.0
        n = sample * SHORT_NORMALIZE
        sum_squares += n*n

    return math.sqrt( sum_squares / count )

class TapTester(object):
    def __init__(self):
        self.pa = pyaudio.PyAudio()
        self.stream = self.open_mic_stream()
        self.tap_threshold = INITIAL_TAP_THRESHOLD
        self.noisycount = MAX_TAP_BLOCKS+1 
        self.quietcount = 0 
        self.errorcount = 0

    def stop(self):
        self.stream.close()

    def find_input_device(self):
        device_index = None            
        for i in range( self.pa.get_device_count() ):     
            devinfo = self.pa.get_device_info_by_index(i)   
            print( "Device %d: %s"%(i,devinfo["name"]) )

            for keyword in ["mic","input"]:
                if keyword in devinfo["name"].lower():
                    print( "Found an input: device %d - %s"%(i,devinfo["name"]) )
                    device_index = i
                    return device_index

        if device_index == None:
            print( "No preferred input found; using default input device." )

        return device_index

    def open_mic_stream( self ):
        device_index = self.find_input_device()

        stream = self.pa.open(   format = FORMAT,
                                 channels = CHANNELS,
                                 rate = RATE,
                                 input = True,
                                 input_device_index = device_index,
                                 frames_per_buffer = INPUT_FRAMES_PER_BLOCK)

        return stream

    def tapDetected(self):
        print("Tap!")

    def listen(self):
        try:
            block = self.stream.read(INPUT_FRAMES_PER_BLOCK)
        except IOError as e:
            # dammit. 
            self.errorcount += 1
            print( "(%d) Error recording: %s"%(self.errorcount,e) )
            self.noisycount = 1
            return

        amplitude = get_rms( block )
        if amplitude > self.tap_threshold:
            # noisy block
            self.quietcount = 0
            self.noisycount += 1
            if self.noisycount > OVERSENSITIVE:
                # turn down the sensitivity
                self.tap_threshold *= 1.1
        else:            
            # quiet block.

            if 1 <= self.noisycount <= MAX_TAP_BLOCKS:
                self.tapDetected()
            self.noisycount = 0
            self.quietcount += 1
            if self.quietcount > UNDERSENSITIVE:
                # turn up the sensitivity
                self.tap_threshold *= 0.9

if __name__ == "__main__":
    tt = TapTester()

    for i in range(1000):
        tt.listen()

关于python - 使用现场麦克风的 pyaudio 检测水龙头,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/4160175/

相关文章:

python - 使用计数器将 Python for 循环转换为 while 循环

python - 当数据表不显示在页面源中时如何抓取

Python/Yagmail - 如何将本地镜像嵌入到电子邮件中?

javascript - 从浏览器访问麦克风 - Javascript

python - 第二个 .wav 文件播放第一个 .wav 文件的增强噪音,而不是减少噪音

python - Django 如何决定模型中的哪些内容将被分配 _id?

android - Android:不断聆听语音识别输入

Javascript:获取浏览器选择的麦克风名称

python - PyAudio 无法在 'unable to open slave' 的 Ubuntu 14.04 上使用麦克风

python - 如何使用 PyAudio 选择特定的输入设备