我正在以浮点形式读取文件的音频数据,例如我得到这些值:
-4,151046E+34
-2,365558E+38
6,068741E+26
-4,141856E+34
-2,179363E+38
1,177772E-04
-1,035052E+34
-1,#QNAN
2,668123E-20
-1,0609E+37
-2,153349E+38
1,105884E-16
-4,25223E+37
-1,#QNAN
-3,718855E+22
-1,695596E+38
我想检测沉默何时开始和结束。
这些值是否代表与音量直接相关的值,或者值 0 是否代表此屏幕截图中此时的值,而我需要查看其中相当多的值来检测静音?
最佳答案
沉默是一个与感知相关的概念,具有时间属性......沉默不可能只发生在大声音频包围的瞬间,因为它不会被视为沉默
当音频曲线在某个可感知的时间段内处于零交叉点或与零交叉点变化不大时,就会发生静音...您不可能先听到可听的音频,然后出现仅持续一瞬间的静音,然后出现可听的音频...这不是沉默...在安静的房间里,您的耳膜或麦克风的膜不会振动...随着房间的响度从沉默中增加,这些表面开始摆动...您显示的情节可以被认为是可视化这种摆动......在情节上唯一的沉默发生在开始的那段平线时间段
要以编程方式识别何时发生静音,您需要两个参数
- 音频曲线的某个最大高度,低于该高度则声明发生静音
- 音频曲线保持低于最大高度的最小时间长度
您可以尝试猜测这些值...现在让我们确定何时发生沉默
package main
import "fmt"
func main() {
// somehow your audio_buffer gets populated
flag_in_candidate_silence := false // current sample is quiet
flag_currently_in_declared_silence := false // current stretch of samples are in silence period
total_num_samples := len(audio_buffer) // identify how many samples
max_vol := 0.1 // max volume and still a silence candidate
min_num_samples := 2000 // minimum number of samples necessary to declare silence has happened
// value used is dependent on sampling rate
curr_num_samples_found := 0
index_silence_starts := 0
index_silence_ends := 0
for curr_sample := 0; curr_sample < total_num_samples; curr_sample++ {
curr_amplitude := audio_buffer[curr_sample]
if curr_amplitude < max_vol { // current sample is candidate for silence
index_silence_ends = curr_sample
if flag_in_candidate_silence != true { // previous sample was not a candidate
index_silence_starts = curr_sample
}
if curr_num_samples_found > min_num_samples {
// we are inside a period of silence !!!!!!!!!!!
flag_currently_in_declared_silence = true
}
flag_in_candidate_silence = true
curr_num_samples_found++ // increment counter of current stretch of silence candidates
} else {
if flag_currently_in_declared_silence == true {
fmt.Println("found silence stretch of samples from ", index_silence_starts, " to ", index_silence_ends)
}
flag_in_candidate_silence = false
flag_currently_in_declared_silence = false
curr_num_samples_found = 0
}
}
if flag_currently_in_declared_silence == true {
fmt.Println("found silence stretch of samples from ", index_silence_starts, " to ", index_silence_ends)
}
}
(代码未经测试 - 直接从额头喷出)
关于audio - 从 float 音频值检测静音,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51771510/