python-3.x - 使用 MFCC 进行特征提取

我想知道，如何提取音频(x.wav)信号，使用MFCC进行特征提取？我知道使用 MFCC 提取音频特征的步骤。我想知道使用Django框架在Python中的精细编码

最佳答案

这是构建语音识别器最重要的一步，因为在将语音信号转换为频域后，我们必须将其转换为特征向量的可用形式。

import numpy as np
import matplotlib.pyplot as plt
from scipy.io import wavfile
from python_speech_features import mfcc, logfbank

frequency_sampling, audio_signal = 
wavfile.read("/home/user/Downloads/OSR_us_000_0010_8k.wav")

audio_signal = audio_signal[:15000]

features_mfcc = mfcc(audio_signal, frequency_sampling)

print('\nMFCC:\nNumber of windows =', features_mfcc.shape[0])
print('Length of each feature =', features_mfcc.shape[1])



features_mfcc = features_mfcc.T
plt.matshow(features_mfcc)
plt.title('MFCC')

filterbank_features = logfbank(audio_signal, frequency_sampling)

print('\nFilter bank:\nNumber of windows =', filterbank_features.shape[0])
print('Length of each feature =', filterbank_features.shape[1])

filterbank_features = filterbank_features.T
plt.matshow(filterbank_features)
plt.title('Filter bank')
plt.show()

或者您可以使用此代码来提取特征

import numpy as np
from sklearn import preprocessing
import python_speech_features as mfcc

def extract_features(audio,rate):
"""extract 20 dim mfcc features from an audio, performs CMS and combines 
delta to make it 40 dim feature vector"""    

        mfcc_feature = mfcc.mfcc(audio,rate, 0.025, 0.01,20,nfft = 1200, appendEnergy = True)    
        mfcc_feature = preprocessing.scale(mfcc_feature)
        delta = calculate_delta(mfcc_feature)
        combined = np.hstack((mfcc_feature,delta)) 
        return combined

关于python-3.x - 使用 MFCC 进行特征提取，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/54160128/

上一篇：java - 带有 OpenJDK 10 和 OpenJFX 的 MacOS X 上的 JavaFX 中的 HeadlessException

下一篇：node.js - Hyperledger fabric : Error: chaincode argument error: json: cannot unmarshal array into Go struct field strArgs. 字符串类型的参数

相关文章：

audio - 如何使用 MFCC 向量对单个音频文件进行分类？

python - python tensorflow信号处理MFCC功能

python-3.x - 如何使用比较多个数据帧并使用 Pandas 返回匹配项

python-3.x - numpy 中的这个操作叫什么？

audio - 确定声学相似性的方法(但不是指纹)

python - 使用ast.literal_eval()清理数据时出现语法错误

python-3.x - librosa.util.exceptions.ParameterError : Invalid shape for monophonic audio: ndim=2, 形状=(1025, 5341)

python-3.x - 为什么当验证器条件不满足时 wtforms 不产生错误？

python-3.x - pip install requirements.txt in venv : How to ignore system site packages of different version?

python - 如何使用 Python 3 venv 使用 postactivate 脚本？