c - 使用具有 MFCC 功能的 kohonen 网络进行语音识别。如何设置神经元及其权重之间的距离？

我不知道如何设置 map 中每个神经元的定位。这是一个神经元和 map :

typedef struct _neuron
{
    mfcc_frame *frames;
    char *name;
    double *weights;
    int num_weights;
    int x;
    int y;
} neuron;
typedef struct _map
{
neuron *lattice;
    int latice_size;
    double mapRadius;
    int sideX, sideY; 
    int scale;
} map;

如果我有多个相同的单词，如何计算模式输入(单词)和我的神经元之间的距离。

我不确定重量。我将权重定义为单词的 mfcc 特征数量，但在训练中我需要根据神经元之间的距离更新这个权重。我使用神经元之间的欧几里得距离。但问题是如何更新权重。这是初始化映射和神经元的代码

void init_neuron(neuron *n, int x, int y, mfcc_frame *mfcc_frames, unsigned int n_frames, char *name){

double r;
register int i, j;
n->frames = mfcc_frames;
n->num_weights = n_frames;
n->x = x; 
n->y = y;

n->name = malloc (strlen(name) * sizeof(char));
strcpy(n->name, name);
n->weights= malloc (n_frames * sizeof (double));

for(i = 0; i < n_frames; i++)
    for(j = 0; j < N_MFCC; j++)
        n->weights[i] = mfcc_frames[i].features[j];

printf("%s lattice %d, %d\n", n->name, n->x, n->y);

}

初始化 map :

map* init_map(int sideX, int sideY, int scale){
register int i, x, y;
char *name = NULL;
void **word_adresses;
unsigned int n = 0, count = 0;
int aux = 0;
word *words = malloc(sizeof(word));

map *_map = malloc(sizeof(map));
_map->latice_size = sideX * sideY;
_map->sideX       = sideX;
_map->sideY       = sideY; 
_map->scale       = scale;
_map->lattice     = malloc(_map->latice_size * sizeof(neuron));
mt_seed ();

if ((n = get_list(words))){
    word_adresses = malloc(n * sizeof(void *));
    while (words != NULL){
        x = mt_rand() %sideX;
        y = mt_rand() %sideY;
        printf("y : %d  x: %d\n", y, x);
        init_neuron(_map->lattice + y * sideX + x, x, y, words->frames, words->n, words->name);

        word_adresses[count++] = words;     
        words = words->next;
    }
    for (i = 0; i < count; i++)
        free(word_adresses[i]);
    free(word_adresses);
    aux++;
}

return _map;

}

最佳答案

在 Kohonen SOM 中，权重位于特征空间中，因此这意味着每个神经元都包含一个原型(prototype) vector 。如果输入是 12 个 MFCC，则每个输入可能看起来像一个由 12 个 double 值组成的 vector ，因此这意味着每个神经元有 12 个值，每个 MFCC 一个。给定一个输入，您找到最佳匹配单元，然后根据学习率将该神经元的 12 个码本值向输入 vector 移动少量。

关于c - 使用具有 MFCC 功能的 kohonen 网络进行语音识别。如何设置神经元及其权重之间的距离？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/38949078/

c - 使用具有 MFCC 功能的 kohonen 网络进行语音识别。如何设置神经元及其权重之间的距离？

上一篇：c - c/popen 中的 bash 找不到我的脚本

下一篇：c - 在 OS X 上使用 clang 链接 OSMalloc.h 时出现 undefined symbol