首先我要说的是,我了解非常基本的 C 语言。我已经使用 tutorialspoint 中的代码实现了一个哈希表。网站。
我的程序正在从文本文件中读取正整数,将数字及其在文件中出现的次数分别作为键和值。实际上它工作得很好,但我面临一个大问题:我需要事先知道哈希表的大小。我注意到,当我跳过负数时,它会奇怪地弄乱表格的结构。
这是我的代码(如果使用大小为 20):
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <stdbool.h>
#define SIZE 20
struct DataItem {
int key;
int data;
};
struct DataItem* hashArray[SIZE];
struct DataItem* item;
int hashCode(int key) {
return key % SIZE;
}
struct DataItem *search(int key) {
int hashIndex = hashCode(key);
while(hashArray[hashIndex] != NULL) {
if(hashArray[hashIndex]->key == key)
return hashArray[hashIndex];
++hashIndex;
hashIndex %= SIZE;
}
return NULL;
}
void insert(int key,int data) {
struct DataItem *item = (struct DataItem*) malloc(sizeof(struct DataItem));
item->data = data;
item->key = key;
int hashIndex = hashCode(key);
while(hashArray[hashIndex] != NULL && hashArray[hashIndex]->key != -1) {
++hashIndex;
hashIndex %= SIZE;
}
hashArray[hashIndex] = item;
}
void display() {
int key_index;
for(key_index = 0; key_index<SIZE; key_index++) {
if(hashArray[key_index] != NULL) {
printf("%d has occurred %d time(s)\n", hashArray[key_index]->key, hashArray[key_index]->data);
}
}
printf("\n");
}
void sort(size_of_struct) {
int i, tempkey, tempdata;
if(size_of_struct == 1) {
return;
}
for(i=0; i<(size_of_struct-1); i++) {
if(hashArray[i] != NULL) {
if(hashArray[i]->key > hashArray[i+1]->key) {
tempkey = hashArray[i]->key;
hashArray[i]->key = hashArray[i+1]->key;
hashArray[i+1]->key = tempkey;
tempdata = hashArray[i]->data;
hashArray[i]->data = hashArray[i+1]->data;
hashArray[i+1]->data = tempdata;
}
}
}
sort(size_of_struct-1);
}
int main() {
FILE *myFile;
myFile = fopen("test.txt", "r");
int *numberArray = NULL;
int i, count, number, current_num, temp;
count = number = temp = 0;
numberArray = malloc(sizeof(int)*500);
while(fscanf(myFile, "%d", &number) > 0) {
if(number >= 0 && number <= 100) {
numberArray[count] = number;
count++;
}
}
for(i=0; i < count; i++)
{
item = search(numberArray[i]);
if(item == NULL) {
insert(numberArray[i], 1);
} else {
item = search(numberArray[i]);
item->data += 1;
}
free(numberArray);
numberArray = malloc(sizeof(int)*500);
}
display();
sort(SIZE);
}
如果我运行此代码,我会得到以下结果:
80 has occurred 1 time(s)
100 has occurred 1 time(s)
2 has occurred 1 time(s)
3 has occurred 2 time(s)
84 has occurred 1 time(s)
12 has occurred 1 time(s)
33 has occurred 1 time(s)
99 has occurred 3 time(s)
Segmentation fault: 11
如果我使用的 SIZE 为 8(这是我从文件中使用的数字总数),那么我会得到以下结果:
80 has occurred 1 time(s)
33 has occurred 1 time(s)
2 has occurred 1 time(s)
99 has occurred 3 time(s)
3 has occurred 2 time(s)
12 has occurred 1 time(s)
100 has occurred 1 time(s)
84 has occurred 1 time(s)
2 has occurred 1 time(s)
3 has occurred 2 time(s)
12 has occurred 1 time(s)
33 has occurred 1 time(s)
80 has occurred 1 time(s)
84 has occurred 1 time(s)
99 has occurred 3 time(s)
100 has occurred 1 time(s)
经过进一步调查,我注意到我的显示功能中发生了这种情况:
key_index: 0
80 has occurred 1 time(s)
key_index: 1
100 has occurred 1 time(s)
key_index: 2
2 has occurred 1 time(s)
key_index: 3
3 has occurred 2 time(s)
key_index: 4
84 has occurred 1 time(s)
key_index: 5
key_index: 6
key_index: 7
key_index: 8
key_index: 9
key_index: 10
key_index: 11
key_index: 12
12 has occurred 1 time(s)
key_index: 13
33 has occurred 1 time(s)
key_index: 14
key_index: 15
key_index: 16
key_index: 17
key_index: 18
key_index: 19
99 has occurred 3 time(s)
Segmentation fault: 11
我不完全确定发生了什么。我知道为什么会发生段错误:它试图引用该索引处为空的值。我不知道的是为什么(在显示函数内部)它会跳过 5-11 和 14-18 中的 key_index 值;这是我真正关心的问题。我假设因为我没有使用文件中的所有数字,所以该结构正在尝试填充它,以便始终有 20 个空格。我错了吗?有什么办法可以解决这个问题吗?
我的 test.txt 文件:
99 2 99
3
80 12 -12 33
3 99 100 1234 84
最佳答案
嘿,看起来你那里有一个段错误。该死,追踪这些可能非常困难。如@underscore_d提到,您可以通过调试器运行程序以在代码运行时检查代码。
但是,幸运的是,您提供了(大部分)Minimal, Complete and Verifiable例子给我们看一下。我提供了自己的 test.txt
文件,但能够重现崩溃。现在,这可能不是您看到的崩溃,因为我没有您的 test.txt
,但希望以下分析无论如何都能帮助您。
由于您的示例很短,您还可以考虑通过一些分析工具来运行您的代码。我定期通过 AddressSanitizer 运行我的所有代码和 Valgrind帮助避免引入此类错误。在您的情况下,AddressSanitizer 会查明错误的位置。
$ gcc -fsanitize=address -g test.c -o test
$ ./test
80 has occurred 1 time(s)
ASAN:DEADLYSIGNAL
=================================================================
==18300==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x561f6050b39f bp 0x7ffe4d6866b0 sp 0x7ffe4d686690 T0)
#0 0x561f6050b39e in sort /home/v1bri/test.c:67
#1 0x561f6050bae2 in main /home/v1bri/test.c:111
#2 0x7f60938ee3f0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x203f0)
#3 0x561f6050acb9 in _start (/home/v1bri/test+0xcb9)
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /home/v1bri/test.c:67 in sort
==18300==ABORTING
错误发生在 sort()
函数的第 67 行。果然...
void sort(size_of_struct) {
// ...
for(i=0; i<(size_of_struct-1); i++) {
if(hashArray[i] != NULL) {
if(hashArray[i]->key > hashArray[i+1]->key) {
在取消引用 hashArray[i+1]
处的指针时,您忘记检查 NULL
。 Valgrind 会给你更多的错误,但也会找到罪魁祸首。
$ gcc -g test.c -o test
$ valgrind --tool=memcheck ./test
==18764== Memcheck, a memory error detector
==18764== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==18764== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
<snip>
==18764== Invalid read of size 4
==18764== at 0x108B0D: sort (test.c:67)
==18764== by 0x108DA7: main (test.c:111)
==18764== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==18764==
==18764==
==18764== Process terminating with default action of signal 11 (SIGSEGV)
==18764== Access not within mapped region at address 0x0
==18764== at 0x108B0D: sort (test.c:67)
==18764== by 0x108DA7: main (test.c:111)
我知道您提到过您了解“非常基础的 C 语言”,但希望您能看到使用现代调试工具来测试程序错误是多么容易。对于新程序员来说,学习工具可能会让人望而生畏,但这实际上是学习的最佳时机,因为它将为您节省大量时间。
关于C 哈希表大小问题,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47579219/