这是我的功能:
void printStatistics(const char *current) {
int count = 0, i = 0, length = strlen(current);
int lowercaseLetters[26] = {0};
int uppercaseLetters[26] = {0};
char *token;
for (i = 0; i < length; i++) {
if (current[i] >= 'a' & current[i] <= 'z') {
lowercaseLetters[current[i] - 'a']++;
}
}
for (i = 0; i < length; i++) {
if (current[i] >= 'A' & current[i] <= 'Z') {
uppercaseLetters[current[i] - 'A']++;
}
}
char tempToken[10] = "";
strcpy(tempToken, current);
token = strtok(tempToken, " ");
while (token != NULL) {
token = strtok(NULL, " ");
count++;
}
printf("Statistics:\n"
"\tlength:\t\t%d\n"
"\tword:\t\t%d\n"
"Frequency:\n", length, count);
printf("Printing Uppercase matrix...\n");
for (i = 0; i < 26; i++) {
printf("\tfrequency of %c:\t%d\n", 'a' + i, uppercaseLetters[i]);
}
printf("Printing Lowercase matrix...\n");
for (i = 0; i < 26; i++) {
printf("\tfrequency of %c:\t%d\n", 'a' + i, lowercaseLetters[i]);
}
}
这是我尝试检查字符串时得到的结果
Statistics:
length: 74
word: 2
Frequency:
Printing Uppercase matrix...
frequency of a: 1734829927
frequency of b: 1734829927
frequency of c: 1107322727
frequency of d: 1111638594
frequency of e: 1111638594
frequency of f: 1111638594
frequency of g: 1111638594
frequency of h: 1111638594
frequency of i: 1111638594
frequency of j: 1111638594
frequency of k: 1111638594
frequency of l: 1111638594
frequency of m: 1111638594
frequency of n: 1111638594
frequency of o: 1111638594
frequency of p: 1111638594
frequency of q: 0
frequency of r: 0
frequency of s: 0
frequency of t: 0
frequency of u: 0
frequency of v: 0
frequency of w: 0
frequency of x: 0
frequency of y: 0
frequency of z: 0
Printing Lowercase matrix...
frequency of a: 0
frequency of b: 0
frequency of c: 0
frequency of d: 0
frequency of e: 0
frequency of f: 0
frequency of g: 20
frequency of h: 0
frequency of i: 0
frequency of j: 0
frequency of k: 0
frequency of l: 0
frequency of m: 0
frequency of n: 0
frequency of o: 0
frequency of p: 0
frequency of q: 0
frequency of r: 0
frequency of s: 0
frequency of t: 0
frequency of u: 0
frequency of v: 0
frequency of w: 0
frequency of x: 0
frequency of y: 0
frequency of z: 0
为什么我会在大写矩阵中得到这些奇怪的长数字?似乎我没有在大写数组之外进行索引 - 我以与小写数组完全相同的方式处理它。
我在这里做错了什么?
最佳答案
你造成 undefined behaviour 通过写入超过缓冲区的末尾。主要问题在这里:
char tempToken[10] = "";
strcpy(tempToken, current);
由于在将字符串复制到 tempToken
之前,您没有在 current
检查字符串的长度,因此您很可能会超过 9 个字符的限制(允许一个终止 '\0'
字节的额外字符)并破坏分配给其他数据的内存。
在您的情况下,这是程序调用 printStatistics()
时堆栈的样子:(但请参阅下面的注释)
+--------------------+--------------------------+--------------------------+--------------
| char tempToken[10] | int uppercaseLetters[26] | int lowercaseLetters[26] | token, etc...
+--------------------+--------------------------+--------------------------+--------------
当你复制字符串 gggggggggggggggggggg BBBBBBBBBBBBBB...
到 tempToken
时,前十个字符完全填满这个数组,其余的被写入数组 大写字母
代替。因此,当您从该数组中获取数据时,实际上是在读回这些 ASCII 字符 (1734829927 == 0x67676767 == "gggg"; 1111638594 == 0x42424242 == "BBBB")。
如果你复制一个较长的字符串,你也会覆盖lowercaseLetters
,然后是其他变量(token
等)。
strncpy()
功能旨在避免此类问题。您也应该使用它。
此外,正如其他人所指出的,您正在使用按位“与”运算符 &
,其中需要逻辑“与”&&
。
- Note: Other systems and other compilers will store things differently, and will misbehave in other ways. Your code simply crashed when compiled on my computer.
关于c - C 中字符的频率 - 奇怪的数字,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39885417/