c - 使用动态分配的数组递归合并排序大量整数会引发错误

我编写了一个 C 程序来使用动态分配的数组对整数进行合并排序(递归)。它适用于最多 100k 个整数，但当我输入 100 万个整数时，它会抛出 Segmentation fault (core dumped) 错误。

为什么要这样做？我的 16GB RAM 不够好吗？如果我不使用动态分配的数组，它是否能够对更大数量的整数进行排序？

动态分配究竟是如何工作的？根据我的理解，当动态变量或动态数组中的元素被声明时，内存的一部分(RAM？)被搁置并设置为严格存储声明的变量，直到内存被释放。

当我的程序试图留出内存来容纳一百万个整数时，它是否会因为没有足够的可用内存而失败？

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>

#define BIL 1E9

//struct Sort allows dynamic allocation of the array used in insertion sort.
typedef struct {
    int *arr; //pointer to the dynamic array
    size_t used; //stores number of 'used' elements in the array
    size_t size; //stores number of total elements
} Sort;

//function prototypes to interact with the dynamic array
void freeSort(Sort *);
void initSort(Sort *, size_t);
void inSort(Sort *, int);

//prototypes for the Merge Sort
void mergeSort(Sort *, int, int, int []);
void merge(Sort *, int, int, int, int []);
void copyArray(int [], int, int, Sort *);



int main(){

    //declare Sort variable 'magic' to perform the magical insertion sort on the dynamic array.
    Sort magic;
    initSort(&magic, 10); //initialize magic with 10 elements

    //variables to allow the program to function
    int intin;
    char filename[15];

    //tosort is the file to sort.
    //sorted is the output file after sort.
    FILE *tosort, *sorted;

    //necessary variables to measure time
    struct timespec start, finish;

    //prompt user for file name.
    printf("Enter the name of file with a list of integers to sort: ");
    scanf("%s", filename);
    tosort = fopen(filename, "r"); //read 'tosort' file

    //write the 'sorted' file to 'filename.sorted'
    sorted = fopen(strcat(filename, ".sorted"), "w");

    //while loop stores every integer in the dynamically allocated magic array from tosort file.
    while (!feof(tosort)) {
        fscanf(tosort, "%d", &intin);
        inSort(&magic, intin);
    }

    //n stores number of integers to sort
    int n = magic.used;
    //temporary array for use with the merge sort
    int sortedArray [n];

    //measure time
    clock_gettime(CLOCK_REALTIME, &start);  //start

    //Merge Sort
    mergeSort(&magic, 0, n, sortedArray);

    clock_gettime(CLOCK_REALTIME, &finish); //finish

    //calculate the elapsed time in nanoseconds.
    double elapsed = (finish.tv_sec-start.tv_sec)+(finish.tv_nsec-start.tv_nsec)/BIL;

    printf("Merge Sort took %lf seconds\n", elapsed);

    //write the sorted array to 'sorted' ('filename'.sorted)
    for (int i = 0; i < n; i++) {
        fprintf(sorted, "%d\n", magic.arr[i]);
    }

    //free up the allocated memory for the Sort array and close the files.
    freeSort(&magic);
    fclose(tosort);
    fclose(sorted);

    return 0;
}

//initialize the dynamic array
void initSort(Sort *dynA, size_t initSize) {
    dynA->arr = (int *)malloc(initSize * sizeof(int));
    dynA->used = 0;
    dynA->size = initSize;
}

//add values to the elements of the dynamic array
void inSort(Sort *dynA, int val) {
    //if the array size is not big enough to fit new values, allocate 100 more elements.
    if (dynA->used == dynA->size) {
        dynA->size += 100;
        dynA->arr = (int *)realloc(dynA->arr, dynA->size * sizeof(int));
    }
    //'used' holds the number of used elements with values in the array.
    dynA->arr[dynA->used++] = val;
}

//free allocated memory for the dynamic array
void freeSort(Sort *dynA) {
  free(dynA->arr);
  dynA->arr = NULL;
  dynA->used = dynA->size = 0;
}


//split the array until size is 1
void mergeSort(Sort *dynA, int begin, int end, int tempA [])
{
    //if size is 1, done splitting.
    if(end-begin < 2)
        return;

    // recursively split the array
    int mid = (end+begin)/2; // mid = middle point
    mergeSort(dynA, begin, mid, tempA); // mergeSort left half
    mergeSort(dynA, mid, end, tempA); // mergeSort right half
    merge(dynA, begin, mid, end, tempA); // merge the two halves
    copyArray(tempA, begin, end, dynA); // copy the merged array to dynA
}

//merge the two arrays
void merge (Sort *dynA, int begin, int mid, int end, int tempA [])
{
    int i = begin; int j = mid;

    //from begin to end, compare the values of the two arrays
    for (int k = begin; k < end; k++)

        // store the smaller value into tempA[k]
        if (j >= end || (i < mid && dynA->arr[i] <= dynA->arr[j]))
            tempA[k] = dynA->arr[i++];
        else tempA[k] = dynA->arr[j++];
}

//copy the contents of the temporary array to the dynamic array
void copyArray(int tempA[], int begin, int end, Sort *dynA){
    for(int k = begin; k < end; k++)
        dynA->arr[k] = tempA[k];
}

当我输入一百万个整数进行排序时，Cygwin64 和 CommandPrompt 给出相同的错误。

最佳答案

您的错误是您正在使用基于大堆栈的 VLA sortedArray。对于 1,000,000 个值，您的数组为 4MB，由于数组超出预设的堆栈大小限制，您会因堆栈溢出而出现段错误。

例如，在 linux 下，堆栈限制约为 8MB [在我的系统上，我不得不将数组计数增加到 3,000,000 以重现段错误]

改变:

    //temporary array for use with the merge sort
    int sortedArray [n];

进入:

    //temporary array for use with the merge sort
    int *sortedArray = malloc(sizeof(int) * n);

可选地，如果您愿意，可以在 main 的底部添加一个 free(sortedArray) 以保持整洁。

此外，您可以使用 limit shell 命令查看 stacksize 参数设置的值。因此，解决此问题的另一种方法是使用命令来增加限制。但是，我不推荐这样做，因为与简单地使用 malloc 相比，它是一种不太通用且更脆弱的解决方案。

关于调试技巧...

为了找到您的问题，我做了以下工作:

使用调试信息编译程序:gcc -o blah -g blah.c

使用 gdb ./blah 调用调试器

使用run运行程序[在调试器内部]

我得到了下面的段错误信息:

Starting program: ...
Enter the name of file with a list of integers to sort: input2

Program received signal SIGSEGV, Segmentation fault.
0x00000000004009d1 in main () at blah.c:64
64      clock_gettime(CLOCK_REALTIME, &start);  //start

clock_gettime 出错没有多大意义，所以我查看了它上面的行:

    int sortedArray [n];

关于c - 使用动态分配的数组递归合并排序大量整数会引发错误，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/39883455/

c - 使用动态分配的数组递归合并排序大量整数会引发错误

上一篇：r - 基于字符向量进行子集化时，knit_expand 失败

下一篇：smtp - Telnet 到谷歌邮件服务器