我正在对求解矩阵的执行时间进行基准测试,但我无法获得超过 ~200x200 的时间,我应该使用 1500x1500 或接近该时间。我在 VS 上运行这个。
#include <stdlib.h>
#include <stdio.h>
#include "lapacke.h"
#include "lapacke_config.h"
#include <time.h>
/* Auxiliary routines prototypes */
extern void print_matrix(lapack_complex_double *a, int m, int n);
extern void generate_matrix(lapack_complex_double * matrix, int w, int h);
/* Parameters */
#define N 179
/* Main program */
int main() {
clock_t t;
/* Locals */
lapack_int n = N,info;
lapack_int ipiv[N];
lapack_complex_double a[N*N];
lapack_complex_double b[N*N];
FILE* fp1 = fopen("complexNumsLog.txt", "w");
int w = 1, h = 1, h2 = 1, i = 0, j = 0;
for(i = 1; i <= N; i++){
for(j = 1; j <= N; j++){
w = i;
h = i;
h2 = j;
generate_matrix(a, w, h);
generate_matrix(b, w, h2);
// print_matrix(a, w, h);
// print_matrix(b, w, h2);
// getchar();
t = clock();
info = LAPACKE_zgesv(LAPACK_ROW_MAJOR, w, h2, a, h, ipiv, b, h2);
t = clock() - t;
fprintf(fp1, "Matrix A: %3dx%3d ", w, h);
fprintf(fp1, "Matrix B: %3dx%3d ", w, h2);
fprintf(fp1, "%3d milliseconds %2.3f seconds\n",t,((float)t)/CLOCKS_PER_SEC);
/* Check for the exact singularity */
if( info > 0 ) {
printf( "The diagonal element of the triangular factor of A,\n" );
printf( "U(%i,%i) is zero, so that A is singular;\n", info, info );
printf( "the solution could not be computed.\n" );
getchar();
exit( 1 );
}
}
printf("%d\n", i);
}
getchar();
exit( 0 );
}
void print_matrix(lapack_complex_double* a, int m, int n) {
int i, j;
for( i = 0; i < m; i++ ) {
for( j = 0; j < n; j++ )
printf( " (%6.2f,%6.2f)", a[i*m+j].real, a[i*m+j].imag );
printf( "\n" );
}
printf( "********************************************\n" );
}
void generate_matrix(lapack_complex_double * matrix, int w, int h){
int i,j;
double r;
for(i = 0; i < w; i++){
for(j = 0; j < h; j++){
r = (rand()%1000 - 500)/100.0;
matrix[i*w+j].real = r;
r = (rand()%1000 - 500)/100.0;
matrix[i*w+j].imag = r;;
}
}
}
最佳答案
你的筹码已经爆了。您的程序通常只分配几 MB 的堆栈空间,而您正尝试分配堆栈上的所有数据。当您达到大约 200x200 以上时,您很快就会耗尽堆栈空间。
要解决此问题,您需要在全局范围或堆上分配内存,其中大小仅受虚拟地址空间和/或总可用物理内存的限制。由于大小在编译时已知,因此最简单的方法是在全局范围内分配它:
/* Parameters */
#define N 179
lapack_int ipiv[N];
lapack_complex_double a[N*N];
lapack_complex_double b[N*N];
/* Main program */
int main() {
...
}
关于c - 我无法为更大的矩阵分配内存,然后〜200x200 Lapack Visual Studio in C 这是我的代码,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/21124185/