algorithm - MATLAB程序执行需要1个多小时

标签 algorithm matlab memory-management

以下程序是用于从输入图中查找 k-clique 社区的程序。 图数据集 can be found here .

数据集的第一行分别包含“节点数和边数”。以下几行的“node1 node2”表示 node1 和 node2 之间的边。 例如:

2500 6589    // number_of_nodes, number_of_edges    
0 5          // edge between node[0] and node[5]    
.    
.    
.    

包含k-clique( aCliqueSIZE, anAdjacencyMATRIX ) 函数here .

在MATLAB的命令窗口中执行以下命令:

x         = textread( 'amazon.graph.small' );            %% source input file text 
s         = max(x(1,1), x(1,2));                         %% take largest dimemsion
adjMatrix = sparse(x(2:end,1)+1, x(2:end,2)+1, 1, s, s); %% now matrix is square    
adjMatrix = adjMatrix | adjMatrix.';                     %% apply "or" with transpose to make symmetric
adjMatrix = full(adjMatrix);                             %% convert to full if needed
k=4;
[X,Y,Z]=k_clique(k,adjMatrix);                                   %% 

% The output can be viewed by the following commands    
celldisp(X);    
celldisp(Y);    
Z

上述程序需要 1 个多小时才能执行,但我认为情况不应该如此。在 Windows 上运行该程序时,我检查了任务管理器,发现只为该程序分配了 500 MB。这是程序运行缓慢的原因吗?如果是,那么如何在 MATLAB 中为该程序分配更多的堆内存(接近 4GB)?

最佳答案

问题似乎不是内存限制

具有 6k5 * 6k5 边的稀疏方形对称矩阵并不意味着内存很大。

所提供的代码有很多 for 循环,并且在尾函数 transfer_nodes() 中大量递归/p>

在代码中添加一个“Stone-Age-Profiler”

要显示在处理的 CPU 绑定(bind)部分上花费的相应时间,请将代码的主要部分包装到以下结构中: tic(); for .... end;toc() 它将向您打印在 k_clique.m 代码的相关部分上花费的 CPU 限制时间,“即时”显示读数

你的原始代码 k_clique.m

function [components,cliques,CC] = k_clique(k,M)
% k-clique algorithm for detecting overlapping communities in a network 
% as defined in the paper "Uncovering the overlapping 
% community structure of complex networks in nature and society"
% 
% [X,Y,Z] = k_clique(k,A)
% 
% Inputs: 
% k - clique size 
% A - adjacency matrix
% 
% Outputs: 
% X - detected communities
% Y - all cliques (i.e. complete subgraphs that are not parts of larger
% complete subgraphs)
% Z - k-clique matrix


nb_nodes = size(M,1); % number of nodes 

% Find the largest possible clique size via the degree sequence:
% Let {d1,d2,...,dk} be the degree sequence of a graph. The largest
% possible clique size of the graph is the maximum value k such that
% dk >= k-1
degree_sequence = sort(sum(M,2) - 1,'descend');
%max_s = degree_sequence(1);
max_s = 0;
for i = 1:length(degree_sequence)
    if degree_sequence(i) >= i - 1
        max_s = i;
    else 
        break;
    end
end

cliques = cell(0);
% Find all s-size kliques in the graph
for s = max_s:-1:3
    M_aux = M;
    % Looping over nodes
    for n = 1:nb_nodes
        A = n; % Set of nodes all linked to each other
        B = setdiff(find(M_aux(n,:)==1),n); % Set of nodes that are linked to each node in A, but not necessarily to the nodes in B
        C = transfer_nodes(A,B,s,M_aux); % Enlarging A by transferring nodes from B
        if ~isempty(C)
            for i = size(C,1)
                cliques = [cliques;{C(i,:)}];
            end
        end
        M_aux(n,:) = 0; % Remove the processed node
        M_aux(:,n) = 0;
    end
end

% Generating the clique-clique overlap matrix
CC = zeros(length(cliques));
for c1 = 1:length(cliques)
    for c2 = c1:length(cliques)
        if c1==c2
            CC(c1,c2) = numel(cliques{c1});
        else
            CC(c1,c2) = numel(intersect(cliques{c1},cliques{c2}));
            CC(c2,c1) = CC(c1,c2);
        end
    end
end

% Extracting the k-clique matrix from the clique-clique overlap matrix
% Off-diagonal elements <= k-1 --> 0
% Diagonal elements <= k --> 0
CC(eye(size(CC))==1) = CC(eye(size(CC))==1) - k;
CC(eye(size(CC))~=1) = CC(eye(size(CC))~=1) - k + 1;
CC(CC >= 0) = 1;
CC(CC < 0) = 0;

% Extracting components (or k-clique communities) from the k-clique matrix
components = [];
for i = 1:length(cliques)
    linked_cliques = find(CC(i,:)==1);
    new_component = [];
    for j = 1:length(linked_cliques)
        new_component = union(new_component,cliques{linked_cliques(j)});
    end
    found = false;
    if ~isempty(new_component)
        for j = 1:length(components)
            if all(ismember(new_component,components{j}))
                found = true;
            end
        end
        if ~found
            components = [components; {new_component}];
        end
    end
end


    function R = transfer_nodes(S1,S2,clique_size,C)
        % Recursive function to transfer nodes from set B to set A (as
        % defined above)

        % Check if the union of S1 and S2 or S1 is inside an already found larger
        % clique 
        found_s12 = false;
        found_s1 = false;
        for c = 1:length(cliques)
            for cc = 1:size(cliques{c},1)
                if all(ismember(S1,cliques{c}(cc,:)))
                    found_s1 = true;
                end
                if all(ismember(union(S1,S2),cliques{c}(cc,:)))
                    found_s12 = true;
                    break;
                end
            end
        end

        if found_s12 || (length(S1) ~= clique_size && isempty(S2))
            % If the union of the sets A and B can be included in an
            % already found (larger) clique, the recursion is stepped back
            % to check other possibilities
            R = [];
        elseif length(S1) == clique_size;
            % The size of A reaches s, a new clique is found
            if found_s1
                R = [];
            else
                R = S1;
            end
        else
            % Check the remaining possible combinations of the neighbors
            % indices
            if isempty(find(S2>=max(S1),1))
                R = [];
            else
                R = [];
                for w = find(S2>=max(S1),1):length(S2)
                    S2_aux = S2;
                    S1_aux = S1;
                    S1_aux = [S1_aux S2_aux(w)];
                    S2_aux = setdiff(S2_aux(C(S2(w),S2_aux)==1),S2_aux(w));
                    R = [R;transfer_nodes(S1_aux,S2_aux,clique_size,C)];
                end
            end
        end
    end
end

关于algorithm - MATLAB程序执行需要1个多小时,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/26228106/

相关文章:

algorithm - 如果排列的子数组被反转,排列中的反转次数?

Matlab:如何随机排列和逆排列向量

java - 使用Matlab创建MySql数据库

c++ - 将数组传递给 C++ 中的函数

performance - 对于较小的最终长度,为什么 Vec::with_capacity 比 Vec::new 慢?

java - 试图定位泄漏点! anon 对 pmap 意味着什么?

java - 从母亲列表生成 child 列表

java - BFS : PriorityQueue not getting empty

python - python 代码在数组中不起作用

matlab - 具有不同颜色图的子图 matlab