非线性科学研究中心
 
Home | News | Research | People | Publications | Courses | NetworkScience | Conference | Workshop | Links | Softwares
   
 

 

推荐 | bvgraph:A Matlab class to work with the highly compressed BVGraph files.

2023年06月13日 08:43 Web 点击:[]

The bvgraph class makes working with enormous graph adjacency matrices in Matlab a snap. 

The class implements an interface to BVGraph files (http://webgraph.dsi.unimi.it) that emulates a Matlab sparse matrix.

http://cnets.indiana.edu/groups/nan/webgraph/

http://code.google.com/p/py-web-graph/

For example,

>> G = bvgraph(‘data/wb-cs.stanford’);

>> size(G);

>> y = G*rand(size(G,1),1);

are all valid operations. See the “bvgraph example” file for a more exhaustive explanation.

The goal of the library is to support computing PageRank with a 1 billion link graph and 120 million nodes possible inside Matlab with only 2 GB of memory. To load the matrix itself on a 64-bit platform would take around 10 GB of memory. As the “huge graphs” example shows, the bvgraph class achieves this goal.

To find additional datasets, see http://law.dsi.unimi.it/index.php?option=com_include&Itemid=65

The software was tested on

Matlab 2007b on amd64 Ubuntu 7.04
Matlab 2007a on amd64 Ubuntu 7.04
Matlab 2006b on amd64 Ubuntu 7.04

Matlab R14SP3 on qemu (i686) Ubuntu 6.06.1
Matlab 2007b on i686 Ubunutu 5.10

Matlab 7.0 on i386 Windows
Matlab 2007a on i386 Windows

Matlab 2007a on qemu (x86_64) WinXP64

Cite As

David Gleich (2023). bvgraph (https://www.mathworks.com/matlabcentral/fileexchange/16248-bvgraph), MATLAB Central File Exchange. Retrieved.

function bvg = bvgraph(filename,optionsu)
% BVGRAPH/BVGRAPH Construct a Matlab object around a Boldi-Vigna graph file.
%
% The bvgraph class represents a Boldi-Vigna compressed graph as a native
% Matlab object.  The class wraps a libbvg C object with a series of
% functions that make the graph look like a Matlab sparse matrix.  
% For the object G, the operations G*x, and G'*x, x'*G are all
% valid, and consequently, G can be used in many iterative methods without
% adaptation.
%
% G = mwebgraph(filename) loads the graph represented by filename
% G = mwebgraph(filename,options) specifies non-default options for loading
%
% options.trans: internally transpose the graph [{0} | 1]
% options.load_type : determine how to load the graph
%   ['stream' | 'offline' | 'inmemory' | 'online']
%   The options 'stream' and 'offline' are synonymous, as are 'inmemory'
%   and 'online'.  The memory required satisfies the hierarchy
%   'stream' << 'inmemory'
%
% The difference between 'inmemory'/'online' and 'offline'/'stream' is that
% 'online' reads the graph file into memory and the 'stream' option does
% not read the file into memory, but rather, iterates through it on disk.
%
% The filename is the 'base' filename for the graph and 'filename.graph' and
% 'filename.properties' must exist.
%
%Example:
%  G = bvgraph('data/wb-cs.stanford') % compute a PageRank vector using bicgstab
%  n = size(G,1); alpha = 0.85; v = ones(n,1)./n;
%  id = G*ones(n,1); id(id ~= 0) = 1./id(id ~= 0);
%  x = bicgstab(@(x) x - alpha*(G'*(id.*x)), v, 1e-8, 500); x = x./norm(x,1);
%  y = diag(G);
%
% David Gleich
% 21 May 2007
% Copyright, Stanford University, 2007
%
options = struct('load_type', 'online','trans',0);
if exist('optionsu','var')
   options = merge_structs(optionsu,options);
end
switch options.load_type
   case 'stream'
       offset_step = -1;
   case 'offline'
       offset_step = -1;
   case 'inmemory'
       offset_step = 0;
   case 'online'
       offset_step = 0;
%     case 'random'
%         offset_step = 1;
   otherwise
       error('bvgraph:invalidParameter','unknown load type: %s', options.load_type);
end
transp = 0;
if options.trans
   transp = 1;
end
if ~exist([filename '.graph'], 'file')
   error('bvgraph:fileNotFound', ...
       'The file %s.graph does not seem to exist!', filename);
end
if ~exist([filename '.properties'], 'file')
   error('bvgraph:fileNotFound', ...
       'The properties file %s.properties does not seem to exist!', filename);
end
if offset_step > 0 && ~exist([filename '.offsets'], 'file')
   error('bvgraph:fileNotFound', ...
       'The offsets file %s.offsets does not seem to exist!', filename);
end
[n,nz,smem,gmem,offsetmem] = bvgfun('load',filename,offset_step);
bvg = struct('n',n,'nz',nz,'smem',smem,'gmem',gmem,'offsetmem',offsetmem,...
   'offset_step', offset_step, 'filename', filename, ...
   'class', '', 'transp', transp);
bvg = class(bvg, 'bvgraph');

%%%%%%%%%%%%%%%%%%%%%

function get_src
curdir = pwd;
try
   [filepath filename] = fileparts(mfilename('fullpath'));
   cd (filepath);
   
   type readme.txt
   
   input(['Press any key to download the GPL protected codes\n' ...
          'or Ctrl-C to stop here\n']);
   
   try
       urlwrite('http://wilkinson.stanford.edu/~dgleich/codes/bvgraph-1.2-src.zip',...
           'bvgraph-1.2-src.zip');
       
   catch
       fprintf(['Eek!  Getting the source failed!!\n' ...
           'Please send mithandor@gmail.com an email immediately\n']);
       fprintf('I''m so sorry, but is not going to compile without the source.\n');
       rethrow(lasterror);
   end
   
   unzip('bvgraph-1.2-src.zip');
   
   cd(curdir);
catch
   % make sure they end up back where they started
   cd(curdir);
   rethrow(lasterror);
end    

%%%%%%%%%%%%%%%%%%%%%

function test_main
% TEST_MAIN The main test driver for the bvgraph library.
%
% David Gleich
% 3 September 2007
% Copyright, Stanford University, 2007
%
msgid = 'bvgraph:test';
%% Make sure the library is compiled
try
   G = bvgraph('../data/wb-cs.stanford');
   fprintf('The library was already compiled!\n');
catch
   s = lasterror;
   if strcmp(s.identifier, 'bvgfun:notCompiled')
       fprintf('The library is now compiled!\n');
   else
       error('bvgraph:compileError','the library could not automatically compile!');
   end
end
%% Test the code documentation
G = bvgraph('../data/wb-cs.stanford');
A = sparse(G);
x = ones(size(A,1),1);
if norm(G*x - A*x) > eps(1)
   error(msgid, 'the sparse routine example failed');
end
%% Additional tests
G = bvgraph('../data/wb-cs.stanford');
y = bvpagerank(G);
d = diag(G);
n = size(G,1);
x1 = G*ones(n,1);
x2 = G'*ones(n,1);
x3 = ones(1,n)*G;
if ~isequal(x2,x3')
   error(msgid,'there was an error multiplying two vectors');
end
G2 = bvgraph('../data/wb-cs.stanford',struct('offline',1));
n2 = size(G2,1);
y1 = G2*ones(n,1);
y2 = G2'*ones(n,1);
y3 = ones(1,n)*G2;
if ~isequal(x1,y1) || ...
  ~isequal(x2,y2) || ...
  ~isequal(x3,y3)
  error(msgid,'there was an error with the offline graph');
end
if ~isa(G,'bvgraph')
   error(msgid, 'bvgraph did not report the correct type');
end
if ~isa(G,'bvgraph')
   error(msgid, 'bvgraph did not pretend to be a logical');
end
nz = nnz(G);
A = sparse(G);
nz2 = nnz(A);
if nz ~= nz2
   error(msgid, 'bvgraph did not return the correct number of non-zeros');
end
%% Test the istrans function
Gt = G';
if ~istrans(Gt)
   error(msgid,'istrans reported the incorrect transpose state');
end
if istrans(G)
   error(msgid,'istrans reported the incorrect transpose state');
end
%% Test the sum function
A = sparse(G);
d_A = sum(A);
d_G = sum(G);
d_A_1 = sum(A,1);
d_G_1 = sum(G,1);
d_A_2 = sum(A,2);
d_G_2 = sum(G,2);
if ~isequal(d_A, d_G)
   error(msgid,'sum returned incorrect results');
end
if ~isequal(d_A_1, d_G_1)
   error(msgid,'sum1 returned incorrect results');
end
if ~isequal(d_A_2, d_G_2)
   error(msgid,'sum2 returned incorrect results');
end
Gt = G';
At = sparse(Gt);
d_A = sum(At);
d_G = sum(Gt);
d_A_1 = sum(At,1);
d_G_1 = sum(Gt,1);
d_A_2 = sum(At,2);
d_G_2 = sum(Gt,2);
if ~isequal(d_A, d_G)
   error(msgid,'sum returned incorrect results');
end
if ~isequal(d_A_1, d_G_1)
   error(msgid,'sum1 returned incorrect results');
end
if ~isequal(d_A_2, d_G_2)
   error(msgid,'sum2 returned incorrect results');
end
%% Test the stochastic mult function
rand('state',0);
x = rand(n,1);
G = bvgraph('../data/wb-cs.stanford');
A = sparse(G);
% create the substochastic matrix corresponding to a random walk on G/A
[i j v] = find(A);
d = sum(A,2);
P = sparse(i, j, v(i)./d(i), size(A,1), size(A,2));
y_P = P*x;
y_G = substochastic_mult(G,x);
if norm(y_P-y_G,inf) > 100*eps(1)
   error(msgid,'stochastic_mult results are not correct to 100*eps');
end
y_Pt = P'*x;
y_Gt = substochastic_mult(G',x);
if norm(y_Pt-y_Gt,inf) > 100*eps(1)
   error(msgid,'stochastic_mult(tranpose) results are not correct to 100*eps');
end


上一条:推荐 | MatlabBGL: Robust and efficient graph algorithms for Matlab using native data structures. 下一条:推荐 | A Matlab suite of drivers to compute the PageRank vector for a directed graph.

关闭

  读取内容中,请等待...

版权所有:Research Centre of Nonlinear Science 邮政编码:430073
E-mail:liujie@wtu.edu.cn 备案序号: 鄂ICP备15000386号 鄂公网安备 42011102000704号