GAP (HAP commands) - Chapter 15: Parallel computation

The following example creates fifteen child processes and uses them simultaneously to compute the second integral homology of each of the 2328 groups of order 128. The final command shows that

for the 2328-th group G in GAP's library of small groups. The penulimate command shows that the parallel computation achieves a speedup of 10.4 .

gap> Processes:=List([1..15],i->ChildProcess());;
gap> fn:=function(i);return GroupHomology(SmallGroup(128,i),2);end;;
gap> for p in Processes do
> ChildPut(fn,"fn",p);
> od;
gap> Exec("date +%s");L:=ParallelList([1..2328],"fn",Processes);;Exec("date +%s");
1716105545
1716105554
gap> Exec("date +%s");L1:=List([1..2328],fn);;Exec("date +%s");
1716105586
1716105680

gap> speedup:=1.0*(680-586)/(554-545);
10.4444

gap> L[2328];
[ 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2 ]

The function ParallelList() is built from HAP's six core functions for parallel computation.

15.2 A non-embarassingly parallel computation

The following commands use core functions to compute the product A=M× N of two random matrices by distributing the work over two processors.

gap> M:=RandomMat(10000,10000);;
gap> N:=RandomMat(10000,10000);;
gap> 
gap> s:=ChildProcess();;
gap> 
gap> Exec("date +%s");
1716109418
gap> Mtop:=M{[1..5000]};;
gap> Mbottom:=M{[5001..10000]};;
gap> ChildPut(Mtop,"Mtop",s);
gap> ChildPut(N,"N",s);
gap> NextAvailableChild([s]);;
gap> ChildCommand("Atop:=Mtop*N;;",s);;
gap> Abottom:=Mbottom*N;;
gap> A:=ChildGet("Atop",s);;
gap> Append(A,Abottom);;
gap> Exec("date +%s");
1716110143

gap> AA:=M*N;;Exec("date +%s");
1716111389

gap> speedup:=1.0*(111389-110143)/(110143-109418);
1.71862

The next commands compute the product A=M× N of two random matrices by distributing the work over fifteen processors. The parallelization is very naive (the entire matrices M and N are communicated to all processes) and the computation achieves a speedup of 7.6.

gap> M:=RandomMat(15000,15000);;
gap> N:=RandomMat(15000,15000);;
gap> S:=List([1..15],i->ChildCreate());;

gap> Exec("date +%s");
1716156583
gap> ChildPutObj(M,"M",S);
gap> ChildPutObj(N,"N",S);
gap> for i in [1..15] do
> cmd:=Concatenation("A:=M{[1..1000]+(",String(i),"-1)*1000}*N;");
> ChildCommand(cmd,S[i]);
> od;
gap> A:=[];;
gap> for i in [1..15] do
>  C:=ChildGet("A",S[i]);
>  Append(A,C);
> od;
gap> Exec("date +%s");
1716157489

gap> AA:=M*N;;Exec("date +%s");
1716164405

gap> speedup:=1.0*(64405-57489)/(57489-56583);
7.63355

15.3 Parallel persistent homology

Section 5.8 illustrates an alternative method of computing the persitent Betti numbers of a filtered pure cubical complex. The method lends itself to parallelisation. However, the following parallel computation of persistent Betti numbers achieves only a speedup of 1.5 due to a significant time spent transferring data structures between processes. On the other hand, the persistent Betti function could be used to distribute computations over several computers. This might be useful for larger computations that require significant memory resources.

gap> file:=HapFile("data247.txt");;
gap> Read(file);;
gap> F:=ThickeningFiltration(T,25);;
gap> S:=List([1..15],i->ChildCreate());;
gap> N:=[0,1,2];;
gap> Exec("date +%s");P:=ParallelPersistentBettiNumbers(F,N,S);;Exec("date +%s");
1717160785
1717161285

gap> Exec("date +%s");Q:=PersistentBettiNumbersAlt(F,N);;Exec("date +%s");
1717161528
1717162276
gap> speedup:=1.0*(1717162276-1717161528)/(1717161285-1717160785);
1.496

15 Parallel computation

15.1 An embarassingly parallel computation

15.2 A non-embarassingly parallel computation

15.3 Parallel persistent homology