PETSc KSP and PETSc PC

In this tutorial we will have an in-depth look in to the way we can use a PETSc KSP object to solve a linear system arising from the discretization of a linear partial differential equation. In particular, we will focus our attention the usual Poisson problem in variational form, i.e. find \(u_h\in P^k(\mathcal{T}_h)\) such for any \(v_h \in P^k(\mathcal{T}_h)\) the following equation is verified,

\[(\nabla u_h,\nabla v_h) = (f,v_h),\;\textrm{ for a certain data }f\in [P^k(\mathcal{T}_h)]^*.\]
[1]:
from ipyparallel import Cluster
c = await Cluster().start_and_connect(n=1, activate=True)
Starting 1 engines with <class 'ipyparallel.cluster.launcher.LocalEngineSetLauncher'>

Let’s test if the cluster has been initialized correctly by checking the size of the COMM_WORLD-

[2]:
%%px
from mpi4py.MPI import COMM_WORLD
COMM_WORLD.Get_size()
Out[0:1]: 1

First we need to construct the distributed mesh that will be interested in dealing with, then define the BilinearForm corresponding to the Poisson problem and the load vector f we are interested in solving for, which is a NGSolve LinearForm object.

[3]:
%%px
from ngsolve import Mesh, BilinearForm, LinearForm, H1
from ngsolve import x, y, dx, grad
from netgen.geom2d import unit_square

if COMM_WORLD.rank == 0:
    mesh = Mesh(unit_square.GenerateMesh(maxh=0.2).Distribute(COMM_WORLD))
else:
    mesh = Mesh(ngm.Mesh.Receive(COMM_WORLD))
fes = H1(mesh, order=3, dirichlet="left|right|top|bottom")
u,v = fes.TnT()
a = BilinearForm(grad(u)*grad(v)*dx).Assemble()
f = LinearForm(fes)
f += 32 * (y*(1-y)+x*(1-x)) * v * dx

We now initialize a new KrylovSolver which wraps a PETSc KSP object and can be used to solve the linear system \(A\vec{u}_h = \vec{f}_h\), obtained from the above discretization. In particular, it is possible to control the setup of the PETSc KSP solver using the solverParameters dictionary which is equivalent to passing PETSc command line options. More detail on available options and solvers can be found in the PETSc KSP Manual. Once the solver has been initialized it is possible to obtain a solution of the linear system in terms of GridFunction using the solve method.

[4]:
%%px
from ngsPETSc import KrylovSolver
from ngsolve.webgui import  Draw
solver = KrylovSolver(a,fes, solverParameters={'ksp_type': 'cg', 'pc_type': 'lu',
                                'pc_factor_mat_solver_type': 'mumps'})
gfu = solver.solve(f)
Draw(gfu,mesh, "solution")
solver.ksp.view()
[output:0]
[stdout:0] KSP Object: 1 MPI process
  type: cg
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
  left preconditioning
  using PRECONDITIONED norm type for convergence test
PC Object: 1 MPI process
  type: lu
    out-of-place factorization
    tolerance for zero pivot 2.22045e-14
    matrix ordering: external
    factor fill ratio given 0., needed 0.      Factored matrix follows:
        Mat Object: 1 MPI process
          type: mumps
          rows=205, cols=205
          package used to perform factorization: mumps
          total: nonzeros=7603, allocated nonzeros=7603
            MUMPS run parameters:
              Use -ksp_view ::ascii_info_detail to display information for all processes
              RINFOG(1) (global estimated flops for the elimination after analysis): 153913.
              RINFOG(2) (global estimated flops for the assembly after factorization): 1916.
              RINFOG(3) (global estimated flops for the elimination after factorization): 153913.
              (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): (0.,0.)*(2^0)              INFOG(3) (estimated real workspace for factors on all processors after analysis): 7603
              INFOG(4) (estimated integer workspace for factors on all processors after analysis): 998
              INFOG(5) (estimated maximum front size in the complete tree): 35
              INFOG(6) (number of nodes in the complete tree): 14
              INFOG(7) (ordering option effectively used after analysis): 2
              INFOG(8) (structural symmetry in percent of the permuted matrix after analysis): 100
              INFOG(9) (total real/complex workspace to store the matrix factors after factorization): 7603
              INFOG(10) (total integer space store the matrix factors after factorization): 998
              INFOG(11) (order of largest frontal matrix after factorization): 35
              INFOG(12) (number of off-diagonal pivots): 0
              INFOG(13) (number of delayed pivots after factorization): 0
              INFOG(14) (number of memory compress after factorization): 0
              INFOG(15) (number of steps of iterative refinement after solution): 0
              INFOG(16) (estimated size (in MB) of all MUMPS internal data for factorization after analysis: value on the most memory consuming processor): 0
              INFOG(17) (estimated size of all MUMPS internal data for factorization after analysis: sum over all processors): 0
              INFOG(18) (size of all MUMPS internal data allocated during factorization: value on the most memory consuming processor): 0
              INFOG(19) (size of all MUMPS internal data allocated during factorization: sum over all processors): 0
              INFOG(20) (estimated number of entries in the factors): 7603
              INFOG(21) (size in MB of memory effectively used during factorization - value on the most memory consuming processor): 0
              INFOG(22) (size in MB of memory effectively used during factorization - sum over all processors): 0
              INFOG(23) (after analysis: value of ICNTL(6) effectively used): 0
              INFOG(24) (after analysis: value of ICNTL(12) effectively used): 1
              INFOG(25) (after factorization: number of pivots modified by static pivoting): 0
              INFOG(28) (after factorization: number of null pivots encountered): 0
              INFOG(29) (after factorization: effective number of entries in the factors (sum over all processors)): 7603
              INFOG(30, 31) (after solution: size in Mbytes of memory used during solution phase): 0, 0
              INFOG(32) (after analysis: type of analysis done): 1
              INFOG(33) (value used for ICNTL(8)): 7
              INFOG(34) (exponent of the determinant if determinant is requested): 0
              INFOG(35) (after factorization: number of entries taking into account BLR factor compression - sum over all processors): 7603
              INFOG(36) (after analysis: estimated size of all MUMPS internal data for running BLR in-core - value on the most memory consuming processor): 0
              INFOG(37) (after analysis: estimated size of all MUMPS internal data for running BLR in-core - sum over all processors): 0
              INFOG(38) (after analysis: estimated size of all MUMPS internal data for running BLR out-of-core - value on the most memory consuming processor): 0
              INFOG(39) (after analysis: estimated size of all MUMPS internal data for running BLR out-of-core - sum over all processors): 0
  linear system matrix = precond matrix:
  Mat Object: 1 MPI process
    type: seqaij
    rows=205, cols=205
    total: nonzeros=2863, allocated nonzeros=2863
    total number of mallocs used during MatSetValues calls=0
      using I-node routines: found 137 nodes, limit used is 5

In the previous example we have solved using a CG method and a LU factorization preconditioned computed using MUMPS. We can access all this information from the using the view of the solver.ksp object. We can also solve using more exotic methods and preconditioners, such as a biconjugate gradient method using HYPRE for preconditioning.

[5]:
%%px
solver = KrylovSolver(a,fes, solverParameters={'ksp_type': 'bcgs', 'pc_type': 'hypre'})
gfu = solver.solve(f)
Draw(gfu,mesh, "solution")
solver.ksp.view()
[output:0]
[stdout:0] KSP Object: 1 MPI process
  type: bcgs
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
  left preconditioning
  using PRECONDITIONED norm type for convergence test
PC Object: 1 MPI process
  type: hypre
    HYPRE BoomerAMG preconditioning
      Cycle type V
      Maximum number of levels 25
      Maximum number of iterations PER hypre call 1
      Convergence tolerance PER hypre call 0.
      Threshold for strong coupling 0.25
      Interpolation truncation factor 0.
      Interpolation: max elements per row 0
      Number of levels of aggressive coarsening 0
      Number of paths for aggressive coarsening 1
      Maximum row sums 0.9
      Sweeps down         1
      Sweeps up           1
      Sweeps on coarse    1
      Relax down          symmetric-SOR/Jacobi
      Relax up            symmetric-SOR/Jacobi
      Relax on coarse     Gaussian-elimination
      Relax weight  (all)      1.
      Outer relax weight (all) 1.
      Maximum size of coarsest grid 9
      Minimum size of coarsest grid 1
      Using CF-relaxation
      Not using more complex smoothers.
      Measure type        local
      Coarsen type        Falgout
      Interpolation type  classical
      SpGEMM type         hypre
  linear system matrix = precond matrix:
  Mat Object: 1 MPI process
    type: seqaij
    rows=205, cols=205
    total: nonzeros=2863, allocated nonzeros=2863
    total number of mallocs used during MatSetValues calls=0
      using I-node routines: found 137 nodes, limit used is 5

It is also possible to import any PETSc PC object and used it NGSolve using the PETScPC NGSolve Preconditioner. This is very useful if one is interested in using NGSolve linear algebra functionality with “foreign” preconditioner.

[6]:
%%px
from ngsPETSc import pc
from ngsolve import Preconditioner, GridFunction
from ngsolve.solvers import CG
pre = Preconditioner(a, "PETScPC", pc_type="hypre")
gfu = GridFunction(fes)
gfu.vec.data = CG(a.mat, rhs=f.vec, pre=pre, printrates=mesh.comm.rank==0)
Draw(gfu,mesh, "solution")
[stdout:0] CG iteration 1, residual = 2.3515313068754367
CG iteration 2, residual = 0.11703951230766897
CG iteration 3, residual = 0.02532238998559266
CG iteration 4, residual = 0.002762280521164809
CG iteration 5, residual = 0.00036992984473413113
CG iteration 6, residual = 5.0193813921945675e-05
CG iteration 7, residual = 8.048185966060108e-06
CG iteration 8, residual = 1.1206592184815855e-06
CG iteration 9, residual = 1.3930271656429977e-07
CG iteration 10, residual = 2.358932103568244e-08
CG iteration 11, residual = 3.2889334523253205e-09
CG iteration 12, residual = 4.688724686311323e-10
CG iteration 13, residual = 6.51694553700458e-11
CG iteration 14, residual = 9.11213172981437e-12
CG iteration 15, residual = 7.541816903801115e-13

[stderr:0] WARNING: kwarg 'pc_type' is an undocumented flags option for class <class 'ngsolve.comp.Preconditioner'>, maybe there is a typo?

[output:0]
Out[0:5]: BaseWebGuiScene