Timothée Nicolas
2015-10-14 02:23:17 UTC
Dear all,
I have been playing around with multigrid recently, namely with
/ksp/ksp/examples/tutorials/ex42.c, with /snes/examples/tutorial/ex5.c and
with my own implementation of a laplacian type problem. In all cases, I
have noted no improvement whatsoever in the performance, whether in CPU
time or KSP iteration, by varying the number of levels of the multigrid
solver. As an example, I have attached the log_summary for ex5.c with
nlevels = 2 to 7, launched by
mpiexec -n 1 ./ex5 -da_grid_x 21 -da_grid_y 21 -ksp_rtol 1.0e-9 -da_refine
6 -pc_type mg -pc_mg_levels # -snes_monitor -ksp_monitor -log_summary
where -pc_mg_levels is set to a number between 2 and 7.
So there is a noticeable CPU time improvement from 2 levels to 3 levels
(30%), and then no improvement whatsoever. I am surprised because with 6
levels of refinement of the DMDA the fine grid has more than 1200 points so
with 3 levels the coarse grid still has more than 300 points which is still
pretty large (I assume the ratio between grids is 2). I am wondering how
the coarse solver efficiently solves the problem on the coarse grid with
such a large number of points ? Given the principle of multigrid which is
to erase the smooth part of the error with relaxation methods, which are
usually efficient only for high frequency, I would expect optimal
performance when the coarse grid is basically just a few points in each
direction. Does anyone know why the performance saturates at low number of
levels ? Basically what happens internally seems to be quite different from
what I would expect...
Best
Timothee
I have been playing around with multigrid recently, namely with
/ksp/ksp/examples/tutorials/ex42.c, with /snes/examples/tutorial/ex5.c and
with my own implementation of a laplacian type problem. In all cases, I
have noted no improvement whatsoever in the performance, whether in CPU
time or KSP iteration, by varying the number of levels of the multigrid
solver. As an example, I have attached the log_summary for ex5.c with
nlevels = 2 to 7, launched by
mpiexec -n 1 ./ex5 -da_grid_x 21 -da_grid_y 21 -ksp_rtol 1.0e-9 -da_refine
6 -pc_type mg -pc_mg_levels # -snes_monitor -ksp_monitor -log_summary
where -pc_mg_levels is set to a number between 2 and 7.
So there is a noticeable CPU time improvement from 2 levels to 3 levels
(30%), and then no improvement whatsoever. I am surprised because with 6
levels of refinement of the DMDA the fine grid has more than 1200 points so
with 3 levels the coarse grid still has more than 300 points which is still
pretty large (I assume the ratio between grids is 2). I am wondering how
the coarse solver efficiently solves the problem on the coarse grid with
such a large number of points ? Given the principle of multigrid which is
to erase the smooth part of the error with relaxation methods, which are
usually efficient only for high frequency, I would expect optimal
performance when the coarse grid is basically just a few points in each
direction. Does anyone know why the performance saturates at low number of
levels ? Basically what happens internally seems to be quite different from
what I would expect...
Best
Timothee