Discussion:
[petsc-users] Configuration of Hybrid MPI-OpenMP
Danyang Su
2014-01-29 18:28:05 UTC
Permalink
Hi All,

Is it possible to configure PETSc with support for hybrid MPI-OpenMP?
My codes have a lot of OpenMP instructions and work well. I have
implemented PETSc solver in the codes and the solver can also work and
help to speedup. Currently I have the following question:

1. If I configure PETSc with OpenMP, the OpenMP instructions work well
and I can see the significant speedup in this part, but the speedup for
the solver is very small or no speedup. I have asked the question
before, maybe this is due to the bus bandwidth. Not sure on it.

2. If I configure PETSc with MPI, the speedup for the solver is
significant, but the OpenMP instructions cannot work. The section with
OpenMP instructions runs even 10 times slower.

I tried to configure PETSc with both MPI and OpenMP enabled, but failed.

BTW: I have also tried to use "PETSc for WINDOWS2.0"
(http://www.msic.ch/Software). Both the OpenMP instructions and PETSc
solver (MPI) can work well, but the solver speedup is not as significant
as compiling in CYGWIN with PETSc 3.4.3. Usually the speedup of solver
is 1.5 times slower.

Is it possible to configure PETSc with Hybrid MPI-OPenMP? How?

Thanks and regards,

Danyang
Karl Rupp
2014-01-29 21:23:42 UTC
Permalink
Hi Danyang,

PETSc is supposed to work with the MPI-OpenMP hybridization just as you
request it.
Post by Danyang Su
I tried to configure PETSc with both MPI and OpenMP enabled, but failed.
Please provide details on why it failed. Please send configure.log and
make.log to petsc-maint.
Post by Danyang Su
BTW: I have also tried to use "PETSc for WINDOWS2.0"
(http://www.msic.ch/Software). Both the OpenMP instructions and PETSc
solver (MPI) can work well, but the solver speedup is not as significant
as compiling in CYGWIN with PETSc 3.4.3. Usually the speedup of solver
is 1.5 times slower.
Note that due to the saturation of the memory links you have to be
careful with what you define as 'speedup' and how you compare execution
times. Also, thread affinities become pretty important if you are on a
NUMA machine.
Post by Danyang Su
Is it possible to configure PETSc with Hybrid MPI-OPenMP? How?
--with-threadcomm --with-openmp should suffice.

Best regards,
Karli
Danyang Su
2014-01-30 01:46:14 UTC
Permalink
Hi Karli,

"--with-threadcomm --with-openmp" can work when configure PETSc with
MPI-OpenMP. Sorry for making a mistake before.
The program can be compiled but I got a new error while running my program.

Error: Attempting to use an MPI routine before initializing MPICH

This error occurs when calling MPI_SCATTERV. I have already called
PetscInitialize, and MPI_BCAST, which is just before the calling of
MPI_SCATTERV, can also work without throwing error.

When PETSc is configured without openmp, there is no error in this section.

Thanks and regards,

Danyang
Post by Karl Rupp
Hi Danyang,
PETSc is supposed to work with the MPI-OpenMP hybridization just as
you request it.
Post by Danyang Su
I tried to configure PETSc with both MPI and OpenMP enabled, but
failed.
Please provide details on why it failed. Please send configure.log and
make.log to petsc-maint.
Post by Danyang Su
BTW: I have also tried to use "PETSc for WINDOWS2.0"
(http://www.msic.ch/Software). Both the OpenMP instructions and PETSc
solver (MPI) can work well, but the solver speedup is not as significant
as compiling in CYGWIN with PETSc 3.4.3. Usually the speedup of solver
is 1.5 times slower.
Note that due to the saturation of the memory links you have to be
careful with what you define as 'speedup' and how you compare
execution times. Also, thread affinities become pretty important if
you are on a NUMA machine.
Post by Danyang Su
Is it possible to configure PETSc with Hybrid MPI-OPenMP? How?
--with-threadcomm --with-openmp should suffice.
Best regards,
Karli
Jed Brown
2014-01-30 02:08:03 UTC
Permalink
Post by Danyang Su
Hi Karli,
"--with-threadcomm --with-openmp" can work when configure PETSc with
MPI-OpenMP. Sorry for making a mistake before.
The program can be compiled but I got a new error while running my program.
Error: Attempting to use an MPI routine before initializing MPICH
This error occurs when calling MPI_SCATTERV. I have already called
PetscInitialize, and MPI_BCAST, which is just before the calling of
MPI_SCATTERV, can also work without throwing error.
When PETSc is configured without openmp, there is no error in this section.
Are you calling this inside an omp parallel block? Are you initializing
MPI with MPI_THREAD_MULTIPLE? Do you have other threads doing something
with MPI?

I'm afraid we'll need a reproducible test case if it still doesn't work
for you.
Danyang Su
2014-01-30 02:29:54 UTC
Permalink
Post by Jed Brown
Post by Danyang Su
Hi Karli,
"--with-threadcomm --with-openmp" can work when configure PETSc with
MPI-OpenMP. Sorry for making a mistake before.
The program can be compiled but I got a new error while running my program.
Error: Attempting to use an MPI routine before initializing MPICH
This error occurs when calling MPI_SCATTERV. I have already called
PetscInitialize, and MPI_BCAST, which is just before the calling of
MPI_SCATTERV, can also work without throwing error.
When PETSc is configured without openmp, there is no error in this section.
Are you calling this inside an omp parallel block? Are you initializing
MPI with MPI_THREAD_MULTIPLE? Do you have other threads doing something
with MPI?
No, this calling is outside omp block. I didn't initialize MPI with
MPI_THREAD_MULTIPLE. The codes of PETSc doesn't change.
No other threads are doing anything with MPI. The program crashes in the
initialization of simulation.

But it is strange that the codes can work with "PETSC for WINDOWS 2.0",
which is based on PETSC3.4.2.
Post by Jed Brown
I'm afraid we'll need a reproducible test case if it still doesn't work
for you.
Danyang Su
2014-01-30 17:27:27 UTC
Permalink
I made a second check on initialization of PETSc, and found that the
initialization does not take effect. The codes are as follows.

call PetscInitialize(Petsc_Null_Character,ierrcode)
call MPI_Comm_rank(Petsc_Comm_World,rank,ierrcode)
call MPI_Comm_size(Petsc_Comm_World,nprcs,ierrcode)

The value of rank and nprcs are always 0 and 1, respectively, whatever
how many processors are used in running the program.

Danyang
Post by Jed Brown
Post by Danyang Su
Hi Karli,
"--with-threadcomm --with-openmp" can work when configure PETSc with
MPI-OpenMP. Sorry for making a mistake before.
The program can be compiled but I got a new error while running my program.
Error: Attempting to use an MPI routine before initializing MPICH
This error occurs when calling MPI_SCATTERV. I have already called
PetscInitialize, and MPI_BCAST, which is just before the calling of
MPI_SCATTERV, can also work without throwing error.
When PETSc is configured without openmp, there is no error in this section.
Are you calling this inside an omp parallel block? Are you initializing
MPI with MPI_THREAD_MULTIPLE? Do you have other threads doing something
with MPI?
I'm afraid we'll need a reproducible test case if it still doesn't work
for you.
Jed Brown
2014-01-30 17:30:50 UTC
Permalink
Post by Danyang Su
I made a second check on initialization of PETSc, and found that the
initialization does not take effect. The codes are as follows.
call PetscInitialize(Petsc_Null_Character,ierrcode)
call MPI_Comm_rank(Petsc_Comm_World,rank,ierrcode)
call MPI_Comm_size(Petsc_Comm_World,nprcs,ierrcode)
The value of rank and nprcs are always 0 and 1, respectively, whatever
how many processors are used in running the program.
The most common reason for this is that you have more than one MPI
implementation on your system and they are getting mixed up.
Danyang Su
2014-01-30 18:59:31 UTC
Permalink
Post by Jed Brown
Post by Danyang Su
I made a second check on initialization of PETSc, and found that the
initialization does not take effect. The codes are as follows.
call PetscInitialize(Petsc_Null_Character,ierrcode)
call MPI_Comm_rank(Petsc_Comm_World,rank,ierrcode)
call MPI_Comm_size(Petsc_Comm_World,nprcs,ierrcode)
The value of rank and nprcs are always 0 and 1, respectively, whatever
how many processors are used in running the program.
The most common reason for this is that you have more than one MPI
implementation on your system and they are getting mixed up.
Yes, I have MPICH2 and Microsoft HPC on the same OS. The PETSc was build
using MPICH2. I will uninstall Microsoft HPC to see if it works.

Thanks,

Danyang

Loading...