Discussion:
[petsc-users] DMPlex natural to global mappings
Lawrence Mitchell
2015-10-23 10:52:23 UTC
Permalink
Dear all,

I'm trying to understand how to use the recent natural-global mappings that have made it into dmplex. The goal is to do checkpoint restart (onto, perhaps, differing numbers of processes) with DMs with a non-zero overlap (I can see that this is currently unsupported, but let's try the zero overlap case first).

I'm running into a number of problems that may be misunderstandings, but are possibly bugs.

Let's try the simplest thing first, build a DMPlex on two processes, dump it to disk and then try to reload.

#include <petsc.h>


int main(int argc, char **argv)
{
PetscErrorCode ierr;
MPI_Comm comm;
DM dm, newdm, pardm;
PetscViewer vwr;
const char *name = "dump.h5";

PetscInitialize(&argc, &argv, NULL, NULL);

comm = PETSC_COMM_WORLD;
ierr = DMPlexCreateBoxMesh(comm, 2, PETSC_TRUE, &dm); CHKERRQ(ierr);
ierr = DMPlexDistribute(dm, 0, NULL, &pardm); CHKERRQ(ierr);
if (pardm) {
ierr = DMDestroy(&dm); CHKERRQ(ierr);
dm = pardm;
}

ierr = PetscViewerCreate(comm, &vwr); CHKERRQ(ierr);
ierr = PetscViewerSetType(vwr, PETSCVIEWERHDF5); CHKERRQ(ierr);
ierr = PetscViewerFileSetMode(vwr, FILE_MODE_WRITE); CHKERRQ(ierr);
ierr = PetscViewerSetFormat(vwr, PETSC_VIEWER_NATIVE); CHKERRQ(ierr);
ierr = PetscViewerFileSetName(vwr, name); CHKERRQ(ierr);
ierr = DMView(dm, vwr); CHKERRQ(ierr);

ierr = PetscViewerDestroy(&vwr); CHKERRQ(ierr);

ierr = DMCreate(comm, &newdm); CHKERRQ(ierr);
ierr = DMSetType(newdm, DMPLEX); CHKERRQ(ierr);

ierr = PetscViewerCreate(comm, &vwr);
ierr = PetscViewerSetType(vwr, PETSCVIEWERHDF5); CHKERRQ(ierr);
ierr = PetscViewerFileSetMode(vwr, FILE_MODE_READ); CHKERRQ(ierr);
ierr = PetscViewerSetFormat(vwr, PETSC_VIEWER_NATIVE); CHKERRQ(ierr);
ierr = PetscViewerFileSetName(vwr, name); CHKERRQ(ierr);

ierr = DMLoad(newdm, vwr); CHKERRQ(ierr);

ierr = PetscViewerDestroy(&vwr); CHKERRQ(ierr);

ierr = DMPlexDistribute(newdm, 0, NULL, &pardm); CHKERRQ(ierr);

if (pardm) {
ierr = DMDestroy(&newdm); CHKERRQ(ierr);
newdm = pardm;
}

ierr = DMDestroy(&dm); CHKERRQ(ierr);
ierr = DMDestroy(&newdm); CHKERRQ(ierr);
PetscFinalize();

return 0;
}


This fails with the following error in DMPlexDistributeCoordinates:

VecSetBlockSize:

[0]PETSC ERROR: Invalid argument
[0]PETSC ERROR: Int value must be same on all processes, argument # 2

Digging a bit, I think this is because in DMCreateLocalVector the block size is determined by checking the block size of the section. But on rank 1, the section is empty (and so the created Vec does not have a block size set). We then go on to copy this block size over to the distributed coordinates, and get an error because on rank-0 the block size is 2, not 1.

I fix this in the following way:

commit aef309e0b49bf6e55e6f970bc0bdabf021b346f6
Author: Lawrence Mitchell <***@imperial.ac.uk>
Date: Fri Oct 23 09:32:36 2015 +0100

plexdistribute: Use better guess for local coordinate block size

If we create local coordinates on a proces with an empty section, we end
up setting a block size of 1 on the newly distributed local
coordinates (since the local block size is unset). In this case, check
if we have global coordinates in place and try using the block size
they provide instead.

diff --git a/src/dm/impls/plex/plexdistribute.c b/src/dm/impls/plex/plexdistribu
index 24d39b8..8049019 100644
--- a/src/dm/impls/plex/plexdistribute.c
+++ b/src/dm/impls/plex/plexdistribute.c
@@ -1040,7 +1040,7 @@ PetscErrorCode DMPlexDistributeCoordinates(DM dm, PetscSF
{
MPI_Comm comm;
PetscSection originalCoordSection, newCoordSection;
- Vec originalCoordinates, newCoordinates;
+ Vec originalCoordinates, newCoordinates, globalCoordinates;
PetscInt bs;
const char *name;
const PetscReal *maxCell, *L;
@@ -1055,6 +1055,7 @@ PetscErrorCode DMPlexDistributeCoordinates(DM dm, PetscSF
ierr = DMGetCoordinateSection(dm, &originalCoordSection);CHKERRQ(ierr);
ierr = DMGetCoordinateSection(dmParallel, &newCoordSection);CHKERRQ(ierr);
ierr = DMGetCoordinatesLocal(dm, &originalCoordinates);CHKERRQ(ierr);
+ ierr = DMGetCoordinates(dm, &globalCoordinates);CHKERRQ(ierr);
if (originalCoordinates) {
ierr = VecCreate(comm, &newCoordinates);CHKERRQ(ierr);
ierr = PetscObjectGetName((PetscObject) originalCoordinates, &name);CHKERRQ
@@ -1063,6 +1064,15 @@ PetscErrorCode DMPlexDistributeCoordinates(DM dm, PetscSF
ierr = DMPlexDistributeField(dm, migrationSF, originalCoordSection, origina
ierr = DMSetCoordinatesLocal(dmParallel, newCoordinates);CHKERRQ(ierr);
ierr = VecGetBlockSize(originalCoordinates, &bs);CHKERRQ(ierr);
+ if (globalCoordinates) {
+ PetscLayout map, gmap;
+ ierr = VecGetLayout(originalCoordinates, &map); CHKERRQ(ierr);
+ ierr = VecGetLayout(globalCoordinates, &gmap); CHKERRQ(ierr);
+ /* If local block size not set but global size is, pick global size */
+ if (map->bs < 0 && gmap->bs > 0) {
+ ierr = PetscLayoutGetBlockSize(gmap, &bs); CHKERRQ(ierr);
+ }
+ }
ierr = VecSetBlockSize(newCoordinates, bs);CHKERRQ(ierr);
ierr = VecDestroy(&newCoordinates);CHKERRQ(ierr);
}

Now at least I can dump and reload my DM.

OK, now let's think about the natural-to-global stuff. If I understand this correctly, I need to set a section on the serial DM (before distribution) then use DMSetUseNatural and now the natural-to-global SF will be created during distribution.

This is pretty awkward for me to deal with, since I don't have a section at all before distributing, and would rather not build it then. Is this a hard limitation? Naively it seems like I should be able to derive a natural numbering from the distributed DM by using the point migration SF. Thoughts?


Cheers,

Lawrence

Loading...