Discussion:
[petsc-users] Crash when using valgrind
Dominik Szczerba
2013-04-17 11:41:57 UTC
Permalink
I have been successfully using valgrind for a long long time with petsc but
now suddenly it refuses to work. E.g. calling up a properly functioning
program causes a crash:

mpiexec -n 2 valgrind --tool=memcheck -q --num-callers=20 MySolver

cr_libinit.c:183 cri_init: sigaction() failed: Invalid argument
cr_libinit.c:183 cri_init: sigaction() failed: Invalid argument

=====================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= EXIT CODE: 134
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
=====================================================================================
APPLICATION TERMINATED WITH THE EXIT STRING: Aborted (signal 6)

Same problem if I run without mpiexec at all, just on one proc.

Google found just one and only related page on valgrind pages but I was not
able to conclude much. Did anyone else experience the same problem?

Thanks,
Dominik
Jed Brown
2013-04-17 13:43:13 UTC
Permalink
Can you get a stack trace? Does this happen on a different machine?

Should we turn off signal handling when running in valgrind?
Post by Dominik Szczerba
I have been successfully using valgrind for a long long time with petsc
but now suddenly it refuses to work. E.g. calling up a properly functioning
mpiexec -n 2 valgrind --tool=memcheck -q --num-callers=20 MySolver
cr_libinit.c:183 cri_init: sigaction() failed: Invalid argument
cr_libinit.c:183 cri_init: sigaction() failed: Invalid argument
=====================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= EXIT CODE: 134
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
=====================================================================================
APPLICATION TERMINATED WITH THE EXIT STRING: Aborted (signal 6)
Same problem if I run without mpiexec at all, just on one proc.
Google found just one and only related page on valgrind pages but I was
not able to conclude much. Did anyone else experience the same problem?
Thanks,
Dominik
Dominik Szczerba
2013-04-17 15:07:44 UTC
Permalink
Post by Jed Brown
Can you get a stack trace? Does this happen on a different machine?
Stack trace of what exactly? I do not seem to be able to run gdb with
valgrind...?

gdb valgrind --tool=memcheck -q --num-callers=20 MySolver
gdb: unrecognized option '--tool=memcheck'

I have no other machine to check at the moment, but am trying to set one up.

Thanks
Dominik
Jed Brown
2013-04-17 22:03:07 UTC
Permalink
Post by Dominik Szczerba
Post by Jed Brown
Can you get a stack trace? Does this happen on a different machine?
Stack trace of what exactly? I do not seem to be able to run gdb with
valgrind...?
gdb valgrind --tool=memcheck -q --num-callers=20 MySolver
gdb: unrecognized option '--tool=memcheck'
Sometimes it helps to use 'valgrind --db-attach=yes'.

What happens when you pass -no_signal_handler to the PETSc program?
Dominik Szczerba
2013-04-18 09:12:38 UTC
Permalink
Post by Jed Brown
Sometimes it helps to use 'valgrind --db-attach=yes'.
valgrind --db-attach=yes --tool=memcheck -q --num-callers=20 MySolver
bt

#0 0x0000000007cf2425 in __GI_raise (sig=<optimized out>) at
../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1 0x0000000007cf5b8b in __GI_abort () at abort.c:91
#2 0x000000000987c98d in ?? () from /usr/lib/libcr.so.0
#3 0x000000000400f306 in call_init (l=<optimized out>, argc=2,
argv=0x7ff000138, env=0x7ff000150) at dl-init.c:85
#4 0x000000000400f3df in call_init (env=<optimized out>, argv=<optimized
out>, argc=<optimized out>, l=<optimized out>) at dl-init.c:52
#5 _dl_init (main_map=0x42242c8, argc=2, argv=0x7ff000138,
env=0x7ff000150) at dl-init.c:134
#6 0x00000000040016ea in _dl_start_user () from /lib64/ld-linux-x86-64.so.2
#7 0x0000000000000002 in ?? ()
#8 0x00000007ff0003d7 in ?? ()
#9 0x00000007ff000412 in ?? ()
#10 0x0000000000000000 in ?? ()
Post by Jed Brown
What happens when you pass -no_signal_handler to the PETSc program?
valgrind --tool=memcheck -q --num-callers=20 MySolver -no_signal_handler

No change, i.e:

cr_libinit.c:183 cri_init: sigaction() failed: Invalid argument
Aborted (core dumped)

Any more ideas?

Many thanks
Dominik
Jed Brown
2013-04-18 13:29:16 UTC
Permalink
Post by Dominik Szczerba
Post by Jed Brown
What happens when you pass -no_signal_handler to the PETSc program?
valgrind --tool=memcheck -q --num-callers=20 MySolver -no_signal_handler
cr_libinit.c:183 cri_init: sigaction() failed: Invalid argument
Aborted (core dumped)
Can you run simple BLCR-using programs in Valgrind? You might have to
do your debugging in a non-BLCR build.
Dominik Szczerba
2013-04-19 11:36:00 UTC
Permalink
Post by Jed Brown
Post by Dominik Szczerba
Post by Jed Brown
What happens when you pass -no_signal_handler to the PETSc program?
valgrind --tool=memcheck -q --num-callers=20 MySolver -no_signal_handler
cr_libinit.c:183 cri_init: sigaction() failed: Invalid argument
Aborted (core dumped)
Can you run simple BLCR-using programs in Valgrind? You might have to
do your debugging in a non-BLCR build.
I now found that it seems to be an issue with the system mpich2 that I
am using. Thanks for the pointer.

Dominik

Loading...