You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am running into a segfault where a TSD key is NULL during MPI_Finalize, see BT below. Hypothesis: MPI_Finalize might be called from a different thread than the progress thread is on. This can result in a opal_tsd_key_t == NULL in the type opal_tsd_tracked_key_t. Should we protect this by an if condition or rely on the ULT implementation to handle NULL keys correctly?
Observation:
Program received signal SIGSEGV, Segmentation fault.
0x0000000000000000 in ?? ()
#0 0x0000000000000000 in ?? () #1 0x00007ffff5043522 in qthread_key_delete (key=0x0) at ../../src/tls.c:44 #2 0x00007ffff6926659 in opal_tsd_key_delete (key=0x0) at ../../../../opal/mca/threads/qthreads/threads_qthreads_tsd.h:36 #3 0x00007ffff69269c5 in opal_tsd_tracked_key_destructor (key=0x7ffff7bbd260 <print_args_tsd_key>) at ../../../../opal/mca/threads/base/tsd.c:34 #4 0x00007ffff783bf8b in opal_obj_run_destructors (object=0x7ffff7bbd260 <print_args_tsd_key>) at ../../opal/class/opal_object.h:483 #5 0x00007ffff78415e0 in ompi_rte_finalize () at ../../ompi/runtime/ompi_rte.c:955 #6 0x00007ffff78398d4 in ompi_mpi_finalize () at ../../ompi/runtime/ompi_mpi_finalize.c:468 #7 0x00007ffff787f717 in PMPI_Finalize () at pfinalize.c:54
The text was updated successfully, but these errors were encountered:
You are perhaps running into an issue because PMIx has its own progress thread and knows nothing about the OPAL thread abstraction? It does use TSD, but in its own context of course.
I am running into a segfault where a TSD key is NULL during MPI_Finalize, see BT below. Hypothesis: MPI_Finalize might be called from a different thread than the progress thread is on. This can result in a opal_tsd_key_t == NULL in the type opal_tsd_tracked_key_t. Should we protect this by an if condition or rely on the ULT implementation to handle NULL keys correctly?
Observation:
Program received signal SIGSEGV, Segmentation fault.
0x0000000000000000 in ?? ()
#0 0x0000000000000000 in ?? ()
#1 0x00007ffff5043522 in qthread_key_delete (key=0x0) at ../../src/tls.c:44
#2 0x00007ffff6926659 in opal_tsd_key_delete (key=0x0) at ../../../../opal/mca/threads/qthreads/threads_qthreads_tsd.h:36
#3 0x00007ffff69269c5 in opal_tsd_tracked_key_destructor (key=0x7ffff7bbd260 <print_args_tsd_key>) at ../../../../opal/mca/threads/base/tsd.c:34
#4 0x00007ffff783bf8b in opal_obj_run_destructors (object=0x7ffff7bbd260 <print_args_tsd_key>) at ../../opal/class/opal_object.h:483
#5 0x00007ffff78415e0 in ompi_rte_finalize () at ../../ompi/runtime/ompi_rte.c:955
#6 0x00007ffff78398d4 in ompi_mpi_finalize () at ../../ompi/runtime/ompi_mpi_finalize.c:468
#7 0x00007ffff787f717 in PMPI_Finalize () at pfinalize.c:54
The text was updated successfully, but these errors were encountered: