Parallel Debugging Example

The following command starts a parallel debugging session on an Intel® MPI job with 8 processes.

% mpiexec -idb -n 8 cpi
Intel® Debugger for applications running on Intel® 64, Version X 
Attaching to program: /usr/bin/python, process 17717
Reading symbols from /usr/bin/python...(no debugging symbols found)...done.
[New Thread 182902515936 (LWP 17717)]__select_nocancel () in /lib64/tls/libc-2.3.2.so
Info: Optimized variables show as <no value> when no location is allocated.
Continuing.
MPIR_Breakpoint () at /tmp/vgusev.xtmpdir.svsmpi020.1167/mpi2.32e.svsmpi020.2008
0917/dev/src/pm/mpd/mtv.c:100
No source file named /tmp/vgusev.xtmpdir.svsmpi020.1167/mpi2.32e.svsmpi020.20080
917/dev/src/pm/mpd/mtv.c.
(idb)

The following is a message from processes 0 to 7.

[0:7] Intel® Debugger for applications running on Intel® 64, Version X
%1 [0:7] Attaching to program: ~/test/cpi, process [17729
;17737]
   [0:7] Reading symbols from  ~/test/cpi...done.

The following aggregated message contains messages with differing portions, and 2 is the message ID. In this case, the LWP ID's are different from process to process.

%2 [0:7] [New Thread 182908720320 (LWP [17729;17737])]
   [3,5] syscall () in /lib64/tls/libc-2.3.2.so
   [0:2,4,6:7] MPIR_WaitForDebugger () at /tmp/vgusev.xtmpdir.svsmpi020.1167/mpi
2.32e.svsmpi020.20080917/dev/src/mpi/debugger/dbginit.c:139
(idb) 
   [0:7] stopped at [int main(int, char**):22 0x0000000000400ab1]
   [0:7]      22     MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
(idb) 
   [0:7]      18     char processor_name[MPI_MAX_PROCESSOR_NAME];
   [0:7]      19     int gate = 0;
   [0:7]      20 
   [0:7]      21     MPI_Init(&argc,&argv);
   [0:7] >    22     MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
   [0:7]      23     MPI_Comm_rank(MPI_COMM_WORLD,&myid);
   [0:7]      24     MPI_Get_processor_name(processor_name,&namelen);
   [0:7]      25 
   [0:7]      26     fprintf(stderr,"Process %d on %s\n",
(idb) 
(idb) b f 
(idb) 
   [0:7] Breakpoint 1 at 0x400a41: file ~/test/cpi.c, line 8.
(idb) c 
(idb) 
   [0:7] Continuing.
   [0:7] 
%3 [0:7] Breakpoint 1, f (a=[0.0050000000000000001;0.074999999999999997]) at  ~/test/cpi.c:8
   [0:7] 8          return (4.0 / (1.0 + a*a));
(idb) where 
(idb) 
%4 [0:7] #0  0x0000000000400a41 in f (a=[0.0050000000000000001;0.074999999999999997]) at ~/test/cpi.c:8
%5 [0:7] #1  0x0000000000400bf3 in main (argc=1, argv=0x7fbfe7d358) at ~/test/cpi.c:52

The following command sets the current process set to include processes 4, 5, 6, and 7.

(idb) focus [4:7] 
(idb) c 
(idb)

The following prompt shows the current process set.

   [4:7] Continuing.
   [4:7] 
%6 [4:7] Breakpoint 1, f (a=[0.125;0.155]) at ~/test/cpi.c:8
   [4:7] 8          return (4.0 / (1.0 + a*a));
(idb) where 
(idb) 
%7 [4:7] #0  0x0000000000400a41 in f (a=[0.125;0.155]) at ~/cchen
15/test/cpi.c:8
%8 [4:7] #1  0x0000000000400bf3 in main (argc=1, argv=0x7fbff7d7d8) at ~/test/cpi.c:52
(idb) focus [*] 
(idb) n 
(idb) 
%9 [0:7] main (argc=1, argv=0x7fbff2a468) at ~/test/cpi.c
:52
   [0:7] 52                    sum += f(x);
(idb) where 
(idb) 
%10 [0:7] #0  0x0000000000400bf3 in main (argc=1, argv=0x7fbfe7d358) at ~/test/cpi.c:52

The following command displays all the aggregated messages saved in the message list.

(idb) show aggregated message 
%1 [0:7] Attaching to program: ~/test/cpi, process [17729;17737]
%2 [0:7] [New Thread 182908720320 (LWP [17729;17737])]
%3 [0:7] Breakpoint 1, f (a=[0.0050000000000000001;0.074999999999999997]) at ~/test/cpi.c:8
%4 [0:7] #0  0x0000000000400a41 in f (a=[0.0050000000000000001;0.074999999999999997]) at ~/test/cpi.c:8
%5 [0:7] #1  0x0000000000400bf3 in main (argc=1, argv=0x7fbfe7d358) at ~/test/cpi.c:52
%6 [4:7] Breakpoint 1, f (a=[0.125;0.155]) at ~/test/cpi.c:8
%7 [4:7] #0  0x0000000000400a41 in f (a=[0.125;0.155]) at ~/tesast/cpi.c:8
%8 [4:7] #1  0x0000000000400bf3 in main (argc=1, argv=0x7fbff7d7d8) at ~/test/cpi.c:52
%9 [0:7] main (argc=1, argv=0x7fbff2a468) at ~/test/cpi.c:52
%10 [0:7] #0  0x0000000000400bf3 in main (argc=1, argv=0x7fbfe7d358) at ~/test/cpi.c:52
The following command expands the aggregated message with message ID 1.
(idb) expand aggregated message 1 
%1 [0:7] Attaching to program: ~/test/cpi, process [17729;17737]
 [3] Attaching to program: ~/test/cpi, process 17732
 [5] Attaching to program: ~/test/cpi, process 17734
 [2] Attaching to program: ~/test/cpi, process 17730
 [4] Attaching to program: ~/test/cpi, process 17733
 [0] Attaching to program: ~/test/cpi, process 17737
 [1] Attaching to program: ~/test/cpi, process 17729
 [7] Attaching to program: ~/test/cpi, process 17736
 [6] Attaching to program: ~/test/cpi, process 17735
(idb) disable 1 
(idb) 
(idb) c 
(idb) 
    [0:7] Continuing.s
pi is approximately 3.1416009869231245, Error is 0.0000083333333314
wall clock time = 120.800664
    [0:7] Program exited normally.
(idb) 
(idb) quit 

Submit feedback on this help topic

Copyright © 1996-2010, Intel Corporation. All rights reserved.