The following command starts a parallel debugging session on an Intel® MPI job with 8 processes.
% mpiexec -idb -n 8 cpi Intel® Debugger for applications running on Intel® 64, Version X Attaching to program: /usr/bin/python, process 17717 Reading symbols from /usr/bin/python...(no debugging symbols found)...done. [New Thread 182902515936 (LWP 17717)]__select_nocancel () in /lib64/tls/libc-2.3.2.so Info: Optimized variables show as <no value> when no location is allocated. Continuing. MPIR_Breakpoint () at /tmp/vgusev.xtmpdir.svsmpi020.1167/mpi2.32e.svsmpi020.2008 0917/dev/src/pm/mpd/mtv.c:100 No source file named /tmp/vgusev.xtmpdir.svsmpi020.1167/mpi2.32e.svsmpi020.20080 917/dev/src/pm/mpd/mtv.c. (idb)
The following is a message from processes 0 to 7.
[0:7] Intel® Debugger for applications running on Intel® 64, Version X %1 [0:7] Attaching to program: ~/test/cpi, process [17729 ;17737] [0:7] Reading symbols from ~/test/cpi...done.
The following aggregated message contains messages with differing portions, and 2 is the message ID. In this case, the LWP ID's are different from process to process.
%2 [0:7] [New Thread 182908720320 (LWP [17729;17737])] [3,5] syscall () in /lib64/tls/libc-2.3.2.so [0:2,4,6:7] MPIR_WaitForDebugger () at /tmp/vgusev.xtmpdir.svsmpi020.1167/mpi 2.32e.svsmpi020.20080917/dev/src/mpi/debugger/dbginit.c:139 (idb) [0:7] stopped at [int main(int, char**):22 0x0000000000400ab1] [0:7] 22 MPI_Comm_size(MPI_COMM_WORLD,&numprocs); (idb) [0:7] 18 char processor_name[MPI_MAX_PROCESSOR_NAME]; [0:7] 19 int gate = 0; [0:7] 20 [0:7] 21 MPI_Init(&argc,&argv); [0:7] > 22 MPI_Comm_size(MPI_COMM_WORLD,&numprocs); [0:7] 23 MPI_Comm_rank(MPI_COMM_WORLD,&myid); [0:7] 24 MPI_Get_processor_name(processor_name,&namelen); [0:7] 25 [0:7] 26 fprintf(stderr,"Process %d on %s\n", (idb) (idb) b f (idb) [0:7] Breakpoint 1 at 0x400a41: file ~/test/cpi.c, line 8. (idb) c (idb) [0:7] Continuing. [0:7] %3 [0:7] Breakpoint 1, f (a=[0.0050000000000000001;0.074999999999999997]) at ~/test/cpi.c:8 [0:7] 8 return (4.0 / (1.0 + a*a)); (idb) where (idb) %4 [0:7] #0 0x0000000000400a41 in f (a=[0.0050000000000000001;0.074999999999999997]) at ~/test/cpi.c:8 %5 [0:7] #1 0x0000000000400bf3 in main (argc=1, argv=0x7fbfe7d358) at ~/test/cpi.c:52
The following command sets the current process set to include processes 4, 5, 6, and 7.
(idb) focus [4:7] (idb) c (idb)
The following prompt shows the current process set.
[4:7] Continuing. [4:7] %6 [4:7] Breakpoint 1, f (a=[0.125;0.155]) at ~/test/cpi.c:8 [4:7] 8 return (4.0 / (1.0 + a*a)); (idb) where (idb) %7 [4:7] #0 0x0000000000400a41 in f (a=[0.125;0.155]) at ~/cchen 15/test/cpi.c:8 %8 [4:7] #1 0x0000000000400bf3 in main (argc=1, argv=0x7fbff7d7d8) at ~/test/cpi.c:52 (idb) focus [*] (idb) n (idb) %9 [0:7] main (argc=1, argv=0x7fbff2a468) at ~/test/cpi.c :52 [0:7] 52 sum += f(x); (idb) where (idb) %10 [0:7] #0 0x0000000000400bf3 in main (argc=1, argv=0x7fbfe7d358) at ~/test/cpi.c:52
The following command displays all the aggregated messages saved in the message list.
(idb) show aggregated message %1 [0:7] Attaching to program: ~/test/cpi, process [17729;17737] %2 [0:7] [New Thread 182908720320 (LWP [17729;17737])] %3 [0:7] Breakpoint 1, f (a=[0.0050000000000000001;0.074999999999999997]) at ~/test/cpi.c:8 %4 [0:7] #0 0x0000000000400a41 in f (a=[0.0050000000000000001;0.074999999999999997]) at ~/test/cpi.c:8 %5 [0:7] #1 0x0000000000400bf3 in main (argc=1, argv=0x7fbfe7d358) at ~/test/cpi.c:52 %6 [4:7] Breakpoint 1, f (a=[0.125;0.155]) at ~/test/cpi.c:8 %7 [4:7] #0 0x0000000000400a41 in f (a=[0.125;0.155]) at ~/tesast/cpi.c:8 %8 [4:7] #1 0x0000000000400bf3 in main (argc=1, argv=0x7fbff7d7d8) at ~/test/cpi.c:52 %9 [0:7] main (argc=1, argv=0x7fbff2a468) at ~/test/cpi.c:52 %10 [0:7] #0 0x0000000000400bf3 in main (argc=1, argv=0x7fbfe7d358) at ~/test/cpi.c:52 The following command expands the aggregated message with message ID 1. (idb) expand aggregated message 1 %1 [0:7] Attaching to program: ~/test/cpi, process [17729;17737] [3] Attaching to program: ~/test/cpi, process 17732 [5] Attaching to program: ~/test/cpi, process 17734 [2] Attaching to program: ~/test/cpi, process 17730 [4] Attaching to program: ~/test/cpi, process 17733 [0] Attaching to program: ~/test/cpi, process 17737 [1] Attaching to program: ~/test/cpi, process 17729 [7] Attaching to program: ~/test/cpi, process 17736 [6] Attaching to program: ~/test/cpi, process 17735 (idb) disable 1 (idb) (idb) c (idb) [0:7] Continuing.s pi is approximately 3.1416009869231245, Error is 0.0000083333333314 wall clock time = 120.800664 [0:7] Program exited normally. (idb) (idb) quit
Copyright © 1996-2010, Intel Corporation. All rights reserved.