The biggest challenge of debugging massively parallel applications is coping with large quantities of output from debuggers controlling the parallel application's processes. The Intel® Debugger helps you manage this output by aggregating similar output into groups. The debugger aggregates output by using the following two strategies:
It condenses identical output messages into a single output message. When the debugger displays an aggregated message, the debugger prefixes the message with a range of user process IDs, to which this output applies. The processes in that range are not necessarily consecutive. The debugger aggregates all processes with the same output into a single and final output message. For example, in the following message, [0-41] is the process range:
[0-41] Linux Application Debugger for Itanium®-based applications, Version XX
Outputs that have different hexadecimal digits, but are otherwise identical, are condensed by aggregating the differing digits into a range. For example, in the following message, [0-41] is the process range, and [0;41] is the value range:
[0-41]>2 0x120006d6c in feedback(myid=[0;41],np=42,name=0x11fffe018="mytest") "mytest.c":41
Another challenge of debugging massively parallel applications is using a debugger to control all of the application’s processes, or process subsets, in a consistent manner. The Intel debugger provides you with this control through a single user interface.
At the startup of a parallel debugging session:
The root debugger is responsible for starting your parallel application and serves as your user interface. The aggregators perform output consolidation as described previously. The leaf debuggers control and query your application processes.
The branching factor is the factor used to build the n-nary tree and determine the number of aggregators in the tree. For example, for 16 processes:
You can set the value of the $parallel_branchingfactor variable from its default value of 8 to a value equal to or greater than 2 in the debugger initialization file.
When you delete $parallel_branchingfactor from the initialization file, the branching factor used in the startup mechanism is the default value.
Aggregator delay specifies the time that aggregators wait, when not all of the expected messages have been received, before they aggregate and send messages down to the next level.
You can change the value of the $parallel_branchingfactor variable from its default value of 3000 milliseconds in the debugger initialization file. For more information, see Parallel Debugging Tips.
When you delete $parallel_aggregatordelay from the debugger initialization file, the aggregator delay used in the startup mechanism is the default value.
Copyright © 1996-2010, Intel Corporation. All rights reserved.