| At  a given iteration  of the main loop,  and  because of  the  cartesian 
property of the distribution scheme,  each panel factorization  occurs in
one column of processes.   This  particular part of the computation  lies
on the critical path of  the overall algorithm.  The user is  offered the
choice of three  (Crout, left- and right-looking)  matrix-multiply  based 
recursive variants. The software also allows the user  to choose  in  how
many  sub-panels  the current panel  should be divided  into  during  the
recursion.  Furthermore,  one  can also  select at run-time the recursion
stopping criterium in terms of the number  of  columns left to factorize.
When this  threshold is reached,  the sub-panel will  then be  factorized
using one of the three Crout, left- or right-looking matrix-vector  based 
variant.  Finally, for each panel column the pivot search, the associated
swap  and broadcast  operation  of  the pivot row  are combined  into one 
single communication step.  A   binary-exchange  (leave-on-all) reduction
performs these three operations at once. |  |