`“ Schur Complement Factorization and Parallel O(Log N

Transcription

`“ Schur Complement Factorization and Parallel O(Log N
4
‘“
,
Schur Complement Factorization and Parallel O(Log N) Algorithms
for Computation of Operation/ Space Mass Matrix and Its Inverse
Amir Fijany
Jet Propulsion Laboratory, California Institute of Technology
Pasadena, CA 91109
open-chain arms [5,6] along with the recursive
O(N) algorithms for computation of A or A-l, the
dynamic simulation of closed-chain systems can be
then performed with a cost of O(N). These
algorithms represent the asymptotically
optimal serial algorithms for computation of both
operational space dynamic control and dynamic
simulation of closed-chain systems.
Abstract- ing this paper new factorization
techniques for computation of the Operational
Space Mass Matrix (A) and its inverse (A-l) are
developed. Starting with a new factorization of
the inverse of mass matrix (M-l) in the form of
Schur Complement as
M-l = G’ - %)TsI-lB, where d and
8 are block tridlagonal matrices and G is a
tridiagonal matrix, similar factorization for A
It seems, however, that there is no report on
the development of efficient parallel algorithms
and A-i are derived. Specifically, the Schur
Complement factorizatlons of A-l and A are derived
for computation of A and A-l. A more general (and
as will be shown a closely related) issue is
regarding the existence of an optimal parallel
algorithm, i.e., an O(Log N) algorlthm with O(N)
processors, for solution of forward dynamics of
-1
as A = D - &Td-i& and A = S - RT$’-lR, where .5
and R are sparse matrices and D and S are 6x6
matrices. The Schur Complement factorization
provides a unified framework for computation of
M
-l
open-chain arm (or, operator application of M l.
An investigation of parallelism in this problem by
analyzing the efficiency of existing algorithms
for parallel computation 1s reported in [7]. Two
main conclusions of this investigation can be
summarized as follows.
-1 , A-l, and A. It also provides a deeper
physical insight as well as simple physical
interpretations of these factorization.
However, the main advantage of these new
factorizatlons 1s that they are highly efficient
for parallel computation. With O(N) processors,
the computation of A-l and A as well as their
operator applications can be performed in O(Log N)
steps. This represents both time- and processoroptlmal parallel algorithms for their
computations. To our knowledge, these are the
first parallel algorithms that achieve the time
lower bound of O(Log N) in the computation,
1. The existing O(N) algorithms are strictly
sequential, that is, parallelism in their
computation is bounded. More precisely, the main
bottleneck in parallel computation of O(N)
algorithms is in parallelization of the nonlinear
recurrences for computation of the articulatedbody inertia, Note that, the recursive O(N)
algorithms in [2,3] for computation of A and A-*
also require the solution of similar nonlinear
recurrence. This seems to imply that these
algorithms are also strictly sequential.
I. Introduction
The computation of the Operational Space Mass
Matrix (OSMN), A, is fundamental in implementation
of operational space dynamic control of robot arms
[1]. The dynamic simulation of closed-chain robot
manipulator systems (both single closed-chain
systems and multiple arms forming a closed-chain
system) requires the computation of the inverse of
-1
OSMM, A-l, and the inverse of mass matrix, M
.
2. If there indeed can be such an optimal parallel
algorithm for the problem, then it must be
derivable from an O(N) serial algorithm. Since
existing O(N) algorithms are strictly sequential,
the first step in deriving the optimal parallel
algorithm is to develop new serial O(N) algorithms
with efficiency for parallelization in mind. Such
O(N) algorithms can only be developed by a global
reformulation of the problem and not an algebraic
transformation in the computation of existing O(N)
algorithms.
In [2] recursive O(N) algorithms for computation
of A is developed. Recursive O(N) algorithms for
computation of A-l are developed in [3,4]. Once A
(A-l) 1s computed then A-i (A) can be obtained by
From a physical viewpoint, a given algorithm
for the problem can be classified according to its
interbody force decomposition strategy. From the
standpoint of computation, the algorithm can be
inverting a 6x6 matrix with a cost of O(l), Using
the recursive O(N) algorithms for the dynamic
simulation (or, forward dynamics) of single
1
NOMENCLATURE
N
P l,J
Number of total Degree-Of-Freedom
(DOF) of system
Ww
1’ 1
Position vector from Oj to 01,
9
Q
3X1
Angular and 1 near accelera lon of
li;k 1 (frame 1+1)
col{Ti}
Nxl global vector of applied Joint
forces, i = N to 1
3X1
Angular and linear acceleration of
llnk i (frame i+l)
CR
with pl+l ~ = pi
m
1
hi, k,
J,
Ii
Mass of link i
w
First and Second Moment of mass of
link i about point 01
v
Second moment of mass of link i
about its center of mass, C .
1
=
[1
[1
k
-;
hi
J,
O
0
mJU
9 ~ diag{Ii}
McRNXN
[1
[1:’
Linear velocltv and acceleration of
link 1 (point 6[)
(d
v
6x6 spatial inertia of link i
about point 01
.
I I,C1
;, CR 3X1
1’
V,Q’
fit
miU
;, Clll
1’
ti,Q
v
6x6 spatial inertia of link 1
about its center of mass
6Nx6N global matrix of spatial
inertias, i = N to 1
Symmetric Positive Definite (SPD)
mass matrix
6x1 spatial velocity of link
i
6x1 spatial acceleration of 1 nk i
1
v ~ Col{vi}
6Nx1 global vector of link
velocities, i =Ntol
v Q Col{ii}
6Nx1 Klobal vector of link
accelerations, i = N to 1
Force and moment of interaction
between link 1-1 and link
f,, ni
#&R6’N
Jacobian matrix
Q Q Col{e,)
Nxl global vector of joint
positions, i = N to 1
Q 4
CO1{Q,}
Nxl global vector of Joint
velocities, i = N to 1
‘Y ~ CO1{F1}
QQ Col{(j, }
Nxl global vector of joint
accelerations, i = N to 1
H, Cdxl
7 Q col{Ti}
Nxl global vector of applied Joint
forces, 1 =N to 1
?? ~
6x1 spatial force of interaction
between link i-1 and link i
diag{Hl}
6Nx1 global vector of interaction
forces, i = N to 1
6x1 spatial axis (map matrix) of
Joint i
6NxN global matrix of spatial axes,
i =N to 1
‘hl
t
Link I
I
ok,
Link l-l
(’
u’
‘
q
Figure 1. Links, Frames, and Position Vectors
C, : Center of Mass of Link i
2
.
4
,
‘“
classified based on the resulting factorization of
the mass matrix which correspond to the specific
force decomposition strategy (see [8] for a more
detailed discussion. ) A new algorithm based on a
global reformulation of the problem is then the
one that starts with a different and new force
decomposition strategy and results in a new
factorization of mass matrix.
Interestingly, a recently developed iterativealgorithm in [9,10] for open-chain SyStem repre
sents such a global reformulation of the problem.
It differs from the existing O(N) algorithms in
the sense that it is based on a different strategy
for force decomposition. In [8,111, we have shown
that this strategy leads to a new and completely
different factorization of M-l in form of Schur
Complement. This factorization, in turn, results
in a new O(N) algorithm for the problem which is
strictly efficient for parallel computation, that
is, it is less efficient than other O(N)
algorithms for serial computation but, it can be
parallelized to achieve the time lower bound of
O(Log N) with O(N) processors.
In this paper, we show that this factorization
of A-l directly results in a new Schur Complement
factorization for A-l and subsequently for A. As
for M-l, these factorization provide a much deeper
physical insight as well as simple physical
interpretation of both A-l and subsequently for A.
They also result in O(N) algorithms for
computation of A-l and A as well as their operator
applications. These O(N) algorithm, though
seemingly not competitive for serial computation,
can be efficiently parallelized, leading to
O(Log N) parallel algorithms with O(N) processors.
This paper is organized as follows. In $11
notation and some preliminaries are presented. The
Schur Complement factorization of A-l and A are
derived in $111, Serial and parallel computation
properties that ;r = -; and ;1V2 = V,xv ~, i.e., it
is a vector cross-product operator (T denotes the
transpose). A matrix ~ associated to the vector v
is defined as
;.::
[
‘1
where P~ s denotes the position vector from B to
A. The matrix ;A ~ has the properties as
FAB}5C. F A,C and (;A,B)-1 = p B,A
,,
(1)
The spatial forces acting at two rigidly connected
points A and B are related as
F
B
= ;A ~FA
If the linear and angular velocities of point A
are zero then
iA = ;: ~iB
In general, the spatial inertia of link i about
point J is denoted by Ii ,. The spatial inertia of
link i about its center of mass is designated by
I ~ cl,The spatial inertia of body i about point 01
(designated as Ii) is obtained as
Ii =
.
In our derivations, we also make use of global
matrices and vectors which lead to a compact
representation of various factorization. For the
sake of clarity, the global quantities are shown
with upper-case Y’GR9P9 letters. A bidiagonal block
matrix P is defined as
u
-;N-l
In the following derivations, we use spatial
notation which, for the sake of clarity, are shown
with upper-case ITALIC letters. Here, only Joints
with one revolute LX3F are considered. However, all
results can be extended to the systems with Joints
having different and/or more 00Fs.
P=
‘
:’:1
and v (Z) are the components of v
in the frame considered. The tensor ; has the
o
u
-iN-2
0
o
0
0
0
u
-;,
1I
CR
6NX6N
u
P‘1 is a lower triangular block matrix given by
u
With any vector v, a tensor ; can be associated
whose representation in any frame is a skew
symmetric matrix:
‘=[::2:
where v
(x)’ ‘(Y)
(2)
S,I, ~,s:
which represents the parallel axis theorem for
propagation of spatial inertia.
of A‘1 and A are discussed in $IV. Finally, some
concluding remarks are made in $V.
II. Notation and Preliminaries
A. Spatial and Global Notation
[ 1
and fiT = -vu ou ~R6x6
where here (and through the rest of the paper) U
and O stand for unit and zero matrices of
appropriate size. The spatial velocities of two
rigidly connected points A and B are related as
-1
P=
B. An Operator Expression of Jacobian Matrix
IV. Schur Complement Factorization of A-l and A
Following the treatment in [4], a factorization
of Jacobian matrix by using our notation 1s
derived as follows. The velocity propagation for
a serial chain of interconnected rigid body is
given by (Fig. 1)
A. The Interbody Force Decomposition Strategy
The iterative algorithms in [9,10] for forward
dynamics solution of open-chain arms are based on
a decomposition of interbody force of the form:
FI = HIFTI + WiFsl
(3)
(13)
which, by using the matrix 7, can be expressed in
a global form as
where F si is the constraint force and W is the
F’TV = HQ * V = (PT)-lRQ
W~lfl = O and
The EE spatial velocity, VN
orthogonal complement of H, [13,14], th’at is,
(4)
+1’
N+l
- P:VN = o * VN+l = ;:VN
N+ 1
(W =13(PT)-lHQ
6x(6-nt)
say ni<6 DOFS,
Insofar as the axes of
DOFS are orthogonal (which is the case considered
in this paper) the matrix Hi is a projection
matrix [13] and hence
From Eqs. (4)-(5), we get
=
and WICR
Hidfxni
(5)
Let us define a matrix p = [;; O 0 . O]CR6X6N.
v
(14)
For a joint 1 with multiple DOFS,
wlting Eq. (3) for i = N+l as
v
H~Wl = O
1s obtained by
H;HI = U
(6)
(15
The Jacobian matrix 1s defined by relating the FE
spatial velocity and Joint velocities as
It then follows that the matrix Wi is also a
projection matrix [131, i.e.,
v ,+,
Wyi = u
(16
From Eqs. (6)-(7) a factorization of Jacob. an
matrix 1s then derived as
HiH; +
(17
~= B(?’T)-’3(
For a more detailed discussion on these matrices
see [13,141.
(7)
‘r3Q
(8)
W,W; = u
C. Equations of Motion
B. A Schur Complement Factorization of M-l
The equations of motion for a single chain arm
are given by
JtQ = 9 - b(t3,Q,FN+i), or
(9)
In [8,11], we have shown that the force
decomposition in Eq. (13) leads to a new
factorization of M-l and subsequently a new O(N)
algorlthm for the forward dynamics of open-chain
MQ =
3T *
Q = Jt-%T
(10)
arms. We briefly review this factorization of M-i
since it is essential in deriving the
factorization of A-l and A.
where YT = 9 - b(e,Q,FN+l). The vector b(e, Q,FN+l)
represents the contribution of nonlinear terms and
the external spatial force (FN+l) which can be
To begin, let us define following global
matrix and vector for i = N to 1:
computed by using the Newton-Euler (N-E) algorithm
[12] while setting Q to zero. In Eq. (10),
9T ~ col{FTi}cRNxl represents the acceleration-
WG
diag{Wi}cR6Nx5N and %s ~ C01{F5, }CR5N
Equations (11)-(12) and (13)-(17) can be now
written in global form as
dependent component of the control force.
In deriving the factorization of mass matrix,
it 1s assumed that the vector b(e,Q,FN+l) and
yJti . HQ
(18)
subsequently ?T are explicitly computed. Thus,
PY = Yv
(19)
the multlbody system can be assumed as a system at
rest which upon the application of the control
3 = H3T + W7S
(20)
WT3( = O and NTW = O
force ‘3T accelerates in space. The propagation of
accelerations and forces among the links of serial
chain are then given by
(11)
(21)
NT}( = U and W W = U
(22)
3(RT +
(23)
T
W WT = u
From Eqs. (18), (19), and (21) it follows that
(12)
4
V = 9-13%
(24)
wTPTti = WTNQ = o
(25)
*
Substituting the factorlzatlons of j, given by Eq,
Replacing Eq. (24) into Eq. (25), we get
WTT’T9-lW =
(g), and M-l, given by Eq, (31), into Eq, (32):
(26)
o
A
Substituting Eq. (20) into Eq. (26) yields
-1
= 13(PT)-13f{}tTT’T9-’P3t - }?TPT9-lPW(WTY’TY-]?JW)-1
wT$’T9-lm}R?P-lf3T
W’TTTY-lT’(H3T + W3~) = O, or
w%%-hw~l = -W%’T9-%T 4 $3~ = -iE13T
which can be written as
(27)
A-1 = B((pT)-l (j{~T)pT{g-l - g-i7w(wTPTg-~Pw)-~
#pTg-l}p(~#)p-l)pT
[33)
The key to simplification of this expression is
the fact that, from Eq. (23), we have
W(T = u - WWT
(34)
and ~ ~ ~TyTy-lpHcR5NxN
where d ~ WTPT9-1PWCR5NX5N
are block tridlagonal matrices, From Eqs. (27) and
(20) lt follows that
3 = N -
[
1
W(WTPT9-lYW)-’WTPT9-1PH 3T
(28)
and substituting Eq. (28) into Eq. (24) leads to
BY rePlacin8 Eq. (34) into Eq. (33) and after some
involved algebraic manipulations, a simple
if= 9-1?’ H - W(WT7T9-lPW)-1WTPT9-lPR 9T
(29)
1
[
By multiplying both sides of Eq. (18) by 3(T and
operator expression of A-l is derived as
-1
A = i39-lBT - p9-1Pw(wTPT9-lPw)-lwTPT9-~BT
(35)
This expression can be further simplified since
using Eq. (22) Q is computed as
3{THQ = HT?’TO * Q = 3fT$’Tti
(30)
&T
@9”13’w = [F:I;lWN o 0 .,.
=
01CR6X5N
(36)
Finally, from Eqs. (29) and (30) it follows that
(37)
The parallel axis theorem in Eq. (2) can be also
used for propagation of the inverse of spatiai
inertias. To this end, by using Eqs. (l)-(2),
Eq. (37) can be rewritten as
Q= HTPT9-lPR -RTPT9-17W(WTPT9-’7W)-1WTPT9-1PK Y T
1
[
In comparison with Eq. (10), an operator factorization of M-l, in terms of its decomposition into
a set of simpler operators, is then given by
m
-1
=
D=
RTPT9-12’H - NTPT9-l?W(WT?’T9-lTW)-lWT?’T9-l?W
(( F’N)+JN(+-Y = (i’N+l NINF’:+l N)-’
. I -1
(38)
N,N+l
Let E ~ HTPT9-1P3M?NXN. M-l is now expressed as
M
-1
= G’ - 8T$4-lB
that is, the matrix ‘D is Just the inverse of
spatial inertia of link N about point OM+j.
(31)
‘G 1s a tridlagonal matrix. As shown in [15], S4 and
E are symmetric and positive definite (SPD). This
This factorization of A-l can be writt~n in
form of Schur Complement as
guarantees the existence of d-l
A
-1
= ‘D - &Tdl-l&
(39)
Note that the matrix +4 is the same as in Eq. (31).
Let us define a matrix !?,:
The operator form of M-l given by Eq. (31)
represents an interesting mathematical construct
If a matrix Xl is defined as
[1
d&
2,
Q
[1
$ D
f)T E
X2 i
CR6NX6N
&*
~R(5N+6)x(5N+6)
2)
A‘1 is then the Schur Complement of $ in Y2.
then G - ZITSI-lB is the Schur Complement of 4 in $!I
Similar to M-l (see [8]), the Schur Complement
[161. The structure of matrix !?i not only provides
a deeper physical insight into the computation but
it also motivates a different and a much simpler
aPProach for derivation of the factorizatio n of
factorization of A-l and the structure of matrix
g2 allows a simple physical interpretation of this
factorization as well as a simpler and direct
approach (without using the factorization of .ti-t)
for its derivation [17].
M ‘1 and its associated O(N) algorithm (see [8,15]).
However, it should be emphasized that the
similarity in the factorizatlons of M -1 and A -’ is
not limited to their analytical form (i.e., the
Schur Complement form) but it further extends to
their physical interpretation. To see this, let us
C. A Schur Complement Factorization of A-l
The new factorization of M-l directly results
in a new factorization of the OSMM and its
inverse. The matrices A-l and A are defined as [11
A
-1
= $M-lJT
and A = (~-ljT)-lcR’x6
rewrite M-l and A-l as
(32)
5
A -1
= HT$’T(9-1 - 9-1 PW(WTPT9-17W)-1 WTPT9-1 )?W
5 =,
-1
A = ~(g-1 . g-l~W(WTpTg-lPW) -lwTPTj-l)BT
2)-1 = (J;lN*J1 = IN ~+1
[FJ-jJN o 0 . . . o]
(41)
RT = (B9-lfP)-lfM-lw’
= [( FN)-lIN(;; )-lF;I;lWN o 0 . . . 01
[42)
/39-173/ =
Let us also define a matrix K as
?( = 9-1 - 9-1 PW(WT7T9-lPW)-1 WTPT9-1
A-l and A-l can now be expressed as
dt-’ = ?fTPTKPH and A-l = /33(j3T
As shown in [17], the matrix X has a simple
physical interpretation. The fact that M-l and A-l
can be both derived from X then allows a unified
and alternate physical interpretation of
(40)
= [i’ N+lwN 00 . . . 01 CR6X5N
9’- 1 = 9-1 FT[L3$-1,8T)-]F9-1 -9-1 =
Diag{I; -l}
with I;-l = O and I;-~ = -1~1, i = N-1, to 1
Let Y = WTPT9’-1 7W where Y’ is a symmetric block
tridiagonal matrix. Y’ is a rank one modification
of matrix 4. In fact, Y’ differs from A only in the
leading element. The factorization of A is then
written in terms of Schur Complement as
factorization of At-] and A-l based on the
physical interpretation of matrix X.
From a computational perspective, the advantage
of this structural similarity resides in the
improved efficiency in both serial and parallel
computation. For the cases (such as the forward
dynamics of closed-chain systems) wherein the
A= s - RTY-lR
(43)
If a matrix -?3 is defined as
YR- (5
m
?/T s
[1
23 ~
computation of both M-l and A-l 1s needed, this
structural similarity can be exploited to increase
the computational efficiency.
N+6)x(5N+6)
then A is the Schur complement of Y in 23. Again,
the structure of matrix 23 allows a simple
D. A Schur Complement Factorization of A
physical interpretation and an alternate direct
approach for derivation of the Schur Complement
factorization of A [171.
Once A-l is computed and assuming that its
inverse exists (i.e., A-l is nonsingular), A can
be then obtained by performing a 6x6 matrix
inversion. However, this corresponds to a
numerical evaluation of A. Interestingly, it 1S
possible to derive a factorization of A which
allows its direct computation without any need for
IV, Serial and Parallel Computation of A-l and A
A. O(N) Serial Computation of A-l and A
The main kernels in computation of A-l and A
are the explicit computation and inversion of
matrices d and $’. The matrix A and its elements
are given as
computing A-l. It also provides a deeper physical
insight into the structure as well as a simple
physical interpretation of matrix A.
d = Tridiag
The factorization of A is derived by using the
matrix identity [18]
[B ,, Ai, B:-ll
Al = V;(I;l + }:-lI; lj#f
(E - Xi)y)-l = E-l - E-iX(D-l - yE-lx)-lyE-l
i= Ntol
(44)
i = N-1 to 1
(4s)
As stated before, the matrix Y differs from d only
in the leading element, i.e., A:, which is given
BI = -V: I;l;iW1+l
for inverting the matrix A-l given by Eq. (39) as
A = (~ - &T.d-l&)-l = ~-1 - m-igT(gf)-l~T - d)-16~-l
= ((39 -16T)-1 - (pg-l~T)-lBg-lpw{ wTpT(g-lBT
as A’ =
N
~WT;T
~-1 ~~l}N-iWN.
I
From Eqs. (44)-(45) the
elements of matrix JQ (and hence Y) can be computed
in O(N) steps. Efficient computation of matrix S4
by using optimal frame for projection of Eqs.
(44)-(45) is extensively discussed in [8,11,15].
[pg-l/J)-l#- g-~)pw)-lWTpTg-lpT (pg-lPT)-l
This inversion, in addition to the nonsingularity
of A-l, requires that the matrix KO-l&T - d be
nonsingUlar (note that, D is positive definite and
The explicit computation of A-* from Eq. (39)
can be performed in O(N) steps as follows. The
hence D-l exists. ) It should be mentioned that
there are other possible forms of the inverse A-t
computation of S4-18 corresponds to the solution of
system
(46)
An=&
which only require the nonsingularity of A-l [18].
These forms and their computations are extensively
discussed in [17]. The above expression of A can
be further simplified by noting that
for ‘J. This represents the solution of a SPD block
trldiagonal system for six right-hand side vectors
which, by using the block LDL T algorithm [19], can
6
be obtained in O(N) steps, Exploiting the sparse
It should be emphasized that efficient parallel
solution of block tridiagonal systems is the key
to efficient parallel computation of Schur
structure of &r, the computation of &T~ can be
reduced to
(47)
where 6~cR 6X= and f/NcR
5x6
are the Nth elements of
&T and ~. The multiplication in Eq. (47) can be
performed with a cost of 0(1). A-l can be then
obtained by adding two 6x6 matrices with a cost
O(l), leading to an O(N) complexity for the
overall computation.
of
The computation of A from Eq. (43) can be also
performed in O(N) steps in a fashion similar to
that of A-i. Note, however, that usually the
operator applications of A-l and A- i.e.,
Complement factorlzations of M-l, A-l and A.
Motivated by this fact, we have developed a more
efficient variant of the BCR algorithm [21,22]
which 1s particularly suitable for implementation
on coarse grain MIMD parallel architecture since
it significantly reduces the communication
overhead by providing a high degree of overlapping
between communication and computation. We have
implemented the parallel O(Log N) algorithm for
computation of forward dynamics of a serial chain
by using the Schur Complement factorization of M-l
on a Hypercube architecture [22]. Our results
clearly validate the efficiency of this variant of
the BCR algorithm as well as the Schur Complement
factorization of M-i for practical implementation
on coarse grain Mlt4D architectures,
multiplication of A-l by a vector (say ~N+l)
and multiplication of A by a vector (say F~+l)rather than their explicit computations are
required. In this case, it is significantly
V. Discussion and Conclusion
more
We presented a new factorization technique for
efficient to directly compute A-l~N+l by first
computation of A-l and A, This technique results
computing &tiN+l (which involves a simple matrixvector multiplication with a cost of O(l)) and
then solve Eq. (46). The greater computational
efficiency results from the fact that in this case
the solution of Eq. (46) for only one right-hand
side vector is needed.
in Schur Complement factorization of both A-l and
A and subsequently a new O(N) algorithms for their
computation. These O(N) algorithms are highly
efficient for parallel computation. To our
knowledge, they represent the first algorithms
that can be fully parallelized, resulting
in both time- and processor-optimal parallel
algorithms,
B. O(Log N) Parallel Computation of A-i and A
As can be seen, the computation of elements of
matrix d (and hence Y) is fully decoupled for
l=Ntol. Thus, by using O(N) processors, this
computation as well as required projections can be
performed in O(1) while involving only nearest
neighbor communication among processors.
The block LDLT algorithm, while is highly
efficient for serial solution of block tridiagonal
systems, seems to be strictly sequential and, in
fact, there is no report on its parallelization.
However, the Block Cyclic Reduction (BCR)
algorithm [20], while less competitive for serial
computation, can be efficiently parallelized. By
using the BCR algorithm, the system in Eq. (46)
can be solved in O(Log N) steps with O(N)
processors. The computation of Eq. (47) and the
final matrix addition for computation of A-l can
be each performed in O(1) with one processor,
i.e., in a serial fashion. This results in a
complexity of O(Log N) + O(1) for parallel
computation of A-l with O(N) processors which
indicates a both time- and processor-optimal
parallel algorithm. The parallel computation of A
as well as operator applications of both A-i and A
can be also computed in a similar fashion with a
complexity of O(Log N)+O(l) with O(N) processors.
The manifest of Schur Complement in
factorization of M-l, A-l, and A provides a
unified framework not only for their computations
but also for their physical interpretations. Such
a physical interpretation for M-l is discussed in
[8,15]. Here, due to the lack of space, we did not
discuss the physical interpretation for A-i and A.
This and practical implementation of parallel
algorithms for computation of A-’ and A will be
discussed in a forthcoming report.
Acknowledgments
The research in this paper was performed at the
Jet Propulsion Laboratory, California Institute
of Technology, under contract with the National
Aeronautics and Space Administration (NASA).
REFERENCES
1.
2.
3.
4.
5.
6.
7.
16. R.W. Cattle, ’’Manlfestatlon of Schur
Complement, ” Lin. Algebra and its Application,
Vol. 8, pp. 189-211, 1974.
17. A. FiJany,’’New Factorizatlons with Simple
Physical Interpretations for Computation of
Robot Dynamics, ” In preparation.
18. H.V. Henderson and S.R. Searle, ”On Deriving
the Inverse of a Sum of Matrices, ” SIAM Rev.,
Vol. 23(l), pp. 53-60, Jan. 1981.
19. G.H. Golub and C.F. Van Loan, Matrix
Computations, 2nd Edition, The John Hopkins
Univ. Press, 1989.
20. R.W. Hockney and CR. Jesshope, Parallel
Computers, Adam Hilger Ltd, 1981
21. A. Fijany and N. Bagherzadeh, ’’Communication
Efficient Cyclic Reduction Algorithms for
Parallel Solution of Block Trldiagonal
Systems,’$ Submitted to Inf. Processing Let..
22. A. FIJany, G. Kwan, and N. Bagherzadeh, t’A
Fast Algorithm for Parallel Computation of
Multibody Dynamics on MIMD Parallel Architectures,” To be presented at Computing in
Aerospace 9 Conf., San Diego, CA, Oct. 1993.
0. Khatlb,’)A Unified Approach for Motion and
Force Control of Robot Manipulators: The
Operational Space Formulation, ” IEEE J.
Robotics & Automation, Vol. RA-3(1), 1987.
K.W. Lilly and D.E. Orin,’’Efficient O(N)
Computation of the Operational Space Inertia
Matrix, ” Proc. IEEE Int. Conf. Robotics &
Automation, pp. 1014-1019, May 1990.
G. Rodriguez, A. Jain, and K. Kreutz-Delgado,
“A Spatial Operator Algebra for Manipulator
Modeling and Control, ” Int. J. Robotics Res.,
vol. 10(4), pp. 371-381, Aug. 1991.
K. Kreutz-Delgado, A. Jain, and G. Rodriguez,
“Recursive Formulation of Operational Space
Control, ” Int. J. Robotics Res,, Vol. 11(4),
pp. 320-328, Aug. 1992,
R. Featherstone,’’The Calculation of Robot
Dynamics Using Articulated-Body Inertia, ” Int,
J. Robotics Res., Vol. 2(l), pp. 13-30, 1983.
G. Rodriguez and K. Kreutz-Delgado, ’’Spatial
Operator Factorization and Inversion of the
Manipulator Mass Matrix, ” IEEE Trans. Robotics
&Automation, VO1. 8(l), pp. 65-76, Feb. 1992.
A. Fljany and A.K. BeJczy, ’’Techniques for
Parallel Computation of Mechanical Manipulator
Dynamics. Part 11: Forward Dynamics, ” in
Advances in Control and Dynamic Systems,
Vol. 40: Advances in Robotic Systems Dynamics
a n d C o n t r o l , C.T. Leondes (Ed.), pp. 357-410,
8,
9.
10.
11.
12.
3.
Academic Press, March 1991,
A. FiJany,’’Parallel O(Log N) Algorithms for
Open- and Closed-Chain Rigid Multibody Systems
based on a new Mass Matrix Factorization
Technique, ” Proc. 5th NASA Workshop on
Aerospace Computational Control, pp. 243-266,
Santa Barbara, Aug. 1993.
I. Sharf, P a r a l l e l S i m u l a t i o n D y n a m i c s f o r
Open Multibody Chains, Ph.D. Diss., Unlv, of
Toronto, Canada, Nov. 1990.
I. Sharf and G.M.T. D’Eleuterio, ’’Parallel
Simulation Dynamics for Rigid Multibody
Chains,’ ( proc. 12th Biennial ASME Conf. on
Mechanical Vibration and Noise, Sept. 19S9.
A. Fijany, I. Sharf, and G.M.T. D’Eleuterlo,
“Parallel O(Log N) Algorithms for Computation
of Manipulator Forward Dynamics, ” Submitted to
IEEE Trans. Robotics & Automation.
J,Y.S, Luh, M,W. Walker, and R.P.C. Paul,’’OnLine Computational Scheme for Mechanical
Manipulator, ” ASME J, Dynamic Syst., Mess.,
Control, Vol. 102, pp. 69-76, June 1980.
P.C. Hughes and G.B. Sincarsln,’’Dynamlcs of an
Elastic Multibody Chain. Part B: Global
Dynamics, ” Int. J. Dynamics & Stabillty of
Systems, Vol, 4(3&4), pp. 227-244, 1989,
4. C.J. Damaren and G.M.T. D’Eleuterlo, l’On the
Relationship between Discrete-Time Optimal
Control and Recursive Dynamics for Elastic
Multibody Chains,” Contemporary Mathematics,
Vol. 97, pp. 61-77, 1989,
15, A. FiJany~i’Parallel O(log N) Algorithms for
Rigid Multibody Dynamics, ” JPL Engineering
Memorandum, EM 343-92-1258, Aug. 1992.
8