Description

A cyclic block-tridiagonal solver

All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.

Related Documents

Share

Transcript

A cyclic block-tridiagonal solver
M. Batista
*
Faculty of Maritime Studies and Transportation, University of Ljubljana, Pot pomorscakov 4, SI-6320 Portoroz, Slovenia
Received 8 September 2004; received in revised form 27 April 2005; accepted 27 April 2005Available online 11 July 2005
Abstract
A simple algorithm for solving a cyclic block-tridiagonal system of equations is presented. Introducing a special form of a new variable,the system is split into two block-tridiagonal systems, which can be solved by known methods. Implementation details of the algorithm arediscussed and numerical examples of diagonal and random generated systems are presented.
q
2005 Elsevier Ltd. All rights reserved.
Keywords:
Cyclic tridiagonal systems; Block-tridiagonal systems
1. Introduction
It is well known that many numerical schemes forsolving partial differential equations lead to tridiagonalsystems (TS) or block-tridiagonal systems (BTS) of linearalgebraic equations. Periodic boundary conditions degen-erate the tridiagonality of the system by adding extraelements in the corners of the system matrix, which resultsin a cyclic tridiagonal system (CTS) or a cyclic block-tridiagonal system (CBTS) of linear algebraic equations.These systems can be classiﬁed as sparse linear systems [1].A relatively large number of good general or specialpurpose programs exist for solving these systems via director iterative methods [2–4]. In particular, TS can be very
efﬁciently solved by the so-called Thomas algorithm or theTridiagonal Matrix algorithm [1,5–9], or in the case of BTSby the block-elimination method [5–7]. (Recently, the
development of the Thomas algorithm and its extensionswas presented by Bieniasz [10].) Press et al. [1] published a
simple algorithm implemented as the subroutine
cyclic
forsolving CTS based on the Sherman–Morrison formula;however, apparently no similar algorithms exist for CBTS.The aim of this paper is to propose a method that extendsthe
cyclic
’s algorithm for solving CTS [1], so that it is
applicable to CBTS. The procedure, a detailed descriptionof the implementation, and a numerical example will bepresented.
2. Solving procedure
Consider a cyclic block-tridiagonal system of linearalgebraic equations
ð
1
Þ
where
A
k
,
B
k
(
k
Z
1,
.
,
n
) are
m
!
m
matrices,
f
k
and
x
k
are
n
known vectors and unknown vectors of size
m
, respectively.In what follows all other matrices also have size
m
!
m
andall other vectors have size
m
. The unit matrix is denoted as
I
.Motivated by the procedure for solving CTS [1], a new
unknown is introduced
u
Z
a
A
1
x
n
C
g
C
n
x
1
(2)where
a
and
g
are two parameters different from zero.Substituting (2) into the ﬁrst and last equation in (1) yieldsthe tridiagonal form
B
0
1
x
1
C
C
1
x
2
Z
f
1
K
u
=
a
A
k
x
k
K
1
C
B
k
x
k
C
C
k
x
k
C
1
Z
f
k
ð
k
Z
2
;
.
;
n
K
1
Þ
A
n
x
n
K
1
C
B
0
n
x
n
Z
f
n
K
u
=
g
(3)
Advances in Engineering Software 37 (2006) 69–74www.elsevier.com/locate/advengsoft0965-9978/$ - see front matter
q
2005 Elsevier Ltd. All rights reserved.doi:10.1016/j.advengsoft.2005.04.004
*
Tel.:
C
386 5 6767219; fax:
C
386 5 6767130.
E-mail address:
milan.batista@fpp.edu.
where
B
0
1
h
B
1
K
ga
C
n
B
0
n
h
B
n
K
ag
A
1
(4)The expressions in (4) clearly imply the role of parameters
a
and
g
. Their values must be chosen in sucha way that the singularity of
B
0
1
and
B
0
n
are avoided.The form of system (3) suggests that the solution can besought in the form
x
k
Z
y
k
K
Z
k
u
ð
k
Z
1
;
.
;
n
Þ
(5)where
y
k
are new unknown vectors and
z
k
new unknownmatrices. By substituting (5) into (3), two sets of BTS
B
0
1
y
1
C
C
1
y
2
Z
f
1
A
k
y
k
K
1
C
B
k
y
k
C
C
k
y
k
C
1
Z
f
k
ð
k
Z
2
;
.
;
n
K
1
Þ
A
n
y
n
K
1
C
B
0
n
y
n
Z
f
n
(6)and
B
0
1
Z
1
C
C
1
Z
2
Z
I
=
a
A
k
Z
k
K
1
C
B
k
Z
k
C
C
k
Z
k
C
1
Z
0
ð
k
Z
2
;
.
;
n
K
1
Þ
A
n
Z
n
K
1
C
B
0
n
Z
n
Z
I
=
g
(7)with equal system matrices are obtained.Both sets can be solved by the block elimination method[5–7]. This method requires the computation of matrices
E
k
,
F
k
,
G
k
and vectors
g
k
via the following steps:
†
factorization
E
1
Z
B
0
K
11
;
F
1
Z
E
1
C
1
(8)
E
k
Z
ð
B
k
K
A
k
F
k
K
1
Þ
K
1
ð
k
Z
2
;
.
;
n
Þ
(9)
F
k
Z
E
k
C
k
ð
k
Z
2
;
.
;
n
K
1
Þ
(10)
†
intermediate solution
g
1
Z
E
1
f
1
;
g
k
Z
E
k
ð
f
k
K
A
k
g
k
K
1
Þ ð
k
Z
2
;
.
;
n
Þ
(11)
G
1
Z
E
1
=
a
;
G
k
Z
K
E
k
A
k
G
k
K
1
ð
k
Z
2
;
.
;
n
K
1
Þ
;
G
n
Z
E
n
ð
I
=
g
K
A
n
G
n
K
1
Þ
(12)
†
ﬁnal solution
y
n
Z
g
n
;
y
k
Z
g
k
K
F
k
y
k
C
1
ð
k
Z
n
K
1
;
.
;
1
Þ
(13)
Z
n
Z
G
n
;
Z
k
Z
G
k
K
F
k
Z
k
C
1
ð
k
Z
n
K
1
;
n
K
2
;
.
;
1
Þ
(14)Once (6) and (7) are solved,
u
can be calculated bysubstituting (5) into (2) for
k
Z
1 and
n
. After some algebraicmanipulations the following expression is obtained
u
Z
ð
I
C
a
A
1
Z
n
C
g
C
n
Z
1
Þ
K
1
ð
a
A
1
y
n
C
g
C
n
y
1
Þ
(15)In this way, (1) is solved by (5).The present algorithm requires
n
C
1 matrix inversions,5
n
K
3productsoftwosquarematricesand4
n
C
1productsof a square matrix with a vector. The inverse of a
m
!
m
matrixvia Gaussian elimination requires
m
3
operations, the productof two
m
!
m
matrices also
m
3
operations, while the productof a vector of order
m
with a matrix of order
m
!
m m
2
operations [6]. Compared to the solving of BTS, which
requires (3
n
K
2) (
m
3
C
m
2
) operations [6], the total number
of operations for solving CBTS is (6
n
K
3)
m
3
C
(4
n
C
1)
m
2
.Thus, the solving of CBTS with the presented algorithmrequires approximately two times the operations than forsolvingBTS.ThecomparisonispresentinTable1,wheretheestimated total memory space needed to store data and work arrays are also given. The memory space is estimated byassuming that data arrays are not overwritten by a solutionprocess, so CBTS requires more space-approximately 20%.Obviously, the proposed algorithm works if all matricesthat have tobeinvertedinthe solvingprocess of systems (6),(7) and (15), are non-singular. It is known that thefactorization step can be carried out if the submatrices
B
0
1
,
B
0
n
,
B
k
(
k
Z
2,
.
,
n
K
1)onthediagonalofthesystemarenon-singular[6,7].However,thisisnotsufﬁcientbecauseof(15),
where calculation of
u
requires that
I
C
a
A
1
Z
n
C
g
C
n
Z
2
isnon-singular. Also, since the kernal of the present algorithmistheblockeliminationmethod,allproblemsassociatedwiththat method are incorporated in the present algorithm. Inparticular, the non-singularities of matrices which areinverted in the factorization step of the algorithm do notguarantee the stability of the overall process [7].Before proceeding with the implementation details someremarks regarding choosing values for the parameters
a
and
g
will be made. Their values are set in advance, so the non-singularity of
B
0
1
,
B
0
n
can also be tested in advance; but thisis not the case with
I
C
a
A
1
Z
n
C
g
C
n
Z
1
because
Z
1
and
Z
n
are results of a solution process. A bad choice of parameterscan made solvable systems unsolvable, but if the values areappropriate they cannot affect the ﬁnal solution if thearithmetic has been performed exactly.To further illustrate the role of
a
and
g
the followingsystem will be considered
b
x
1
C
x
2
C
x
n
Z
cx
k
K
1
C
b
x
k
C
x
k
C
1
Z
c
ð
k
Z
2
;
.
;
n
K
1
Þ
x
1
C
x
n
K
1
C
b
x
n
Z
c
(16)
Table 1Estimated ﬂops and memory spaceAlgorithm Flops Memory spaceBTS 3
nm
3
C
O(
nm
2
) 5
nm
2
C
2
nm
CBTS 6
nm
3
C
O(
nm
2
) 6
nm
2
C
2
nm M. Batista / Advances in Engineering Software 37 (2006) 69–74
70
where
c
is a constant vector independent of
k
. For
b
s
K
2,the system (16) has the solution
x
k
Z
c
2
C
b
;
ð
k
Z
1
;
.
;
n
Þ
(17)The approach taken to solve (16) with the algorithm issubstituting
f
k
Z
c,
y
k
Z
y
k
c
and
Z
k
Z
z
k
I
into (6) and (7). Inthis way, (16) splits into two CTS. In particular, (7) yields
b
K
ga
z
1
C
z
2
Z
1
=
a
z
k
K
1
C
b
z
k
C
z
k
C
1
Z
0
ð
k
Z
2
;
.
;
n
K
1
Þ
z
n
K
1
C
b
K
ag
z
n
Z
1
=
g
(18)Similarly, (15) reduced to
u
Z
ð
a
y
n
C
g
y
1
Þ
D
c
D
Z
1
C
a
z
n
C
g
z
1
s
0 (19)(18) can be solved analytically [11] in the form
z
k
Z
x
k
,where
x
must be suitably determined, but we omit details.Essential results concerning selection of
a
and
g
are:
†
For
j
b
j
s
2 system (18) is solved if
ag
s
b
G
ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ
b
2
K
4
p
2 (20)In this case also
D
s
0, so the algorithm solves (16)
†
For
b
Z
2 the solution of (18) is
z
k
Z
ð
K
1
Þ
k
ð
a
K
g
Þ
C
ð
K
1
Þ
k
½ð
K
1
Þ
n
K
1
½
k
ð
a
K
g
Þ
K
g
n
ð
a
K
g
Þ
2
ð
k
Z
1
;
.
;
n
Þ ð
21
Þ
This solution exists if
ag
s
1 (22)In this case
D
Z
2
ag
½ð
K
1
Þ
n
K
1
=
n
ð
a
K
g
Þ
2
so for an odd
n
D
s
0 and for an even
n
D
h
0. Consequently, thepresent algorithm fails for even
n
even though thesolution exists.
†
For
b
Z
K
2 the solution of (18) is
z
k
Z
K
1
a
C
g
;
ð
k
Z
1
;
.
;
n
Þ
(23)This solution exists if
ag
s
K
1 (24)But for this case
D
h
0 for any
n
, so the algorithm fails.
†
For
b
Z
0 and for
b
Z
(
a
/
g
) or
b
Z
(
g
/
a
) (18) has asolution, but this solution cannot be obtained by thepresent algorithm since the inversion of the diagonalelements of (18) fail.This example illustrates that there is no general rule forselecting
a
and
g
. One must keep in mind that there areCBTS systems that have solutions, but cannot be obtainedby the present algorithm regardless of any selected
a
and
g
.
3. Implementation
The implementation of the algorithm can be done inpractice with several simpliﬁcations regarding computermemory. First
C
k
can be overwritten by
F
k
, and
f
k
by
g
k
.Furthermore,
g
k
can be overwritten by
y
k
and ﬁnally
y
k
by
x
k
.
A
1
and
C
n
remain unchanged during computation. If thesystem is set up each time, then
B
k
can also be overwritten(ﬁrst by
G
k
and then by
Z
k
). Thus if the factorization,intermediate solution and ﬁnal solution steps of thealgorithm are not separated, then no extra work storage isnecessary in addition to the memory required for the systemmatrix and RHS vector data. The complete algorithmwritten in Fortran90-like syntax has the following form
subroutine
tricyc
Input/Output:
A
k
,
B
k
,
C
k
,
f
k
(
k
Z
1,
.
,
n
)Optional input:
a
,
g
Locals:
T
,
uB
n
:
Z
B
n
K
(
a
/
g
)
A
1
!
B
0
1
Z
B
1
K
ð
g
=
a
Þ
C
n
B
n
:
Z
B
n
K
(
a
/
g
)
A
1
!
B
0
n
Z
B
n
K
ð
a
=
g
Þ
A
1
T
:
Z
B
K
11
!
E
1
Z
B
0
K
11
C
1
:
Z
TC
1
!
F
1
Z
E
1
C
1
f
1
:
Z
Tf
1
!
g
1
Z
E
1
f
1
B
1
:
Z
T
/
a
!
G
1
Z
E
1
/
a
do
k
Z
2,
n
T
:
Z
(
B
k
K
A
k
C
k
K
1
)
K
1
!
E
k
Z
(
B
k
K
A
k
F
k
K
1
)
K
1
f
k
:
Z
T
(
f
k
K
A
k
f
k
K
1
) !
g
k
Z
E
k
(
f
k
K
A
k
g
k
K
1
)
B
k
:
Z
K
TA
k
B
k
K
1
!
G
k
Z
K
E
k
A
k
G
k
K
1
if
(
k
!O
n
)
then C
k
:
Z
TC
k
!
F
k
Z
E
k
C
k
enddoB
n
:
Z
T
/
g
C
B
n
!
G
n
Z
E
k
(
I
/
g
K
A
n
G
n
K
1
)
do
k
Z
n
K
1 1,
K
1
f
k
:
Z
f
k
K
C
k
f
k
C
1
!
y
k
Z
g
k
K
F
k
y
k
C
1
B
k
:
Z
B
k
K
C
k
B
k
C
1
!
Z
k
Z
G
k
K
F
k
Z
k
C
1
enddou
:
Z
(
I
C
a
A
1
B
n
C
g
C
n
B
1
)
K
1
(
a
A
1
f
n
C
g
C
n
f
1
)
Table 2Execution time, Mﬂops, errors and residuals for system (16) with
m
Z
8,
n
Z
100,000,
a
Z
K
b
,
g
Z
1
b
Time Mﬂops Error Res
K
4 2.402 138.6 0.444
!
10
K
15
0.167
!
10
K
14
K
2 2.384 139.6 0.100
!
10
1
0.0
K
1 2.383 139.7 0.0 0.01 2.389 139.3 0.0 0.02 2.386 139.5 0.122
!
10
1
0.178
!
10
K
14
2
a
2.383 139.7 0.143
!
10
K
8
0.111
!
10
K
14
4 2.397 138.8 0.444
!
10
K
15
0.178
!
10
K
14
1
n
Z
100,001.
M. Batista / Advances in Engineering Software 37 (2006) 69–74
71
do
k
Z
1,
n
f
k
:
Z
f
k
K
B
k
u
!
x
k
Z
y
k
K
Z
k
uenddoend subroutine
In the above algorithm :
Z
represents assignment,
!O
means not equal and ! represents a comment.There are of course other possibilities for the implemen-tation regarding a problem solving and/or system structure.If for example, the decomposition and back-substitutionphases of the algorithm are implemented in separateroutines, as in the case when several RHS vectors havethe same system matrices, then
B
k
can be overwritten by
E
k
and extra storage is needed for
n
Z
k
matrices. Note also thatin this case the intermediate solution phase of the algorithmcan be included in the decomposition routine, so
A
1
can beoverwritten by (
I
C
a
A
1
B
n
C
g
C
n
B
1
)
K
1
(
a
A
1
) and
C
n
by(
I
C
a
A
1
B
n
C
g
C
n
B
1
)
K
1
(
g
C
n
); consequently, no matrixinversion is needed in the implementation of the ﬁnalsolution step. Obviously, several simpliﬁcations in imple-menting the present algorithm are also possible in the casewhen
A
k
Z
C
k
Z
I
(
k
Z
1,
.
,
n
). Only
B
k
matrices remain asinput data, so extra storage for
n
Z
k
matrices is also needed.The presented algorithms were implemented as a module
tri
in Fortran90 [12]. The module contains subroutines forsolving TS, BTS, CTS and CBTS. Subroutines that solve TSand BTS are accessible through the same generic name
tridga
, while subroutines, which solve CTS and CBTS, areaccessible through the same generic name
tricyc
. All matrixmultiplications are implemented using Fortran90 intrinsicfunction
matmul
. For matrix inversion the
invert
subroutinewas used [13].
4. Numerical experiments
For numerical experiments, the
tri
module was compiledusing the Compaq Visual Fortran 6.6 C compiler withoptimization for speed compiler options. All computationalexperiments have been performed on a PC with 1.4 GHzPentium 4 processor and 512 MB RAM, using the Windows2000 operating system. For all tests
f
k
is generated by theconditions that the exact solution is
x
k
Z
1 for all
k.
When
b
is used this means that this is a constant added to diagonalelements of
B
k
matrices. Errors are computed as themaximal absolute difference between computed and exactsolutions, residuals are computed as maximal differencebetween RHS and LHS of system computed with computedsolution. When the time is given this is CPU time inseconds, obtained by averaging 10 program runs. TheMﬂops rate was calculated on the basis of estimated Flops inTable 1 and measured CPU time.As a ﬁrst example system (16) was solved. For thenumerical experiment
m
Z
8,
n
Z
100,000 and
a
Z
K
b
,
g
Z
1with various values of
b
were taken. The choice of
a
and
g
were motivated by subroutine
cyclic
[1]. Results are shownin Table 2. The time of a run was about 2.4 s and the Mﬂopsrate was about 139. For
b
Z
0 the program reportssingularity of diagonal matrices. As it was shown for
b
Z
K
2 the algorithm fails, but the program does not reportsingularity if pivots in matrix inversion routine are tested forzero condition. Instead the program yields wrong resultswhile the residual is zero. Similar problems occurred when
b
Z
2, but if the number of equations increased by one, thusbecoming odd, the error drops to 1.4
!
10
K
9
. This supportsthe theoretical results given at the end of the secondparagraph.To estabilsh the smallest acceptable pivot size severalruns were made for
b
Z
K
2 with different
n
. It is found thatfor
n
Z
1000, the smallest pivot was zero,
n
Z
10,000 thesmallest pivot was 1.33
!
10
K
15
and for
n
Z
100,000 thesmallest pivot was 1.87
!
10
K
14
. No general rule forselecting pivot tolerance is suggested, rather a warningthat this should be done carefully. For this an optionalargument is provided for setting acceptable pivot size insubroutine
tricyc
.The next experiment using the same system was made todetermine how the choice of
a
and
g
inﬂuence the accuracy
Table 3Error and residuals for
m
Z
4,
n
Z
100,000,
b
Z
4 with various choices of
a
and
gd a
Z
d
,
g
Z
1
a
Z
g
Z
d
a
Z
1,
g
Z
d
Error Res Error Res Error Res10
K
12
5.94
!
10
K
5
1.03
!
10
K
9
3.33
!
10
K
16
2.22
!
10
K
16
4.25
!
10
K
5
7.36
!
10
K
10
10
K
16
6.72
!
10
K
11
1.39
!
10
K
15
3.33
!
10
K
16
2.22
!
10
K
16
5.94
!
10
K
5
1.03
!
10
K
9
1 3.33
!
10
K
16
2.22
!
10
K
16
3.33
!
10
K
16
2.22
!
10
K
16
3.33
!
10
K
16
2.22
!
10
K
16
10
6
2.69
!
10
K
11
6.88
!
10
K
16
3.33
!
10
K
16
2.22
!
10
K
16
3.14
!
10
K
12
2.76
!
10
K
16
10
12
4.25
!
10
K
5
7.36
!
10
K
10
3.33
!
10
K
16
2.22
!
10
K
16
5.94
!
10
K
5
1.03
!
10
K
9
Table 4Execution time, Mﬂops, errors and residuals for random matrices for
n
Z
100,000 and
a
Z
g
Z
1,
b
Z
mm
Time Mﬂops Error Res1 0.072 13.9 0.154
!
10
K
10
0.886
!
10
K
12
2 0.141 45.5 0.445
!
10
K
11
0.171
!
10
K
11
3 0.272 72.8 0.244
!
10
K
14
0.377
!
10
K
14
4 0.550 81.5 0.155
!
10
K
14
0.533
!
10
K
14
5 0.913 93.2 0.178
!
10
K
14
0.622
!
10
K
14
6 1.348 106.8 0.200
!
10
K
14
0.844
!
10
K
14
7 2.030 111.1 0.178
!
10
K
14
0.977
!
10
K
14
8 2.502 133.0 0.178
!
10
K
14
0.124
!
10
K
13
M. Batista / Advances in Engineering Software 37 (2006) 69–74
72
of a calculation. For this the value range from 10
K
12
to 10
12
was taken. The results of the calculation are provided inTable 3. It can be seen that the maximum and average errorincrease near the interval limits for the asymmetrical choiceof parameters and has a constant value for the symmetricalcase
a
Z
g
. This experiment suggests the symmetricalchoice of parameters, but this cannot be a generalrecommendation. A note is also made that for this testwhen one chose
a
/
g
Z
3.732051, i.e. the value correspon-dent to (20), the maximal error was 1.00, and the residual3.00. This again is an avoidable wrong solution, whichcould simply be corrected by choosing the suitable pivotsize.A third example, the numerical performance on randomgenerated matrices is tested. The standard Fortran90intrinsic subroutine
random_number
was used for therandom generation of
A
k
,
B
k
and
C
k
with the value
b
Z
m
.Table 4 shows the results for
n
Z
100,000 and
a
Z
g
Z
1and various values of
m
. The results show an acceptabletime and error level for all cases. Note also that theMﬂops rate increases with
m
, but this is not a property of the algorithm, but probably of the compiler and/orcomputer used.As the penultimate example we compare the perform-ance of the algorithm with subroutine
cyctri
, which is ablock-tridiagonal solver based on the Gaussian eliminationoptimized for the case when the individual block matrixsizes are 3
!
3 [4]. When comparing
cyctri
and
tricyc
onemust keep in mind that
cyctri
is not intended to be a general-purpose solver, rather was designed for very speciﬁcproblems. Two cases have been tested. The ﬁrst is system(16) and the second random matrices with
b
Z
1. The resultsof the comparison are shown in Table 5.While the errors of both subroutines are of the sameorder,
cyctri
is about 20% faster than
trycyc
. This can beexplained by the presence of unrolled loops in
cyctri,
while
tricyc
uses the intrinsic function
matmul
for all matrixmultiplications. With diagonal matrices using
b
Z
G
2
cyctri
has the same problems as
tricyc
when the zero tolerancepivot test is used and also as opposed to
tricyc
,
cyctri
failswhen
b
Z
1.Finally, the test was also conducted using the program
shall4
[4] to see how it behaves in a practical simulationtool. When used in the
shall4
program
tricyc
gives resultswhich differ from those obtained by
cyctri
at most by about5%. The execution time of
tricyc
was 12.8 s; the time for
cyctri
was 9.8 s, i.e. 30% faster. This increase can beexplained by the fact that all matrices passing
tricyc
wereduplicated and then transponed, since
cyctri
stores matrix asan array of consecutive rows.
5. Conclusions
An algorithm that extends the known CTS solver
cyclic
[1] to CBTS was developed. A comparison between theBTS and CBTS solver shows that the CBTS solver requiresapproximately two times more operations than the BTSsolver. The algorithm was implemented as the Fortran90module. Results of numerical experiments show that thepresented algorithm produces relatively accurate resultswithin an acceptable execution time. The advantage of thepresented algorithm is that it is simple and relatively easy toprogram and should be regarded as yet another possibilityfor solving CBTS.
Acknowledgements
The author wishes to thank Prof. Dr I. Michael Navonfrom Florida State University, Tallahassee, for providinghim the source code of the
shall4
program.
References
[1] Press WH, Teukolsky SA, Vetterling VT, Flannery PB. Numericalrecipies in C; the art of scientiﬁc computing. Cambridge, UK:Cambridge University Press; 1992.[2] http://www.netlib.org.[3] Swarztrauber PN, Sweet RA. The fourier and cyclic reductionmethods for solving Poisson’s equation. In: Schetz JA, Fuhs AE,editors. Handbook of ﬂuid dynamics and ﬂuid machinery. New York:Wiley; 1996.[4] Navon IM, Riphagen HA. SHALL4—An implicit compact fourth-order FORTRAN program for solving the shallow-water equations inconservation-law form. Comput Geosci 1986;12(2):129–50.[5] Varga SR. Matrix iterative analysis. Englewood Cliffs, NJ: PrenticeHall; 1962.Table 5Comparison of execution time, errors and residuals between
cyctri
and
tricyc
when
a
Z
g
Z
1,
m
Z
3
n
System (16) with
b
Z
4 Random matrices with
b
Z
1Time Error Res Time Error ResCYCTRI 10,000 0.114 0.444
!
10
K
15
0.178
!
10
K
14
0.114 0.888
!
10
K
15
0.622
!
10
K
14
100,000 1.109 0.444
!
10
K
15
0.178
!
10
K
14
1.106 0.111
!
10
K
14
0.666
!
10
K
14
500,000 5.511 0.444
!
10
K
15
0.178
!
10
K
14
5.492 0.133
!
10
K
14
0.755
!
10
K
14
TRICYC 10,000 0.140 0.666
!
10
K
15
0.244
!
10
K
14
0.139 0.888
!
10
K
15
0.622
!
10
K
14
100,000 1.250 0.666
!
10
K
15
0.244
!
10
K
14
1.248 0.888
!
10
K
15
0.533
!
10
K
14
500,000 6.184 0.666
!
10
K
15
0.244
!
10
K
14
6.195 0.111
!
10
K
14
0.555
!
10
K
14
M. Batista / Advances in Engineering Software 37 (2006) 69–74
73

We Need Your Support

Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks