A New Design of Scientiﬁc Software Using Python and XML
L
UTZ
G
ROSS
,
1
H
ANS
M
U¨HLHAUS
,
1
E
LSPETH
T
HORNE
,
1
and K
EN
S
TEUBE
1
Abstract—
In this paper we advance the development of our
python
based package for the solution of partialdifferential equations using spatial discretization techniques such as the ﬁnite element method (FEM) via twoapproaches. First we deﬁne a Model class object which makes it easy to break down a complex simulation intosimpler submodels, which then can be linked together into a highly efﬁcient whole. Second, we implement anXML schema in which we can save an entire simulation. This allows implemention of checkpointing inaddition to graphical user interfaces which enables nonprogrammers to use models developed for their research.These features are built upon our
escript
module, a software package designed to develop numerical models in avery abstract way while still allowing the use of computational components implemented in C and C
++
toachieve extreme highperformance for timeintensive calculations.
Key words:
Partial differential equations, mathematical modelling, XML schema, Drucker–Prager ﬂow.
1. Introduction
Presently numerical simulations require a team effort unifying a variety of skills. In avery simple approach we can identify four groups of people involved: Researchers usingnumerical simulation techniques to improve the understanding and prediction of phenomena in science and engineering, modelers developing and validating mathematicalmodels, computational scientists implementing the underlying numerical methods,software engineers implementing and optimizing algorithms for a particular architecture.Each of these skill levels employ their individual terminology: A researcher uses termssuch as stress, temperature and constitutive laws, while modelers express their modelsthrough functions and partial differential equations. The computational scientists work with grids and matrices. Software engineers work with arrays and data structures. Finally,an object such as stress used by a researcher is represented as a suitable, maybe platformdependent, data structure after the modeler has interpreted it as a function of spatialcoordinates and the computational scientists as values at the center of elements in a ﬁniteelement mesh. When moving from the software engineer’s view towards the view of theresearcher these data structures undergo an abstraction process until ﬁnally only seeing
1
Earth Systems Science Computational Center, The University of Queensland, St. Lucia, QLD 4072,Australia. Email: gross@uq.edu.au; h.muhlhaus@uq.edu.au; e.thorne@uq.edu.au; k.steube@uq.edu.auPure appl. geophys. 165 (2008) 653–670
Birkha¨user Verlag, Basel, 2008
0033–4553/08/030653–18DOI 10.1007/s0002400803277
Pure and Applied Geophysics
the concept of stress and ignoring the fact that it is a function in the
L
2
Sobolev space andrepresented using a ﬁniteelement mesh.It is also important to point out that each of these layers has an appropriate userenvironment. For the researcher, this is a set of input ﬁles describing the problem to besolved, typically in XML (http://www.w3.org/XML). Modelers, mostly not trained assoftware engineers, prefer to work in script–based environments, such as
python
(L
UTZ
,2001) or MATLAB while computational scientists and software engineers work withprogramming languages such as C, C
++
or FORTRAN in order to achieve the bestpossible computational efﬁciency.Various efforts have been made to provide tools for computational scientists todevelop numerical algorithms, for instance PETSc (P
ACHECO
, 1997) which is widelyused. These tools provide linear algebra concepts such as vectors and matrices whilehiding data structures from the user. The need of researchers for a userfriendlyenvironment has been addressed in various projects with the development of problemsolving environments (PSEs) (H
OUSTIS
et al.
, 2000) some of which take the form of asimple graphical user interface. In contrast there were relatively few activities engagedin the development of environments for modelers. Two examples for partial differentialequation (PDE) based modeling are ELLPACK (R
ICE
and B
OISVERT
, 1985) andFASTFLO (L
UO
et al
., 1996). Both products use their own programming language andas such are not powerful enough to deal with complex and coupled problems in an easyand efﬁcient way.The
escript
module (G
ROSS
et al.
, 2005; D
AVIES
et al.
, 2004) is an environment inwhich modelers can develop PDE based models using
python
. It is designed to solvegeneral, coupled, timedependent, nonlinear systems of PDEs. It is a fundamental designfeature of
escript
that is not tied to a particular spatial discretization technique or PDEsolver library. It is seamlessly linked with other tools such as linear algebra tools(G
REENFIELD
et al.
, 2001) and visualization tools (K
ITWARE
, Inc., 2003). We refer to(https://shake200.esscc.uq.edu.au) to access the
escript
users guide, example scripts andthe source code.In the ﬁrst part of the paper we will give an overview of the basic concepts of
escript
from a modelers point of view. We will illustrate its usage in theimplementation of the Drucker–Prager ﬂow model. In the second part we will presentthe
modelframe
module within
escript
. It provides a framework to implementmathematical models as
python
objects which then can be plugged together to buildsimulations. We illustrate this approach for the Drucker–Prager ﬂow and demonstratehow this model can be linked with a temperature advection–diffusion model withoutmodifying any of the cods for the models. We will then discuss how XML can be usedto set not only simulation parameters but also to deﬁne an entire simulation fromexisting models. The XML ﬁles provide an ideal method to build simulations out of PSEs or from web services. The presented implementation of the Drucker–Prager ﬂowhas been validated on some test problems but a detailed discussion of these results isbeyond the scope of this paper.
654 Lutz Gross
et al.
Pure appl. geophys.,
2. Drucker–Prager Flow Model2.1. The General Framework
We will illustrate the abovementioned concepts for the deformation of a threedimensional body loaded by a timedependent volume force,
f
i
, and by a surface load,
g
i
.If
u
i
and
r
ij
deﬁne the displacement and stress state at a given time
t
and
dt
gives a timeincrement, then the general framework for implementing a model in the updatedLagrangian framework is given below:0. start time integration, set
t
=
01. start iteration at time
t
, set
k
=
01.0. calculate displacement increment
du
i
from
r
ij

and current force
f
i
.1.1. update geometry1.2. calculate stretching
D
ij
and spin
W
ij
from
v
i
.1.3. calculate new stress
r
ij
from
r
ij

using
D
ij
and
W
ij
.1.4. update material properties from
r
ij
1.5. if not converged,
k
/
k
+
1 and goto 1.0.2. set
t
/
t
+
dt
and goto 1.The superscript
0

0
refers to values at the previous iteration or time step. To terminatethe iteration process at a time step one can use the relative changes in the velocity
v
¼
u
u
dt
:
k
du
k
1
f
iter
k
u
k
1
;
ð
2
:
1
Þ
where
k
:
k
1
denotes the maximum norm and
f
iter
is a speciﬁed, positive relativetolerance. Alternatively, one can check the stress change.To ensure that the relative time integration error is forced below a speciﬁed tolerance
f
time
the time step
dt
for the next time step is bounded by
dt
max
given below:
dt
max
¼
dt
k
u
k
1
k
v
v
k
1
f
time
;
ð
2
:
2
Þ
which controls an estimate of the local time discretization error for the total displacement
u
.The stretching
D
ij
and spin
W
ij
are deﬁned as the symmetric and nonsymmetric part of the gradient of
du
i
:
D
ij
¼
12
ð
du
i
;
j
þ
du
j
;
i
Þ
;
ð
2
:
3
Þ
W
ij
¼
12
ð
du
i
;
j
du
j
;
i
Þ
;
ð
2
:
4
Þ
where for any function
Z
,
Z
,
i
denotes the derivative of
Z
with respect to
x
i
.
Vol. 165, 2008 A New Design of Scientiﬁc Software 655
To calculate the displacement increment
du
i
one has to solve a partial differentialequation on the deformed domain
X

which in tensor notation takes the form:
ð
S
ijkl
du
k
;
l
Þ
;
j
¼ ð
r
ij
K
th
d
ij
Þ
;
j
þ
f
i
ð
2
:
5
Þ
where
K
is the bulk modulus and
S
ijkl
is the tangential tensor which depends onthe rheology that is used. The argument of the divergence expression on the right handside is the Cauchy stress at time
t

dt
with the mechanical stress
r
ij

at time
t

dt
. Theterm
K
e
th
d
ij
is the thermal stress where the thermal volume strain
e
th
is given as
th
¼
h
ð
T
T
ref
Þ ð
2
:
6
Þ
with thermal expansion coefﬁcient
h
, current temperature
T
and the referencetemperature
T
ref
. In geoscience applications,
f
i
typically takes the form of thegravitational force
f
i
¼
q
g
d
id
ð
2
:
7
Þ
which acts oppositely to the positive
d
direction. The constants
g
and
q
are the gravityand density, respectively. The density is often given in the form
q
¼
q
0
ð
1
h
ð
T
T
0
ÞÞ
;
ð
2
:
8
Þ
where
q
0
is the density at the reference temperature
T
0
.The displacement increment has to fulﬁll the conditions
n
j
ð
S
ijkl
du
k
;
l
þ
r
ij
K
th
d
ij
Þ ¼
g
i
ð
2
:
9
Þ
on the boundary of the domain where
g
i
is a timedependent surface load. Moreover, thedisplacement increment has to meet a constraint of the form
du
¼
dt
V
i
on
C
i
;
ð
2
:
10
Þ
where
C
i
is a subset of the boundary of the domain. At the ﬁrst iteration step
V
i
gives the
i
th velocity component acting on the subset
C
i
of the deformed domain and is set to zerofor the following steps.
2.2. The Drucker–Prager Model
For the Drucker–Prager model the stress is updated in the following procedure: Firstthe stress state at time
t
+
dt
due to elastic deformation
r
i
,
je
is calculated:
r
eij
¼
r
ij
þ
K
th
d
ij
þ
dt
2
G D
ij
þ
K
23
G
D
kk
d
ij
þ
W
ik
r
kj
r
ik
W
kj
;
ð
2
:
11
Þ
where
G
is the shear modulus. The yield function
F
is evaluated as
F
¼
s
e
a
p
e
s
Y
;
ð
2
:
12
Þ
656 Lutz Gross
et al.
Pure appl. geophys.,
where
p
e
¼
13
r
ekk
;
ð
2
:
13
Þ
s
e
¼
ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ
12
ð
r
eij
Þ
0
ð
r
eij
Þ
0
r
ð
2
:
14
Þ
with the deviatoric stress
ð
r
eij
Þ
0
¼
r
eij
þ
p
e
d
ij
:
ð
2
:
15
Þ
The value
s
Y
is the current shear length and
a
is the friction parameter, both of which aregiven as a function of the plastic shear stress
c
p
.We require a non–negative yield function. The factor
v
marks when the yield conditionis violated:
v
¼
0 for
F
\
01 else
:
ð
2
:
16
Þ
With current hardening modulus
h
¼
d
s
Y
d
c
p
and dilatancy parameter
b
, which is again agiven function of the plastic shear stress
c
p
, we can deﬁne the plastic shear stressincrement to
k
¼
v
F h
þ
G
þ
b
K
:
ð
2
:
17
Þ
We then can calculate a new stress as
s
¼
s
e
k
G
;
ð
2
:
18
Þ
r
ij
¼
ss
e
ð
r
eij
Þ
0
ð
p
e
þ
kb
K
Þ
d
ij
:
ð
2
:
19
Þ
Finally we can update the plastic shear stress
c
p
¼
c
p
þ
k
ð
2
:
20
Þ
and the hardening parameter
h
¼
s
Y
s
Y
c
p
c
p
:
ð
2
:
21
Þ
For the Drucker–Prager model the tangential tensor is given as
S
ijkl
¼
G
ð
d
ik
d
jl
þ
d
jk
d
il
Þþ
K
23
G
d
ij
d
kl
þð
r
ij
d
kl
r
il
d
jk
Þþ
12
ð
d
ik
r
lj
d
jl
r
ik
þ
d
jk
r
il
d
il
r
kj
Þ
v
h
þ
G
þ
ab
K G
ð
r
ij
Þ
0
s
þ
b
K
d
ij
G
ð
r
kl
Þ
0
s
þ
a
K
d
kl
:
ð
2
:
22
Þ
In the implementation of the model we have to monitor the total stress
r
, plastic stress
c
p
,the shear length
s
Y
as well as displacement
u
and velocity
v
.
Vol. 165, 2008 A New Design of Scientiﬁc Software 657