generate::Macrofort::genFor
-- FORTRAN
code generatorMac::genFor
(where
Mac:=generate::Macrofort
) is the Macrofort package for
generating FORTRAN code.
The Macrofort package allows the user to make complete standard FORTRAN 77 code while remaining in the MuPAD environment. Necessities of FORTRAN code such as label numbering and complicated FORTRAN loops are handled automatically. Macrofort has capabilities for:
Furthermore, the combination of MuPAD's symbolic manipulation
capabilities and, in particular, its optimizer generate::optimize
can help yield
efficient codes for ambitious goals in numerical computation. MuPAD can
be also used as a preprocessor for numerical analysis.
Macrofort overlaps with MuPAD's generate::fortran
but
it is more general program for generation FORTRAN code and has many
more options.
generate::Macrofort::genFor(l)
l |
- | list or list of lists |
the void object of domain type DOM_NULL
writes FORTRAN code into an ascii file
generate::Macrofort::init
,
generate::Macrofort::setOptimizedOption
,
generate::Macrofort::setIOSettings
,
generate::Macrofort::setPrecisionOption
,
generate::Macrofort::setAutoComment
,
generate::Macrofort::openOutputFile
,
generate::Macrofort::closeOutputFile
,
generate::optimize
,
generate::fortran
m
appears at the end of the keyword. The
following input list of keywords and their corresponding FORTRAN output
refer to Macrofort single instructions:
["call" , name, list] |
call name (list) |
["close" , n] |
close (n) |
["comment" , string] |
c string |
["common" , name, list] |
common /name/ list |
["continue" , label] |
label continue |
["declare" , type, list] |
type list |
["do" , label, index, |
do label,index=start,end |
start, end] | |
["do" , label, index, |
do label,index=start,end,step |
start, end, step] | |
["else" ] |
else |
["end" ] |
end |
["endif" ] |
endif |
["equal" , var, expression] |
var=expression |
["format" , label, list] |
label format (list) |
["function" , type, |
type function name (list) |
name, list] | |
["goto" , label] |
goto label |
["if_goto" , cond, label] |
if (cond) goto label |
["if_then" , cond] |
if (cond) then |
["parameter" , list] |
parameter (list) |
["program" , name] |
program name |
["read" , file, label, list] |
read (file,label) list |
["return" ] |
return |
["subroutine" , name, list] |
subroutine name (list) |
["write" , file, label, list] |
write (file, label) list |
["open" , n, file, st] |
open(unit=n,file='file',status='st') |
where n appears in cases such as "open"
correspond to unit numbers (or channel numbers) for FORTRAN I/O
instructions and where cond
appears as a condition
argument of a Macrofort instruction. Examples of conditions are:
[ïf_then",a>=b].
_not
, _and
and
_or
in a condition, you have to use the names
"NOT"
, ÄND"
and ÖR"
within a
list notation. For instance:[ïf_then",[ÖR",a=b,["NOT",c<d)]].
label
appears as an argument of a Macrofort instruction, a MuPAD variable
must be inserted. The label is retained within Macrofort it can always
be avoided using the macro instructions.
When list
appears as an argument of a Macrofort
instruction, it corresponds to an argument FORTRAN list which you have
to write as a MuPAD list. For instance:
["call",foo,[a,b,c]]or
["format",[`2x,e14.7`],[x,y]].
["do_m" , index, |
do label,index=start,end,step |
start, end, step, doList] | doList |
label continue | |
["do_m" , index, |
do label,index=start,end |
start, end, doList] | doList |
label continue | |
["functionm" , type, |
type function name (list) |
name, list, bodyList] | bodyList |
end | |
["if_then_else_m" , cond, |
if cond then |
thenList, elseList] | thenList |
else | |
elseList | |
endif | |
["if_then_m" , cond, thenList] |
if cond then |
thenList | |
endif | |
["programm" , name, bodyList] |
program name |
bodyList | |
end | |
["readm" , file, |
read (file,label) varList |
formatList, varList] | label format (formatList) |
["subroutinem" , name, |
subroutine name (list) |
list, bodyList] | bodyList |
end | |
["writem" , file, |
write (file,label) varList |
formatList, varList] | label format (formatList) |
["declarem" , type, list] |
type list |
["commonm" , name, list] |
common / name / list |
["openm" , n, |
open(unit=n,file='file',status='st') |
file, st, bodyList] | bodyList |
close (n) |
"commonm"
and
"declarem"
and their single instruction counterparts,
namely "common"
and "declare"
is that one can
put the macro instructions everywhere in the list describing the
program and Macrofort puts them at the right place in the generated
FORTRAN code. This allow us to declare a variable only when it is used
in the body of the program. These macros only work within the
"programm"
, "functionm"
or
"subroutinem"
instructions. Otherwise there are
ignored.["whilem",condition,initList,whileList,whileMax(optional)]:
<initList>
while condition do
<whileList>
end.
[üntilm",condition,initList,untilList,untilMax(optional)]:
<initList>where
do <untilList>
until condition
end.
whileMax
and untilMax
are optional
arguments and respectively the maximum number of iterations for the
WHILE and UNTIL loops. When the maximum is reached, the loop stops and
a message is issued. If this argument is not given, there is no maximum
number of iterations. Note that doList
,
thenList
, elseList
, bodyList
,
initList
, untilList
and
whileList
arguments must be MuPAD lists describing FORTRAN
statements with a Macrofort syntax. You can nest as many loops as you
want.[
"matrixm",var,matrix] is another very useful macro
instruction. It is used to make assignations of the elements of a
matrix or array. Here, var
is the name of a FORTRAN matrix
and matrix
is a MuPAD matrix (see first example).macrofort
domain. Please note that before any call to
Mac::genFor
(where Mac:=generate::Macrofort
),
the procedure Mac::init
must be used! The latter sets
these global variables to their default values and it initializes the
various counters used internally by Macrofort (e.g.: for label
generation). Deviations from these settings are done by calls to
Mac::setIOSettings
, Mac::setOptimizedOption
,
Mac::setPrecisionOption
and
Mac::setAutoComment
(see the help page for
Mac::init
and these other procedures for details).
Furthermore, any call to Mac::genFor
also needs a call to
Mac::openOutputFile
to name and open the ascii FORTRAN
file and a call to Mac::closeOutputFile
to close that same
file.Calculation of a matrix.
Initialize Macrofort and open the file
"matrix.f"
>> Mac := generate::Macrofort: Mac::init(): Mac::openOutputFile("matrix.f"):
Create a 2 x 2 array with symbolic entries called
a
>> a := array(1..2, 1..2, [[x^2, x - y], [x/y, x^2 - 1]])
+- -+ | 2 | | x , x - y | | | | x 2 | | -, x - 1 | | y | +- -+
and generate a FORTRAN array v
for
a
using "matrixm"
in genFor:
>> Mac::genFor(["matrixm", v, a]):
Close the file "matrix.f"
>> Mac::closeOutputFile(): delete a:
The output file matrix.f
is:
v(1,1) = x**2 v(1,2) = x-y v(2,1) = x/y v(2,2) = x**2-1
The calculation of a polynomial in Horner form.
Open the file "demo.f"
and define the
function using the macro "functionm"
. p
is a
list containing all specifications in the form of lists.
"declarem"
defines the FORTRAN variables and their types
and "do_m"
creates the do-loop by which the polynomial is
summed in Horner form.
>> Mac::openOutputFile("demo.f"): p := ["functionm", doubleprecision, horner,[x,n,a], [["declarem", doubleprecision, [a(n), x]], ["equal", horner, a[n]], ["do_m", i, n - 1, 0, -1, ["equal", horner, a[i] + horner*x]]]]: Mac::genFor(p): Mac::closeOutputFile(): delete p,a,x,n,i:
The output file demo.f
is:
c c FUNCTION horner c doubleprecision function horner(x,n,a) doubleprecision a(n),x horner = a(n) c do 1000, i=n-1,0,-1 horner = x*horner+a(i) 1000 continue c end
Bear in mind that this demo example is lame as the Macrofort MuPAD code is at least as extensive as the resulting FORTRAN output which could have been readily obtained directly by hand. The dividends of Macrofort become more clear in the following examples.
A recursive function on a tree.
Although it is possible to write FORTRAN programs for recursive problems, this can be very tedious for complicated recursions and even if the recursion is not so complicated, other problems can arise as shown here. In this example, we consider a binary tree with nodes labeled by pairs of integers (i,j) where (1,1) is the root of the tree, (2,1) and (2,2) are the children nodes of the root and recursively (i+1,2j-1) and (i+1,2j) are the children nodes of node (i,j). Node (i,j) is at level i. Consider now the following sequence:
where g is a given function. We want to compute the values of the sequence up to a given level N, i.e. the 2 ^N-1 values f(N,1) ... f(N,2 ^N-1). To write the corresponding FORTRAN code, you only have to write two loops:
real f(n,m) do 1,i=1,n do 2,j=1,2**(n-1)-1,2 f(i,j)=g(f(i-1,(j+1)/2)) 2 continue do 3,j=2,2**(n-1),2 f(i,j)=g(f(i-1,j/2)) 3 continue 1 continue
but the dimension m of the array f is 2 ^N-1. In other words, we would have to retain the storage of N x 2 ^N-1 real values instead of 2 ^N-1 (which corresponds to 5 times more storage for N equal to 10).
A way to avoid this waste, is to have an array for each level, i.e. to have FORTRAN arrays f1(1), f2(2), f3(4) and so forth but it then becomes very tricky to write the resulting FORTRAN program by hand. However, this can be readily solved using a MuPAD program within Macrofort.
We suppose that a FORTRAN function g has already been defined. The MuPAD function which generates the FORTRAN program is:
>> Mac::openOutputFile("Func.f"): pushe := proc(e,l) begin [op(eval(l)),e] end_proc: gen_Func := proc(n) local i,pg,temp,temp1,temp2,temp3; begin pg:=[]: // declaration of the arrays for i from 1 to n do temp:=eval(text2expr( _concat("f",expr2text(i),"[",expr2text(2^(i-1)),"]"))): pg:=pushe(["declare",real,[temp]],pg): end_for: // loops for each array for i from 2 to n do temp1:=eval(text2expr(_concat("f",expr2text(i),"[j]"))): temp2:=eval(text2expr(_concat("f",expr2text(i-1),"[Hold(j+1)/2]"))): temp3:=eval(text2expr(_concat("f",expr2text(i-1),"[j/2]"))): pg:=pushe(["do_m",j,1,2^(i-1)-1,2,["equal",temp1,g(temp2)]],pg): pg:=pushe(["do_m",j,2,2^(i-1),2,["equal",temp1,g(temp3)]],pg): end_for: pg:=["programm",Func,pg]: Mac::genFor(pg): end_proc: gen_Func(4): Mac::closeOutputFile(): delete j,pushe,gen_Func:
The output file Func.f
is:
c c MAIN PROGRAM Func c program Func real f1(1) real f2(2) real f3(4) real f4(8) c do 1000, j=1,1,2 f2(j) = g(f1((j + 1)/2)) 1000 continue c c do 1001, j=2,2,2 f2(j) = g(f1(j/2)) 1001 continue c c do 1002, j=1,3,2 f3(j) = g(f2((j + 1)/2)) 1002 continue c c do 1003, j=2,4,2 f3(j) = g(f2(j/2)) 1003 continue c c do 1004, j=1,7,2 f4(j) = g(f3((j + 1)/2)) 1004 continue c c do 1005, j=2,8,2 f4(j) = g(f3(j/2)) 1005 continue c end
By calling gen_Func(n)
for larger n
, you
can readily have Macrofort generate the needed code for larger and
larger n. Of course, the output is proportionally larger but
still ``digestible'' to modern-day compilers for a wide range of
n.
Optimized Vectorized FORTRAN code for Molecular
Integrals.
Here, we present a method for the rapid numerical
evaluation of molecular integrals which appear in the areas of Quantum
Chemistry and Molecular Physics. The method is based on the
exploitation of common intermediates using MuPAD's optimizer (see
generate::optimize
) and the optimization can be adjusted
to both serial and vectorised computations.
Integrals based on atom-centered Gaussian-type functions known as the R-integrals are given by the recurrence relations:
R_j [t+1,m,n ] = x R_j+1 [t,m,n] + t R_j+1 [t-1,m,n]
R_j [t,m+1,n] = y R_j+1 [t,m,n] + m R_j+1 [t,m-1,n]
R_j [t,m,n+1] = z R_j+1 [t,m,n] + n R_j+1 [t,m,n-1]
R_j [0,0,0] = (-2 p ) ^j F_j ( alpha QP ^2 )
where
F_m (W) = int(exp [ -W t ^2] t ^2m,t=0..1),
where the vector QP=-(x,y,z) and alpha are given geometrical quantities (determined on input) for a particular molecular geometry, and F is a known function in quantum chemistry for which there already exist various algorithms for its rapid computation (in this example, we are only interested in the computation of the polynomial part of R(t,u,v). The summation indices are bounded by zero and t+m+n <= L where L is a total angular quantum number for a given molecular problem, and consequently these induces adhere to a polyhedral structure. The total number N of R functions to compute for a given L is given by N = (L+1) (L+2) (L+3) /6 . In the diatomic molecular case (i.e. a molecule made of only two ``atoms'' whose centers are set on the z-axis), the R-integrals form a sparse three-dimensional matrix (see the work of the references for a fuller understanding of the framework).
Vectorization involved the introduction of a vectorization index, denoted M, which is the first index of all the arrays involved in the computation. The FORTRAN code obtained from the compiler is then ``sandwiched'' within do-loops. The code shown here is generated for the R integrals up to L=3 although it can generate the code for all the way up to L=16.
This example makes use of MuPAD's optimizer (see
generate::optimize
) and therefore, one obtains optimized
vectorized FORTRAN code. This code had been tested on a CRAY and the
FLOP (floating-point operations) rate was about 85 percent of its peak
efficiency and has been used in specific relativistic quantum chemistry
calculations.
Input code:
First we construct the S
functions by which the
R
functions are later defined as shown in the work of V.
Saunders (see references). This definition essentially exploits a
decomposition in odd and even symmetries within these functions and
provides an economy in the computations.
>> S:=proc(j,p,QP,QP2,t,u,v) begin if (t<0) or (u<0) or (v<0) then 0; elif (t=0) and (u=0) and (v=0) then ((-2*p)^j)*FS[M,j+1]; elif (t>0) and (t mod 2 =1) then S(j+1,p,QP,QP2,t-1,u,v)+(t-1)*S(j+1,p,QP,QP2,t-2,u,v); elif (u>0) and (u mod 2 =1) then S(j+1,p,QP,QP2,t,u-1,v)+(u-1)*S(j+1,p,QP,QP2,t,u-2,v); elif (v>0) and (v mod 2 =1) then S(j+1,p,QP,QP2,t,u,v-1)+(v-1)*S(j+1,p,QP,QP2,t,u,v-2); elif (t>0) and (t mod 2 =0) then QP2[M,1]*S(j+1,p,QP,QP2,t-1,u,v)+(t-1)*S(j+1,p,QP,QP2,t-2,u,v); elif (u>0) and (u mod 2 =0) then QP2[M,2]*S(j+1,p,QP,QP2,t,u-1,v)+(u-1)*S(j+1,p,QP,QP2,t,u-2,v); elif (v>0) and (v mod 2 =0) then QP2[M,3]*S(j+1,p,QP,QP2,t,u,v-1)+(v-1)*S(j+1,p,QP,QP2,t,u,v-2); end_if; end_proc: Xi:=proc(xbar,i) begin if (QP[M,1] <> 0) then (-xbar)^(i mod 2); elif ((i mod 2) > 0) then 0; else 1; end_if; end_proc:
Finally, we construct the R functions
>> R:=proc(j,p,QP,QP2,t,u,v) local X,Y,Z; begin X:=Xi(QP[M,1],t); Y:=Xi(QP[M,2],u); Z:=Xi(QP[M,3],v); S(j,p,QP,QP2,t,u,v)*(X*Y*Z); end_proc:
We restrict ourselves to the Diatomic case i.e. only the z-axis (the 3rd axis) has non-zero components.
>> QP[M, 1] := 0: QP[M, 2] := 0: QP[M, 3] := QP[M]: QP2[M, 1] := 0: QP2[M, 2] := 0: QP2[M, 3] := QP2[M]:
We ensure plenty of guard digits for L up to L=16
>> DIGITS:=60: subr:=null(): for LL from 0 to 4 do tuv:=0: ds:=null(): for mu from 0 to LL do t:=LL-mu; for nu from 0 to mu do u:=mu-nu; for tau from 0 to nu do v:=nu-tau; tuv:=tuv+1; Rtuv:=float(R(0,ALPHA[M],QP,QP2,t,u,v)); if (Rtuv <>0 and Rtuv <> 0.0) then ds:=ds,[RC[M,tuv],Rtuv]: end_if; end_for: end_for: end_for: subr:=subr,["if_then_m",L=LL, ["do_m",M,1,MAXM,["equal",[ds]]]]; end_for:
Construct the input list for the FORTRAN Subroutine
>> delete QP2: subr := ["subroutinem", RMAKE, [L, FS, ALPHA, QP2, MAXM, RC], [["declare", "implicit doubleprecision", ["(a-h,o-z)"]], ["parameter", [LMAX = 12, MAXB = 32, MAXB2 = "MAXB*MAXB", MAXR = 455]], ["declare", dimension, ["FS(MAXB2,LMAX)", "ALPHA(MAXB2)", "QP2(MAXB2)", "RC(MAXB2,MAXR)"]], subr]]:
Initialization of Macrofort:
>> Mac:=generate::Macrofort: Mac::init():
Open file "vectorized.f"
and switch
optimizer on. The desired precision for the FORTRAN code is double:
>> Mac::openOutputFile("vectorized.f"): Mac::setOptimizedOption(TRUE): Mac::setPrecisionOption("double"):
Code Generation:
>> subr: Mac::genFor(subr): Mac::closeOutputFile(): delete subr,S,R,LL,Xi,t,u,v,Rtuv,tuv,ds,tu,mu,nu,QP,QP2,M:
The output file vectorized.f
is:
c c SUBROUTINE RMAKE c subroutine RMAKE(L,FS,ALPHA,QP2,MAXM,RC) implicit doubleprecision (a-h,o-z) parameter (LMAX=12,MAXB=32,MAXB2=MAXB*MAXB,MAXR=455) dimension FS(MAXB2,LMAX),ALPHA(MAXB2),QP2(MAXB2),RC(MAXB2,MAXR) if (L.eq.0) then c do 1000, M=1,MAXM RC(M, 1) = FS(M,1) 1000 continue c endif if (L.eq.1) then c do 1001, M=1,MAXM RC(M, 4) = FS(M,1) 1001 continue c endif if (L.eq.2) then c do 1002, M=1,MAXM t1 = ALPHA(M) t2 = FS(M,2) RC(M, 1) = -0.2D1*t1*t2 RC(M, 5) = RC(M,1) RC(M, 8) = RC(M,1)+0.4D1*t1**2*QP2(M)*FS(M,3) RC(M, 10) = FS(M,1) 1002 continue c endif if (L.eq.3) then c do 1003, M=1,MAXM t3 = ALPHA(M) t4 = FS(M,2) RC(M, 4) = -0.2D1*t3*t4 RC(M, 13) = RC(M,4) RC(M, 18) = RC(M,4)+0.4D1*t3**2*QP2(M)*FS(M,3) RC(M, 20) = FS(M,1) 1003 continue c endif if (L.eq.4) then c do 1004, M=1,MAXM t5 = ALPHA(M) t6 = t5**2 t7 = FS(M,3) RC(M, 1) = 0.12D2*t6*t7 RC(M, 5) = 0.4D1*t6*t7 t8 = QP2(M) t9 = FS(M,4) t10 = -0.8D1*t5*t6*t8*t9 RC(M, 8) = t10+RC(M,5) t11 = FS(M,2) RC(M, 10) = -0.2D1*t5*t11 RC(M, 21) = RC(M,1) RC(M, 24) = RC(M,8) RC(M, 26) = RC(M,10) RC(M, 31) = t8*(0.16D2*t6**2*t8*FS(M,5)-0.24D2*t5*t6*t9) #+RC(M,1)-0.24D2*t5*t6*t8*t9 RC(M, 33) = 0.4D1*t6*t7*t8+RC(M,10) RC(M, 35) = FS(M,1) 1004 continue c endif end
The benefits of MuPAD's optimizer become more evident at higher L but we can already see its effects. Common intermediates are exploited and in a number of cases, only re-assignations and not actual computation were needed.