A Tutorial Introduction to

                                   Sisal

                         A High Performance, Portable,

                         Parallel Programming Language


                The mission of the Sisal Language Project is 
                to develop high-performance functional 
                compilers and runtime systems to simplify 
                the process of writing scientific programs 
                on parallel supercomputers and to help 
                programmers develop functional scientific 
                applications.

If you have any questions about this page, or want more information about the
Sisal Language Project, contact:

John Feo at (510) 422-6389 or feo@diego.llnl.gov, or
Tom DeBoni at (510) 423-3793 or deboni@llnl.gov.

The Sisal Language Project has been approved as a Designated Unclassified
Subject Area (DUSA) K19222, as of 07 August, 1991.


Table of Contents
1. Introduction - What is Sisal?		
2. Parallel Programming and Other Virtues		4
3. Expression Syntax					6
4. More Syntax						10
5. Exercise Set 1: Syntax Elements			12
6. User-Defined Types and Dynamic Values		14
7. Exercise Set 2: Aggregate Types			19
8. Loops and Parallelism				21
9. Exercise Set 3: Basic Loops			23
10. More on Loops						25
11. Exercise Set 4: Advanced Loops			29
12. Advanced Array Operations				31
13. Exercise Set 5: Advanced Array Operations	32
14. Functions and Programs				33
15. Exercise Set 6: Functions and Programs	38
16. Sequential Looping with For-Initial Loops	41
17. Exercise Set 7: Sequential Loops		47
18. Compiling and Running Sisal Programs		52
19. Sisal I/O						58
20. Exercise Set 8: Programs				63
21. Advanced Features					64
22. How to Get the Sisal Software			66
23. For More Information...				66

DISCLAIMERS
NOTICE: Information from this server resides on a computer system funded by the U.S. Department of Energy. Anyone using this system consents to monitoring of this use by system or security personnel.

DISCLAIMER OF LIABILITY: With respect to documents available from this server, neither the United States Government nor the University of California nor any of their employees, makes any warranty, express or implied, including the warranties of merchantability and fitness for a particular purpose, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights.

DISCLAIMER OF ENDORSEMENT: Reference herein to any specific commercial products, process, or service by trade name, trademark, manufacturer, or otherwise, does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or the University of California. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or the University of California, and shall not be used for advertising or product endorsement purposes.

COPYRIGHT STATUS: LLNL authored documents are sponsored by the U.S. Department of Energy under Contract W-7405-Eng-48. Accordingly, the U.S. Government retains a nonexclusive, royalty-free license to publish or reproduce these documents, or allow others to do so, for U.S. Government purposes. All documents available from this server may be protected under the U.S. and Foreign Copyright Laws. Permission to reproduce may be required.
1. Introduction - What is Sisal?

Sisal is a functional parallel programming language. It combines modern 
language features with readable syntax and mathematically sound semantic 
foundations to provide the easiest vehicle for parallel programming. Its 
optimizing compiler and runtime support system software provide portability, 
high performance, and determinate execution behavior. It is available on all 
Unix-based uniprocessor and shared memory multiprocessor systems, and 
developmental versions exist for several distributed memory systems, as well.

Why do we think you should be interested in Sisal?

Anyone who's written a computer program knows the extent to which attention 
to detail is required in order to succeed at it. Much of this detail is due 
to factors that are not present in the problem being solved, arising from 
what computer scientists would call the "model of computation" that underlies 
the programming language being used. Unfortunately, the standard 
computational model on which most languages are based is not an easy one to 
deal with. The primal operations are quite simple, and their order is a prime 
determinant of the outcome of a program. On the other hand, most scientific 
or technical programming is based on mathematics, which is quite powerful, 
and in a typical "statement" of mathematics, such as a derivation, the order 
of the expressions does not affect the truth of the ultimate equation. It 
seems reasonable, therefore, that a programming language based on mathematics 
could offer some leverage in the programming process - make the programmer's 
job simpler and the expression of the ideas in the program more 
straightforward. That is exactly what this text deals with - a programming 
language that is closer to mathematics than most, called Sisal. Sisal is a 
simple language. Mathematicians tend to like it. However, if you have any 
experience programming in languages like Fortran, Cobol, C, or Pascal, there 
is a change of mind-set necessary in learning to use it. The effort is not 
great, and we think the results  are well worth it.
 
Call Our Bluff!

Read the following code and decide what you think it does. Then look at the answer underneath it to see if you're correct.

define main

type OneDim = array [ real ];
type TwoDim = array [ OneDim ];

function generate( n : integer
                   returns TwoDim, TwoDim )
   for i in 1, n cross j in 1, n
   returns array of real(i)/real(j)
           array of real(i)*real(j)
   end for
end function % generate

function doit( n : integer; A, B : TwoDim
               returns TwoDim )
   for i in 1, n cross
       j in 1, n
      c := for k in 1, n
              t := A[i,k] * B[k,j]
            returns value of sum t
            end for
   returns array of c
   end for
end function % doit

function main(  n : integer returns TwoDim )
let A, B := generate( n )
in  doit( n, A, B )
end let
end function % main

The Answer:

If you determined that the code on the previous page generated and then 
multiplied two n x n matrices, you were correct. The details of function 
generate are unimportant; we simply needed two matrices, so we built them 
there, according to a completely ad hoc formula. Function doit multiplies 
them. Function main simply invokes the other two functions to accomplish 
this.

So What's the Point?

The point is twofold. First, you probably recognized what the code does, 
which shows that Sisal is easy to read and expressive. Second, the program, 
which runs on just about anything from a PC to a Cray, is not just a program, 
but a parallel program. The fact that there's no reference in it to execution 
environment, number of processors, or any other machine feature, nor to any 
sort of parallelization primitive, shows that Sisal is able to automatically 
manage the parallelism in your program and the resources of the system 
executing it. It's no idle boast to say that this program will give linear 
speedups on a Cray C-90, when a large matrix size is used.

2. Parallel Programming and Other Virtues

It's been said that programming computers is the most difficult abstract 
endeavor ever attempted by human beings. Those who have written complex 
programs, such as large scientific modeling codes, can certainly attest to 
this. Now, however, we are attempting to do something even harder, namely 
write programs for concurrent execution on multiple processors. Parallelism 
adds a new dimension to those already present within programs.

In a sequential program, the order of operations and the locations of data 
are specified, albeit often implicitly thanks to the abstractions available 
in modern programming languages. Parallelism, or what operations can go on 
simultaneously, and how they can interact, is the third dimension. The added 
complexity of parallelism can be staggering. That's why we advocate 
programming in a language with mathematical foundations. Such a formal 
basis allows compilers and run-time support systems to correctly analyze the 
program code and make intelligent decisions about what is safe to execute in 
parallel.

And safety, in the form of program correctness, is of paramount importance, 
here. The number of ways programs can go wrong undetectably is already very 
large. Parallelism adds the possibility of "race conditions", or time-
dependent behavior, to our problems. Such misbehaviors may be extremely 
difficult to find, because they may not show up in every execution, every 
execution environment, or on every platform. Mathematical semantics and 
careful analysis can either spot such dangers or guarantee that they cannot 
happen. This is the case with Sisal.

In fact, Sisal carries a very important guarantee: Sisal programs are 
determinate, regardless of platform or environment. This means that a 
Sisal program will produce the same answers, regardless of what computer it 
executes on, or how many processors (how much parallelism) it uses. This is 
because the Sisal language is based on mathematical principles that give 
syntactically correct Sisal programs very definite formal meanings.

If the Sisal compiler accepts a program, it has extracted complete 
information about what the program does and how its parts interact. In 
languages such as Fortran or C, compilers typically "understand" what the 
parts do, but not how they interact. The Sisal compiler is able to use its 
complete information to automatically reason about how execute the program, 
without possibility of order-dependent or time-dependent errors. This enables 
the parallelism in the program to be exploited with certainty of correctness. 
Determinacy, and the automatic management of parallelism, remove a huge 
burden from the shoulders of the programmer.

But these are not Sisal's only advantages. It is also a very good language 
for rapid prototyping. Sisal programs tend to be shorter, and easier to read 
and modify, than those in other languages. Sisal programmers tend to be able 
to write code faster, since they must describe only what the computation is, 
not the order it must proceed in.

Another very important advantage is that Sisal code can be mixed with C 
or Fortran, to effect hybrid applications, where each language can be used 
for the parts of the work it excels at. For instance, one might chose to use 
mathematical solver libraries written in Fortran from within the parallel 
kernel of a Sisal program. Similarly, one might compose the computational 
kernel of a program in Sisal, so it can deal with the parallelism present in 
the application, while keeping the outer shell in another language, for 
purposes of user interaction, problem setup, and file I/O. We will attempt to 
demonstrate many of these claims in subsequent sections of this tutorial, and 
references will be provided to other sources for further information.

3. Sisal Syntax: Expressions

All Sisal statements are expressions, in exactly the same way that "2+3" is 
one; it has a well-defined value that can be substituted for the original 
expression with no loss of meaning. Each statement of the Sisal language, in 
actual, correct use in a program, has a well-defined value. We say the 
statements "return" these values, as the result of they're having executed; 
but the truth is, the program statements and the values they return, are 
exactly equivalent to each other. Exchanging one for the other has no effect 
on the results of the rest of the program.

This idea may not sound strange, but in fact it does not hold in all 
programming languages; some, such as Fortran and C, allow program expressions 
or statements to have "side effects" in addition to their explicit values. 
Sisal is a "functional" language, and its property of "referential 
transparency" prevents side effects and guarantees a specific and well-
defined meaning for each statement.

The principle of referential transparency derives from mathematics. In 
mathematical derivations, there is often much substitution of equivalent 
expressions for each other, in the search for the most concise or useful 
rendering of a truth. Coding in Sisal is very much like doing math, in its 
reliance on functions and substitutable expressions, and there is usually a 
straightforward transformation from mathematical notation to Sisal code. This 
will become clearer as we explore the nature of the language.

3.1 Expressions
All Sisal statements are expressions, and all expressions evaluate to well-
defined values. An expression is a combination of identifiers and operators 
that is equivalent to a value. Examples of Sisal expressions range from the 
obvious to the arcane. All of the following are correct Sisal expressions:

						1.3
						26
					"hello world"
					  2 + 3 * 2
				  1.5 * 3.14159 / 2.0

These illustrate arithmetic operations on the usual sorts of numeric values, 
and they work just like intuition says they should. Operational precedence in 
Sisal is just what it is in other languages, which is to say that the value 
of the fourth expression above is eight, not ten. Mixing modes or operand 
types is not correct in Sisal. The following are not correct Sisal 
expressions:

					1.3 + 26
				   5 * "good-bye"
				    3.14159 / 2

Two possible corrections for the first one above are

				     1.3 + 26.0
				   1.3 + real(26)

and two for the third one above are

				    3.14159 * 0.5
				    3.14159 / 2.0

There is no possible correct way to multiply a character string by an 
integer, so the second expression above is uncorrectable.

All Sisal statements are expressions; each returns one or more values, and 
each can be used within any other Sisal expression. For instance, the 
following will look familiar:

			Fibonacci(2 + 3) + Factorial(4 * 5)

These are, of course, merely calls to (presumably) user-defined functions, 
with expressions for arguments; one also presumes they each return a single 
numeric value and can thus be added. Such function and expression nesting can 
be carried on to any depth, and the result of evaluating an expression or 
function call can be substituted at the site of invocation without changing 
the resulting value. This is the principle of "referential transparency", the 
mathematical foundation we have alluded to. This principle is all-important, 
and we will refer to it often. Sisal expressions are quite flexible. For 
example, the following is also a legal Sisal expression:

			1 + (if X > Y then 2 else 3 end if)

Here, we have put parentheses in to simplify reading the statement, but they 
are not required. The if-statement compares the magnitudes of X and Y to 
determine which of two values to return. The value of the if-statement is 
then added to 1 to produce the value of the overall expression. The value of 
this entire expression cannot be determined without the values of X and Y, 
but it will have a definite value if they are defined.

All this said, it remains to describe the elements that can make up Sisal 
expressions and then go on to the statements available to the programmer. 
This is available in detail in the Sisal language manual, [33]. Here we will 
merely give some general rules.

3.2 Expression rules

0)  All legal Sisal expressions have values, and all values have types. The 
basic arithmetic types are predefined and available, as discussed below. 
There are also predefined Boolean and character types. Programmers can also 
make up their own data types, as in other languages, such as Pascal, but we 
will save discussing user-defined data types for a later section.

1)  The normal alphabetic letters and numbers, along with the underscore 
character, can be used to make up Sisal identifiers.

					anIdentifier
				   Another_identifier

2)  Identifiers can represent single values or function calls, and their 
names are significant up to 31 characters.

			a_value := calculate_function(x,y,z)

3)  Case is not significant.

					The_Identifier
is the same name as
					tHE_iDENTIFIER

4)  Numerical values are integers, reals, or double_reals. Integers do not 
contain decimal points. Reals either contain decimal points or are expressed 
in scientific notation, with the letter "e" denoting the exponent of ten. 
Double_reals must be expressed in scientific notation, and must have the 
letter "d" denoting the exponent. The numeric literals

						1.
					     1.0
					     1e0
					    1.e0
					   1.0e0

are all reals, and the literals

					    1d0
					    1.d0
					   1.0d0

are all double_reals.

5)  Mixed mode arithmetic is forbidden. Explicit type casting is required. 
The casting operators are simply the names of the desired result types, used 
as functions. For example, to cast the integer value 10 into types real and 
double_real, use

					real(10)
and
				    double_real(10),
respectively.

To cast reals and double_reals into integers, there are three functions: 
integer, trunc, and round. Their operation is described in the Sisal 
language manual. The differences among them are in the way they treat signed 
operands.

6)  The standard arithmetic operators +, -, *, and / are available for 
numeric values and expressions. The divide operator produces truncated 
integer results when applied to integers. Raising a number to a power is 
accomplished by way of the "exp(b,e)" function, which raises the value "b" 
to the power "e". The two arguments to exp may be any combination of numeric 
types. The type of the result will be the higher type of the two arguments, 
where double_real is higher than real which is higher than integer

7) The standard mathematical library functions are available in Sisal via an 
underlying C math library. These include trigonometric, exponential, and 
logarithmic functions. Variants are available for all numeric types.

8)  Boolean values and expressions are possible (and common) in Sisal. The 
primitive Boolean elements are "true" and "false". The  Boolean operators 
are "&" (and), "|" (or), and "~" (not).

9)  Logical operations producing Boolean values are available in Sisal. The 
logical operators are

	>	(greater than)
	>=	(greater than or equal to)
	<	(less than)
	<=	(less than or equal to)
	=	(equal to)
	~=	(not equal to)

10)  Character and string values, expressions, and aggregates are possible in 
Sisal, and their correct uses are described in the Sisal language manual.

The above syntax resembles that of many modern programming languages. Other 
parts of Sisal syntax, such as if-statements, will also look familiar. 
However, the style in which such statements are combined will deviate 
slightly from expectations, so the next aspects of Sisal are areas in which 
it deviates significantly enough from more common imperative languages that 
they deserve sections for themselves.

4. More Syntax

Expressions in Sisal go beyond arithmetic sort, as we have already seen; 
mathematical and computer science considerations arise. For instance, the 
mathematical notion of a conditional expression - one whose value varies 
based on a predicate - is very useful. Also useful is the idea of nested 
scope; scope is a concept applied to names that are associated with values, 
and it specifies the region in which such an association is valid. We will 
describe the application of these ideas to Sisal here, and use them to 
introduce the concept of "arity" as it applies to Sisal programming.

4.1 Arity and Conditionals

All Sisal constructs are expressions, and all expressions have values, but 
sometimes one value is not enough. Therefore, Sisal expressions can return 
multiple values. The number of values returned by an expression is called its 
"arity." Following is an example of a conditional assignment of multiple 
arity:

ROOT1, ROOT2 :=
    if DISCRIM > 0.0 then
        sqrt( DISCRIM ) / DENOM,
        -sqrt( DISCRIM ) / DENOM
    else
        0.0, 0.0
    end if

The if-statement above has arity of two, meaning it returns two values. This 
code fragment is a multi-assignment of two values to two names. Each  branch 
of the if-statement within the assignment must return two values. Further, 
the corresponding result values in each branch must have the same types; this 
means that if an if-statement's then-clause returns two values, say an 
integer and an array of reals, in that order, then the else-clause must also 
return an integer and an array of reals, in that order. In the example above, 
both values are of type real, as seen by the literal zeroes in the else-
branch. In any multiple-valued expression, the number of returned values must 
equal the number of names they are assigned to, and the names and values 
associate from left to right. This means that the statement we are 
considering, above, is equivalent to the following:

ROOT1 := if DISCRIM > 0.0 then
            sqrt( DISCRIM ) / DENOM
         else
            0.0
         end if;
ROOT2 := if DISCRIM > 0.0 then
            -sqrt( DISCRIM ) / DENOM
         else
            0.0
         end if;

In a multiple-valued expression (also termed a multi-expression), the 
individual component expressions or values are separated by commas, as shown 
above. The reader is cautioned not to become confused by the semicolons in 
the program text; semicolons are statement separators. The rules for 
constructing and evaluating expressions, even those of multiple arity, are 
likely to be intuitive to anyone with a background in basic mathematics.

4.2 Let-In Statements

One problem with languages like Sisal is that they tend to be wordy. A large 
or complex expression that is used several times in succession can needlessly 
clutter an otherwise clear program. Sisal takes another hint from 
mathematical convention in allowing a convenience notation known as the "let-
in statement." It basically allows the programmer to associate a name with 
the value of an expression, and then use the name as often and wherever is 
handy. Here's an example of this:

let one := sin(x)*sin(x) + cos(x)*cos(x);
    two := sqrt( one+one+one+one);
    three := one + two
in  one, two, three, sqrt(two)
end let

This shows the names "one", "two", and "three" being assigned values in the 
let-clause, and demonstrates the utility of the let-in statement. The name 
"one" gets a value based on several function calls; "two" gets a value that 
uses the value of "one"; and  "three" gets a value based on both the 
previous values. Then in the in-clause, all three names are used to specify 
four values, which constitute the value of the let-in statement. This let-in 
statement has arity four.

The if-statement and the let-in statement typify statements that can have 
multiple arity. The let-in statement brings us to the idea of nested scopes. 
Each instance of a let-in statement constitutes a scope wherein names may be 
assigned to values, or defined. Names used in such a way are invisible 
outside the let-in statement in which they are defined. This allows names to 
be reused inside nested scopes, and such nesting can go on to effectively any 
depth. Here is an absurd example of such a legal (but unwise) nesting:

let x := 1;
    y := let x := 2;
             y := 3
         in  x + y
         end let
in x + y
end let

The value of this nested pair of let-in statements is 6, and in spite of its 
tendency to confuse human readers, it is legal Sisal syntax. Many clearer and 
more compelling uses for statement nests can be found, and will readily occur 
to an experienced Sisal programmer.

In both the if-statement and the let-in statement, the statements are 
terminated by an end-if and end-let, respectively. These  bracketing keywords  
help both the compiler and the reader keep track of where they are in the 
code, and establishes the pattern for all Sisal statement types. Each of 
these two statement types can be used together and nested in any convenient 
way, as long as commas, semicolons, and end-keywords are placed correctly. 

The best way to appreciate such syntactic elements is to use them, so let us 
now go on to some exercises.

5. Exercise Set 1

Exercise 1.1: The Circle
Given:  a real value, r, for the radius of a circle;
Compose: a let-in statement that returns the area and circumference of a 
circle of the given radius.

Exercise 1.2:  Statistics
Given: four values, a, b, c, and d, all real numbers;
Given: the formulas for the mean and standard deviation
            mean = (a+b+c+d)/4;
            standard deviation = (((a-mean)**2 + (b-mean)**2 +
                                   (c-mean)**2 + (d-mean)**2) / 3)**0.5
Compose: a let-in statement that returns the mean and standard deviation of 
the given set of numbers.

Exercise 1.3:  Cubic Roots
Given: two real values, a, and  b, coefficients of the cubic
            equation x**3 + ax + b = 0;
Given: the formulas for the roots x = C + D,
                                      -(C + D)/2 + 3**0.5*(C - D)/2, and
                                      -(C + D)/2 - 3**0.5*(C - D)/2,
            where C = (-b/2 + ((b**2)/4 + (a**3)/27)**0.5)**(1/3), and
                  D = (-b/2 + ((b**2)/4 - (a**3)/27)**0.5)**(1/3);
Compose: a let-in statement that returns the roots of the cubic equation.

Answers to Exercise Set 1

Exercise 1.1: The Circle
Given:  a real value, r, for the radius of a circle;
Compose: a let-in statement that returns the area and circumference of a 
circle of the given radius.
area, circumference :=
        let pi := 3.14159
        in  pi * r * r,
            2.0 * pi * r
        end let

Exercise 1.2:  Statistics
Given: four values, a, b, c, and d, all real numbers;
Given: the formulas for the mean and standard deviation
            mean = (a+b+c+d)/4;
            standard deviation = (((a-mean)**2 + (b-mean)**2 +
                                   (c-mean)**2 + (d-mean)**2) / 3)**0.5
Compose: a let-in statement that returns the mean and standard deviation of 
the given set of numbers.
let mean := (a + b + c + d) / 4.0;
    a_diff := a - mean;
    b_diff := b - mean;
    c_diff := c - mean;
    d_diff := d - mean;

    a_diff_sq := a_diff * a_diff;
    b_diff_sq := b_diff * b_diff;
    c_diff_sq := c_diff * c_diff;
    d_diff_sq := d_diff * d_diff;

    diff_sq_sum := a_diff_sq + b_diff_sq +
                   c_diff_sq + d_diff_sq;

    sigma := sqrt( diff_sq_sum / 3.0 )

in  mean, sigma

end let

Exercise 1.3:  Cubic Roots
Given: two real values, a, and b, coefficients of the cubic
            equation x**3 + ax + b = 0;
Given: the formulas for the roots x = C + D,
                                      -(C + D)/2 + 3**0.5*(C - D)/2, and
                                      -(C + D)/2 - 3**0.5*(C - D)/2,
            where C = (-b/2 + ((b**2)/4 + (a**3)/27)**0.5)**(1/3), and
                  D = (-b/2 + ((b**2)/4 - (a**3)/27)**0.5)**(1/3);
Compose: a let-in statement that returns the roots of the cubic equation.
x1, x2, x3 :=
let term1   := -b / 2.0;
    term2   := b * b / 4.0;
    term3   := a * a * a / 27.0;
    radical := sqrt(term2 + term3);
    C       := exp(term1 + radical, 1.0/3.0);
    D       := exp(term1 - radical, 1.0/3.0);
in  C + D,
    -(C + D)/2.0 + sqrt(3.0) * (C - D)/2.0,
    -(C + D)/2.0 - sqrt(3.0) * (C - D)/2.0
end let 

6. User-Defined Types and Dynamic Values

Another area where we offer convenience to the programmer is in specifying  
data types. Sisal comes with the standard complement of built-in data types, 
as described above, but these can be used to build further types. Arrays and 
records are the two most important sorts of user-defined date types. But 
before we describe this, a bit more Sisal philosophy is necessary to set the 
stage for it.

In conventional programming languages, programmers use "variables" as names 
in their programs. These names are, strictly speaking, equivalent to storage 
locations. When a variable name is used, it implicitly means that the value 
stored in that location is wanted. This is why statements such as "I = I + 1" 
make sense in languages like as Fortran. It means that the contents of 
location "I" are being changed by an addition operation.

In Sisal, however, names refer to values, not locations. We really do not 
care about storage in Sisal, so the location of a value is unimportant to us; 
only the values themselves concern us in writing Sisal code.  A statement 
such as the increment operation above is incorrect in Sisal, both 
syntactically and semantically. The name "I" must correspond to a value, and 
if a new or different value is desired, it must be given a new name. But this 
need not be as awkward in practice as it may seem to be in concept. It is 
consistent with mathematical convention. In Sisal programs, values and their 
associated names have transitory natures, existing only between the times of 
their creation and their final use by other program elements.

Here's the the let-in example we last dealt with:

let one := sin(x)*sin(x) + cos(x)*cos(x);
    two := sqrt(one+one+one+one);
    three := one + two
in  one, two, three, sqrt(two)
end let

In it, the names "one", "two", and "three" are defined and used, and are then 
discarded. They are said to be undefined outside the scope of that statement. 
This dynamism, while familiar to mathematicians, is quite foreign to most 
programmers, and takes some getting used to. User-defined data types in Sisal 
reflect the dynamic nature of the language as well as names, but in a 
slightly different way.

6.1 Aggregate Data Types

The most important of the Sisal data types are the aggregates, that is, array 
and record types. The use of aggregates in Sisal, as in other languages, give 
the programmer leverage in manipulating program data with minimal effort at 
naming. They allow giving  ensembles of data single names, and grouping 
similar data items in intuitive ways. They also allow easy identification of 
the potential for parallel execution, as we will see in the section on loops. 
First, though, we briefly describe the two sorts of aggregate data types.

6.1.1 Array Types
In Sisal, as in other languages, arrays are collections of elements of 
homogeneous types. In Fortran, for example, an array type is specified by the 
number of dimensions it has, and by its static size at compile-time. An array 
defined to be of size 100 by 50 can never change size or shape. In Sisal, the 
number of dimensions an array contains is specified by its type, but its 
sizes in each dimension are completely dynamic. But let's take this in small 
steps; consider the Sisal type declarations shown below.

type float = double_real;
type One_Dim_R = array [ float ];
type One_Dim_I = array [ integer ];
type Two_Dim_I = array [ One_Dim_I ];

The first statement defines a convenient type-name, "float", by which to 
refer to values of type double_real. The second defines a convenient type-
name for an array whose elements are of type float. The third line specifies 
a type that is an array of integer elements, and the fourth line specifies 
an array of elements which are themselves one-dimensional arrays; this fourth 
type is an example of what we mean by a two-dimensional array in Sisal.

Note that in the array type declarations, no sizes are specified; this is 
because size and index range are not characteristics of array types, but only 
of specific values which are instances of array types. Arrays of type  
One_Dim_R can have as many elements as desired in the program; they can also 
have any index range the programmer finds expedient, as long as the upper 
index bound is at least as large as the lower bound. The same is true for 
arrays of type Two_Dim_I. Furthermore, the one-dimensional arrays making up 
the rows of a two-dimensional array can all have different sizes and index 
ranges, leading to what are called "ragged arrays".

These matters will be made clearer in the examples to follow. For the moment, 
the key point to focus on is that all values are dynamic and transitory in 
Sisal programs. If a statement returns, or evaluates to, a value that is a 
one-dimensional array, that array's size and index range are defined solely 
by the statement whose value it is; and, the value itself exists only until 
it is finally consumed by a further statement in the program.

One other matter concerning arrays should be mentioned at this time: indexing 
and logical structure. We have just seen that a two-dimensional array may be 
considered an array of one-dimensional arrays. This inductive hierarchical 
structuring continues into greater numbers of dimensions; for instance, a 
three-dimensional array is an array of two-dimensional arrays. When indexing 
an array, the elements accessed get smaller (or simpler) as more indices are 
specified from left to right. For instance, if we have the type declarations 
shown above, and array A is of type Two_Dim_I, then A[6] is the sixth row 
of A, and A[4,7] is the seventh element of the fourth row. It is also 
correct syntax to specify this element as A[4][7], which clarifies the fact 
that we are specifying the seventh element of the fourth element of A. 

There is no notation in Sisal for specifying a single column of a two-
dimensional array, nor is there a correct notation for omitting the left or 
middle index from an access to a three dimensional array. Indices must be 
supplied from the left and omitted, if they are not all supplied, from the 
right.

Similarly, suppose we add the following declaration to the above set:

type Three_Dim_I = array [ Two_Dim_I ];

And, suppose we have an array B of type Three_Dim_I. If we index as B[i], 
we are specifying the two-dimensional array formed by all the i-th rows of 
each plane. The B[i,j]-th element is the row of all the elements from row i, 
column j, in each plane. This indexing treatment is termed "row_major," and 
means that in walking through an array of any dimensionality, the leftmost 
array index varies fastest and the rightmost varies slowest. This is 
consistent with the indexing conventions found in the C language but not the 
same as in Fortran.

6.1.2 Record Types
Records are collections of elements of  heterogeneous types. The best way to 
see this is to define an example.

type element_record = record [ name : array [ character ];
                               number : integer;
                               weight : real ]
type atomic_table = array [ element_record ];

Here we see a record type for chemical elements, containing fields for a 
name, atomic number, and atomic weight, with each field having a different 
type. We also see an array type for a collection of element records. Such 
aggregations can be nested to effectively any useful depth.

The fields of record-type data items are accessed by giving the name of the 
record-type value, followed by a period and the field name. Given the above 
declarations, suppose we have the value Fe, of type element_record. Then 
specifying Fe.name could yield the value "Iron" and specifying Fe.number 
could yield the value 56. For a value named Column1 of type atomic_table, 
we might have the access Column1[1] yield a record of type element_record 
whose contents are [name: "Hydrogen"; number: 1; weight: 1.0].

In practice, records usually are aggregated into arrays, and they sometimes 
contain array-type fields. This is natural, convenient, and will be familiar 
to users of languages such as Pascal. While some of the standard uses for 
records, such as implementing stacks and queues, can be accomplished with 
arrays of records, they can also be done using another built-in aggregate 
type called a "stream." Streams will not be dealt with here, and the 
interested reader is referred to the Sisal Language Manual [33].

6.1.3 A Few Examples of Array and Record Use
Array and record types make composing readable Sisal programs easy and 
convenient. The syntax rules for using them correctly arise immediately from 
the semantic rules we have already touched on: every Sisal statement is an 
expression, and all expressions return specific values of specific types. 
Consider the following example:

let x_diff := x[1] - x[2];
    y_diff := y[1] - y[2];
    xd_sq := x_diff * x_diff;
    yd_sq := y_diff * Y_diff
in  sqrt( xd_sq + yd_sq )
end let

This is the calculation of the magnitude of separation of two two-space 
vectors. The  vectors are named x and y, and their components are indexed in 
the usual way, using square brackets to separate the index expressions from 
the value names. Alternatively, we could represent the vectors as records, 
whose fields contain the components, and the calculation would them be as 
follows:

let x_diff := v1.x - v2.x;
    y_diff := v1.y - v2.y;
    xd_sq := x_diff * x_diff;
    yd_sq := y_diff * Y_diff
in  sqrt( xd_sq + yd_sq )
end let

Here we have the two records, named v1 and v2, with fields in each named x 
and y. The record notation allows us to name the fields, and refer to them by 
use of the record and field names separated by a period. A third alternative 
we could use would allow a combination of these two notations, as follows:

let x_diff := vectors[1].x - vectors[2].x;
    y_diff := vectors[1].y - vectors[2].y;
    xd_sq := x_diff * x_diff;
    yd_sq := y_diff * Y_diff
in  sqrt( xd_sq + yd_sq )
end let

Here we have an array, named vectors, whose elements are records. We index 
the individual records as array elements, and their components as record 
fields. But both these examples assume aggregates that already exist. How 
does one go about creating them?

6.1.4 Record Creation
To illustrate record creation, we will use the example declarations we used 
earlier:

type element_record = record [ name : array [ character ];
                               number : integer;
                               weight : real ]
type atomic_table = array [ element_record ];

When creating items of record types, we specify the matched pairs of field 
names and contents, in the order given in the type definition. Here are 
examples:

let hydrogen := record element_record [ name: "Hydrogen"; number: 1;
                                            weight: 1.0 ];
    helium   := record element_record [ name: "Helium"; number: 2; 
                                        weight: 4.0 ];
    element_115 := record element_record [ name: ""; number: 115;
                                           weight: -999.999];
    column1 := array atomic_table [ 1: hydrogen, helium, element_115 ]

in hydrogen, helium, element_115, column1
end let

Here we see three records being created, each of which has its type, 
element_record, specified along with it. This is sometimes optional, but 
it's a good practice to adopt, nonetheless, as it helps clarify the code to 
human readers. The first two records, hydrogen and helium have their field 
names and values specified in the order given in the original declaration, 
and with reasonable values. The field names must be paired with the field 
values, with a colon separating each name and value, and a semicolon 
separating the pairs, as in the definition of the type. In the third record, 
element_115, we see and empty string used for the name and a negative value 
for the weight, used as flag values. This is still a legal record, since all 
the fields are specified in order and with correct types.

The fourth item created and returned above is an array named column1 of type 
atomic_table. We will see more about creating arrays in the next section.

6.1.5 Array Creation
Arrays offer the greatest number of options for data creation. They are most 
often created in loops, in which context they offer great power to the 
programmer in the expression of parallelism. However, they can also be 
created in other ways. Since we have not yet discussed loops, we will 
illustrate the other creation methods, using the declarations of array types 
we used above: 

type float = double_real;
type One_Dim_R = array [ float ];
type One_Dim_I = array [ integer ];
type Two_Dim_I = array [ One_Dim_I ];

With these declarations, we can define a set of arrays in the following code:

let x := array One_Dim_R [ 1,4: 1d0, 2d0, 3d0, 4d0 ];
    a := array One_Dim_I [ 0,4: 1, 3, 5, 7, 9 ];
    b := array           [ 1: 2, 4, 6, 8 ];
    c := array Two_Dim_I [ 3: a, b, a ]
    d := array_fill( 2, 10, 0d0 );
    e := array One_Dim_I [];
in  foo(x[2], a, b, c[4,4], d, e)
end let

7. Exercise Set 2

Exercise 2.1: Three Circles
Given: an array of three real values, r, for the radii of three circles;
Compose: a let-in statement that returns two arrays containing the areas and 
circumferences of the circles of the given radii.

Exercise 2.2:  More Statistics
Given: four three-element arrays, a, b, c, and d, all containing real 
numbers;
Given: the formulas for the mean and standard deviation
            mean = (a+b+c+d)/4;
            standard deviation = (((a-mean)**2 + (b-mean)**2 +
                                   (c-mean)**2 + (d-mean)**2) / 3)**0.5
Compose: a let-in statement that returns an array of records, each containing 
the mean and standard deviation of one of the given sets of numbers.

Answers to Exercise Set 2

Exercise 2.1: Three Circles
Given: an array of three real values, r, for the radii of three circles;
Compose: a let-in statement that returns two arrays containing the areas and 
circumferences of the circles of the given radii.

let pi := 3.14159;
    area1 :=  r[1] * r[1] * pi;
    area2 :=  r[2] * r[2] * pi;
    area3 :=  r[3] * r[3] * pi;
    circ1 := 2.0 * r[1] * pi;
    circ2 := 2.0 * r[2] * pi;
    circ3 := 2.0 * r[3] * pi;

in  array [ 1: area1, area2, area3 ],
    array [ 1: circ1, circ2, circ3 ]
end let

Exercise 2.2:  More Statistics
Given: four three-element arrays, a, b, c, and d, all containing real 
numbers;
Given: the formulas for the mean and standard deviation
            mean = (a+b+c+d)/4;
            standard deviation = (((a-mean)**2 + (b-mean)**2 +
                                   (c-mean)**2 + (d-mean)**2) / 3)**0.5
Compose: a let-in statement that returns an array of records, each containing 
the mean and standard deviation of one of the given sets of numbers.

let mean_a := (a[1] + a[2] + a[3])/3.0;
    mean_b := (b[1] + b[2] + b[3])/3.0;
    mean_c := (c[1] + c[2] + c[3])/3.0;
    mean_d := (d[1] + d[2] + d[3])/3.0;

    a_diff_sq_sum := (a[1] - mean_a) * (a[1] - mean_a) +
                     (a[2] - mean_a) * (a[2] - mean_a) +
                     (a[3] - mean_a) * (a[3] - mean_a);
    b_diff_sq_sum := (b[1] - mean_b) * (b[1] - mean_b) +
                     (b[2] - mean_b) * (b[2] - mean_b) +
                     (b[3] - mean_b) * (b[3] - mean_b);
    c_diff_sq_sum := (c[1] - mean_c) * (c[1] - mean_c) +
                     (c[2] - mean_c) * (c[2] - mean_c) +
                     (c[3] - mean_c) * (c[3] - mean_c);
    d_diff_sq_sum := (d[1] - mean_d) * (d[1] - mean_d) +
                     (d[2] - mean_d) * (d[2] - mean_d) +
                     (d[3] - mean_d) * (d[3] - mean_d);

    sigma_a := sqrt( a_diff_sq_sum / 2.0 );
    sigma_b := sqrt( b_diff_sq_sum / 2.0 );
    sigma_c := sqrt( c_diff_sq_sum / 2.0 );
    sigma_d := sqrt( d_diff_sq_sum / 2.0 );

    a_record := record [ mean: mean_a; st_dev: sigma_a ];
    b_record := record [ mean: mean_b; st_dev: sigma_b ];
    c_record := record [ mean: mean_c; st_dev: sigma_c ];
    d_record := record [ mean: mean_d; st_dev: sigma_d ];

in  array [ 1: a_record, b_record, c_record, d_record ]
end let

8. Loops and Parallelism

The preceding exercise set should have motivated the need for looping 
constructs. Performing repetitive operations is verbose and tedious without 
them. In most conventional languages, no distinction is made between the two 
basic kinds of loops, those in which the computations in the loop body are 
independent, and those in which they not. The first kind contains the 
potential for parallel execution, while the second kind, which we normally 
call "iteration" does not. It is the first kind, called the "for_loop", which 
is the easiest and most rewarding to use, in terms of return for effort, and 
we will discuss it first.

8.1 For-Loops

The Sisal for-loop is used for repetitive computations that are independent, 
meaning that no information flows among them. Examples of this might be the 
incrementing of an array of counters, or the element-wise addition of two 
vectors. None of the individual increment or addition operations depends on 
any of the others, so, potentially, they can all occur simultaneously, or in 
parallel. The Sisal for-loop is designed to expose this potential, so the 
language translation and support software can determine whether and how to 
exploit it. This allows the programmer to concentrate on the algorithms used 
and the problems being solved, and not get bogged down in the details of how 
the computer executes the program implementing the algorithm.

The Sisal for-loop contains three parts: the range, the body, and the returns 
clause. The syntax looks like this:

     for {range}
        {optional body}
     returns {returns clause}
     end for

The range specifies the set of values controlling the number of instances of 
the body and returns clause that will be generated. Typically, it consists of 
an index identifier and a set of numeric values, such as "i in 1, n," but it 
can also consist of a pair of names for an element and an aggregate, such as 
"a in counters". In these ways, the number of instances of the loop that 
will be calculated is established, as well as the specific information needed 
by each one.

The loop body is optional, and may contain the definitions of the values that 
are to be returned or any intermediate values the calculation needs.

The returns clause controls which and how many data values will be returned 
by the loop, and it contains one or more aggregation or reduction operators 
(which will look familiar). But before we explain them, let's look at a 
simple for-loop that increments an array of counters, to see how these 
elements interact in practice.

for i in 1, num_counters
   new_count := counters[i] + increment[i]
returns array of new_count
end for

Here we see a loop whose range is from 1 to the value named "num_counters;" 
we assume this value exists and makes sense in the present context. In the 
second line of the loop we see this loop's body. In this loop, the body is 
where all the work is done, namely the addition of the elements of the two 
vectors "counters" and "increment." Both vectors are assumed to be of such 
size that there will be an element of each available for the entire range of 
values of the loop index, "i." For each value of this index, a new value is 
calculated, named "new_count," and all instances of this calculation are 
independent. Finally, in the returns clause, all the instances of 
"new_count" are aggregated into an array, and that array is returned as the 
value of the loop.

Recalling that its index range is an integral part of a Sisal array, we must 
specify the index range of the array returned by this loop. This is simple 
and automatic; it is simply the set of values in the loop's range. In the 
current example, the loop's returned array has range [1..num_counters].  
This is characteristic of all Sisal for-loops that return arrays, except for 
cases we will discuss in a future section on advanced looping concepts.

As before, the for-loop is an expression, and has a well-defined value, which 
is the value defined by its returns clause. In the above case, it is an 
array. However, we can also return reduced values, such as in the following 
loop:

for i in 1, num_counters
   new_count := counters[i] + increment[i]
returns value of sum counters[i]
        value of sum new_count
        array of new_count
end for

This loop returns three values: a scalar (the sum of all the values in the 
array "counters"), an array (containing all the values of "new_count"), 
and another scalar (the sum of all the values of "new_count"). This 
demonstrates aggregation and reduction in the same loop.

The body of a for-loop is similar to a let-in statement, in that temporary 
names can be defined in it that are used subsequently in the loop and/or in 
the returns clause. But strictly speaking, the loop body is not generally 
needed at all. The following two loops have the same value as the one 
immediately above:

for i in 1, num_counters
   old_count := counters[i];
   new_count := old_count + increment[i]
returns value of sum old_count
        value of sum new_count
        array of new_count
end for

for i in 1, num_counters
returns value of sum counters[i]
        value of sum counters[i] + increment[i]
        array of sum counters[i] + increment[i]
end for

The first of these two loops demonstrates that names, such as "old_count" 
may be assigned to values within the loop, and then used both in the loop and 
in the returns clause.  Again, a different instance of "old_count" is 
generated for each value of the loop's range, and none of the instances can 
affect each other in any way outside of the loop's returns clause. This 
guarantees that no matter how many or how few of the loop instances are 
computed in parallel, or in what order they are computed, that same set of 
intermediate values will be generated and the same set will be returned as 
the value(s) of the loop itself.

The second of the above loops demonstrates that all the work can occur in the 
loop's returns clause, leaving an empty loop body. This in no way changes 
what we just said:  each instance of the loop's calculation is independent of 
all the others, and they can occur in any order and give the same results.

There is one more thing to consider before we leave this section, and that is 
the alternate form of range generator we alluded to above - the one 
containing a pair of identifiers. Simply speaking, if we don't need the loop 
index as a numerical value, we don't have to use it. We can instead refer by 
name to the individual elements of the array we are working on, as in the 
following example:

for elt in counters
   new_count := elt + 1
returns value of sum elt
        value of sum new_count
        array of new_count
end for

This loop assumes that the same value will be used to increment each element 
of the array "counters," namely one. So, no loop index is needed, since each 
instance of the loop gets one instance of "elt" from the array "counters" 
to work with. More advanced forms of loop range generation are also possible, 
and we will discuss them in a later section.

9. Exercise Set 3

Exercise 3.1: Dot Products
Given: two vectors, x and y, of equal size and range;
Compose: a loop that returns their dot (or inner) product.

Exercise 3.2: Vector Max and Min
Given: two vectors, x and y, of equal size and range;
Compose: a loop that returns two vectors, containing the larger and smaller 
elements, respectively, of each position in the two vectors.

Exercise 3.3: Vector Sum and Difference
Given: two vectors, x and y, of equal size and range;
Compose: a loop that returns four vectors, containing, respectively, the 
element-wise sum and difference of the two vectors, and the sums of the 
elements in each of the two vectors.

Answers to Exercise Set 3]

Exercise 3.1: Dot Products
Given: two vectors, x and y, of equal size and range;
Compose: a loop that returns their dot (or inner) product.

for i in 1, size
   elt_prod := x[i] * y[i]
returns value of sum elt_prod
end for

Exercise 3.2: Vector Max and Min
Given: two vectors, x and y, of equal size and range;
Compose: a loop that returns two vectors, containing the larger and smaller 
elements, respectively, of each position in the two vectors.

for i in 1, size
   minimum,
   maximum := if x[i] < y[i] then
                 x[i], y[i]
              else
                 y[i], x[i]
              end if
returns array of maximum
        array of minimum
end for

Exercise 3.3: Vector Sum and Difference
Given: two vectors, x and y, of equal size and range;
Compose: a loop that returns four vectors, containing, respectively, the 
element-wise sum and difference of the two vectors, and the sums of the 
elements in each of the two vectors.

for in 1, size
returns array of x[i] + y[i]
        array of x[i] - y[i]
        value of sum x[i]
        value of sum y[i]
end for

10. More on Loops

In this section, we tackle some advanced aspects of looping. While we're 
going to do this in the context of parallel loops, for simplicity's sake, 
most of what we say will also apply to sequential, or iterative, looping, 
which we are saving for a later section. We have already seen two forms of 
range generators, and two forms of returns clauses. Sisal has more to offer 
in both of these areas. As in all languages, looping constructs can be nested 
to effectively any depth. Nested looping is one way to generate 
multidimensional arrays. There are also many ways to return values, both 
simple and compound, from loops. Here we will explore these topics in a bit 
more detail, since they add greatly to the power of the language in 
expressing algorithms and exploiting parallelism.

10.1 Range Generators

The possibilities for simple loop range generators extends to the combination 
of the two we have already seen. For simplicity, we will show examples of the 
three together before describing them.

        for i in 1, n

        for a in a_array

        for a in a_array at i

Here, we see the two familiar forms followed by a new hybrid form. The first 
one, as we have said, generates a set of independent values, all named i, 
from the inclusive range [1,n]. The second generates a set of independent 
values, all named a, from the aggregate named a_array. The third generates 
both the element a, from the aggregate a_array, and the index of a, i, in 
the index range of a_array. This last form may seem strange, but it is 
useful in situations where the index range of an array is not known ahead of 
time.

In addition to the above simple range generators, we can have compound or 
multidimensional ones, such as the following:

        for i in m, n cross j in p, q

        for a in a_array cross b in b_array

        for k in r, s dot l in x, y

        for c in c_array dot d in d_array

Here we have the first two forms of simple range generator combined into 
compound generators. The keywords "dot" and "cross" determine how the index 
or element value sets will be formed. In the first example, we have an index 
value set consisting of the cross product of the ranges from the two simple 
generators. In other words, every value of i in the range [m,n] will be 
paired with every value of j in the range [p,q]. In the second example, we 
have another cross product, with every element a from a_array paired with 
every element i from b_array.

In the dot-product range generators, we again have a pairing of values, but 
no value from either set is used more than once. In the third example, the 
first value of k will be r, and it will be paired with the first value of l, 
which will be x. The second values of k and l will be r+1 and x+1, 
respectively; this continues until the smaller-sized range is completely used 
up, at which point the loop's range is complete. Similarly, in the fourth 
example, the first value of c will be the first element of c_array, and it 
will be paired with the first element of d_array. In the case of arrays, 
when we use the term "first element" we mean the element with the lowest 
index.

These compound ranges are very powerful, and they can be extended effectively 
without limit. Semantically, compound ranges are equivalent to loop nests. 
The following two loops are equivalent in range and in the types of values 
they return:

for i in m, n cross j in p, q
   x := i + j
returns array of x
        value of sum x
end for

for i in m, n
   x_array, x_scalar := for j in p, q
                           x := i + j
                        returns array of x
                                value of sum x
                        end for
returns array of x_array
        value of sum x_scalar
end for

Each of the above two loops produces a two-dimensional array and a scalar. 
The index range of the arrays are [m, n] by [p, q], and the elements in 
corresponding positions of the arrays are identical.

Just to make certain all this is clear, we should consider a few more 
examples. The following loop sums all the elements of a two_dimensional 
array:

for i in 1, num_rows cross j in 1, num_columns
returns value of sum a[i,j]
end for

The next example returns the vector consisting of the major diagonal of a 
two_dimensional array:

for i in 1, num_rows dot j in 1, num_columns
returns array of a[i,j]
end for

The above example is a bit tricky. Since the array a may not be square (i.e., 
have the same number of rows as it does columns), the resulting vector will 
actually be the major diagonal of the largest square subarray of array a.

Here's another tricky example, the transposition of a rectangular array:

for j in 1, num_columns cross i in 1, num_rows
returns array of a[i,j]
end for

The above deserves careful study. It hasn't been emphasized yet, but the 
shape of an array returned by a loop nest is strictly determined by the order 
of the ranges of the loops as well as the ranges themselves. The array a, 
above, has shape [1, num_rows] by [1, num_columns], which is to say it has 
num_rows rows and num_columns columns. We want the resulting transposed 
array to have num_columns rows and num_rows columns, and to have as it's 
[i,j]-th element the [j,i]-th element of a. We must build the array in the 
shape we want, so we specify the outer loop to run over the desired index 
range for rows, and the inner loop to run over the that for columns. We then 
must be careful to access the elements of the original array, a, so as to 
avoid addressing outside of its index ranges. If we do all that correctly, 
the resulting value returned by the loop will be correctly transposed in 
size, shape, and element values. The reader should take the time firmly 
understand the above loop before proceeding with the next set of exercises.

10.2 Returns Clauses

We have also merely scratched the surface in the area of returns clauses. We 
have seen the use of the "array of" aggregation and the "value of sum" 
reduction. There are also "value of" and "stream of" aggregations, which we 
will not go into at this point, and reductions of the form "value of 
product", value of least", "value of greatest", and "value of catenate". The 
sum, product, least, and greatest reductions produce scalars, and operate on 
numeric and Boolean types. Here's an example of their use in a nonnumeric 
context:

for B_val in Boolean_array
returns value of sum B_val
        value of product B_val
end for 

This loop's return clause performs the inclusive "or" and the "and" functions 
of the elements of Boolean_array. The value of sum b_val reduction 
performs a Boolean summation, which returns true if any of its operands are 
true, and value of product b_val reduction performs a Boolean 
multiplication, returning true only if all its operands are true.

The catenate reduction produces a flattened aggregate. This latter point is 
nontrivial. If a two-level loop produces an array via the "array of" 
aggregation, it will be a two-dimensional array, with ranges corresponding to 
those of the loops. If, on the other hand, it returns "value of catenate", 
the resulting array will be one-dimensional, consisting of all the rows 
produced by the inner loop joined together into a larger vector. A few 
examples at this point would likely be helpful.

The following loop finds the largest and smallest elements of a three-
dimensional array:

for a_two_d in a_three_d cross
    a_one_d in a_two_d cross
    a_elt in 1, a_one_d

returns value of least a_elt
        value of greatest a_elt
end for

The following loop flattens a three-dimensional array into a one_dimensional 
vector by concatenating all its rows together:

for row_plane in a_3d_array cross
    row in row_plane

returns value of catenate row
end for 

The number of elements of the array resulting from the above loop will be the 
sum of the numbers of elements in each row of a_3d_array. The index range 
will begin at the index range of the first row and proceed upward from there.

10.3 Masked Returns Clauses

The returns clause is capable of aggregating and reducing values computed in 
its for-loop, and it can also exercise selection criteria in determining 
which values to so treat. This is accomplished with one or more "masks," 
which are simply predicates that can be applied to the loop-generated values 
before further action is taken on them by the returns clause. These masks 
take the form of the keywords "when" or "unless" followed by a logical 
expression. The truth value of the logical expression interacts with the 
chosen mask keyword to determine whether the loop-generated values will be 
among those returned. Here's an example of a masked loop:

for val in an_array
returns value of sum 1 when val > 0.0
end for 

The above loop sums a set of integer ones corresponding to the positive 
elements of an_array. This is a simple way of counting the positive 
elements. Another common way to use masked returns clauses is in weeding out 
values from a set to be aggregated and returned, such as the following 
situation:

for i in 1, number_of_values
    new_val := deep_thought( value[i] )
returns array of new_val unless new_val > upper_val | new_val < lower_val
end for 

The above loop returns an array of only those instances of new_val that fall 
within the desired range [lower_val, upper_val]. The array so generated 
will not have any empty elements; every element in its index range will have 
a corresponding legal value, but its index range will not necessarily be 
equal to that of the loop, so care must be taken in subsequent dealings with 
it.

The for-loop is incapable of doing the sorts of calculations that involve 
information moving between the instances of the loop's calculation, such as 
iterative convergence calculations. It is also incapable, as we now know it, 
of doing the more general form of looping across irregular index ranges or 
value sets. The first of these problems will be dealt with when we discuss 
the other form of Sisal loop statement, and the second when we deal with 
advanced looping concepts. For now, let us do a few exercises that will 
expose more of the power of Sisal loops.

11.Exercise Set 4

Exercise 4.1: Multidimensional Dot Products
Given: two two_dimensional arrays, x and y, of equal size and range;
Compose: a loop that returns an array of dot (or inner) product of each pair 
of corresponding rows.

Exercise 4.2: Multidimensional Max
Given: two two_dimensional arrays, x and y, of equal size and range;
Compose: a loop that returns the value of the largest dot (or inner) product 
of each pair of corresponding rows.

Exercise 4.3: Character Exclusion
Given: an array of characters, chr_array;
Compose: a loop that returns an array of the characters from chr_array that 
are either alphabetic characters or the space character.

Exercise 4.4: General Character Exclusion
Given: an array of characters, chr_array and an array of forbidden  
characters, forbidden_chr_array;
Compose: a loop that returns an array of the characters from chr_array that 
are not elements of the array forbidden_chr_array.

Answers to Exercise Set 4

Exercise 4.1: Multidimensional Dot Products
Given: two three_dimensional arrays, x and y, of equal size and range;
Compose: a loop that returns an array of dot (or inner) product of each pair 
of corresponding rows.

for x_row in x dot y_row in y
returns array of for i in 1, row_size
                 returns value of sum x_row[i] * y_row[i]
                 end for
end for

Exercise 4.2: Multidimensional Max
Given: two two_dimensional arrays, x and y, of equal size and range;
Compose: a loop that returns the value of the largest dot (or inner) product 
of each pair of corresponding rows.

for x_row in x dot y_row in y
returns value of greatest for i in 1, row_size
                          returns value of sum x_row[i] * y_row[i]
                          end for
end for

Exercise 4.3: Character Exclusion
Given: an array of characters, chr_array;
Compose: a loop that returns an array of the characters from chr_array that 
are either alphabetic characters or the space character.

for char in chr_array
returns array of char when (char >= 'A' & char <= 'z') | char = ' '
end for

Exercise 4.4: General Character Exclusion
Given: an array of characters, chr_array and an array of forbidden  
characters, forbidden_chr_array;
Compose: a loop that returns an array of the characters from chr_array that 
are not elements of the array forbidden_chr_array.

for chr in chr_array
returns array of chr unless for forbidden_chr in forbidden_chr_array
                            returns value of sum chr = forbidden_chr
                            end for
end for

12. Advanced Array Operations

The exercises involving arrays up to this point may have made the point that 
further smoothing is needed in how Sisal handles arrays. There are indeed 
facilities to make array operations friendlier. For instance, we can acquire 
the size and bounds of arrays by using the Sisal primitives array_size(), 
array_liml(), and array_limh(), respectively. These primitives allow us 
to form loops such as the following:

for i in array_liml(X), array_limh(X) cross
    j in array_liml(X[i]), array_limh(X[i])
    q := deep_thought(x, i, j)
returns array of q
end for

This sort of thing can be very useful when dealing with ragged arrays. For 
instance, the situation often develops in the modeling of physical systems 
that a system array with unchanging boundaries must be evolved and returned 
with the original boundaries. In the following example, we will use the Sisal 
operator for array concatenation, "||".

let new_core   := evolve(system_array);        % new system with no 
boundaries
    top_indx   := array_liml(system_array);    % positions of top and bottom
    btm_indx   := array_limh(system_array);    %   boundaries from old system

    new_top    := system_array[top_indx];      % actual top boundary
    new_btm    := system_array[btm_indx];      % actual bottom boundary

    left_indx  := array_liml(system_array[1]); % positions of left and right
    right_indx := array_limh(system_array[1]); %  boundaries from old system
                                               %  (all rows have same 
indices)

    long_core  := for row in core at i         % add side boundaries to rows

                      left_bound_elt    := system_array[ i, left_indx ];	
                      left_bound_array  := array [ 1: left_bound_elt ]; 
                      right_bound_elt   := system_array[ i, right_indx ];
                      right_bound_array := array [ 1: right_bound_elt ]

                  returns array of left_bound_array || row  ||
                                   right_bound_array
                  end for;

    long_tall  := new_top || long_core || new_btm   % add top and bottom

in  long_tall   % The newly evolved system, complete with boundaries.
end let

This elegant bit of code builds a new system array in three steps. First, 
function "evolve" is invoked with the old system array as its argument, and 
the new system array is returned. Then the short rows of the new array are 
made the correct length by concatenating the boundaries from the rows of the 
old system array onto them in a for-loop. In the loop we see that the values 
left_bound_elt and right_bound_elt are made into arrays of one element. 
This is necessary, as the concatenation operator requires that both its 
arguments be arrays. Finally, the top and bottom rows of the old system array 
are concatenated onto the widened core of the new system array. The result of 
these steps is an array the same size and shape as the old system array.

This example also includes a use of the hybrid range generator "row in core 
at i". Use of this form of generator relieves us of the need to specify the 
index range of the array core, and yet still gives us an index value to use 
in the array index expressions within the loop.

These techniques are so powerful and useful that we will explore them further 
in an exercise.

13. Exercise Set 5

Exercise 5.1: Thick Boundaries
Given: a system array with constant boundaries of size k, that is to be 
evolved by a given function "evolve", which returns only the core of the new 
system array;
Compose: a code fragment that evolves the system and puts the boundary rows 
and columns onto the core system returned by function "evolve".

Answers to Exercise Set 5

Exercise 5.1: Thick Boundaries
Given: a system array with constant boundaries of size k, that is to be 
evolved by a given function "evolve", which returns only the core of the new 
system array;
Compose: a code fragment that evolves the system and puts the boundary rows 
and columns onto the core system returned by function "evolve".

let new_core   := evolve(system_array); % returns only core of new system
    top_indx   := array_liml(system_array);
    btm_indx   := array_limh(system_array);

    new_top    := for i in top_indx, top_indx+k-1
                      top_bound_row := system_array[i]
                  returns array of top_bound_row					
				end for;
    new_btm    := for i in btm_indx-k+1, btm_indx
                      btm_bound_row := system_array[i]
                  returns array of btm_bound_row
			end for;

    left_indx  := array_liml(system_array[1]); % all rows have same range
    right_indx := array_limh(system_array[1]);

    long_core  := for row in new_core at i
                      left_bound_elts  := for j in left_indx, left_indx+k-1
                                              left_elts  := system_array[i,j]
                                          returns array of left_elts
                                          end for;
                      right_bound_elts := for j in right_indx-k+1, right_indx
                                              right_elts := system_array[i,j]
                                          returns array of right_elts
                                          end for;

                  returns array of left_bound_elts || row || right_bound_elts
                  end for;
    long_tall  := new_top || long_core || new_btm

in  long_tall % The newly evolved system, complete with boundaries.
end let

Note: If you came close to an answer equivalent to this one, you're to be 
congratulated. This is a nontrivial piece of Sisal coding, and indicates a 
good understanding of all that's gone before. If you didn't, don't be 
discouraged. This sort of coding requires a change of mind-set from that 
needed for programming in other languages. We believe that, once you make the 
shift to thinking functionally, the Sisal language will make you a faster and 
more productive programmer. If you require further proof of this, try writing 
the above code fragment in your favorite imperative language, and see if you 
think it's shorter, neater, or easier to read.

14. Functions and Programs

By now the reader must be ready to write whole programs instead of program 
fragments. To develop this topic properly we must deal with function 
definitions and invocations, as well as the extra linguistic glue necessary 
to build complete programs out of them.

14.1 Functions

The function definition and invocation is the heart and soul of the Sisal 
language. It will allow us to simplify several of the examples and exercises 
already presented in this tutorial. The basic idea, here, is similar to that 
of function and subroutine subprograms in Fortran and procedures in Pascal. 
The user composes code with dummy or formal arguments and then invokes it 
wherever necessary in a larger program with real arguments substituted for 
the formal ones. Sisal functions are more solidly founded, semantically, than 
those of Fortran because of the language's insistence on referential 
transparency and tight typing. This means that a function can get no 
information other than that provided through its argument list, and can 
return no information other than its value(s). Further, it means that the 
types of all arguments and returned values must be declared in the function 
definition and strictly adhered to in its use.

The syntax rules are simple:

function ( {optional argument list} returns {returned value list} )
{body expression}
end function

The argument list, if present, consists of a list of one or more argument 
name-type declarations, with some allowed shorthand notations. The returned 
value list must have at least one type declaration for the value of the 
function. No name is needed for the returned value, any more than is needed 
for the value of "2+2"; the value itself suffices. However, the body of a 
function can contain any legal sisal statement, including further function 
definitions, function invocations (including recursive ones), let-in 
statements, if-statements, for-loops, etc. Several easy and obvious examples 
are possible at this point.

The following function computes the dot product of two argument vectors:

function dot_product( x, y : array [ real ] returns real )

   for a in x dot b in y
   returns value of sum a * b
   end for

end function

The points to note about this example are that the argument arrays have the 
same type and can therefore be declared together, with their names separated 
by a comma, and separated from their type by a colon. If there were other 
arguments, they would be separated from other declarations with a semicolon. 
As with other Sisal language statements, there is an "end function" keyword 
pair to close the definition. Here is another function definition that 
invokes the function defined above:

type One_Dim_R = array [ real ];
type Two_Dim_R = array [ One_Dim_R ];

function matrix_mult( x, y_transposed : Two_Dim_R
                      returns Two_Dim_R )

   for   x_row in x                % for all rows of x
   cross y_col in y_transposed     % & all columns of y (rows of 
y_transposed)

   returns array of dot_product(x_row, y_col) 
   end for

end function % matrix_mult

The above function shows how the type declaration of an anonymous (unnamed) 
two-dimensional matrix looks. It also shows the syntax of Sisal commentary in 
the comments in the body of the function, as well as that used after the "end 
function" keywords to identify the function. We have shown comment text in 
earlier examples, but without discussing it. Sisal comments begin with a 
percent character "%" and continue until the end of the line. Commentary can 
be embedded in other statements without affect, as shown in the for-loop.

The invocation of the previously defined function dot_product occurs in the 
returns clause of the for-loop. A function invocation, like any other Sisal 
statement, can occur anyplace a constant, identifier, or expression can occur 
in any Sisal statement. Since the for-loop has a two dimensional  range 
generator, its value will be a two dimensional array. The positioning of a 
specific dot-product result in it is determined by the values of i and j in 
that instance of the loop's body. Note that we assume in this function that 
the argument arrays have correctly conformable shape and size, and that all 
their various index ranges have lower bounds of one. The above could, of 
course, be written to allow for other cases.

It is also possible to write recursive functions in Sisal. This style of 
programming is very natural for some applications, and while it raises 
efficiency questions, if rapid and correct prototyping is the concern, 
recursive formulations are sometimes the most expedient means to use. We will 
not discuss this topic in depth, since it should be familiar to all who have 
seen it in other languages (except Fortran). We will simply present one 
example, for the sake of completeness:

function factorial( n : integer returns integer )

if n <=1 then 1
else n * factorial( n - 1 )
end if

end function

14.2 Programs

Now we will see how to pull the above two functions together into a complete 
program that can actually be compiled and run. There is relatively little to 
add to what we've already seen, mostly details of coordination. First of all, 
we must tell the Sisal compiler what function is to serve as the program's 
entry point, and what name or names are to be exported. The latter allows us 
to load separately compiled modules together. The default entry point name is 
"main", but another name can be specified with the compiler pragma %$entry= 
placed at the head of the program text, or with an option flag passed 
directly to the compiler. (Sisal compiler options that are included in 
program source files always begin with the comment character so they will not 
be mistaken for source code statements themselves.)

Second we define new type names for convience and readability. Third, we 
declare the names, formal parameters, types of any external functions. This 
includes any functions from the math library, such as square root, 
trigonometrics, transcendentals, etc., which we will discuss further below. 
It also includes any functions from other modules that are compiled 
separately and invoked in the program. The declarations of such functions are 
merely function headers, consisting of the function names, argument 
declaration lists, and return value declaration lists.

Fourth, we must provide the user-defined functions that constitute our Sisal 
source. The current Sisal compiler requires that functions be defined before 
they are invoked, so definitions must occur in the source before any 
references to them. However, there is a way around this order restriction, 
called a "forward declaration." In it, a function's header is provided, 
preceded by the keyword forward. The full definition can then be placed 
anywhere in the program source. There is a required ordering to the full set 
of "glue" statements that can or must appear in a Sisal program, and we will 
fully detail it later. 

14.3 Importing Library Functions

We've mentioned in passing that Sisal programs have access to all the 
routines in the Unix math libraries. However, since the routines in that 
library are all designed to be very general, we must inform the Sisal 
compiler exactly how we intend to use them in our programs. to do this, we 
use a "global" declaration. This is easily described by example:

global sin ( x: real returns real );
global cos( x: double_real returns double_real );
global sqrt( x: real returns real );
global log( x: double_real returns double_real );

The above declarations specify how some of the math library functions will be 
used in a Sisal program.  Such declarations are necessary because the 
compiler must know what argument and result types to expect to be associated 
with invocations of these functions. These declarations show that the 
intended use for some math library  functions is different from that of 
others, by way of the argument and returned value types we have declared for 
them. Since the actual library functions can deal with many such type 
arrangements, we are free to use them as we see fit. However, there can be 
only one declaration for each function name. The declarations themselves must 
occur after the type statements and before any other function definitions.

Following is a complete matrix multiply program, with examples of each type 
of glue statement described above:

% MATRIX MULTIPLICATION PROGRAM
%========================================================================
% The "defines" statement is needed in every complete Sisal Program;
% the "$entry=" statement is needed if the entry point isn't named "main".
%------------------------------------------------------------------------
define matrix_mult   % Specifies function name(s) exported from 
module/program
%$entry=matrix_mult  % Specifies entry point(s) for this module/program

%-----------------------------
% Type declarations come next.
%-----------------------------
type One_Dim_R = array [ real ];
type Two_Dim_R = array [ One_Dim_R ];

%---------------------------------------------------------------------
% Then come global declarations for functions imported from libraries.
% These aren't used in the program, and are for illustration only.
%---------------------------------------------------------------------
global sqrt( x: real returns real );
global sin( x: double_real returns double_real)

%------------------------------------------------------------------------
% Next come forward function declarations; these are not needed, in this
% program, since the full function definitions are given in definition-
% before-use order.
%------------------------------------------------------------------------
forward function dot_product( x, y : One_Dim_R returns real )

forward function matrix_mult( x, y :  Two_Dim_R
                              returns Two_Dim_R )

%----------------------------------------
% Finally, the full function definitions.
%----------------------------------------
function dot_product( x, y : One_Dim_R returns real )

   for a in x dot b in y
   returns value of sum a * b
   end for

end function % dot_product

%............................................................................
.

function matrix_mult( x, y_transposed : Two_Dim_R
                      returns Two_Dim_R )

   for   x_row in x                % for all rows of x
   cross y_col in y_transposed     % & all columns of y (rows of 
y_transposed)

   returns array of dot_product(x_row, y_col)
   end for

end function % matrix_mult

%............................................................................
.

The comment statements describe the various parts of the program source. It 
isn't shown, above, but it is also possible to nest function definitions. 
That is, one function can have another defined and used within it. This can 
be notationally convenient, but the function so defined can be used only by 
the function it is defined within. There are a few more specific points about 
Sisal programs in general that can be illustrated by the above program.

Notice that the program above consists of two functions. They are defined at 
the same "level", but matrix_mult invokes dot_product, and is specified as 
the entry point by the $entry= declaration. The entry or outermost function 
corresponds to the main routine in a Fortran program. Here, the entry 
function is not invoked by any others in the program. This is generally the 
case, but not necessarily so; for instance, mutual recursion between two 
functions may cause the entry function to be invoked by another in the 
program. Notice also that the outermost function has arguments and returns 
values, just as the inner functions do. This is how Sisal programs get input 
and produce output. The arguments to the outermost function are the program's 
input, and the results it returns are the program's output. These may come 
from and go to files, as will be presented in a future section on compiling 
and running Sisal programs.

Now it's time for a few simple exercises in constructing whole functions and 
programs.

15. Exercise Set 6

Exercise 6.1: Quadratic Roots Function
Given:  three real values, a, b, and c, and the formula for the roots of a 
quadratic equation
     x1 = b + (b**2-4ac)**0.5/2a and
     x2 = b - (b**2-4ac)**0.5/2a;
Compose: a function of a, b, and c that calculates and returns the roots or 
some flag values if the roots cannot be calculated.

Exercise 6.2: Relaxation Function
Given:  a two-dimensional array, A, of real values
Compose: a function of A that performs one relaxation step, returning the new 
array whose values at each position [i, j] are the average of the values 
from A of that position and that position's eight nearest neighbors, [i+/-1, 
j+/-1].

Exercise 6.3: Polynomial  Evaluation Program
Given:  a two-dimensional array, C, of real valued polynomial coefficients, 
each row of which contains the coefficients of a single polynomial;
Given: for each row of C, the index of each coefficient's position is the 
exponent of x for that term;
Given: a one-dimensional array X of real valued polynomial unknowns, with 
each element corresponding to the unknown for the whole set of polynomials;
Compose: a function of C and X that evaluates each polynomial with each 
unknown, and returns the two-dimensional array containing in each row the 
value of each polynomial for one of the unknowns.

Answers to Exercise Set 6

Exercise 6.1: Quadratic Roots Function
Given:  three real values, a, b, and c, and the formula for the roots of a 
quadratic equation
     x1 = b + (b**2-4ac)**0.5/2a and
     x2 = b - (b**2-4ac)**0.5/2a;
Compose: a function of a, b, and c that calculates and returns the roots or 
some flag values if the roots cannot be calculated.

function quad.roots( a, b, c : real returns real, real )
let denom = 2.0 * a;
    discrim := b * b - 4.0 * a * c;

in  if discrim >= 0.0 then
       b + sqrt( discrim ) / denom,
       b - sqrt( discrim ) / denom
    else -9.99e99, -9.99e99
    end if
end let
end function

Exercise 6.2: Relaxation Function
Given:  a two-dimensional array, A, of real values;
Compose: a function of A that performs one relaxation step, returning the new 
array whose values at each position [i, j] are the average of the values 
from A of that position and that position's eight nearest neighbors, [i+/-1, 
j+/-1].

function relax ( A : array [ array [ real ] ]
                returns array [ array [ real ] ] )

for row in A at i cross elt in row at j
   avg := (elt + A[i, j-1] + A[i, j+1] +
           A[i+1, j-1] + A[i+1, j] + A[i+1, j+1] +
           A[i-1, j-1] + A[i-1, j] + A[i-1, j+1]) / 9.0
returns array of avg
end for

end function

Exercise 6.3: Polynomial  Evaluation Program
Given:  a two-dimensional array, C, of real valued polynomial coefficients, 
each row of which contains the coefficients of a single polynomial;
Given: for each row of C, the index of each coefficient's position is the 
exponent of x for that term;
Given: a one-dimensional array X of real valued polynomial unknowns, with 
each element corresponding to the unknown for the whole set of polynomials;
Compose: a function of C and X that evaluates each polynomial with each 
unknown, and returns the two-dimensional array containing in each row the 
value of each polynomial for one of the unknowns.

define multi_poly
%$entry = multi_poly

type Poly_Coeffs = array [ real ];        % Coefficients of one polynomial     
type Many_Polys = array [  Poly_Coeffs ]; % Coefficients of many polynomials
type Unknowns = array [ real ];           % Values for X, the unknowns
type Values = array [ real ];             % Values of all polys for one x
type Many_Values = array [ Values ];      % Values of all polys for all x

function eval_poly( coeffs : Poly_Coeffs; x_val := real returns Values )
   for c in coeffs at i
      term := c * exp(x_val, real(i))
   returns value of sum term
   end for

end function % eval_poly

function multi_poly( C : Many_Polys; X : Unknowns returns Many_Values )

   for x_val in X
      row_of_poly_vals := 
         for coeffs in C
            poly_value := eval_poly(coeffs, x_val)
         returns array of poly_value
         end for                     % one row of result array - all polys 
for one x
   returns array of row_of_poly_vals
   end for

end function % multi_poly


16. Sequential Looping with For-Initial Loops

In this section we will discuss the sequential or iterative looping construct 
in Sisal. Unlike the for-loop, in which all instances of the loop body are 
independent, the Sisal "for-initial" loop involves the explicit passing of 
information between consecutive instances of the loop body. This is 
visualized as a graph with a sequence of nodes, representing loop bodies, and 
the arcs between them representing the data dependencies between the loop 
bodies. The bodies (nodes) bear what we term a "producer-consumer" 
relationship to one another, and this establishes their interdependency. 
Syntactically, the data calculated in one body that is to be made available 
to the next body is identified in the consuming body by use of the keyword 
"old" before the identifier name. But it will be much easier to explain by 
example, so consider the following:

16.1 Two-Body Loops

The for-initial loop consists of two major parts, the initial-clause and the 
loop body proper. One way to think of this type of loop is to consider that 
the initial clause is the first "iteration", or the first actual loop body 
that gets executed; after it executes exactly once, zero or more instances of 
the loop body proper execute. The need for a distinguished first "iteration" 
is easy to explain by example, and we will look at one directly. However, for 
the moment, keep in mind that the loop must always return a specified number 
of results of specified types. These results are initially generated in the 
initial-clause, and then may get further treatment in the loop body proper, 
if it executes. With this in mind, consider the following example:

for initial
   i := array_liml(A);
   tmp := A[i];
   running_sum := tmp
while i < array_limh(A) repeat
   i := old i + 1;
   tmp := A[i];
   running_sum := old running_sum + tmp
returns value of running_sum
        array of running_sum
end for

In this example we first see the loop keywords  "for initial" which 
identify it as a sequential loop, and which open what we call the "initial 
clause" of the loop. This clause contains the initial definitions of the 
loop-carried values, "i", and "running_sum", which implement the data 
dependencies between the contiguous body instances.

Like the body of a for-loop, the initial clause of a for-initial loop can 
also contain the definitions of loop-local names, which are temporary 
identifiers used subsequently in the calculation but not passed between loop 
bodies; this is the purpose of "tmp". Next, we see the iteration termination 
test. This is a predicate, evaluating to a Boolean value, which uses either 
the "while" or "until" keyword to specify how the test is to be applied. The 
termination test can be at either the top or the bottom of the loop body; in 
this case it is at the top, and we refer to this form of loop as "top-
tested."  If it were at the bottom of the loop body, we would call it a 
"bottom-tested loop", and the keyword "repeat" would still be used to 
demarcate the top of the loop body.

The body of this loop shows the use of the keyword "old" to specify data 
passed in from previous bodies. In fact, it is more technically correct to 
call them "scopes" than "bodies", but the effect is the same. The statement 
"i := old i + 1" specifies that the value named "i" from the previous scope 
(i.e. "old i") is to be incremented by one and assigned the name "i" in this 
scope. While this may look like a violation of the single-assignment rule, it 
actually is not; the values named "i" and "old i" are two different 
entities, having independent existences, except for the value of the former 
being determined by the latter. The same thing holds for "running_sum", and 
we refer collectively to all of the instances of "i", "running_sum", "old 
i", and "old running_sum" as the "loop-carried values 'i' and 
"running_sum'", respectively.

So, the way the values are passed between the loop bodies is like this. The
first time the statement "i := old i + 1" in loop body is executed,
the meaning of "old i" is that value of "i" defined in the initial clause of 
the loop. Subsequent to this, it means that value of "i" defined in the most 
recent body before the current one. The ordering of the loop bodies' 
execution is totally determined by the effects of the termination test and 
the loop-carried values. Notice that the identifier "tmp" does not appear 
with the modifier keyword "old"; this is because its value is not passed 
between loop bodies, but is calculated anew in each body. This "loop-local" 
identifier is similar to those temporaries that are defined in the bodies of 
parallel for-loops, in that each instance of  it is independent of all the 
others, and we simply allow the reuse of its name for the convenience of the 
programmer.

At the end of the body we see the returns clause, just like that of the for-
loop. It can contain the same aggregation and reduction operations, which can 
operate on any or all of the loop-carried values; they can also contain the 
same Boolean masks, using the keywords "when" or "unless", to allow 
operation on subsets of the instances of any or all of the loop-carried 
values. The loop above returns both a value and an array; it adds up all the 
elements of the array "A", aggregates all the running subtotals into an 
array, and returns the total and the array of subtotals.

16.2 One-Body Loops

Now we need to return to the philosophy of sequential loops, to make sure all 
their ramifications are well explained. The for-initial loop is an 
expression, and all expressions have values. A given loop must return the 
same number and types of results, regardless of how many or few times it 
"iterates". If the body of the loop above never executes, because the array 
"A" is empty and the termination criterion is met the first time it 
is encountered, then the loop will still try to return the same two values. 
If we are sure that the array "A" contains at least one value, then 
the loop as written above will behave correctly, and there will be an initial 
i-th element to be assigned to the name "tmp"; otherwise, that array 
reference in the initial-clause will fail with an address-out-of-range 
runtime error. This situation must be planned for carefully! In the case 
above, the initial clause is actually the first real instance of the body of 
the loop. In other circumstances, we might want to program the loop so that 
the initial-clause sets up the values to be returned in the case where the 
body is not executable. In such situations, we are actually providing, in the 
initial-clause, default values to be returned if the body does not execute, 
and these default values can be considered the identity elements for the 
operations in the returns clause. In this a case, the loop might change to 
that shown below:

for initial
   i := array_liml(A) - 1;
   tmp := 0;
   running_sum := tmp
while i < array_limh(A) & array_size(A) ~= 0 repeat
   i := old i + 1;
   tmp := A[i];
   running_sum := old running_sum + tmp
returns value of running_sum
        array of running_sum
end for

In the above version of the loop, we use the initial-clause to set up for the 
actual loop body, defining the first value of the loop index, "i" as 
one less than the lower bound of the array, and the value of "running_sum" 
as zero. Then, the loop body uses a value for "i" that is legal within the 
array, if the termination test allows the body to execute. The returns clause 
in this version will return zero and an array containing the single element 
zero, if the loop body does not execute. The previous version would have 
failed with a run-time error, if this had occurred. We also see, in the 
second version, that the termination test is altered, to allow for the array 
"A" to be empty and to have a lower bound that is greater than its upper 
bound by more than one. Recall that an empty array can have any bounds as 
long as the lower bound is greater 
than its upper bound; it is possible that an algorithmically-generated array 
might have bounds of, say [-5, 0 ], due to unexpected behavior of the code 
the generates it. The addition of the array-size primitive protects the loop 
against this possibility.

16.3 Sequential Array Manipulation

But suppose we did not want to return an array containing the single element 
"0", as shown above, in the case of the body not executing; a more reasonable 
thing to return in this case might be an empty array. This can be done if we 
can define an empty array and then go on to fill it one or more elements at a 
time. This is possible in Sisal, and the code below shows another version of 
the loop that accomplishes it:

for initial
   i := array_liml(A) - 1;
   tmp := 0;
   running_sum := tmp;
   running_sum_array := array []
while i < array_limh(A) & array_size(A) ~= 0 repeat
   i := old i + 1;
   tmp := A[i];
   running_sum := old running_sum + tmp;
   running_sum_array := array_addh(old running_sum_array, running_sum)
returns value of running_sum
        value of running_sum_array
end for

Here we have added another loop-carried value, "running_sum_array", 
which is initialized to an empty array in the initial clause and returned in 
the returns clause. In the body, we see it is operated on by the Sisal 
primitive "array_addh"; this primitive extends the array given as 
its first argument, at the upper end of its index range, by one element whose 
value is the second argument. This affects both the contents of the array and 
its index bounds, and is a very handy primitive to have in such a case as 
this. Note that in the returns clause, the second value being returned is now 
generated by a "value-of" operation, rather than by an operation. Since the 
array we wish to return was created in the initial-clause and is being 
incrementally filled by the loop body, we simply return the "last" version of 
it generated by the loop as a whole. While the incremental expansion of this 
array as the loop executes would normally be a source of inefficiency, the 
Optimizing Sisal Compiler is able to optimize such programs so that the 
structure is built in place quite efficiently.

As one might expect by now, there is also a primitive to extend an array at 
the bottom end of its index range, as well as primitives to change one or 
more elements of an array, to shrink arrays at either end, and primitives to 
adjust the index range of arrays. These are specified in the following table, 
along with all the other array operations we have already seen; in the table, 
"A" is an array, "v" is a value of the proper type for elements of "A", and 
"low" and "high" are integers:

array primitive           effect
========================================================
array_size(A)            get number of elements
array_limh(A)            get upper bound
array_liml(A)            get lower bound
array_addh(A,v)          extend at high end
array_addl(A,v)          extend at low end
array_remh(A)            remove at high end
array_reml(A)            remove at low end
array_adjust(A,low,high) set both bounds
array_setl(A,low)        set lower bound
A[ i : v ]               change value at index i to v
A[ i: u, v, w ]          change values at indices 
                            i, i+1, and i+2 to 
                            u, v, and w, respectively
A[ i: u; j: v ]          change values at indices i and j
                            to u and v, respectively

The above primitives allow us to operate on arrays in useful ways, and are 
most often used in sequential loops, such as we have been discussing. 
However, they can also occur in other code, such as let-in statements, and it 
bears emphasizing that the arrays they produce are different values from 
those they operate on. In other words, we could have the following code:

let A := array [ 1: 5, 10, 15, 20, 25];
    B := array_addl(A, -5);
    C := array_addh(A, 30);
    D := array_reml(A);
    E := array_adjust(A, 3, 4);
    F := A[ 1: -5; 3: -15, -20 ]
in  A, B, C, D, E, F
end let

The above let-in statement returns the following values:

array [ 1: 5, 10, 15, 20, 25]       % The array A
array [ 0: -5, 5, 10, 15, 20, 25 ]  % B = A with new first element and 
                                    %     lower bound
array [ 1: 5, 10, 15, 20, 25, 30 ]  % C = A with new last element and 
                                    %     upper bound
array [ 2: 10, 15, 20, 25 ]         % D = A minus first element, with new 
                                    %     lower bound
array [ 3: 15, 20 ]                 % E = A minus several elements, with 
                                    %     new bounds
array [ 1: -5, 10, -15, -20, 25 ]   % F = A with several elements changed

Note that none of the operations on "A" changes its value; Rather, 
they produce new values of array types whose elements and index ranges are 
related to those of their argument array.

16.4 Termination Tests and Generation of "old" Values

The need to pass values between consecutive instances of a loop body, and in 
so doing, define an identifier "old {name}" and simultaneously undefine an
identifier {name} requires some careful timing. In the case of a top-tested 
loop, it is not possible to reference both {name} and "old {name}" in the 
termination test, because the the first time the test is evaluated, only one 
of these values is defined. The redefinition of names of loop-carried values 
takes place immediately after the termination test. In the case of the loop 
shown schematically below,

        for initial
           val := initial_value
        while {test(val)} repeat
           val := f(old val)
        returns val
        end for 

the value associated with the name "val" changes its name to
"old val" right after "test(val)" is executed, and the
named "val" becomes undefined at the same time. Note that the
test can contain references to "val" but not "old val",
since the latter is not defined the first time the test is encountered.
This form of loop is called a "top-tested loop."

If a test which refers to both names is desired, it must be placed at the
bottom of the loop body, as in the following example:

        for initial
           val := initial_value
        repeat
           val := f(old val)
        while {test(val, old val)} 
        returns val
        end for 

The first time "test(val, old val)" is executed, both names are
defined, since the former was just calculated and the latter was calculated
in the initial clause. Immediately after the test, the value named "val" 
changes its name to "old val" and the value "val" becomes undefined at the 
same time. This form of loop is called a "bottom-tested loop."

Next are a few exercises intended to cement the concepts discussed above. 

17. Exercise Set 7

Exercise 7.1: Newton Square Root Loop
Given: a real value Y, and the formula for Newton's square root algorithm,
        x(i) = (Y/x(i-1) + x(i-1))/2, where x(0)=Y/2;
Given: termination criterion: |Y - x(i)**2| < epsilon, a real value;
Compose: a for-initial loop that calculates successive approximations x(i) 
until the termination criterion is met, and returns the final value of the 
approximation as well as the number of approximations, x(i), i > 0, that have 
been calculated.

Exercise 7.2: First Maximum and Minimum Loop
Given: a one-dimensional array, A, of real values;
Compose: a for-initial loop that finds the first instance of the minimum and 
maximum values present in the array, and returns these values and their index 
positions.

Exercise 7.3: Successive Relaxation Loop
Given: a two-dimensional array, A, of real values;
Given: a predefined termination test function converged(x,y,eps), which 
takes two conformable two_dimensional arrays, x, and y, and a real value 
eps, and returns a Boolean value if its metric is met with respect to its 
third argument;
Compose: a function containing a for_initial loop that uses the function from 
exercise 16 to successively relax A until convergence occurs, and returns the 
converged value of A and the number of relaxations that have occurred.
Hint: Since the convergence function requires two arrays, one could use a
bottom-tested loop...
NOTE:�Unlike the previous exercise involving relaxation, the entire system
array here should be relaxed. The averaging forumla should be modified to
account for fewer neighbors are the edges of the system.

Exercise 7.4: Conway's Game of Life
Given: a two-dimensional array, Board, of integer cell values, zero for a 
dead cell and positive nonzero for a live one;
Given: the boundaries cells around the edge of Board are always dead;
Given: the rules by which a non-boundary cell lives or dies:
     if three or fewer neighbor cells are alive, then the cell dies;
     if five or more neighbors cells are alive then the cell dies;
     if four neighbor cells are alive, then increment the current cell value;
Compose: a program that plays evolves the Board array until some input 
number of steps are completed, and returns an array of all the values taken 
on by Board.

Answers to Exercise Set 7

Exercise 7.1: Newton Square Root Loop
Given: a real value Y, and the formula for Newton's square root algorithm,
        x(i) = (Y/x(i-1) + x(i-1))/2, where x(0)=Y/2;
Given: termination criterion: |Y - x(i)**2| < epsilon, a real value;
Compose: a for-initial loop that calculates successive approximations x(i) 
until the termination criterion is met, and returns the final value of the 
approximation as well as the number of approximations, x(i), i > 0, that have 
been calculated.

for initial
   x    := Y / 2.0;
  count := 0
while abs(Y - x*x) > epsilon repeat
   x := (Y/old x + old x) / 2.0;
   count := old count + 1
returns value of x
        value of count
end for

Exercise 7.2: First Maximum and Minimum Loop
Given: a one-dimensional array, A, of real values;
Compose: a for-initial loop that finds the first instance of the minimum and 
maximum values present in the array, and returns these values and their index 
positions.

for initial
   i := array_liml(A);
   max_val := A[i];
   min_val := A[i];
   max_ind := i;
   min_ind := i;
while i < array_limh(A) repeat
   i := old i + 1;
   max_val, max_ind := if A[i] >  old max_val then
			  A[i], i
		       else
			  old max_val, old  i
		       end if;
   min_val, min_ind := if A[i] <  old min_val then
			  A[i], i
		       else
			  old min_val, old  i
		       end if;
returns value of max_val
	value of max_pos
	value of min_val
	value of min_ind
end for 

Exercise 7.3: Successive Relaxation Loop
Given: a two-dimensional array, A, of real values;
Given: a predefined termination test function converged(x,y,eps), which 
takes two conformable two_dimensional arrays, x, and y, and a real value 
eps, and  returns a Boolean value if its metric is met with respect to its 
third argument;
Compose: a function containing a for_initial loop that uses the function from 
exercise 16 to successively relax A until convergence occurs, and returns the 
converged value of A and the number of relaxations that have occurred.
Hint: Since the convergence function requires two arrays, one could use a 
bottom-tested loop...
NOTE:�Unlike the previous exercise involving relaxation, the entire system
array here should be relaxed. The averaging forumla should be modified to
account for fewer neighbors are the edges of the system.

type OneDim = array [ real ];
type TwoDim = array [ OneDim ];

function relax( A : TwoDim returns TwoDim )

for i in array_liml(A),    array_limh(A) cross
    j in array_liml(A[i]), array_limh(A[i])
% NOTE:�This form of the code in this loop should vectorize.
   top_row  := i = array_liml(A);
   btm_row  := i = array_limh(A);
   lft_side := j = array_liml(A[i]);
   rgt_side := j = array_limh(A[i]);

   top_val, cnt1 := if top_row then 0.0, 0
                               else A[i-1,j], 1 end if;
   btm_val, cnt2 := if btm_row then 0.0, cnt1
                               else A[i+1,j], cnt1+1 end if;
   lft_val, cnt3  := if lft_side then 0.0, cnt2
                                 else A[i,j-1], cnt2+1 end if;
   rgt_val, cnt4 := if rgt_side then 0.0, cnt3
                                else A[i,j+1], cnt3+1 end if;
   lft_btm_val, cnt5 := if lft_side | btm_row then 0.0, cnt4
                                              else A[i+1,j-1], cnt4+1 end if;
   rgt_btm_val, cnt6 := if rgt_side | btm_row then 0.0, cnt5
                                              else A[i+1,j+1], cnt5+1 end if;
   lft_top_val, cnt7 := if lft_side | top_row then 0.0, cnt6
                                              else A[i-1,j-1], cnt6+1 end if;
   rgt_top_val, cnt8 := if rgt_side | top_row then 0.0, cnt7
                                              else A[i-1,j+1], cnt7+1 end if;
   divisor := real(cnt8+1);
   avg := (A[i,j]      +
           top_val     + btm_val     + lft_val     + rgt_val +
           lft_btm_val + rgt_btm_val + lft_top_val + rgt_top_val) / divisor
returns array of avg
end for

end function % relax

function successive_relaxation( A : TwoDim returns TwoDim, integer )
for initial
   relaxed_A  := A;
   step_count := 0;
repeat
   relaxed_A  := relax(old relaxed_A );
   step_count := old step_count + 1;

until converged(relaxed_A, old relaxed_A, eps)
returns value of relaxed_A
        value of step_count
end for

end function % successive_relaxation

Exercise 7.4: Conway's Game of Life
Given: a two-dimensional array, Board, of integer cell values, zero for a 
dead cell and positive nonzero for a live one;
Given: the boundaries cells around the edge of Board are always dead;
Given: the rules by which a non-boundary cell lives or dies:
     if three or fewer neighbor cells are alive, then the cell dies;
     if five or more neighbors cells are alive then the cell dies;
     if four neighbor cells are alive, then increment the current cell value;
Compose:  a program that plays evolves the Board array until some input 
number of steps are completed, and returns an array of all the values taken 
on by Board.

%==========================================================================

define main

% Game of Life by John Conway. Values of the Board cells are 0, for dead
% cells, or positive integers for live cells.
% Each cell has 8 neighbors; boundaries always contain dead cells.
% On each iteration, the cells are updated as follows:
%  -- If a cell has more than four live neighbors, then it should die.
%  -- If a cell has fewer than four live neighbors, then it should die.
%  -- If a cell has exactly four live neighbors, then 
%        if it is deal, then it should be born;
%        if is alive, then it should stay alive.
%  -- If a cell stays alive in a step, its value incremented by one.
% The simulation of life iterates "Iterations" times.
% The board rows are all indexed from 1 to N, and the columns are all
% indexed from 1 to M; M and N may not be equal.

type OneDim = array [ integer ];
type TwoDim = array [ OneDim ];
type ThreeDim = array [ TwoDim ];

%==========================================================================

function decide( B : TwoDim; I : integer; J : integer returns integer )
% This function decides which cells should live or die.

  %----------------------------------------------------------
  function test( B : TwoDim; p, q : integer returns integer )
     % This embedded function determines if a given cell is
     % currently alive; it returns numeric values, 1 for alive
     % and 0 for dead, so its results can be added by its
     % caller to similar results for other cells.

     if B[p, q] > 0 then 1 else 0 end if
  end function % test
  %----------------------------------------------------------

  % count the neighbors of cell [I,J]
  let Neighbor_count := test(B,I-1,J-1) + test(B,I-1,J) + test(B,I-1,J+1) +
                        test(B,I,  J-1) +                 test(B,I,  J+1) +
                        test(B,I+1,J-1) + test(B,I+1,J) + test(B,I+1,J+1)

  % change its state, if necessary
  in  if     ( Neighbor_count >= 5 ) then 0
      elseif ( Neighbor_count  = 4 ) then B[i,j] + 1
      elseif ( Neighbor_count <= 3 ) then 0
      end if 
  end let
end function % decide

%==========================================================================

function evolve( B : TwoDim returns TwoDim )
  let
    Num_Rows  := array_size(B);
    Num_Cols  := array_size(B[1]);
    First_row := B[1];
    Last_row  := B[Num_Rows];

    Core      := % evolve the central portion of B
                 for I in 2, Num_Rows-1
                    Mid_row  := % evolve the central portion of row I
                                for J in 2, Num_Cols-1
                                returns array of decide(B,I,J)
                 returns array of Mid_row
                 end for;
             
    Long_Core := % add the left and right boundaries to each core row
                 for m in 2, Num_Rows-1
                    Long_Row := array_addl( Mid_row, 0 );
                 returns array of array_addh( Long_Row, 0 )
                 end for

  in  % add the top and bottom boundary rows to the long core
      array_addl(array_addh(Core,Last_row),First_row)
  end let
end function % evolve

%==========================================================================

function main( Iterations : integer; Board : TwoDim returns Three_Dim )

   for initial
      Count := Iterations;
      B := Board;
   while ( Count > 0 ) repeat
      Count := old Count - 1;
      B := evolve( old B ); 
   returns array of B
   end for

end function % main

%==========================================================================

18. Compiling and Running Sisal Programs

Up to this point we have been dealing with Sisal and a pedagogical tool. But 
since it is intended to support real-world programming on contemporary 
uniprocessor and multiprocessor systems, we should discuss how to use compile 
and run Sisal programs, using the Optimizing Sisal Compiler (OSC) and runtime 
system. OSC is available for most Unix systems, and provides for parallel 
execution on a number of shared memory systems. Prototype versions of OSC 
exist for distributed memory and multithreaded systems, but performance and 
robustness on these systems is not likely to be as high as on shared memory 
systems. A later section will describe how to acquire and install the Sisal 
system software, For now, we will assume the compiler is available and 
describe how it works in actual use.

Assuming that the user has a Sisal program composed and ready to compile, the 
first step is to run it through OSC for syntax checking. Once the program is 
syntactically correct, it is guaranteed to be semantically correct, and is 
likely to be at least close logically to what the programmer intended. So, 
let us take a minute to discuss the overall structure of most Sisal programs 
by looking at an example. Then we will describe how to compile and run the 
example on a real parallel computer.

The program following is a standard example, used to illustrate the structure 
of a typical technical application written in Sisal.

18.1 Trapezoidal Rule Integrator

define main

% This module is a trapezoidal rule integration program.
% For any supplied function f(X:double_real), it takes
% double real inputs defining endpoints of the interval
% to be integrated over, a double real tolerance, and an
% integer count of sub-intervals to divide the interval
% into. It returns an array of double_real values of the
% numerical value of the definite integral over the interval,
% the number of refinement iterations made, the number of
% subintervals actually used to arrive at the final result,
% and the total number of subinterval area calculations made.

type One_Dim_D = array [ double_real ];
type Two_Dim_D = array [ One_Dim_D ];

% INSERT FUNCTION F HERE
%
%===============================================================

function f( x : double_real returns double_real )

15.0d0 * x * x * x - 6.375d0 * x * x + 11.9d0 * x - 33.999d0

end function %% f

%===============================================================

function trap_area( left_end, right_end : double_real
                    returns double_real )
% Calculates the area of a trapezoid
let     h     := right_end - left_end;
        f_avg := ( f(left_end) + f(right_end) )/2.0d0

in      h * f_avg

end let
end function %% trap_area

%===============================================================

function main( a, b : double_real;	         % Endpoints of integration
                                             %    interval
	              epsilon : double_real;   % tolerance
	              initial_n : integer;	   % Initial number of
                                             %    subintervals

               returns One_Dim_D,	         % values of approximation
                                             %    of definite integral
		                     integer,    % iteration counter
		                     integer )   % number of subintervals
                                             %    finally used
for initial
   N       := initial_n;
   Total_N := N;
   Cnt     := 1;
   H       := (b - a) / double_real(N);
   Area    := for i in 1, N
        		ai := a + double_real(i-1) * h;
        	    	bi := ai + h;
   		  returns value of sum trap_area( ai,  bi )
   		  end for
repeat
   N       := old N * 2;
   Total_N := old Total_N + N;
   Cnt     := old Cnt + 1;
   H       := old H / 2.0d0;
   Area    := for i in 1, N
        		ai := a + double_real(i-1) * h;
        	    	bi := ai + h;
   	         returns value of sum trap_area( ai, bi )
	         end for

until abs( area - old area ) < epsilon | Cnt > 12

returns array of Area
  	  value of Cnt
  	  value of N
        value of Total_N
end for

end function %% main
%

Looking at the commentary at the head of the program, and at the main 
function (function "main), we can see that the program is an integration 
approximator, that recursively refines an initial number of subdivisions of 
an integration interval by halving the width of each subinterval (doubling 
their number) and using the total area of the trapezoidal approximation for 
each subinterval as the definite integral. This process stops when either a 
sufficient number or recursive refinements have taken place, or when a 
convergence test is past. The arguments to the program are the x-values of 
the integration interval endpoints, the convergence tolerance, and the 
initial number if subintervals. The outputs are the integral approximation, 
the number of refinement steps or iterations taken, the number of 
subintervals finally use to produce the final answer, and the total number of 
subinterval area values calculated. The function being integrated must be 
inserted into the code of this simple sample program; there is one present 
already, a polynomial, whose integral can readily be calculated by hand as a 
check on the program's operation.

In structure, the program has a sequential outer loop, the initial clause of 
which computes the initial approximation, given the initial number of 
subintervals. In the loop body, the subintervals are refined and another 
approximation is calculated. The convergence test is at the bottom of the 
loop, and uses both the freshly computed approximation and the most recent 
previous one to determine convergence. The outer sequential loop contains a 
for-loop which ranges over the number of subintervals currently being used. 
In this loop a function named "trap_area" is called to calculate the area of 
one trapezoid. Function "f" is called by "trap_area" to calculate the 
function values at the trapezoidal endpoints. The doubling of the number of 
subintervals in each iteration of the outer loop causes the range of the for-
loop to double, increasing the amount of work that is exploitable for 
parallel execution. This doubling justifies the use for an iteration ceiling 
in the convergence test, to make sure the program does not run away with 
itself under pathological numerical conditions, and generate an unbounded 
amount of work, swamping the runtime system in the process. The code in the 
file accessed by the above anchor should be in shape for compilation. The 
contents of the file accessible below are sample inputs for the program, once 
it is ready to run.

18.2 Trapezoidal Rule Integrator Program Input

1.0d0 2.0d0 1.0d-6 10

The shell actions and results of compiling and executing the program can be 
seen by clicking on the following anchor:

18.3 Compiling and Executing the Trapezoidal Rule 
Integrator Program

unixbox> osc traprule2.sis.html
Production OSC version V12_9_1 
 LL Parse, using binary files
* Reading file: trap.sis...

version 1.8       (Mar 28, 1989)

 accepted
   82 lines in program
    0 errors ( calls to corrector)
    0 tokens inserted;    0 tokens deleted.
    0 semantic errors
if2mem:  W - array of ON LINE 77 OF main IN trap.sis MAY INTRODUCE COPYING

unixbox> s.out
SGI SISAL 1.2 V12_9_1 
1.0d0 2.0d0 1.0d-6 10
[ 1,11: # DRC=1 PRC=1
    2.532788e+01
    2.525147e+01
    2.523237e+01
    2.522759e+01
    2.522640e+01
    2.522610e+01
    2.522602e+01
    2.522601e+01
    2.522600e+01
    2.522600e+01
    2.522600e+01
]
11
10240
2047

The first line shows OSC being invoked, with no arguments other than the name 
of the source file. The Optimizing Sisal Compiler has a large number of 
options allowing in-depth control of the entire multi-phasic compilation and 
optimization process. These options are documented in the "OSC" man pages 
which are installed during the compiler installation process. The compilation 
proceeds through various stages, translating Sisal source into a data flow 
graph language, called IF1, translating that into an annotated memory 
management graph language called IF2, and finally producing C code; the 
target machine's compiler is then invoked to complete the process and leave 
an executable memory image, called "s.out" by default, containing the runtime 
system for I/O and parallel task management. The compiler can be ordered to 
return varying amounts of information to the terminal. Of particular interest 
are syntax error messages, which are not shown in this example.

When compilation is completed, execution is ordered by typing the name of the 
executable, which in this case is the default "s.out." Extensive options also 
exist for controlling the execution, and these, too, are documented in the 
"s.out" man pages which are automatically installed during compiler 
installation. In this example, no options are present. The runtime system 
presents a startup message to the terminal, and then waits for inputs from 
the Unix input channel "standard Input," since no file name was given. The 
inputs are typed in manually on the next line, and execution proceeds. Again, 
no output file name was given, output is sent to the Unix output channel 
"standard output", and appears on the terminal. The outputs are those 
specified in the program: (1) the array of subinterval approximations - 
eleven of them, since we asked for ten iterations to be taken; (2) the total 
number of approximations steps, including the first one in the initial clause 
of the outer loop;  (3) the total number of subintervals used to compute the 
final answer; and, (4) and the total number of area evaluations performed in 
the program's execution. The odd form of the array of area approximations is 
due to the fact that Sisal has a syntax for its inputs and outputs. This is 
because, since all objects are dynamic, their boundaries must be denoted for 
proper handling on input, and because one Sisal program's output is 
immediately suitable for use as input to other Sisal programs. The name of 
the simple I/O language is "Fibre," and more information on it can be found 
in the Fibre manual [47].

Next to observe are the results of more complex executions:

18.4 Executing the Trapezoidal Rule Integrator Program in 
Parallel

unixbox> cat > trap.inp
1.0d0 2.0d0 1.0d-6 10

unixbox> s.out trap.inp trap.out
SGI SISAL 1.2 V12_9_1 

unixbox> s.out -r -w1 -z inp
SGI SISAL 1.2 V12_9_1 

unixbox> s.out -r -w4 -z inp
SGI SISAL 1.2 V12_9_1 

unixbox> more s.info


  Workers   DsaSize  ExactFit  DsaHelps
        1  2000000b      100b         0

     MemW      MemU  LpSliceV   ArrayEx
    1040b     1256b         1       100

  CpuTime  WallTime    CpuUse
   0.0800    0.0800    100.0%


  Workers   DsaSize  ExactFit  DsaHelps
        4  2000000b      100b         0

     MemW      MemU  LpSliceV   ArrayEx
    1240b     1528b         4       100

  CpuTime  WallTime    CpuUse
   0.0200    0.0200    100.0%
   0.0200    0.0200    100.0%
   0.0200    0.0200    100.0%
   0.0200    0.0200    100.0%

In this file, we see the results of a parallel execution. First, the input is 
put into a file for convenience, using the Unix "cat" command. Next, 
execution is performed with the names of input and output files specified. 
Then execution is performed again with options "-r -w1 -z". These tell the 
runtime system to turn on execution performance monitoring, run with one 
worker process, and suppress output, respectively. The next execution uses 
the same options, except for "-w4", which specifies the use of four worker 
processes. This allows the exploitation of parallelism, with the runtime 
system slicing the available loops and partitioning the work to the worker 
processes. Finally, see the display of the execution info file, "s.info," 
through the use of the Unix "more" command. This file contains the worker 
process execution times and memory usages for the two executions made with 
the "-r" option specified. The sequential run shows the single-worker cpu 
time is shown as 0.08 seconds, and each of the worker cpu times for the four-
worker run are shown as 0.02 seconds. This indicates that the compiler and 
runtime system did a good job of exploiting the available parallelism, and 
the result was an execution speedup of four.

19. Sisal I/O

There remains one more topic to cover, that of dealing with I/O in Sisal 
programs. The general topic of I/O in functional programs is one of hot 
debate and current research. This is because reading and writing are 
inherently non-functional. These operations are in fact supposed to have 
side-effects, so they cannot be done within a program, at least not with the 
freedom to be found in imperative languages. While these matters are being 
addressed, and there are interim work-arounds for the most needed forms of 
I/O, the designers of Sisal 1.2 took a conservative tack in implementing the 
I/O facilities they added to the language. To preserve functional semantics, 
the I/O that can be done in Sisal is either a bit stiff or a bit uncertain. 
We will look at both of these areas in the following sections, starting with 
the formal and ending with the informal.

19.1 Sisal I/O: The Fibre Syntax

We have already seen how I/O is done in general in Sisal programs. The main 
function gets all the inputs as its arguments or parameters, and produces all 
the outputs as its results. This is workable for programs of the sort that we 
have seen in our examples - numerical, scientific ones that tend to transform 
their inputs into their outputs without much I/O needed during processing.
But what we haven't explored are effects of Sisal's dynamic objects on 
exactly how input data and output results are specified.

Since all arrays are of dynamic size, when one or more arrays are provided as 
inputs, they must be distinguishable from each other and from other inputs. 
If a number of values are provided, some of which are array elements, there 
is no way for the compiler or runtime systems to predict how many of them 
belong to the arrays. We solve this problem, maintaining the freedom to 
change input array sizes, by using a separate small language for the inputs 
(and outputs) of Sisal code. This language, called "Fibre", has its own small 
set of syntax rules that allow Sisal's I/O routines to determine the 
boundaries of aggregates. As one might imagine, this syntax is quite simple, 
and is completely documented in [47].

19.1.1 Scalars
In a nutshell, numbers of all types are specified the same as in program 
expressions (see sections 3 and 4). Boolean values are specified by the 
single unquoted characters T and F. Characters are specified by embedding 
them in single quote characters, such as 'Q'. (Nonprinting characters are 
specified using quoted character pairs that begin with the backslash 
character \; for details on this, the reader is referred to the Fibre 
manual.) Fibre also allows for commentary, in the form of text begun with the 
pound sign character # and continuing to the end of the line. The 
specification of aggregates is where the syntactic elements come into play.

19.1.2 Arrays
Arrays in Fibre look very much like they do in Sisal expressions. They begin 
with an opening square bracket, [, followed by a lower index bound; next is 
an optional comma and upper bound; then, a colon, to separate the bound(s) 
from the elements; the elements are separated by one or more spaces, and a 
final closing square bracket ] closes the array. Here are a few examples to 
illustrate this:

   [1: ]
   [4,2: ] 
   [0: 21 22 23 ]
   [ 1 : 1.0 2.0 3.0 4.0 ]
   [ 1, 4 : 1.0 2.0 3.0 4.0 ]
   [ 0, 5: 'h' 'e' 'l 'l' 'o' '!' ]
   

The first two arrays are empty; the first has lower bound 1 and upper bound 
0, by default; the second has specified bounds, but the lower is greater than 
the upper. The third one is an array of integers with lower bound 0 and 
(implied) upper bound 2. The fourth is an array of reals with lower bound 1 
and upper bound 4. The fifth is the same array as the fourth, but with the 
upper bound specified rather than implied. The sixth example is an array of 
characters. Obviously, the spacings of the brackets and bounds are free-form; 
white space is not significant and arrays may break across line boundaries.

Arrays of characters can be represented with a special string notation using 
double quotes, as follows:

        " Sisal is cool!"
is the same array as
        [1: ' ' 'S' 'i' 's' 'a' 'l' ' ' 'i' 's' ' ' 'c' 'o' 'o' 'l' '!' 
]<PRE>

Similarly, multidimensional arrays also look just like they do in program 
code. Since each element of a two-or-greater-dimensional array is itself an 
array, the above rules are applied recursively. Here are a few more examples 
of this:

   [1: [1:] ]
   [1: [ 1: 10 20 30 40]
       [ 3: 5 10 15 ]
       [ 2, 3: 101 102 ] ]
   [ 1, 2 : [ 1, 2 : 1d0 2.0d0 ] [ 3.0d0 4d0 ] ]
   [ 1, 2: [ 1: 'g' 'o' 'o' 'd' ] [ 1: 'b' 'y' 'e' ] ]
   [ 1: [ 1: [ 1: [ 1: 11 12 13 ]
                  [ 1: 21 22 23 ] ]
             [ 1: [1 : 31 32 33 ] ] ]
        [ 1: [ 1: [1: 41 42 43 ] ] ] ]

The first is a non-empty two-dimensional empty whose only element is an empty 
one-dimensional array. (This sort of thing can get confusing, so caution is 
urged!) The second example is a two-dimensional array of integers; note that 
the individual row arrays have different index ranges. The third example is a 
two-dimensional array of double_reals. The fourth is a two-dimensional array 
of characters. The last example shows a four-dimensional array of integers. 
Indents and line breaks, while optional, can help clarify complex aggregates.

19.1.2 Records
Record syntax involves using angle brackets, <>, to mark record boundaries, 
which causes the Fibre syntax to deviate from the Sisal syntax; in Sisal 
code, records are enclosed in square brackets. Field names are not needed, 
but the field values must be specified in order. So, given the declarations

   type complex = record [ re, im : real ];
   type label = record [ name, addr1, addr2, city, state : array [ char ];
                         zip : integer ]; 

 and a program whose outermost function expected two arguments, of type 
complex and label, respectively, we could have the following Fibre data:

<3.1416, 2.1828>
<"DeBoni" "LLNL" "P.O. Box 808" Livermore "CA" 94550>

19.1.3 Other aggregates
Sisal syntax also allows for two other kinds of aggregate data types, streams 
and unions. Since we didn't treat these types in this tutorial, we will not 
do so here, either. The reader is referred to the Sisal Language Manual [33] 
and the Fibre manual [47] for more information.

19.2 Using Fibre

Fibre is the syntax for both input and output. When executing a program, as 
we saw in section 18, data can be provided directly from the keyboard, and 
execution will not begin until the Fibre parser recognizes all the values it 
was told by the compiler to expect. or, it can be put into a file and 
directed to the executable, as we also saw. This is generally the most 
convenient method, since most programs have more than just a few input 
values. But there's also another virtue to doing it this way. If input and 
output come from and go to files, the output file from one run can easily 
become the input file for a subsequent one. This requires forethought in 
program design, to insure that everything needed as input is also produced as 
output, but it's a good ides to return all major inputs as outputs, anyway, 
for purposes of documenting the run.

The only real downside to using Fibre is the fact that there currently are 
only minimal facilities  for formatting output in Sisal's support software. 
This means that the user is often presented with array outputs with one 
element per line. Some editing is usually required to change output files 
into readable form, and sometimes the number of lines can make this 
troublesome. However, there is a very useful trick that one of the Sisal 
users came up with, which deserves mentioning, even thought its details are 
outside the scope of this tutorial. So we will describe it briefly in the 
next section, and let the interested programmer experiment or ask us for 
further guidance on it.

19.3 Sisal I/O: The Real World

While the Fibre I/O language is simple and useful, it tends to feel 
incomplete on first exposure, sine the requirements of real-world programming 
and its needs for I/O services are generally more complicated. Some of this 
can be addressed, but a significant portion is a set of unanswered questions. 
Let us briefly describe these matters and then go on to some real-world 
expedients we can resort to, to get around them.

19.3.1 Getting Output
Real scientific and numerical programs tend not to produce a single result at 
termination time. Rather, they usually produce output continuously, as they 
proceed toward termination. for at least two purposes. First, they often 
generate a sequence of partial answers, which either aggregate to or point to 
the complete result. This can take the form of a series of approximations to 
a final result, as in convergence algorithms; or, it can be a sequence of 
completely evolved simulated systems, at intermediate steps between 
initiation and completion. In either case, the intermediate results are of 
interest for two reasons. They can be useful in charting the progress of a 
program and perhaps identifying where it goes wrong (or gets interesting in 
unexpected ways); and, they can prevent having to restart a lengthy 
simulation run at the beginning, after an abnormal termination due to 
hardware problems.

In both cases, it is a good idea to produce all the intermediate values as 
well as the final one. In Sisal, the typical technical program is composed as 
a sequential outer loop containing a set of parallel loops. The outermost 
loop steps the algorithm, using the results of one step as input to the next. 
It is no greater problem to return "array of system", that it is to return 
"value of system" at the bottom of the outer loop, and this is an obvious way 
to produce the intermediate steps' values. However, this array result will 
still not appear until the program terminates, since it is not fully formed 
until then, so this is at best a partial solution. Further, it adds to the 
problem by causing the answers produced by such programs to require a 
potentially unbounded amount of space to store them as they are built.

19.3.2 Stream Output
The "stream" data type was originally designed to provide another part of the 
solution. Streams, unlike arrays, are produced sequentially by elements, not 
all at once. This allows the storage requirements to be bounded by the size 
of a small number of the stream elements ("system", in our current example ). 
If the outer sequential loop of a program were to return "stream of system", 
then in theory, each element "system" would be available as it was completed 
by an iteration of the loop. However, in the current implementation of the 
Optimizing Sisal Compiler and Runtime System, streams are not built this way; 
instead they are treated as arrays, for practical reasons involving program 
performance,  optimization, and efficiency. A limited form of stream data 
types is now being designed for reimplementation, and future versions of the 
compiler should handle them acceptably. For now, streams can be used, but 
will act as arrays. So the problem remains unanswered.

19.3.3 Peek Output
However, the needs for intermediate output, added to the requirements of 
debugging Sisal programs, stimulated the introduction of a "meta-feature" 
into Sisal, called "peek". This is a primitive that serves no purpose in the 
language, and is used in programs only to provide quick-and-dirty output. 
Basically, peek is called with a single argument, any value name or 
expression, and it returns an integer which can be subsequently ignored. Peek 
is used not for its results, though, but for its side effects, which are to 
write the value of its argument to the Unix output file called "standard 
error." This normally appears at the terminal, but can also be directed into 
a file along with other program output. We will offer no examples of peek's 
use here, because it is dangerous, not in changing program results or causing 
abnormal terminations, but because calls to peek are sometimes removed by the 
optimization stages of the compiler. This is because its results are usually 
not used, and so have no effect on program execution. When they survive 
optimization, the calls to peek are often reordered, along with other 
sections of code, so results sometimes do not come out in the order 
expected.Still, when used with discipline, for simple debugging or saving 
intermediate values, peek can be useful and even dependable.

19.3.4 Advanced Output Trickery
The I/O trick we alluded to above, for producing intermediate results in 
intelligible form, is this. Sisal has a very advanced feature, called the 
Foreign Language Interface, that allows what we call "mixed language 
programming." This allows the integration of Sisal code with Fortran and/or 
C, with appropriate data being passed into and out of the Sisal runtime 
system. The data so passed is NOT in Fibre form, which allows I/O to be done 
with relative ease from outside of Sisal. The very clever user we mentioned 
above figured out how to invoke X-Windows display software from within an 
outer program shell written in C. He then let the outer program invoke a 
Sisal program to solve a big problem with parallel execution, and return 
solution data to the outer program. The outer program then displayed the 
answers using X graphics. Carried a bit further, the outer sequential loop 
could be written in the outer program, so that the intermediate results of 
each invocation of the Sisal program could be displayed as soon as they were 
returned from the Sisal portion of the program.

We strongly believe that using multiple languages, at various levels within a 
program's structure, to do work appropriate for the language and level 
represents a strong approach to the problem of parallel programming in 
general. It also adds to the programmer's burden the need for fluency in 
multiple languages as well as the interfaces among them. But this is grist 
for the research mill, and we believe that acceptable solutions are in the 
offing.

20. Exercise Set 8

These exercises are more extensive and less well defined than those given 
previously, to offer the reader more freedom in adapting them to her or his 
immediate purposes in learning the Sisal language. Also, no suggested answers 
are provided for them. However, the Sisal project staff stands ready to 
assist sincere efforts. To reach us, see the section titled "23. For More 
Information..."

Exercise 8.1: Computing Pi
Assuming that you already have a good machine-precision value for Pi, write a 
program that will determine what value of N will provide equivalent precision 
on your computer, using the classical expression

             n  (-1)^(i+1)
     pi/4 = sum ----------
            i=1   (2i-1)

Exercise 8.2 Computing the Mandelbrot Set
Given the Mandelbrot formula
        z(n+1) = z(n)**2 + c,
write a program that accepts coordinates of a rectangular region of the 
complex plane, a size for each dimension in points, and a value for c as 
inputs, and then iterates every point in your region until |z| > 2, counting 
the iterations needed for each point. The output should be the array of 
iteration counts.

Exercise 8.3 Bubble Sort
Write a program that sorts a set of values via the bubble sort method. This 
method iteratively compares adjacent pairs of values, swapping them is the 
smaller value has the higher index value; this causes the smaller (or larger) 
values to bubble up (or down) depending on which direction through the index 
range the search is started at. This iteration occurs N-1 times, for a set of 
N values being sorted. A standard refinement has the final index position 
treated removed from the next phase, since it has received its correct value, 
already; so, each phase is one position shorter than the previous one.

Exercise 8.4 Compound Interest with Increasing Principal
The formula for the value of a savings account with a principal p, which is 
invested at compound interest, with an annual rate r, for m years, compounded 
n times per year is
     c * (1+r/n)**(m/n).
Assuming a savings plan which takes the principal p by amount s very year, 
write a program that takes p, r, m, n, s , and the number of years, y, as 
inputs, and computes the total value of the savings program at the end of 
each year.

Exercise 8.5 Prime Factorizations
Every integer has a unique factorization into primes. Given that the largest 
prime factor of any integer will be one half its value, and given also a 
large list of primes which you supply as input, write a program that, given 
an argument number, generates and array of counts corresponding to the input 
list of factors, where each count is the number or times the corresponding 
prime occurs in the prime factorization of the argument number. For instance, 
the input list of primes might be [1: 2 3 5 7 11 13 17 19 23 ], and the 
argument number and their factorizations might be 
        46 and [1: 1 0 0 0 0 0 0 0 1 ]
        42 and [1: 1 1 0 1 0 0 0 0 0 ]
        24 and [1: 3 1 0 0 0 0 0 0 0 ]

You might want to take an array of numbers as input and return an array of 
factorizations. You might also want to calculate the array or primes, to get 
values that reach or exceed half the argument number.

Exercise 8.6 Triangular System Solution
Given an upper triangular coefficient matrix for a linear system, along with 
a vector of right-hand side values for the system, write a program that 
solves for the values of the unknowns using the back-substitution method. For 
instance, for a 4 by 4 system the last
unknown is calculated as
     x(4 ) = b(4)/ a(4,4); and
     x(3) = 1/a(3,3)*(b(3)-x(4*a(4,4).
Your program should be general enough for any size system, even one that is 
over-specified (has more equations than unknowns), and the output should the 
be array of values for the unknowns.

21. Advanced Features

Sisal has a number of other features which deserve mention here, even though 
detailed description of them is beyond the scope of this tutorial. We will 
briefly describe several of them, and give a reference to a document or 
person to use as a source for further information.

21.1 Union Types

In order to produce data structures that can take on a recursive structure, 
such as lists and trees, it is necessary that Sisal have a data structuring 
ability equivalent to Pascal's variant records. Sisal's equivalent of this is 
the Union Type; it allows multiple forms of data to occur under a single 
type, identified by a tag that can be tested by user code. This allows, for 
instance, a tree to be constructed of a union type consisting of internal 
nodes or leaf nodes. For more information on Union Types, see the Sisal 
Language Manual [33].

21.2 Streams

Sisal includes a data type that allows producers and consumers of information 
to execute simultaneously on a shared structure, called a Stream. Streams are 
homogeneous aggregates, like arrays, except that they can be accessed only 
for their first element or all but their first element. This allows the 
construction of programs that operate on continuous values to or from other 
programs or outside environments. For more information on Streams, see the 
Sisal Language Manual [33].

21.3 Debugging

Debugging functional programs, despite the tendency to make fewer mistakes in 
composing them, tends to be somewhat difficult, mainly because functional 
language semantics makes input and output difficult, at best. However, we 
have developed a state-of-the-art symbolic debugging system, incorporating 
its own compiler and graph interpreter. This debugging system is called 
Twine, and it is supported as an experimental tool. For more information on 
Twine, see [36].

21.4 Mixed Language Programming

It is possible to merge programs in Fortran or C with Sisal code, in order to 
best use the features of each language, via a facility called the Foreign 
Language Interface. For instance, Sisal can be used to exploit parallelism in 
an algorithm, while the imperative languages can be used for I/O or access to 
special purpose libraries. For more information on mixed language programming 
and the Foreign Language Interface, see [10].

21.5 Distributed Memory Sisal

The currently available version of the Optimizing Sisal Compiler (OSC) and 
Runtime System support parallelism only on computer systems having a shared 
memory programming model. These include such common computers as 
multiprocessor Sparcs, SGI's, Crays, and even a few machines with physically 
distributed memory that can be used in a shared address space mode, such as 
the BBN TC-2000, nCube, and Encore Multimax. However, prototype versions of 
Sisal exist for machines with only distributed memory programming models, 
such as the Thinking Machines CM-5 and Cray T3D. For more information on 
Distributed Memory Sisal, see [1].

21.6 Newer Versions of Sisal

Since the currently available Sisal language, Sisal 1.2, is based on a design 
that is approaching ten years of age, we are continuously exploring 
improvements and enhancements to it as an exercise in language design. 
However, our intentions are to produce a working programming language with a 
high performance compiler and support software. To this end, we produced a 
paper language, Sisal 2.0, to foster debate as to what features were 
appropriate and useful for a real language. For more information on Sisal 
2.0, see [4]. We are also working on a much expanded version of Sisal, called 
Sisal-90, that will arrive on the scene sometime in the next year, with a 
compiler and other support software. This new version will not only contain 
more modern features, but will be much more expandable that the current 
version, while retaining enough syntactic similarity with the Sisal 1.2 to 
allow some automatic translation assistance. It will continue to use our 
intermediate form languages, IF1 and IF2, and will continue to utilize the 
powerful optimization back-end of our current compiler. The Sisal-90 manual 
is in preparation; for more information on Sisal-90, contact the author of 
this tutorial, or see the section titled "23. For More Information...".

21.7 Performance Monitoring Tools

Developing parallel programs seems to be only the first part of the parallel 
programming problem. After the program is running correctly, it must be tuned 
for efficient utilization of machine resources, such as memory and 
communications bandwidth. To support this, we are working on experimental 
tools designed to provide the programmer feedback as to how a program is 
running on a parallel system. To learn more about this effort, contact the 
author of this tutorial, or see the section titled "23. For More 
Information...".

22. How to Get the Sisal Software

The currently released versions of the Sisal compiler and runtime system 
support software is available from the anonymous ftp server operated by the 
Computer Research Group at Lawrence Livermore National Laboratory. This 
server contains directories and compressed tar files of Sisal software, 
source programs, and documentation, and is located at 
"ftp://sisal.llnl.gov/pub/sisal".

In addition, we maintain a number of mailing lists for interested parties and 
Sisal developers:

sisal-info@sisal.llnl.gov    : general sisal-related information (BROADCAST)
sisal-request@sisal.llnl.gov : requests to added or deleted from sisal-info
sisal-bugs@sisal.llnl.gov    : to report sisal software bugs
ssci-help@sisal.llnl.gov     : SSCI consulting help

23. For More Information...

The Sisal Project Home page is located at "http://www.llnl.gov/sisal".

The following is a partial bibliography of our publications on Sisal and 
related topics. Papers may be ordered directly from

Judy Michels
P. O. Box 808, L-417
Lawrence Livermore National Laboratory
Livermore, CA 94551
(510) 422-4236
judy@diego.llnl.gov

or from either of

John Feo at (510) 422-6389 or feo@diego.llnl.gov, or
Tom DeBoni at (510) 423-3793 or deboni@llnl.gov.

23.1 Sisal Publication Bibliography

[1] Beard, Patrick; An Implementation of SISAL for Distributed-Memory 
Architectures; Jun-95;

[2] Cann, D.; Feo, J.; Sisal versus FORTRAN: A Comparison Using the Livermore 
Loops; Apr-90; UCRL-102263 (2);

[3] Cann, D.; Feo, J.; Bohm, W.; Oldehoeft, R.; SISAL: Toward Resolving the 
Parallel Programming Crisis; May-92; UCRL-JC-109774;

[4] Cann, D.; Feo, J.; Bohm, W.; Oldehoeft, R.; The SISAL 2.0 Reference 
Manual; Dec-91; UCRL-MA-109098;

[5] Cann, D.; Feo, J.; Bohm, W.; Oldehoeft, R.; A Guide to the Optimizing 
SISAL Compiler; Sep-91; UCRL-MA-108369;

[6] Cann, D.; Feo, J.; DeBoni, T.; SISAL 1.2: High Performance Applicative 
Computing; May-90; UCRL-JC-103980;

[7] Cann, D.; Wolski, R.; Feo, J.; Parallel Functional Computation: Current 
Results and Observations; Jun-91; UCRL-JC-107022;

[8] Cann, David; SISAL: 1.2 A Brief Introduction and Tutorial; May-92 UCRL-
MA-110620;

[9] Cann, David; Framework Preconstruction and Aggregate Storage Subsumption 
Optimizations for Sisal 1.2; May-92; UCRL-MA-110701;

[10] Cann, David; The Optimizing SISAL Compiler: Version 12.0; Apr-92; UCRL-
MA-110080;

[11] Cann, David; Retire Fortran? A Debate Rekindled; Mar-92; UCRL-JC-107018 
(2);

[12] Cann, David; Retire Fortran? A Debate Rekindled; Jul-91; UCRL-JC-
107018(1);

[13] Cann, David; Retire Fortran? A Debate Rekindled; Apr-91; UCRL-JC-107018;

[14] Cann, David; Vectorization of an Applicative Language: Current Results 
and Future Directions; Nov-90; UCRL-JC-105654;

[15] Cann, David; Compilation Techniques for High Performance Applicative 
Computation; May-89; CS-89-108;

[16] Cann, David; Sisal Multiprocessing Support (July 1987); Jul-87; UCID-
21115;

[17] Cann, David; Feo, John; Sisal versus FORTRAN: A Comparison Using the 
Livermore Loops; Nov-89; UCRL-102263 (1);
[18] Denton, Scott; Optimizing Parallel Reduction Operations;

[19] Feo, J.; Cann, D.; Oldehoeft, R.; A Report on the Sisal Language 
Project; UCRL-102440 (1);

[20] Feo, J.; DeBoni, T.; Caffey, H.; Hausheer, F.; Developing Molecular 
Dynamics Simulation Codes Using Mixed Language Programming; May-94; UCRL-JC-
116641;

[21] Feo, J.; DeBoni, T.; Rodrigue, G.; Muller, J.; Implementation and 
Performance of Domain Decomposition Algorithm in Sisal; Sep-93; UCRL-JC-
115029;

[22] Feo, John; Sisal; The Salishan Problems in Sisal; Jul-92; UCRL-JC-
110915;

[23] Feo, John; Arrays in Sisal; Sep-90 UCRL-JC-106081;

[24] Feo, John; An Analysis of the Computational and Parallel Complexity of 
the Livermore Loops; Parallel Computing; Jan-88;

[25] Feo, John; The Livermore Loops in Sisal; Aug-87; UCID-21159;

[26] Feo, John; DeBoni, Thomas; A Sisal Workshop (Ohio Supercomputer Center);

[27] Grit, D; McGraw, J.; Programming Divide and Conquer on a Multiprocessor; 
May-83; UCRL-88710;

[28] Hendrickson, Chris; Programming A Real Code In A Functional Language 
(Part 1); Sep-91; UCRL-JC-108326;

[29] Lee, C.; Skedzielewski, Steve; On the Implementation of Applicative 
Languages on Shared-Memory Mimd Multiprocessors; Apr-88; UCRL-97980;

[30] Lee, Ching-Cheng; Experience of Implementing Applicative Parallelism on 
Cray X-MP; May-88; UCRL-98303;

[31] (1); McGraw, James; The Val Language - Description and Analysis; Dec-80; 
UCRL-83251;

[32] McGraw, Jim; Parallel Functional Programming in Sisal: Fictions, Facts 
and Future; Jul-93; UCRL-JC-114360;

[33] McGraw, Jim; Skedzielewski, Steve; Sisal Streams and Iteration in A 
single-Assignment Language Version 1.2; Mar-85; M-146 (1);

[34] McGraw, Jim; Skedzielewski, Steve; Sisal Streams and Iteration in A 
single-Assignment Language Version 1.1; Jul-83; M-146;

[35] McGraw, Jim; Skedzielewski, Steve; Streams and Iteration in Val; 
Additions to a Data Flow Language; Mar-82; UCRL-87414;

[36] Miller, Patrick; Cedeno, W.; A User's Guide to Twine; Jul-93;

[37] Miller, Patrick; The Efficient Implementation of Error Values in 
Functional Languages; Nov-91; UCRL-JC-105794;

[38] Miller, Patrick; Implementing Error Values in Applicative Languages; 
Aug-91; UCRL-JC 106539;

[39] Miller, Patrick; Twineman - Man Pages for the Twine Debugging System;

[40] Miller, Patrick; Implementing Error-valued Semantics for Dataflow 
Languages on Imperative Machines; UCRL-JC-111550;

[41] Oldehoeft, Rod; Mixed Applicative and Imperative Programs; Feb-87; UCRL-
96244;

[42] Ranelletti, John; Graph Transformation Algorithms for Array Memory 
Optimization in Applicative Languages (Ph.D. Thesis); Nov-87; UCRL-53832;

[43] Sarkar, Vivek; Partitioning and Scheduling Parallel Programs for 
execution on Multiprocessors; Apr-87; CSL-TR-87-328;

[44] Sarkar, Vivek; Compile-time Partitioning and Scheduling of Parallel 
Programs;

[45] Sarkar, Vivek; Cann, David; POSC - a Partitioning and Optimizing Sisal 
Compiler; Apr-90; UCRL-102737 (1);

[46] Sarkar, Vivek; Skedzielewski, Steve; An Automatically Partitioning 
Compiler for Sisal; Dec-88; UCRL-98289;

[47] Skedzielewski, Steve; Yates, Robert; Fibre: An External Format for SISAL 
and IF1 Data Objects, Version 1.1; Apr-88; M-154, Rev. 1;

[48] Skedzielewski, Steve; Data Flow Graph Optimization in IF1 (Dec. 2, 
1987); Dec-87; UCRL-92122 (1);

[49] Skedzielewski, Steve; DI: An Interactive Debugging Interpreter for 
Applicative Languages; Mar-87; UCRL-95709;

[50] Skedzielewski, Steve; IF1 An Intermediate Form for Applicative 
Languages, Version 1.0; Jul-85; M-170;

[51] Skedzielewski, Steve; A Simple Method to Remove Reference Counting in 
Applicative Programs; UCRL-100156;

[52] Welcome, Mike; IF2 An Applicative Language Intermediate Form with 
Explicit Memory Management; Dec-86; M-195;

[53] Wolski, R.; Feo, J.; Cann, D.; A Prototype Functional Langauge 
Implementation for Hierarchical-Memory Architectures; Jun-91; UCRL-JC-107437 
(1);

[54] Wolski, R.; Feo, J.; Cann, D.; Implementing Functional Languages to 
Exploit Locality; Jun-91; UCRL-JC-107491;

[55] Wolski, Richard; Feo, John; Program Partitioning for NUMA Multiprocessor 
Computer Systems; Oct-92; UCRL-JC-112183;

[56] Yates, Robert; Di Tutorial; Apr-88; UCID-21393;

[57] Proceedings of the Third Sisal User's Conference; Oct-93; CONF-9310206;

[58] Proceedings of the Second Sisal User's Conference; Oct-92; CONF-9210270;

[59] LLNL SISAL Compiler User's Manual, Version 1.1; Aug-86; M-191;