A Tutorial Introduction to Sisal A High Performance, Portable, Parallel Programming Language The mission of the Sisal Language Project is to develop high-performance functional compilers and runtime systems to simplify the process of writing scientific programs on parallel supercomputers and to help programmers develop functional scientific applications. If you have any questions about this page, or want more information about the Sisal Language Project, contact: John Feo at (510) 422-6389 or feo@diego.llnl.gov, or Tom DeBoni at (510) 423-3793 or deboni@llnl.gov. The Sisal Language Project has been approved as a Designated Unclassified Subject Area (DUSA) K19222, as of 07 August, 1991. Table of Contents 1. Introduction - What is Sisal? 2. Parallel Programming and Other Virtues 4 3. Expression Syntax 6 4. More Syntax 10 5. Exercise Set 1: Syntax Elements 12 6. User-Defined Types and Dynamic Values 14 7. Exercise Set 2: Aggregate Types 19 8. Loops and Parallelism 21 9. Exercise Set 3: Basic Loops 23 10. More on Loops 25 11. Exercise Set 4: Advanced Loops 29 12. Advanced Array Operations 31 13. Exercise Set 5: Advanced Array Operations 32 14. Functions and Programs 33 15. Exercise Set 6: Functions and Programs 38 16. Sequential Looping with For-Initial Loops 41 17. Exercise Set 7: Sequential Loops 47 18. Compiling and Running Sisal Programs 52 19. Sisal I/O 58 20. Exercise Set 8: Programs 63 21. Advanced Features 64 22. How to Get the Sisal Software 66 23. For More Information... 66 DISCLAIMERS NOTICE: Information from this server resides on a computer system funded by the U.S. Department of Energy. Anyone using this system consents to monitoring of this use by system or security personnel. DISCLAIMER OF LIABILITY: With respect to documents available from this server, neither the United States Government nor the University of California nor any of their employees, makes any warranty, express or implied, including the warranties of merchantability and fitness for a particular purpose, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. DISCLAIMER OF ENDORSEMENT: Reference herein to any specific commercial products, process, or service by trade name, trademark, manufacturer, or otherwise, does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or the University of California. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or the University of California, and shall not be used for advertising or product endorsement purposes. COPYRIGHT STATUS: LLNL authored documents are sponsored by the U.S. Department of Energy under Contract W-7405-Eng-48. Accordingly, the U.S. Government retains a nonexclusive, royalty-free license to publish or reproduce these documents, or allow others to do so, for U.S. Government purposes. All documents available from this server may be protected under the U.S. and Foreign Copyright Laws. Permission to reproduce may be required. 1. Introduction - What is Sisal? Sisal is a functional parallel programming language. It combines modern language features with readable syntax and mathematically sound semantic foundations to provide the easiest vehicle for parallel programming. Its optimizing compiler and runtime support system software provide portability, high performance, and determinate execution behavior. It is available on all Unix-based uniprocessor and shared memory multiprocessor systems, and developmental versions exist for several distributed memory systems, as well. Why do we think you should be interested in Sisal? Anyone who's written a computer program knows the extent to which attention to detail is required in order to succeed at it. Much of this detail is due to factors that are not present in the problem being solved, arising from what computer scientists would call the "model of computation" that underlies the programming language being used. Unfortunately, the standard computational model on which most languages are based is not an easy one to deal with. The primal operations are quite simple, and their order is a prime determinant of the outcome of a program. On the other hand, most scientific or technical programming is based on mathematics, which is quite powerful, and in a typical "statement" of mathematics, such as a derivation, the order of the expressions does not affect the truth of the ultimate equation. It seems reasonable, therefore, that a programming language based on mathematics could offer some leverage in the programming process - make the programmer's job simpler and the expression of the ideas in the program more straightforward. That is exactly what this text deals with - a programming language that is closer to mathematics than most, called Sisal. Sisal is a simple language. Mathematicians tend to like it. However, if you have any experience programming in languages like Fortran, Cobol, C, or Pascal, there is a change of mind-set necessary in learning to use it. The effort is not great, and we think the results are well worth it. Call Our Bluff! Read the following code and decide what you think it does. Then look at the answer underneath it to see if you're correct. define main type OneDim = array [ real ]; type TwoDim = array [ OneDim ]; function generate( n : integer returns TwoDim, TwoDim ) for i in 1, n cross j in 1, n returns array of real(i)/real(j) array of real(i)*real(j) end for end function % generate function doit( n : integer; A, B : TwoDim returns TwoDim ) for i in 1, n cross j in 1, n c := for k in 1, n t := A[i,k] * B[k,j] returns value of sum t end for returns array of c end for end function % doit function main( n : integer returns TwoDim ) let A, B := generate( n ) in doit( n, A, B ) end let end function % main The Answer: If you determined that the code on the previous page generated and then multiplied two n x n matrices, you were correct. The details of function generate are unimportant; we simply needed two matrices, so we built them there, according to a completely ad hoc formula. Function doit multiplies them. Function main simply invokes the other two functions to accomplish this. So What's the Point? The point is twofold. First, you probably recognized what the code does, which shows that Sisal is easy to read and expressive. Second, the program, which runs on just about anything from a PC to a Cray, is not just a program, but a parallel program. The fact that there's no reference in it to execution environment, number of processors, or any other machine feature, nor to any sort of parallelization primitive, shows that Sisal is able to automatically manage the parallelism in your program and the resources of the system executing it. It's no idle boast to say that this program will give linear speedups on a Cray C-90, when a large matrix size is used. 2. Parallel Programming and Other Virtues It's been said that programming computers is the most difficult abstract endeavor ever attempted by human beings. Those who have written complex programs, such as large scientific modeling codes, can certainly attest to this. Now, however, we are attempting to do something even harder, namely write programs for concurrent execution on multiple processors. Parallelism adds a new dimension to those already present within programs. In a sequential program, the order of operations and the locations of data are specified, albeit often implicitly thanks to the abstractions available in modern programming languages. Parallelism, or what operations can go on simultaneously, and how they can interact, is the third dimension. The added complexity of parallelism can be staggering. That's why we advocate programming in a language with mathematical foundations. Such a formal basis allows compilers and run-time support systems to correctly analyze the program code and make intelligent decisions about what is safe to execute in parallel. And safety, in the form of program correctness, is of paramount importance, here. The number of ways programs can go wrong undetectably is already very large. Parallelism adds the possibility of "race conditions", or time- dependent behavior, to our problems. Such misbehaviors may be extremely difficult to find, because they may not show up in every execution, every execution environment, or on every platform. Mathematical semantics and careful analysis can either spot such dangers or guarantee that they cannot happen. This is the case with Sisal. In fact, Sisal carries a very important guarantee: Sisal programs are determinate, regardless of platform or environment. This means that a Sisal program will produce the same answers, regardless of what computer it executes on, or how many processors (how much parallelism) it uses. This is because the Sisal language is based on mathematical principles that give syntactically correct Sisal programs very definite formal meanings. If the Sisal compiler accepts a program, it has extracted complete information about what the program does and how its parts interact. In languages such as Fortran or C, compilers typically "understand" what the parts do, but not how they interact. The Sisal compiler is able to use its complete information to automatically reason about how execute the program, without possibility of order-dependent or time-dependent errors. This enables the parallelism in the program to be exploited with certainty of correctness. Determinacy, and the automatic management of parallelism, remove a huge burden from the shoulders of the programmer. But these are not Sisal's only advantages. It is also a very good language for rapid prototyping. Sisal programs tend to be shorter, and easier to read and modify, than those in other languages. Sisal programmers tend to be able to write code faster, since they must describe only what the computation is, not the order it must proceed in. Another very important advantage is that Sisal code can be mixed with C or Fortran, to effect hybrid applications, where each language can be used for the parts of the work it excels at. For instance, one might chose to use mathematical solver libraries written in Fortran from within the parallel kernel of a Sisal program. Similarly, one might compose the computational kernel of a program in Sisal, so it can deal with the parallelism present in the application, while keeping the outer shell in another language, for purposes of user interaction, problem setup, and file I/O. We will attempt to demonstrate many of these claims in subsequent sections of this tutorial, and references will be provided to other sources for further information. 3. Sisal Syntax: Expressions All Sisal statements are expressions, in exactly the same way that "2+3" is one; it has a well-defined value that can be substituted for the original expression with no loss of meaning. Each statement of the Sisal language, in actual, correct use in a program, has a well-defined value. We say the statements "return" these values, as the result of they're having executed; but the truth is, the program statements and the values they return, are exactly equivalent to each other. Exchanging one for the other has no effect on the results of the rest of the program. This idea may not sound strange, but in fact it does not hold in all programming languages; some, such as Fortran and C, allow program expressions or statements to have "side effects" in addition to their explicit values. Sisal is a "functional" language, and its property of "referential transparency" prevents side effects and guarantees a specific and well- defined meaning for each statement. The principle of referential transparency derives from mathematics. In mathematical derivations, there is often much substitution of equivalent expressions for each other, in the search for the most concise or useful rendering of a truth. Coding in Sisal is very much like doing math, in its reliance on functions and substitutable expressions, and there is usually a straightforward transformation from mathematical notation to Sisal code. This will become clearer as we explore the nature of the language. 3.1 Expressions All Sisal statements are expressions, and all expressions evaluate to well- defined values. An expression is a combination of identifiers and operators that is equivalent to a value. Examples of Sisal expressions range from the obvious to the arcane. All of the following are correct Sisal expressions: 1.3 26 "hello world" 2 + 3 * 2 1.5 * 3.14159 / 2.0 These illustrate arithmetic operations on the usual sorts of numeric values, and they work just like intuition says they should. Operational precedence in Sisal is just what it is in other languages, which is to say that the value of the fourth expression above is eight, not ten. Mixing modes or operand types is not correct in Sisal. The following are not correct Sisal expressions: 1.3 + 26 5 * "good-bye" 3.14159 / 2 Two possible corrections for the first one above are 1.3 + 26.0 1.3 + real(26) and two for the third one above are 3.14159 * 0.5 3.14159 / 2.0 There is no possible correct way to multiply a character string by an integer, so the second expression above is uncorrectable. All Sisal statements are expressions; each returns one or more values, and each can be used within any other Sisal expression. For instance, the following will look familiar: Fibonacci(2 + 3) + Factorial(4 * 5) These are, of course, merely calls to (presumably) user-defined functions, with expressions for arguments; one also presumes they each return a single numeric value and can thus be added. Such function and expression nesting can be carried on to any depth, and the result of evaluating an expression or function call can be substituted at the site of invocation without changing the resulting value. This is the principle of "referential transparency", the mathematical foundation we have alluded to. This principle is all-important, and we will refer to it often. Sisal expressions are quite flexible. For example, the following is also a legal Sisal expression: 1 + (if X > Y then 2 else 3 end if) Here, we have put parentheses in to simplify reading the statement, but they are not required. The if-statement compares the magnitudes of X and Y to determine which of two values to return. The value of the if-statement is then added to 1 to produce the value of the overall expression. The value of this entire expression cannot be determined without the values of X and Y, but it will have a definite value if they are defined. All this said, it remains to describe the elements that can make up Sisal expressions and then go on to the statements available to the programmer. This is available in detail in the Sisal language manual, [33]. Here we will merely give some general rules. 3.2 Expression rules 0) All legal Sisal expressions have values, and all values have types. The basic arithmetic types are predefined and available, as discussed below. There are also predefined Boolean and character types. Programmers can also make up their own data types, as in other languages, such as Pascal, but we will save discussing user-defined data types for a later section. 1) The normal alphabetic letters and numbers, along with the underscore character, can be used to make up Sisal identifiers. anIdentifier Another_identifier 2) Identifiers can represent single values or function calls, and their names are significant up to 31 characters. a_value := calculate_function(x,y,z) 3) Case is not significant. The_Identifier is the same name as tHE_iDENTIFIER 4) Numerical values are integers, reals, or double_reals. Integers do not contain decimal points. Reals either contain decimal points or are expressed in scientific notation, with the letter "e" denoting the exponent of ten. Double_reals must be expressed in scientific notation, and must have the letter "d" denoting the exponent. The numeric literals 1. 1.0 1e0 1.e0 1.0e0 are all reals, and the literals 1d0 1.d0 1.0d0 are all double_reals. 5) Mixed mode arithmetic is forbidden. Explicit type casting is required. The casting operators are simply the names of the desired result types, used as functions. For example, to cast the integer value 10 into types real and double_real, use real(10) and double_real(10), respectively. To cast reals and double_reals into integers, there are three functions: integer, trunc, and round. Their operation is described in the Sisal language manual. The differences among them are in the way they treat signed operands. 6) The standard arithmetic operators +, -, *, and / are available for numeric values and expressions. The divide operator produces truncated integer results when applied to integers. Raising a number to a power is accomplished by way of the "exp(b,e)" function, which raises the value "b" to the power "e". The two arguments to exp may be any combination of numeric types. The type of the result will be the higher type of the two arguments, where double_real is higher than real which is higher than integer 7) The standard mathematical library functions are available in Sisal via an underlying C math library. These include trigonometric, exponential, and logarithmic functions. Variants are available for all numeric types. 8) Boolean values and expressions are possible (and common) in Sisal. The primitive Boolean elements are "true" and "false". The Boolean operators are "&" (and), "|" (or), and "~" (not). 9) Logical operations producing Boolean values are available in Sisal. The logical operators are > (greater than) >= (greater than or equal to) < (less than) <= (less than or equal to) = (equal to) ~= (not equal to) 10) Character and string values, expressions, and aggregates are possible in Sisal, and their correct uses are described in the Sisal language manual. The above syntax resembles that of many modern programming languages. Other parts of Sisal syntax, such as if-statements, will also look familiar. However, the style in which such statements are combined will deviate slightly from expectations, so the next aspects of Sisal are areas in which it deviates significantly enough from more common imperative languages that they deserve sections for themselves. 4. More Syntax Expressions in Sisal go beyond arithmetic sort, as we have already seen; mathematical and computer science considerations arise. For instance, the mathematical notion of a conditional expression - one whose value varies based on a predicate - is very useful. Also useful is the idea of nested scope; scope is a concept applied to names that are associated with values, and it specifies the region in which such an association is valid. We will describe the application of these ideas to Sisal here, and use them to introduce the concept of "arity" as it applies to Sisal programming. 4.1 Arity and Conditionals All Sisal constructs are expressions, and all expressions have values, but sometimes one value is not enough. Therefore, Sisal expressions can return multiple values. The number of values returned by an expression is called its "arity." Following is an example of a conditional assignment of multiple arity: ROOT1, ROOT2 := if DISCRIM > 0.0 then sqrt( DISCRIM ) / DENOM, -sqrt( DISCRIM ) / DENOM else 0.0, 0.0 end if The if-statement above has arity of two, meaning it returns two values. This code fragment is a multi-assignment of two values to two names. Each branch of the if-statement within the assignment must return two values. Further, the corresponding result values in each branch must have the same types; this means that if an if-statement's then-clause returns two values, say an integer and an array of reals, in that order, then the else-clause must also return an integer and an array of reals, in that order. In the example above, both values are of type real, as seen by the literal zeroes in the else- branch. In any multiple-valued expression, the number of returned values must equal the number of names they are assigned to, and the names and values associate from left to right. This means that the statement we are considering, above, is equivalent to the following: ROOT1 := if DISCRIM > 0.0 then sqrt( DISCRIM ) / DENOM else 0.0 end if; ROOT2 := if DISCRIM > 0.0 then -sqrt( DISCRIM ) / DENOM else 0.0 end if; In a multiple-valued expression (also termed a multi-expression), the individual component expressions or values are separated by commas, as shown above. The reader is cautioned not to become confused by the semicolons in the program text; semicolons are statement separators. The rules for constructing and evaluating expressions, even those of multiple arity, are likely to be intuitive to anyone with a background in basic mathematics. 4.2 Let-In Statements One problem with languages like Sisal is that they tend to be wordy. A large or complex expression that is used several times in succession can needlessly clutter an otherwise clear program. Sisal takes another hint from mathematical convention in allowing a convenience notation known as the "let- in statement." It basically allows the programmer to associate a name with the value of an expression, and then use the name as often and wherever is handy. Here's an example of this: let one := sin(x)*sin(x) + cos(x)*cos(x); two := sqrt( one+one+one+one); three := one + two in one, two, three, sqrt(two) end let This shows the names "one", "two", and "three" being assigned values in the let-clause, and demonstrates the utility of the let-in statement. The name "one" gets a value based on several function calls; "two" gets a value that uses the value of "one"; and "three" gets a value based on both the previous values. Then in the in-clause, all three names are used to specify four values, which constitute the value of the let-in statement. This let-in statement has arity four. The if-statement and the let-in statement typify statements that can have multiple arity. The let-in statement brings us to the idea of nested scopes. Each instance of a let-in statement constitutes a scope wherein names may be assigned to values, or defined. Names used in such a way are invisible outside the let-in statement in which they are defined. This allows names to be reused inside nested scopes, and such nesting can go on to effectively any depth. Here is an absurd example of such a legal (but unwise) nesting: let x := 1; y := let x := 2; y := 3 in x + y end let in x + y end let The value of this nested pair of let-in statements is 6, and in spite of its tendency to confuse human readers, it is legal Sisal syntax. Many clearer and more compelling uses for statement nests can be found, and will readily occur to an experienced Sisal programmer. In both the if-statement and the let-in statement, the statements are terminated by an end-if and end-let, respectively. These bracketing keywords help both the compiler and the reader keep track of where they are in the code, and establishes the pattern for all Sisal statement types. Each of these two statement types can be used together and nested in any convenient way, as long as commas, semicolons, and end-keywords are placed correctly. The best way to appreciate such syntactic elements is to use them, so let us now go on to some exercises. 5. Exercise Set 1 Exercise 1.1: The Circle Given: a real value, r, for the radius of a circle; Compose: a let-in statement that returns the area and circumference of a circle of the given radius. Exercise 1.2: Statistics Given: four values, a, b, c, and d, all real numbers; Given: the formulas for the mean and standard deviation mean = (a+b+c+d)/4; standard deviation = (((a-mean)**2 + (b-mean)**2 + (c-mean)**2 + (d-mean)**2) / 3)**0.5 Compose: a let-in statement that returns the mean and standard deviation of the given set of numbers. Exercise 1.3: Cubic Roots Given: two real values, a, and b, coefficients of the cubic equation x**3 + ax + b = 0; Given: the formulas for the roots x = C + D, -(C + D)/2 + 3**0.5*(C - D)/2, and -(C + D)/2 - 3**0.5*(C - D)/2, where C = (-b/2 + ((b**2)/4 + (a**3)/27)**0.5)**(1/3), and D = (-b/2 + ((b**2)/4 - (a**3)/27)**0.5)**(1/3); Compose: a let-in statement that returns the roots of the cubic equation. Answers to Exercise Set 1 Exercise 1.1: The Circle Given: a real value, r, for the radius of a circle; Compose: a let-in statement that returns the area and circumference of a circle of the given radius. area, circumference := let pi := 3.14159 in pi * r * r, 2.0 * pi * r end let Exercise 1.2: Statistics Given: four values, a, b, c, and d, all real numbers; Given: the formulas for the mean and standard deviation mean = (a+b+c+d)/4; standard deviation = (((a-mean)**2 + (b-mean)**2 + (c-mean)**2 + (d-mean)**2) / 3)**0.5 Compose: a let-in statement that returns the mean and standard deviation of the given set of numbers. let mean := (a + b + c + d) / 4.0; a_diff := a - mean; b_diff := b - mean; c_diff := c - mean; d_diff := d - mean; a_diff_sq := a_diff * a_diff; b_diff_sq := b_diff * b_diff; c_diff_sq := c_diff * c_diff; d_diff_sq := d_diff * d_diff; diff_sq_sum := a_diff_sq + b_diff_sq + c_diff_sq + d_diff_sq; sigma := sqrt( diff_sq_sum / 3.0 ) in mean, sigma end let Exercise 1.3: Cubic Roots Given: two real values, a, and b, coefficients of the cubic equation x**3 + ax + b = 0; Given: the formulas for the roots x = C + D, -(C + D)/2 + 3**0.5*(C - D)/2, and -(C + D)/2 - 3**0.5*(C - D)/2, where C = (-b/2 + ((b**2)/4 + (a**3)/27)**0.5)**(1/3), and D = (-b/2 + ((b**2)/4 - (a**3)/27)**0.5)**(1/3); Compose: a let-in statement that returns the roots of the cubic equation. x1, x2, x3 := let term1 := -b / 2.0; term2 := b * b / 4.0; term3 := a * a * a / 27.0; radical := sqrt(term2 + term3); C := exp(term1 + radical, 1.0/3.0); D := exp(term1 - radical, 1.0/3.0); in C + D, -(C + D)/2.0 + sqrt(3.0) * (C - D)/2.0, -(C + D)/2.0 - sqrt(3.0) * (C - D)/2.0 end let 6. User-Defined Types and Dynamic Values Another area where we offer convenience to the programmer is in specifying data types. Sisal comes with the standard complement of built-in data types, as described above, but these can be used to build further types. Arrays and records are the two most important sorts of user-defined date types. But before we describe this, a bit more Sisal philosophy is necessary to set the stage for it. In conventional programming languages, programmers use "variables" as names in their programs. These names are, strictly speaking, equivalent to storage locations. When a variable name is used, it implicitly means that the value stored in that location is wanted. This is why statements such as "I = I + 1" make sense in languages like as Fortran. It means that the contents of location "I" are being changed by an addition operation. In Sisal, however, names refer to values, not locations. We really do not care about storage in Sisal, so the location of a value is unimportant to us; only the values themselves concern us in writing Sisal code. A statement such as the increment operation above is incorrect in Sisal, both syntactically and semantically. The name "I" must correspond to a value, and if a new or different value is desired, it must be given a new name. But this need not be as awkward in practice as it may seem to be in concept. It is consistent with mathematical convention. In Sisal programs, values and their associated names have transitory natures, existing only between the times of their creation and their final use by other program elements. Here's the the let-in example we last dealt with: let one := sin(x)*sin(x) + cos(x)*cos(x); two := sqrt(one+one+one+one); three := one + two in one, two, three, sqrt(two) end let In it, the names "one", "two", and "three" are defined and used, and are then discarded. They are said to be undefined outside the scope of that statement. This dynamism, while familiar to mathematicians, is quite foreign to most programmers, and takes some getting used to. User-defined data types in Sisal reflect the dynamic nature of the language as well as names, but in a slightly different way. 6.1 Aggregate Data Types The most important of the Sisal data types are the aggregates, that is, array and record types. The use of aggregates in Sisal, as in other languages, give the programmer leverage in manipulating program data with minimal effort at naming. They allow giving ensembles of data single names, and grouping similar data items in intuitive ways. They also allow easy identification of the potential for parallel execution, as we will see in the section on loops. First, though, we briefly describe the two sorts of aggregate data types. 6.1.1 Array Types In Sisal, as in other languages, arrays are collections of elements of homogeneous types. In Fortran, for example, an array type is specified by the number of dimensions it has, and by its static size at compile-time. An array defined to be of size 100 by 50 can never change size or shape. In Sisal, the number of dimensions an array contains is specified by its type, but its sizes in each dimension are completely dynamic. But let's take this in small steps; consider the Sisal type declarations shown below. type float = double_real; type One_Dim_R = array [ float ]; type One_Dim_I = array [ integer ]; type Two_Dim_I = array [ One_Dim_I ]; The first statement defines a convenient type-name, "float", by which to refer to values of type double_real. The second defines a convenient type- name for an array whose elements are of type float. The third line specifies a type that is an array of integer elements, and the fourth line specifies an array of elements which are themselves one-dimensional arrays; this fourth type is an example of what we mean by a two-dimensional array in Sisal. Note that in the array type declarations, no sizes are specified; this is because size and index range are not characteristics of array types, but only of specific values which are instances of array types. Arrays of type One_Dim_R can have as many elements as desired in the program; they can also have any index range the programmer finds expedient, as long as the upper index bound is at least as large as the lower bound. The same is true for arrays of type Two_Dim_I. Furthermore, the one-dimensional arrays making up the rows of a two-dimensional array can all have different sizes and index ranges, leading to what are called "ragged arrays". These matters will be made clearer in the examples to follow. For the moment, the key point to focus on is that all values are dynamic and transitory in Sisal programs. If a statement returns, or evaluates to, a value that is a one-dimensional array, that array's size and index range are defined solely by the statement whose value it is; and, the value itself exists only until it is finally consumed by a further statement in the program. One other matter concerning arrays should be mentioned at this time: indexing and logical structure. We have just seen that a two-dimensional array may be considered an array of one-dimensional arrays. This inductive hierarchical structuring continues into greater numbers of dimensions; for instance, a three-dimensional array is an array of two-dimensional arrays. When indexing an array, the elements accessed get smaller (or simpler) as more indices are specified from left to right. For instance, if we have the type declarations shown above, and array A is of type Two_Dim_I, then A[6] is the sixth row of A, and A[4,7] is the seventh element of the fourth row. It is also correct syntax to specify this element as A[4][7], which clarifies the fact that we are specifying the seventh element of the fourth element of A. There is no notation in Sisal for specifying a single column of a two- dimensional array, nor is there a correct notation for omitting the left or middle index from an access to a three dimensional array. Indices must be supplied from the left and omitted, if they are not all supplied, from the right. Similarly, suppose we add the following declaration to the above set: type Three_Dim_I = array [ Two_Dim_I ]; And, suppose we have an array B of type Three_Dim_I. If we index as B[i], we are specifying the two-dimensional array formed by all the i-th rows of each plane. The B[i,j]-th element is the row of all the elements from row i, column j, in each plane. This indexing treatment is termed "row_major," and means that in walking through an array of any dimensionality, the leftmost array index varies fastest and the rightmost varies slowest. This is consistent with the indexing conventions found in the C language but not the same as in Fortran. 6.1.2 Record Types Records are collections of elements of heterogeneous types. The best way to see this is to define an example. type element_record = record [ name : array [ character ]; number : integer; weight : real ] type atomic_table = array [ element_record ]; Here we see a record type for chemical elements, containing fields for a name, atomic number, and atomic weight, with each field having a different type. We also see an array type for a collection of element records. Such aggregations can be nested to effectively any useful depth. The fields of record-type data items are accessed by giving the name of the record-type value, followed by a period and the field name. Given the above declarations, suppose we have the value Fe, of type element_record. Then specifying Fe.name could yield the value "Iron" and specifying Fe.number could yield the value 56. For a value named Column1 of type atomic_table, we might have the access Column1[1] yield a record of type element_record whose contents are [name: "Hydrogen"; number: 1; weight: 1.0]. In practice, records usually are aggregated into arrays, and they sometimes contain array-type fields. This is natural, convenient, and will be familiar to users of languages such as Pascal. While some of the standard uses for records, such as implementing stacks and queues, can be accomplished with arrays of records, they can also be done using another built-in aggregate type called a "stream." Streams will not be dealt with here, and the interested reader is referred to the Sisal Language Manual [33]. 6.1.3 A Few Examples of Array and Record Use Array and record types make composing readable Sisal programs easy and convenient. The syntax rules for using them correctly arise immediately from the semantic rules we have already touched on: every Sisal statement is an expression, and all expressions return specific values of specific types. Consider the following example: let x_diff := x[1] - x[2]; y_diff := y[1] - y[2]; xd_sq := x_diff * x_diff; yd_sq := y_diff * Y_diff in sqrt( xd_sq + yd_sq ) end let This is the calculation of the magnitude of separation of two two-space vectors. The vectors are named x and y, and their components are indexed in the usual way, using square brackets to separate the index expressions from the value names. Alternatively, we could represent the vectors as records, whose fields contain the components, and the calculation would them be as follows: let x_diff := v1.x - v2.x; y_diff := v1.y - v2.y; xd_sq := x_diff * x_diff; yd_sq := y_diff * Y_diff in sqrt( xd_sq + yd_sq ) end let Here we have the two records, named v1 and v2, with fields in each named x and y. The record notation allows us to name the fields, and refer to them by use of the record and field names separated by a period. A third alternative we could use would allow a combination of these two notations, as follows: let x_diff := vectors[1].x - vectors[2].x; y_diff := vectors[1].y - vectors[2].y; xd_sq := x_diff * x_diff; yd_sq := y_diff * Y_diff in sqrt( xd_sq + yd_sq ) end let Here we have an array, named vectors, whose elements are records. We index the individual records as array elements, and their components as record fields. But both these examples assume aggregates that already exist. How does one go about creating them? 6.1.4 Record Creation To illustrate record creation, we will use the example declarations we used earlier: type element_record = record [ name : array [ character ]; number : integer; weight : real ] type atomic_table = array [ element_record ]; When creating items of record types, we specify the matched pairs of field names and contents, in the order given in the type definition. Here are examples: let hydrogen := record element_record [ name: "Hydrogen"; number: 1; weight: 1.0 ]; helium := record element_record [ name: "Helium"; number: 2; weight: 4.0 ]; element_115 := record element_record [ name: ""; number: 115; weight: -999.999]; column1 := array atomic_table [ 1: hydrogen, helium, element_115 ] in hydrogen, helium, element_115, column1 end let Here we see three records being created, each of which has its type, element_record, specified along with it. This is sometimes optional, but it's a good practice to adopt, nonetheless, as it helps clarify the code to human readers. The first two records, hydrogen and helium have their field names and values specified in the order given in the original declaration, and with reasonable values. The field names must be paired with the field values, with a colon separating each name and value, and a semicolon separating the pairs, as in the definition of the type. In the third record, element_115, we see and empty string used for the name and a negative value for the weight, used as flag values. This is still a legal record, since all the fields are specified in order and with correct types. The fourth item created and returned above is an array named column1 of type atomic_table. We will see more about creating arrays in the next section. 6.1.5 Array Creation Arrays offer the greatest number of options for data creation. They are most often created in loops, in which context they offer great power to the programmer in the expression of parallelism. However, they can also be created in other ways. Since we have not yet discussed loops, we will illustrate the other creation methods, using the declarations of array types we used above: type float = double_real; type One_Dim_R = array [ float ]; type One_Dim_I = array [ integer ]; type Two_Dim_I = array [ One_Dim_I ]; With these declarations, we can define a set of arrays in the following code: let x := array One_Dim_R [ 1,4: 1d0, 2d0, 3d0, 4d0 ]; a := array One_Dim_I [ 0,4: 1, 3, 5, 7, 9 ]; b := array [ 1: 2, 4, 6, 8 ]; c := array Two_Dim_I [ 3: a, b, a ] d := array_fill( 2, 10, 0d0 ); e := array One_Dim_I []; in foo(x[2], a, b, c[4,4], d, e) end let 7. Exercise Set 2 Exercise 2.1: Three Circles Given: an array of three real values, r, for the radii of three circles; Compose: a let-in statement that returns two arrays containing the areas and circumferences of the circles of the given radii. Exercise 2.2: More Statistics Given: four three-element arrays, a, b, c, and d, all containing real numbers; Given: the formulas for the mean and standard deviation mean = (a+b+c+d)/4; standard deviation = (((a-mean)**2 + (b-mean)**2 + (c-mean)**2 + (d-mean)**2) / 3)**0.5 Compose: a let-in statement that returns an array of records, each containing the mean and standard deviation of one of the given sets of numbers. Answers to Exercise Set 2 Exercise 2.1: Three Circles Given: an array of three real values, r, for the radii of three circles; Compose: a let-in statement that returns two arrays containing the areas and circumferences of the circles of the given radii. let pi := 3.14159; area1 := r[1] * r[1] * pi; area2 := r[2] * r[2] * pi; area3 := r[3] * r[3] * pi; circ1 := 2.0 * r[1] * pi; circ2 := 2.0 * r[2] * pi; circ3 := 2.0 * r[3] * pi; in array [ 1: area1, area2, area3 ], array [ 1: circ1, circ2, circ3 ] end let Exercise 2.2: More Statistics Given: four three-element arrays, a, b, c, and d, all containing real numbers; Given: the formulas for the mean and standard deviation mean = (a+b+c+d)/4; standard deviation = (((a-mean)**2 + (b-mean)**2 + (c-mean)**2 + (d-mean)**2) / 3)**0.5 Compose: a let-in statement that returns an array of records, each containing the mean and standard deviation of one of the given sets of numbers. let mean_a := (a[1] + a[2] + a[3])/3.0; mean_b := (b[1] + b[2] + b[3])/3.0; mean_c := (c[1] + c[2] + c[3])/3.0; mean_d := (d[1] + d[2] + d[3])/3.0; a_diff_sq_sum := (a[1] - mean_a) * (a[1] - mean_a) + (a[2] - mean_a) * (a[2] - mean_a) + (a[3] - mean_a) * (a[3] - mean_a); b_diff_sq_sum := (b[1] - mean_b) * (b[1] - mean_b) + (b[2] - mean_b) * (b[2] - mean_b) + (b[3] - mean_b) * (b[3] - mean_b); c_diff_sq_sum := (c[1] - mean_c) * (c[1] - mean_c) + (c[2] - mean_c) * (c[2] - mean_c) + (c[3] - mean_c) * (c[3] - mean_c); d_diff_sq_sum := (d[1] - mean_d) * (d[1] - mean_d) + (d[2] - mean_d) * (d[2] - mean_d) + (d[3] - mean_d) * (d[3] - mean_d); sigma_a := sqrt( a_diff_sq_sum / 2.0 ); sigma_b := sqrt( b_diff_sq_sum / 2.0 ); sigma_c := sqrt( c_diff_sq_sum / 2.0 ); sigma_d := sqrt( d_diff_sq_sum / 2.0 ); a_record := record [ mean: mean_a; st_dev: sigma_a ]; b_record := record [ mean: mean_b; st_dev: sigma_b ]; c_record := record [ mean: mean_c; st_dev: sigma_c ]; d_record := record [ mean: mean_d; st_dev: sigma_d ]; in array [ 1: a_record, b_record, c_record, d_record ] end let 8. Loops and Parallelism The preceding exercise set should have motivated the need for looping constructs. Performing repetitive operations is verbose and tedious without them. In most conventional languages, no distinction is made between the two basic kinds of loops, those in which the computations in the loop body are independent, and those in which they not. The first kind contains the potential for parallel execution, while the second kind, which we normally call "iteration" does not. It is the first kind, called the "for_loop", which is the easiest and most rewarding to use, in terms of return for effort, and we will discuss it first. 8.1 For-Loops The Sisal for-loop is used for repetitive computations that are independent, meaning that no information flows among them. Examples of this might be the incrementing of an array of counters, or the element-wise addition of two vectors. None of the individual increment or addition operations depends on any of the others, so, potentially, they can all occur simultaneously, or in parallel. The Sisal for-loop is designed to expose this potential, so the language translation and support software can determine whether and how to exploit it. This allows the programmer to concentrate on the algorithms used and the problems being solved, and not get bogged down in the details of how the computer executes the program implementing the algorithm. The Sisal for-loop contains three parts: the range, the body, and the returns clause. The syntax looks like this: for {range} {optional body} returns {returns clause} end for The range specifies the set of values controlling the number of instances of the body and returns clause that will be generated. Typically, it consists of an index identifier and a set of numeric values, such as "i in 1, n," but it can also consist of a pair of names for an element and an aggregate, such as "a in counters". In these ways, the number of instances of the loop that will be calculated is established, as well as the specific information needed by each one. The loop body is optional, and may contain the definitions of the values that are to be returned or any intermediate values the calculation needs. The returns clause controls which and how many data values will be returned by the loop, and it contains one or more aggregation or reduction operators (which will look familiar). But before we explain them, let's look at a simple for-loop that increments an array of counters, to see how these elements interact in practice. for i in 1, num_counters new_count := counters[i] + increment[i] returns array of new_count end for Here we see a loop whose range is from 1 to the value named "num_counters;" we assume this value exists and makes sense in the present context. In the second line of the loop we see this loop's body. In this loop, the body is where all the work is done, namely the addition of the elements of the two vectors "counters" and "increment." Both vectors are assumed to be of such size that there will be an element of each available for the entire range of values of the loop index, "i." For each value of this index, a new value is calculated, named "new_count," and all instances of this calculation are independent. Finally, in the returns clause, all the instances of "new_count" are aggregated into an array, and that array is returned as the value of the loop. Recalling that its index range is an integral part of a Sisal array, we must specify the index range of the array returned by this loop. This is simple and automatic; it is simply the set of values in the loop's range. In the current example, the loop's returned array has range [1..num_counters]. This is characteristic of all Sisal for-loops that return arrays, except for cases we will discuss in a future section on advanced looping concepts. As before, the for-loop is an expression, and has a well-defined value, which is the value defined by its returns clause. In the above case, it is an array. However, we can also return reduced values, such as in the following loop: for i in 1, num_counters new_count := counters[i] + increment[i] returns value of sum counters[i] value of sum new_count array of new_count end for This loop returns three values: a scalar (the sum of all the values in the array "counters"), an array (containing all the values of "new_count"), and another scalar (the sum of all the values of "new_count"). This demonstrates aggregation and reduction in the same loop. The body of a for-loop is similar to a let-in statement, in that temporary names can be defined in it that are used subsequently in the loop and/or in the returns clause. But strictly speaking, the loop body is not generally needed at all. The following two loops have the same value as the one immediately above: for i in 1, num_counters old_count := counters[i]; new_count := old_count + increment[i] returns value of sum old_count value of sum new_count array of new_count end for for i in 1, num_counters returns value of sum counters[i] value of sum counters[i] + increment[i] array of sum counters[i] + increment[i] end for The first of these two loops demonstrates that names, such as "old_count" may be assigned to values within the loop, and then used both in the loop and in the returns clause. Again, a different instance of "old_count" is generated for each value of the loop's range, and none of the instances can affect each other in any way outside of the loop's returns clause. This guarantees that no matter how many or how few of the loop instances are computed in parallel, or in what order they are computed, that same set of intermediate values will be generated and the same set will be returned as the value(s) of the loop itself. The second of the above loops demonstrates that all the work can occur in the loop's returns clause, leaving an empty loop body. This in no way changes what we just said: each instance of the loop's calculation is independent of all the others, and they can occur in any order and give the same results. There is one more thing to consider before we leave this section, and that is the alternate form of range generator we alluded to above - the one containing a pair of identifiers. Simply speaking, if we don't need the loop index as a numerical value, we don't have to use it. We can instead refer by name to the individual elements of the array we are working on, as in the following example: for elt in counters new_count := elt + 1 returns value of sum elt value of sum new_count array of new_count end for This loop assumes that the same value will be used to increment each element of the array "counters," namely one. So, no loop index is needed, since each instance of the loop gets one instance of "elt" from the array "counters" to work with. More advanced forms of loop range generation are also possible, and we will discuss them in a later section. 9. Exercise Set 3 Exercise 3.1: Dot Products Given: two vectors, x and y, of equal size and range; Compose: a loop that returns their dot (or inner) product. Exercise 3.2: Vector Max and Min Given: two vectors, x and y, of equal size and range; Compose: a loop that returns two vectors, containing the larger and smaller elements, respectively, of each position in the two vectors. Exercise 3.3: Vector Sum and Difference Given: two vectors, x and y, of equal size and range; Compose: a loop that returns four vectors, containing, respectively, the element-wise sum and difference of the two vectors, and the sums of the elements in each of the two vectors. Answers to Exercise Set 3] Exercise 3.1: Dot Products Given: two vectors, x and y, of equal size and range; Compose: a loop that returns their dot (or inner) product. for i in 1, size elt_prod := x[i] * y[i] returns value of sum elt_prod end for Exercise 3.2: Vector Max and Min Given: two vectors, x and y, of equal size and range; Compose: a loop that returns two vectors, containing the larger and smaller elements, respectively, of each position in the two vectors. for i in 1, size minimum, maximum := if x[i] < y[i] then x[i], y[i] else y[i], x[i] end if returns array of maximum array of minimum end for Exercise 3.3: Vector Sum and Difference Given: two vectors, x and y, of equal size and range; Compose: a loop that returns four vectors, containing, respectively, the element-wise sum and difference of the two vectors, and the sums of the elements in each of the two vectors. for in 1, size returns array of x[i] + y[i] array of x[i] - y[i] value of sum x[i] value of sum y[i] end for 10. More on Loops In this section, we tackle some advanced aspects of looping. While we're going to do this in the context of parallel loops, for simplicity's sake, most of what we say will also apply to sequential, or iterative, looping, which we are saving for a later section. We have already seen two forms of range generators, and two forms of returns clauses. Sisal has more to offer in both of these areas. As in all languages, looping constructs can be nested to effectively any depth. Nested looping is one way to generate multidimensional arrays. There are also many ways to return values, both simple and compound, from loops. Here we will explore these topics in a bit more detail, since they add greatly to the power of the language in expressing algorithms and exploiting parallelism. 10.1 Range Generators The possibilities for simple loop range generators extends to the combination of the two we have already seen. For simplicity, we will show examples of the three together before describing them. for i in 1, n for a in a_array for a in a_array at i Here, we see the two familiar forms followed by a new hybrid form. The first one, as we have said, generates a set of independent values, all named i, from the inclusive range [1,n]. The second generates a set of independent values, all named a, from the aggregate named a_array. The third generates both the element a, from the aggregate a_array, and the index of a, i, in the index range of a_array. This last form may seem strange, but it is useful in situations where the index range of an array is not known ahead of time. In addition to the above simple range generators, we can have compound or multidimensional ones, such as the following: for i in m, n cross j in p, q for a in a_array cross b in b_array for k in r, s dot l in x, y for c in c_array dot d in d_array Here we have the first two forms of simple range generator combined into compound generators. The keywords "dot" and "cross" determine how the index or element value sets will be formed. In the first example, we have an index value set consisting of the cross product of the ranges from the two simple generators. In other words, every value of i in the range [m,n] will be paired with every value of j in the range [p,q]. In the second example, we have another cross product, with every element a from a_array paired with every element i from b_array. In the dot-product range generators, we again have a pairing of values, but no value from either set is used more than once. In the third example, the first value of k will be r, and it will be paired with the first value of l, which will be x. The second values of k and l will be r+1 and x+1, respectively; this continues until the smaller-sized range is completely used up, at which point the loop's range is complete. Similarly, in the fourth example, the first value of c will be the first element of c_array, and it will be paired with the first element of d_array. In the case of arrays, when we use the term "first element" we mean the element with the lowest index. These compound ranges are very powerful, and they can be extended effectively without limit. Semantically, compound ranges are equivalent to loop nests. The following two loops are equivalent in range and in the types of values they return: for i in m, n cross j in p, q x := i + j returns array of x value of sum x end for for i in m, n x_array, x_scalar := for j in p, q x := i + j returns array of x value of sum x end for returns array of x_array value of sum x_scalar end for Each of the above two loops produces a two-dimensional array and a scalar. The index range of the arrays are [m, n] by [p, q], and the elements in corresponding positions of the arrays are identical. Just to make certain all this is clear, we should consider a few more examples. The following loop sums all the elements of a two_dimensional array: for i in 1, num_rows cross j in 1, num_columns returns value of sum a[i,j] end for The next example returns the vector consisting of the major diagonal of a two_dimensional array: for i in 1, num_rows dot j in 1, num_columns returns array of a[i,j] end for The above example is a bit tricky. Since the array a may not be square (i.e., have the same number of rows as it does columns), the resulting vector will actually be the major diagonal of the largest square subarray of array a. Here's another tricky example, the transposition of a rectangular array: for j in 1, num_columns cross i in 1, num_rows returns array of a[i,j] end for The above deserves careful study. It hasn't been emphasized yet, but the shape of an array returned by a loop nest is strictly determined by the order of the ranges of the loops as well as the ranges themselves. The array a, above, has shape [1, num_rows] by [1, num_columns], which is to say it has num_rows rows and num_columns columns. We want the resulting transposed array to have num_columns rows and num_rows columns, and to have as it's [i,j]-th element the [j,i]-th element of a. We must build the array in the shape we want, so we specify the outer loop to run over the desired index range for rows, and the inner loop to run over the that for columns. We then must be careful to access the elements of the original array, a, so as to avoid addressing outside of its index ranges. If we do all that correctly, the resulting value returned by the loop will be correctly transposed in size, shape, and element values. The reader should take the time firmly understand the above loop before proceeding with the next set of exercises. 10.2 Returns Clauses We have also merely scratched the surface in the area of returns clauses. We have seen the use of the "array of" aggregation and the "value of sum" reduction. There are also "value of" and "stream of" aggregations, which we will not go into at this point, and reductions of the form "value of product", value of least", "value of greatest", and "value of catenate". The sum, product, least, and greatest reductions produce scalars, and operate on numeric and Boolean types. Here's an example of their use in a nonnumeric context: for B_val in Boolean_array returns value of sum B_val value of product B_val end for This loop's return clause performs the inclusive "or" and the "and" functions of the elements of Boolean_array. The value of sum b_val reduction performs a Boolean summation, which returns true if any of its operands are true, and value of product b_val reduction performs a Boolean multiplication, returning true only if all its operands are true. The catenate reduction produces a flattened aggregate. This latter point is nontrivial. If a two-level loop produces an array via the "array of" aggregation, it will be a two-dimensional array, with ranges corresponding to those of the loops. If, on the other hand, it returns "value of catenate", the resulting array will be one-dimensional, consisting of all the rows produced by the inner loop joined together into a larger vector. A few examples at this point would likely be helpful. The following loop finds the largest and smallest elements of a three- dimensional array: for a_two_d in a_three_d cross a_one_d in a_two_d cross a_elt in 1, a_one_d returns value of least a_elt value of greatest a_elt end for The following loop flattens a three-dimensional array into a one_dimensional vector by concatenating all its rows together: for row_plane in a_3d_array cross row in row_plane returns value of catenate row end for The number of elements of the array resulting from the above loop will be the sum of the numbers of elements in each row of a_3d_array. The index range will begin at the index range of the first row and proceed upward from there. 10.3 Masked Returns Clauses The returns clause is capable of aggregating and reducing values computed in its for-loop, and it can also exercise selection criteria in determining which values to so treat. This is accomplished with one or more "masks," which are simply predicates that can be applied to the loop-generated values before further action is taken on them by the returns clause. These masks take the form of the keywords "when" or "unless" followed by a logical expression. The truth value of the logical expression interacts with the chosen mask keyword to determine whether the loop-generated values will be among those returned. Here's an example of a masked loop: for val in an_array returns value of sum 1 when val > 0.0 end for The above loop sums a set of integer ones corresponding to the positive elements of an_array. This is a simple way of counting the positive elements. Another common way to use masked returns clauses is in weeding out values from a set to be aggregated and returned, such as the following situation: for i in 1, number_of_values new_val := deep_thought( value[i] ) returns array of new_val unless new_val > upper_val | new_val < lower_val end for The above loop returns an array of only those instances of new_val that fall within the desired range [lower_val, upper_val]. The array so generated will not have any empty elements; every element in its index range will have a corresponding legal value, but its index range will not necessarily be equal to that of the loop, so care must be taken in subsequent dealings with it. The for-loop is incapable of doing the sorts of calculations that involve information moving between the instances of the loop's calculation, such as iterative convergence calculations. It is also incapable, as we now know it, of doing the more general form of looping across irregular index ranges or value sets. The first of these problems will be dealt with when we discuss the other form of Sisal loop statement, and the second when we deal with advanced looping concepts. For now, let us do a few exercises that will expose more of the power of Sisal loops. 11.Exercise Set 4 Exercise 4.1: Multidimensional Dot Products Given: two two_dimensional arrays, x and y, of equal size and range; Compose: a loop that returns an array of dot (or inner) product of each pair of corresponding rows. Exercise 4.2: Multidimensional Max Given: two two_dimensional arrays, x and y, of equal size and range; Compose: a loop that returns the value of the largest dot (or inner) product of each pair of corresponding rows. Exercise 4.3: Character Exclusion Given: an array of characters, chr_array; Compose: a loop that returns an array of the characters from chr_array that are either alphabetic characters or the space character. Exercise 4.4: General Character Exclusion Given: an array of characters, chr_array and an array of forbidden characters, forbidden_chr_array; Compose: a loop that returns an array of the characters from chr_array that are not elements of the array forbidden_chr_array. Answers to Exercise Set 4 Exercise 4.1: Multidimensional Dot Products Given: two three_dimensional arrays, x and y, of equal size and range; Compose: a loop that returns an array of dot (or inner) product of each pair of corresponding rows. for x_row in x dot y_row in y returns array of for i in 1, row_size returns value of sum x_row[i] * y_row[i] end for end for Exercise 4.2: Multidimensional Max Given: two two_dimensional arrays, x and y, of equal size and range; Compose: a loop that returns the value of the largest dot (or inner) product of each pair of corresponding rows. for x_row in x dot y_row in y returns value of greatest for i in 1, row_size returns value of sum x_row[i] * y_row[i] end for end for Exercise 4.3: Character Exclusion Given: an array of characters, chr_array; Compose: a loop that returns an array of the characters from chr_array that are either alphabetic characters or the space character. for char in chr_array returns array of char when (char >= 'A' & char <= 'z') | char = ' ' end for Exercise 4.4: General Character Exclusion Given: an array of characters, chr_array and an array of forbidden characters, forbidden_chr_array; Compose: a loop that returns an array of the characters from chr_array that are not elements of the array forbidden_chr_array. for chr in chr_array returns array of chr unless for forbidden_chr in forbidden_chr_array returns value of sum chr = forbidden_chr end for end for 12. Advanced Array Operations The exercises involving arrays up to this point may have made the point that further smoothing is needed in how Sisal handles arrays. There are indeed facilities to make array operations friendlier. For instance, we can acquire the size and bounds of arrays by using the Sisal primitives array_size(), array_liml(), and array_limh(), respectively. These primitives allow us to form loops such as the following: for i in array_liml(X), array_limh(X) cross j in array_liml(X[i]), array_limh(X[i]) q := deep_thought(x, i, j) returns array of q end for This sort of thing can be very useful when dealing with ragged arrays. For instance, the situation often develops in the modeling of physical systems that a system array with unchanging boundaries must be evolved and returned with the original boundaries. In the following example, we will use the Sisal operator for array concatenation, "||". let new_core := evolve(system_array); % new system with no boundaries top_indx := array_liml(system_array); % positions of top and bottom btm_indx := array_limh(system_array); % boundaries from old system new_top := system_array[top_indx]; % actual top boundary new_btm := system_array[btm_indx]; % actual bottom boundary left_indx := array_liml(system_array[1]); % positions of left and right right_indx := array_limh(system_array[1]); % boundaries from old system % (all rows have same indices) long_core := for row in core at i % add side boundaries to rows left_bound_elt := system_array[ i, left_indx ]; left_bound_array := array [ 1: left_bound_elt ]; right_bound_elt := system_array[ i, right_indx ]; right_bound_array := array [ 1: right_bound_elt ] returns array of left_bound_array || row || right_bound_array end for; long_tall := new_top || long_core || new_btm % add top and bottom in long_tall % The newly evolved system, complete with boundaries. end let This elegant bit of code builds a new system array in three steps. First, function "evolve" is invoked with the old system array as its argument, and the new system array is returned. Then the short rows of the new array are made the correct length by concatenating the boundaries from the rows of the old system array onto them in a for-loop. In the loop we see that the values left_bound_elt and right_bound_elt are made into arrays of one element. This is necessary, as the concatenation operator requires that both its arguments be arrays. Finally, the top and bottom rows of the old system array are concatenated onto the widened core of the new system array. The result of these steps is an array the same size and shape as the old system array. This example also includes a use of the hybrid range generator "row in core at i". Use of this form of generator relieves us of the need to specify the index range of the array core, and yet still gives us an index value to use in the array index expressions within the loop. These techniques are so powerful and useful that we will explore them further in an exercise. 13. Exercise Set 5 Exercise 5.1: Thick Boundaries Given: a system array with constant boundaries of size k, that is to be evolved by a given function "evolve", which returns only the core of the new system array; Compose: a code fragment that evolves the system and puts the boundary rows and columns onto the core system returned by function "evolve". Answers to Exercise Set 5 Exercise 5.1: Thick Boundaries Given: a system array with constant boundaries of size k, that is to be evolved by a given function "evolve", which returns only the core of the new system array; Compose: a code fragment that evolves the system and puts the boundary rows and columns onto the core system returned by function "evolve". let new_core := evolve(system_array); % returns only core of new system top_indx := array_liml(system_array); btm_indx := array_limh(system_array); new_top := for i in top_indx, top_indx+k-1 top_bound_row := system_array[i] returns array of top_bound_row end for; new_btm := for i in btm_indx-k+1, btm_indx btm_bound_row := system_array[i] returns array of btm_bound_row end for; left_indx := array_liml(system_array[1]); % all rows have same range right_indx := array_limh(system_array[1]); long_core := for row in new_core at i left_bound_elts := for j in left_indx, left_indx+k-1 left_elts := system_array[i,j] returns array of left_elts end for; right_bound_elts := for j in right_indx-k+1, right_indx right_elts := system_array[i,j] returns array of right_elts end for; returns array of left_bound_elts || row || right_bound_elts end for; long_tall := new_top || long_core || new_btm in long_tall % The newly evolved system, complete with boundaries. end let Note: If you came close to an answer equivalent to this one, you're to be congratulated. This is a nontrivial piece of Sisal coding, and indicates a good understanding of all that's gone before. If you didn't, don't be discouraged. This sort of coding requires a change of mind-set from that needed for programming in other languages. We believe that, once you make the shift to thinking functionally, the Sisal language will make you a faster and more productive programmer. If you require further proof of this, try writing the above code fragment in your favorite imperative language, and see if you think it's shorter, neater, or easier to read. 14. Functions and Programs By now the reader must be ready to write whole programs instead of program fragments. To develop this topic properly we must deal with function definitions and invocations, as well as the extra linguistic glue necessary to build complete programs out of them. 14.1 Functions The function definition and invocation is the heart and soul of the Sisal language. It will allow us to simplify several of the examples and exercises already presented in this tutorial. The basic idea, here, is similar to that of function and subroutine subprograms in Fortran and procedures in Pascal. The user composes code with dummy or formal arguments and then invokes it wherever necessary in a larger program with real arguments substituted for the formal ones. Sisal functions are more solidly founded, semantically, than those of Fortran because of the language's insistence on referential transparency and tight typing. This means that a function can get no information other than that provided through its argument list, and can return no information other than its value(s). Further, it means that the types of all arguments and returned values must be declared in the function definition and strictly adhered to in its use. The syntax rules are simple: function ( {optional argument list} returns {returned value list} ) {body expression} end function The argument list, if present, consists of a list of one or more argument name-type declarations, with some allowed shorthand notations. The returned value list must have at least one type declaration for the value of the function. No name is needed for the returned value, any more than is needed for the value of "2+2"; the value itself suffices. However, the body of a function can contain any legal sisal statement, including further function definitions, function invocations (including recursive ones), let-in statements, if-statements, for-loops, etc. Several easy and obvious examples are possible at this point. The following function computes the dot product of two argument vectors: function dot_product( x, y : array [ real ] returns real ) for a in x dot b in y returns value of sum a * b end for end function The points to note about this example are that the argument arrays have the same type and can therefore be declared together, with their names separated by a comma, and separated from their type by a colon. If there were other arguments, they would be separated from other declarations with a semicolon. As with other Sisal language statements, there is an "end function" keyword pair to close the definition. Here is another function definition that invokes the function defined above: type One_Dim_R = array [ real ]; type Two_Dim_R = array [ One_Dim_R ]; function matrix_mult( x, y_transposed : Two_Dim_R returns Two_Dim_R ) for x_row in x % for all rows of x cross y_col in y_transposed % & all columns of y (rows of y_transposed) returns array of dot_product(x_row, y_col) end for end function % matrix_mult The above function shows how the type declaration of an anonymous (unnamed) two-dimensional matrix looks. It also shows the syntax of Sisal commentary in the comments in the body of the function, as well as that used after the "end function" keywords to identify the function. We have shown comment text in earlier examples, but without discussing it. Sisal comments begin with a percent character "%" and continue until the end of the line. Commentary can be embedded in other statements without affect, as shown in the for-loop. The invocation of the previously defined function dot_product occurs in the returns clause of the for-loop. A function invocation, like any other Sisal statement, can occur anyplace a constant, identifier, or expression can occur in any Sisal statement. Since the for-loop has a two dimensional range generator, its value will be a two dimensional array. The positioning of a specific dot-product result in it is determined by the values of i and j in that instance of the loop's body. Note that we assume in this function that the argument arrays have correctly conformable shape and size, and that all their various index ranges have lower bounds of one. The above could, of course, be written to allow for other cases. It is also possible to write recursive functions in Sisal. This style of programming is very natural for some applications, and while it raises efficiency questions, if rapid and correct prototyping is the concern, recursive formulations are sometimes the most expedient means to use. We will not discuss this topic in depth, since it should be familiar to all who have seen it in other languages (except Fortran). We will simply present one example, for the sake of completeness: function factorial( n : integer returns integer ) if n <=1 then 1 else n * factorial( n - 1 ) end if end function 14.2 Programs Now we will see how to pull the above two functions together into a complete program that can actually be compiled and run. There is relatively little to add to what we've already seen, mostly details of coordination. First of all, we must tell the Sisal compiler what function is to serve as the program's entry point, and what name or names are to be exported. The latter allows us to load separately compiled modules together. The default entry point name is "main", but another name can be specified with the compiler pragma %$entry= placed at the head of the program text, or with an option flag passed directly to the compiler. (Sisal compiler options that are included in program source files always begin with the comment character so they will not be mistaken for source code statements themselves.) Second we define new type names for convience and readability. Third, we declare the names, formal parameters, types of any external functions. This includes any functions from the math library, such as square root, trigonometrics, transcendentals, etc., which we will discuss further below. It also includes any functions from other modules that are compiled separately and invoked in the program. The declarations of such functions are merely function headers, consisting of the function names, argument declaration lists, and return value declaration lists. Fourth, we must provide the user-defined functions that constitute our Sisal source. The current Sisal compiler requires that functions be defined before they are invoked, so definitions must occur in the source before any references to them. However, there is a way around this order restriction, called a "forward declaration." In it, a function's header is provided, preceded by the keyword forward. The full definition can then be placed anywhere in the program source. There is a required ordering to the full set of "glue" statements that can or must appear in a Sisal program, and we will fully detail it later. 14.3 Importing Library Functions We've mentioned in passing that Sisal programs have access to all the routines in the Unix math libraries. However, since the routines in that library are all designed to be very general, we must inform the Sisal compiler exactly how we intend to use them in our programs. to do this, we use a "global" declaration. This is easily described by example: global sin ( x: real returns real ); global cos( x: double_real returns double_real ); global sqrt( x: real returns real ); global log( x: double_real returns double_real ); The above declarations specify how some of the math library functions will be used in a Sisal program. Such declarations are necessary because the compiler must know what argument and result types to expect to be associated with invocations of these functions. These declarations show that the intended use for some math library functions is different from that of others, by way of the argument and returned value types we have declared for them. Since the actual library functions can deal with many such type arrangements, we are free to use them as we see fit. However, there can be only one declaration for each function name. The declarations themselves must occur after the type statements and before any other function definitions. Following is a complete matrix multiply program, with examples of each type of glue statement described above: % MATRIX MULTIPLICATION PROGRAM %======================================================================== % The "defines" statement is needed in every complete Sisal Program; % the "$entry=" statement is needed if the entry point isn't named "main". %------------------------------------------------------------------------ define matrix_mult % Specifies function name(s) exported from module/program %$entry=matrix_mult % Specifies entry point(s) for this module/program %----------------------------- % Type declarations come next. %----------------------------- type One_Dim_R = array [ real ]; type Two_Dim_R = array [ One_Dim_R ]; %--------------------------------------------------------------------- % Then come global declarations for functions imported from libraries. % These aren't used in the program, and are for illustration only. %--------------------------------------------------------------------- global sqrt( x: real returns real ); global sin( x: double_real returns double_real) %------------------------------------------------------------------------ % Next come forward function declarations; these are not needed, in this % program, since the full function definitions are given in definition- % before-use order. %------------------------------------------------------------------------ forward function dot_product( x, y : One_Dim_R returns real ) forward function matrix_mult( x, y : Two_Dim_R returns Two_Dim_R ) %---------------------------------------- % Finally, the full function definitions. %---------------------------------------- function dot_product( x, y : One_Dim_R returns real ) for a in x dot b in y returns value of sum a * b end for end function % dot_product %............................................................................ . function matrix_mult( x, y_transposed : Two_Dim_R returns Two_Dim_R ) for x_row in x % for all rows of x cross y_col in y_transposed % & all columns of y (rows of y_transposed) returns array of dot_product(x_row, y_col) end for end function % matrix_mult %............................................................................ . The comment statements describe the various parts of the program source. It isn't shown, above, but it is also possible to nest function definitions. That is, one function can have another defined and used within it. This can be notationally convenient, but the function so defined can be used only by the function it is defined within. There are a few more specific points about Sisal programs in general that can be illustrated by the above program. Notice that the program above consists of two functions. They are defined at the same "level", but matrix_mult invokes dot_product, and is specified as the entry point by the $entry= declaration. The entry or outermost function corresponds to the main routine in a Fortran program. Here, the entry function is not invoked by any others in the program. This is generally the case, but not necessarily so; for instance, mutual recursion between two functions may cause the entry function to be invoked by another in the program. Notice also that the outermost function has arguments and returns values, just as the inner functions do. This is how Sisal programs get input and produce output. The arguments to the outermost function are the program's input, and the results it returns are the program's output. These may come from and go to files, as will be presented in a future section on compiling and running Sisal programs. Now it's time for a few simple exercises in constructing whole functions and programs. 15. Exercise Set 6 Exercise 6.1: Quadratic Roots Function Given: three real values, a, b, and c, and the formula for the roots of a quadratic equation x1 = b + (b**2-4ac)**0.5/2a and x2 = b - (b**2-4ac)**0.5/2a; Compose: a function of a, b, and c that calculates and returns the roots or some flag values if the roots cannot be calculated. Exercise 6.2: Relaxation Function Given: a two-dimensional array, A, of real values Compose: a function of A that performs one relaxation step, returning the new array whose values at each position [i, j] are the average of the values from A of that position and that position's eight nearest neighbors, [i+/-1, j+/-1]. Exercise 6.3: Polynomial Evaluation Program Given: a two-dimensional array, C, of real valued polynomial coefficients, each row of which contains the coefficients of a single polynomial; Given: for each row of C, the index of each coefficient's position is the exponent of x for that term; Given: a one-dimensional array X of real valued polynomial unknowns, with each element corresponding to the unknown for the whole set of polynomials; Compose: a function of C and X that evaluates each polynomial with each unknown, and returns the two-dimensional array containing in each row the value of each polynomial for one of the unknowns. Answers to Exercise Set 6 Exercise 6.1: Quadratic Roots Function Given: three real values, a, b, and c, and the formula for the roots of a quadratic equation x1 = b + (b**2-4ac)**0.5/2a and x2 = b - (b**2-4ac)**0.5/2a; Compose: a function of a, b, and c that calculates and returns the roots or some flag values if the roots cannot be calculated. function quad.roots( a, b, c : real returns real, real ) let denom = 2.0 * a; discrim := b * b - 4.0 * a * c; in if discrim >= 0.0 then b + sqrt( discrim ) / denom, b - sqrt( discrim ) / denom else -9.99e99, -9.99e99 end if end let end function Exercise 6.2: Relaxation Function Given: a two-dimensional array, A, of real values; Compose: a function of A that performs one relaxation step, returning the new array whose values at each position [i, j] are the average of the values from A of that position and that position's eight nearest neighbors, [i+/-1, j+/-1]. function relax ( A : array [ array [ real ] ] returns array [ array [ real ] ] ) for row in A at i cross elt in row at j avg := (elt + A[i, j-1] + A[i, j+1] + A[i+1, j-1] + A[i+1, j] + A[i+1, j+1] + A[i-1, j-1] + A[i-1, j] + A[i-1, j+1]) / 9.0 returns array of avg end for end function Exercise 6.3: Polynomial Evaluation Program Given: a two-dimensional array, C, of real valued polynomial coefficients, each row of which contains the coefficients of a single polynomial; Given: for each row of C, the index of each coefficient's position is the exponent of x for that term; Given: a one-dimensional array X of real valued polynomial unknowns, with each element corresponding to the unknown for the whole set of polynomials; Compose: a function of C and X that evaluates each polynomial with each unknown, and returns the two-dimensional array containing in each row the value of each polynomial for one of the unknowns. define multi_poly %$entry = multi_poly type Poly_Coeffs = array [ real ]; % Coefficients of one polynomial type Many_Polys = array [ Poly_Coeffs ]; % Coefficients of many polynomials type Unknowns = array [ real ]; % Values for X, the unknowns type Values = array [ real ]; % Values of all polys for one x type Many_Values = array [ Values ]; % Values of all polys for all x function eval_poly( coeffs : Poly_Coeffs; x_val := real returns Values ) for c in coeffs at i term := c * exp(x_val, real(i)) returns value of sum term end for end function % eval_poly function multi_poly( C : Many_Polys; X : Unknowns returns Many_Values ) for x_val in X row_of_poly_vals := for coeffs in C poly_value := eval_poly(coeffs, x_val) returns array of poly_value end for % one row of result array - all polys for one x returns array of row_of_poly_vals end for end function % multi_poly 16. Sequential Looping with For-Initial Loops In this section we will discuss the sequential or iterative looping construct in Sisal. Unlike the for-loop, in which all instances of the loop body are independent, the Sisal "for-initial" loop involves the explicit passing of information between consecutive instances of the loop body. This is visualized as a graph with a sequence of nodes, representing loop bodies, and the arcs between them representing the data dependencies between the loop bodies. The bodies (nodes) bear what we term a "producer-consumer" relationship to one another, and this establishes their interdependency. Syntactically, the data calculated in one body that is to be made available to the next body is identified in the consuming body by use of the keyword "old" before the identifier name. But it will be much easier to explain by example, so consider the following: 16.1 Two-Body Loops The for-initial loop consists of two major parts, the initial-clause and the loop body proper. One way to think of this type of loop is to consider that the initial clause is the first "iteration", or the first actual loop body that gets executed; after it executes exactly once, zero or more instances of the loop body proper execute. The need for a distinguished first "iteration" is easy to explain by example, and we will look at one directly. However, for the moment, keep in mind that the loop must always return a specified number of results of specified types. These results are initially generated in the initial-clause, and then may get further treatment in the loop body proper, if it executes. With this in mind, consider the following example: for initial i := array_liml(A); tmp := A[i]; running_sum := tmp while i < array_limh(A) repeat i := old i + 1; tmp := A[i]; running_sum := old running_sum + tmp returns value of running_sum array of running_sum end for In this example we first see the loop keywords "for initial" which identify it as a sequential loop, and which open what we call the "initial clause" of the loop. This clause contains the initial definitions of the loop-carried values, "i", and "running_sum", which implement the data dependencies between the contiguous body instances. Like the body of a for-loop, the initial clause of a for-initial loop can also contain the definitions of loop-local names, which are temporary identifiers used subsequently in the calculation but not passed between loop bodies; this is the purpose of "tmp". Next, we see the iteration termination test. This is a predicate, evaluating to a Boolean value, which uses either the "while" or "until" keyword to specify how the test is to be applied. The termination test can be at either the top or the bottom of the loop body; in this case it is at the top, and we refer to this form of loop as "top- tested." If it were at the bottom of the loop body, we would call it a "bottom-tested loop", and the keyword "repeat" would still be used to demarcate the top of the loop body. The body of this loop shows the use of the keyword "old" to specify data passed in from previous bodies. In fact, it is more technically correct to call them "scopes" than "bodies", but the effect is the same. The statement "i := old i + 1" specifies that the value named "i" from the previous scope (i.e. "old i") is to be incremented by one and assigned the name "i" in this scope. While this may look like a violation of the single-assignment rule, it actually is not; the values named "i" and "old i" are two different entities, having independent existences, except for the value of the former being determined by the latter. The same thing holds for "running_sum", and we refer collectively to all of the instances of "i", "running_sum", "old i", and "old running_sum" as the "loop-carried values 'i' and "running_sum'", respectively. So, the way the values are passed between the loop bodies is like this. The first time the statement "i := old i + 1" in loop body is executed, the meaning of "old i" is that value of "i" defined in the initial clause of the loop. Subsequent to this, it means that value of "i" defined in the most recent body before the current one. The ordering of the loop bodies' execution is totally determined by the effects of the termination test and the loop-carried values. Notice that the identifier "tmp" does not appear with the modifier keyword "old"; this is because its value is not passed between loop bodies, but is calculated anew in each body. This "loop-local" identifier is similar to those temporaries that are defined in the bodies of parallel for-loops, in that each instance of it is independent of all the others, and we simply allow the reuse of its name for the convenience of the programmer. At the end of the body we see the returns clause, just like that of the for- loop. It can contain the same aggregation and reduction operations, which can operate on any or all of the loop-carried values; they can also contain the same Boolean masks, using the keywords "when" or "unless", to allow operation on subsets of the instances of any or all of the loop-carried values. The loop above returns both a value and an array; it adds up all the elements of the array "A", aggregates all the running subtotals into an array, and returns the total and the array of subtotals. 16.2 One-Body Loops Now we need to return to the philosophy of sequential loops, to make sure all their ramifications are well explained. The for-initial loop is an expression, and all expressions have values. A given loop must return the same number and types of results, regardless of how many or few times it "iterates". If the body of the loop above never executes, because the array "A" is empty and the termination criterion is met the first time it is encountered, then the loop will still try to return the same two values. If we are sure that the array "A" contains at least one value, then the loop as written above will behave correctly, and there will be an initial i-th element to be assigned to the name "tmp"; otherwise, that array reference in the initial-clause will fail with an address-out-of-range runtime error. This situation must be planned for carefully! In the case above, the initial clause is actually the first real instance of the body of the loop. In other circumstances, we might want to program the loop so that the initial-clause sets up the values to be returned in the case where the body is not executable. In such situations, we are actually providing, in the initial-clause, default values to be returned if the body does not execute, and these default values can be considered the identity elements for the operations in the returns clause. In this a case, the loop might change to that shown below: for initial i := array_liml(A) - 1; tmp := 0; running_sum := tmp while i < array_limh(A) & array_size(A) ~= 0 repeat i := old i + 1; tmp := A[i]; running_sum := old running_sum + tmp returns value of running_sum array of running_sum end for In the above version of the loop, we use the initial-clause to set up for the actual loop body, defining the first value of the loop index, "i" as one less than the lower bound of the array, and the value of "running_sum" as zero. Then, the loop body uses a value for "i" that is legal within the array, if the termination test allows the body to execute. The returns clause in this version will return zero and an array containing the single element zero, if the loop body does not execute. The previous version would have failed with a run-time error, if this had occurred. We also see, in the second version, that the termination test is altered, to allow for the array "A" to be empty and to have a lower bound that is greater than its upper bound by more than one. Recall that an empty array can have any bounds as long as the lower bound is greater than its upper bound; it is possible that an algorithmically-generated array might have bounds of, say [-5, 0 ], due to unexpected behavior of the code the generates it. The addition of the array-size primitive protects the loop against this possibility. 16.3 Sequential Array Manipulation But suppose we did not want to return an array containing the single element "0", as shown above, in the case of the body not executing; a more reasonable thing to return in this case might be an empty array. This can be done if we can define an empty array and then go on to fill it one or more elements at a time. This is possible in Sisal, and the code below shows another version of the loop that accomplishes it: for initial i := array_liml(A) - 1; tmp := 0; running_sum := tmp; running_sum_array := array [] while i < array_limh(A) & array_size(A) ~= 0 repeat i := old i + 1; tmp := A[i]; running_sum := old running_sum + tmp; running_sum_array := array_addh(old running_sum_array, running_sum) returns value of running_sum value of running_sum_array end for Here we have added another loop-carried value, "running_sum_array", which is initialized to an empty array in the initial clause and returned in the returns clause. In the body, we see it is operated on by the Sisal primitive "array_addh"; this primitive extends the array given as its first argument, at the upper end of its index range, by one element whose value is the second argument. This affects both the contents of the array and its index bounds, and is a very handy primitive to have in such a case as this. Note that in the returns clause, the second value being returned is now generated by a "value-of" operation, rather than by an operation. Since the array we wish to return was created in the initial-clause and is being incrementally filled by the loop body, we simply return the "last" version of it generated by the loop as a whole. While the incremental expansion of this array as the loop executes would normally be a source of inefficiency, the Optimizing Sisal Compiler is able to optimize such programs so that the structure is built in place quite efficiently. As one might expect by now, there is also a primitive to extend an array at the bottom end of its index range, as well as primitives to change one or more elements of an array, to shrink arrays at either end, and primitives to adjust the index range of arrays. These are specified in the following table, along with all the other array operations we have already seen; in the table, "A" is an array, "v" is a value of the proper type for elements of "A", and "low" and "high" are integers: array primitive effect ======================================================== array_size(A) get number of elements array_limh(A) get upper bound array_liml(A) get lower bound array_addh(A,v) extend at high end array_addl(A,v) extend at low end array_remh(A) remove at high end array_reml(A) remove at low end array_adjust(A,low,high) set both bounds array_setl(A,low) set lower bound A[ i : v ] change value at index i to v A[ i: u, v, w ] change values at indices i, i+1, and i+2 to u, v, and w, respectively A[ i: u; j: v ] change values at indices i and j to u and v, respectively The above primitives allow us to operate on arrays in useful ways, and are most often used in sequential loops, such as we have been discussing. However, they can also occur in other code, such as let-in statements, and it bears emphasizing that the arrays they produce are different values from those they operate on. In other words, we could have the following code: let A := array [ 1: 5, 10, 15, 20, 25]; B := array_addl(A, -5); C := array_addh(A, 30); D := array_reml(A); E := array_adjust(A, 3, 4); F := A[ 1: -5; 3: -15, -20 ] in A, B, C, D, E, F end let The above let-in statement returns the following values: array [ 1: 5, 10, 15, 20, 25] % The array A array [ 0: -5, 5, 10, 15, 20, 25 ] % B = A with new first element and % lower bound array [ 1: 5, 10, 15, 20, 25, 30 ] % C = A with new last element and % upper bound array [ 2: 10, 15, 20, 25 ] % D = A minus first element, with new % lower bound array [ 3: 15, 20 ] % E = A minus several elements, with % new bounds array [ 1: -5, 10, -15, -20, 25 ] % F = A with several elements changed Note that none of the operations on "A" changes its value; Rather, they produce new values of array types whose elements and index ranges are related to those of their argument array. 16.4 Termination Tests and Generation of "old" Values The need to pass values between consecutive instances of a loop body, and in so doing, define an identifier "old {name}" and simultaneously undefine an identifier {name} requires some careful timing. In the case of a top-tested loop, it is not possible to reference both {name} and "old {name}" in the termination test, because the the first time the test is evaluated, only one of these values is defined. The redefinition of names of loop-carried values takes place immediately after the termination test. In the case of the loop shown schematically below, for initial val := initial_value while {test(val)} repeat val := f(old val) returns val end for the value associated with the name "val" changes its name to "old val" right after "test(val)" is executed, and the named "val" becomes undefined at the same time. Note that the test can contain references to "val" but not "old val", since the latter is not defined the first time the test is encountered. This form of loop is called a "top-tested loop." If a test which refers to both names is desired, it must be placed at the bottom of the loop body, as in the following example: for initial val := initial_value repeat val := f(old val) while {test(val, old val)} returns val end for The first time "test(val, old val)" is executed, both names are defined, since the former was just calculated and the latter was calculated in the initial clause. Immediately after the test, the value named "val" changes its name to "old val" and the value "val" becomes undefined at the same time. This form of loop is called a "bottom-tested loop." Next are a few exercises intended to cement the concepts discussed above. 17. Exercise Set 7 Exercise 7.1: Newton Square Root Loop Given: a real value Y, and the formula for Newton's square root algorithm, x(i) = (Y/x(i-1) + x(i-1))/2, where x(0)=Y/2; Given: termination criterion: |Y - x(i)**2| < epsilon, a real value; Compose: a for-initial loop that calculates successive approximations x(i) until the termination criterion is met, and returns the final value of the approximation as well as the number of approximations, x(i), i > 0, that have been calculated. Exercise 7.2: First Maximum and Minimum Loop Given: a one-dimensional array, A, of real values; Compose: a for-initial loop that finds the first instance of the minimum and maximum values present in the array, and returns these values and their index positions. Exercise 7.3: Successive Relaxation Loop Given: a two-dimensional array, A, of real values; Given: a predefined termination test function converged(x,y,eps), which takes two conformable two_dimensional arrays, x, and y, and a real value eps, and returns a Boolean value if its metric is met with respect to its third argument; Compose: a function containing a for_initial loop that uses the function from exercise 16 to successively relax A until convergence occurs, and returns the converged value of A and the number of relaxations that have occurred. Hint: Since the convergence function requires two arrays, one could use a bottom-tested loop... NOTE:ÊUnlike the previous exercise involving relaxation, the entire system array here should be relaxed. The averaging forumla should be modified to account for fewer neighbors are the edges of the system. Exercise 7.4: Conway's Game of Life Given: a two-dimensional array, Board, of integer cell values, zero for a dead cell and positive nonzero for a live one; Given: the boundaries cells around the edge of Board are always dead; Given: the rules by which a non-boundary cell lives or dies: if three or fewer neighbor cells are alive, then the cell dies; if five or more neighbors cells are alive then the cell dies; if four neighbor cells are alive, then increment the current cell value; Compose: a program that plays evolves the Board array until some input number of steps are completed, and returns an array of all the values taken on by Board. Answers to Exercise Set 7 Exercise 7.1: Newton Square Root Loop Given: a real value Y, and the formula for Newton's square root algorithm, x(i) = (Y/x(i-1) + x(i-1))/2, where x(0)=Y/2; Given: termination criterion: |Y - x(i)**2| < epsilon, a real value; Compose: a for-initial loop that calculates successive approximations x(i) until the termination criterion is met, and returns the final value of the approximation as well as the number of approximations, x(i), i > 0, that have been calculated. for initial x := Y / 2.0; count := 0 while abs(Y - x*x) > epsilon repeat x := (Y/old x + old x) / 2.0; count := old count + 1 returns value of x value of count end for Exercise 7.2: First Maximum and Minimum Loop Given: a one-dimensional array, A, of real values; Compose: a for-initial loop that finds the first instance of the minimum and maximum values present in the array, and returns these values and their index positions. for initial i := array_liml(A); max_val := A[i]; min_val := A[i]; max_ind := i; min_ind := i; while i < array_limh(A) repeat i := old i + 1; max_val, max_ind := if A[i] > old max_val then A[i], i else old max_val, old i end if; min_val, min_ind := if A[i] < old min_val then A[i], i else old min_val, old i end if; returns value of max_val value of max_pos value of min_val value of min_ind end for Exercise 7.3: Successive Relaxation Loop Given: a two-dimensional array, A, of real values; Given: a predefined termination test function converged(x,y,eps), which takes two conformable two_dimensional arrays, x, and y, and a real value eps, and returns a Boolean value if its metric is met with respect to its third argument; Compose: a function containing a for_initial loop that uses the function from exercise 16 to successively relax A until convergence occurs, and returns the converged value of A and the number of relaxations that have occurred. Hint: Since the convergence function requires two arrays, one could use a bottom-tested loop... NOTE:ÊUnlike the previous exercise involving relaxation, the entire system array here should be relaxed. The averaging forumla should be modified to account for fewer neighbors are the edges of the system. type OneDim = array [ real ]; type TwoDim = array [ OneDim ]; function relax( A : TwoDim returns TwoDim ) for i in array_liml(A), array_limh(A) cross j in array_liml(A[i]), array_limh(A[i]) % NOTE:ÊThis form of the code in this loop should vectorize. top_row := i = array_liml(A); btm_row := i = array_limh(A); lft_side := j = array_liml(A[i]); rgt_side := j = array_limh(A[i]); top_val, cnt1 := if top_row then 0.0, 0 else A[i-1,j], 1 end if; btm_val, cnt2 := if btm_row then 0.0, cnt1 else A[i+1,j], cnt1+1 end if; lft_val, cnt3 := if lft_side then 0.0, cnt2 else A[i,j-1], cnt2+1 end if; rgt_val, cnt4 := if rgt_side then 0.0, cnt3 else A[i,j+1], cnt3+1 end if; lft_btm_val, cnt5 := if lft_side | btm_row then 0.0, cnt4 else A[i+1,j-1], cnt4+1 end if; rgt_btm_val, cnt6 := if rgt_side | btm_row then 0.0, cnt5 else A[i+1,j+1], cnt5+1 end if; lft_top_val, cnt7 := if lft_side | top_row then 0.0, cnt6 else A[i-1,j-1], cnt6+1 end if; rgt_top_val, cnt8 := if rgt_side | top_row then 0.0, cnt7 else A[i-1,j+1], cnt7+1 end if; divisor := real(cnt8+1); avg := (A[i,j] + top_val + btm_val + lft_val + rgt_val + lft_btm_val + rgt_btm_val + lft_top_val + rgt_top_val) / divisor returns array of avg end for end function % relax function successive_relaxation( A : TwoDim returns TwoDim, integer ) for initial relaxed_A := A; step_count := 0; repeat relaxed_A := relax(old relaxed_A ); step_count := old step_count + 1; until converged(relaxed_A, old relaxed_A, eps) returns value of relaxed_A value of step_count end for end function % successive_relaxation Exercise 7.4: Conway's Game of Life Given: a two-dimensional array, Board, of integer cell values, zero for a dead cell and positive nonzero for a live one; Given: the boundaries cells around the edge of Board are always dead; Given: the rules by which a non-boundary cell lives or dies: if three or fewer neighbor cells are alive, then the cell dies; if five or more neighbors cells are alive then the cell dies; if four neighbor cells are alive, then increment the current cell value; Compose: a program that plays evolves the Board array until some input number of steps are completed, and returns an array of all the values taken on by Board. %========================================================================== define main % Game of Life by John Conway. Values of the Board cells are 0, for dead % cells, or positive integers for live cells. % Each cell has 8 neighbors; boundaries always contain dead cells. % On each iteration, the cells are updated as follows: % -- If a cell has more than four live neighbors, then it should die. % -- If a cell has fewer than four live neighbors, then it should die. % -- If a cell has exactly four live neighbors, then % if it is deal, then it should be born; % if is alive, then it should stay alive. % -- If a cell stays alive in a step, its value incremented by one. % The simulation of life iterates "Iterations" times. % The board rows are all indexed from 1 to N, and the columns are all % indexed from 1 to M; M and N may not be equal. type OneDim = array [ integer ]; type TwoDim = array [ OneDim ]; type ThreeDim = array [ TwoDim ]; %========================================================================== function decide( B : TwoDim; I : integer; J : integer returns integer ) % This function decides which cells should live or die. %---------------------------------------------------------- function test( B : TwoDim; p, q : integer returns integer ) % This embedded function determines if a given cell is % currently alive; it returns numeric values, 1 for alive % and 0 for dead, so its results can be added by its % caller to similar results for other cells. if B[p, q] > 0 then 1 else 0 end if end function % test %---------------------------------------------------------- % count the neighbors of cell [I,J] let Neighbor_count := test(B,I-1,J-1) + test(B,I-1,J) + test(B,I-1,J+1) + test(B,I, J-1) + test(B,I, J+1) + test(B,I+1,J-1) + test(B,I+1,J) + test(B,I+1,J+1) % change its state, if necessary in if ( Neighbor_count >= 5 ) then 0 elseif ( Neighbor_count = 4 ) then B[i,j] + 1 elseif ( Neighbor_count <= 3 ) then 0 end if end let end function % decide %========================================================================== function evolve( B : TwoDim returns TwoDim ) let Num_Rows := array_size(B); Num_Cols := array_size(B[1]); First_row := B[1]; Last_row := B[Num_Rows]; Core := % evolve the central portion of B for I in 2, Num_Rows-1 Mid_row := % evolve the central portion of row I for J in 2, Num_Cols-1 returns array of decide(B,I,J) returns array of Mid_row end for; Long_Core := % add the left and right boundaries to each core row for m in 2, Num_Rows-1 Long_Row := array_addl( Mid_row, 0 ); returns array of array_addh( Long_Row, 0 ) end for in % add the top and bottom boundary rows to the long core array_addl(array_addh(Core,Last_row),First_row) end let end function % evolve %========================================================================== function main( Iterations : integer; Board : TwoDim returns Three_Dim ) for initial Count := Iterations; B := Board; while ( Count > 0 ) repeat Count := old Count - 1; B := evolve( old B ); returns array of B end for end function % main %========================================================================== 18. Compiling and Running Sisal Programs Up to this point we have been dealing with Sisal and a pedagogical tool. But since it is intended to support real-world programming on contemporary uniprocessor and multiprocessor systems, we should discuss how to use compile and run Sisal programs, using the Optimizing Sisal Compiler (OSC) and runtime system. OSC is available for most Unix systems, and provides for parallel execution on a number of shared memory systems. Prototype versions of OSC exist for distributed memory and multithreaded systems, but performance and robustness on these systems is not likely to be as high as on shared memory systems. A later section will describe how to acquire and install the Sisal system software, For now, we will assume the compiler is available and describe how it works in actual use. Assuming that the user has a Sisal program composed and ready to compile, the first step is to run it through OSC for syntax checking. Once the program is syntactically correct, it is guaranteed to be semantically correct, and is likely to be at least close logically to what the programmer intended. So, let us take a minute to discuss the overall structure of most Sisal programs by looking at an example. Then we will describe how to compile and run the example on a real parallel computer. The program following is a standard example, used to illustrate the structure of a typical technical application written in Sisal. 18.1 Trapezoidal Rule Integrator define main % This module is a trapezoidal rule integration program. % For any supplied function f(X:double_real), it takes % double real inputs defining endpoints of the interval % to be integrated over, a double real tolerance, and an % integer count of sub-intervals to divide the interval % into. It returns an array of double_real values of the % numerical value of the definite integral over the interval, % the number of refinement iterations made, the number of % subintervals actually used to arrive at the final result, % and the total number of subinterval area calculations made. type One_Dim_D = array [ double_real ]; type Two_Dim_D = array [ One_Dim_D ]; % INSERT FUNCTION F HERE % %=============================================================== function f( x : double_real returns double_real ) 15.0d0 * x * x * x - 6.375d0 * x * x + 11.9d0 * x - 33.999d0 end function %% f %=============================================================== function trap_area( left_end, right_end : double_real returns double_real ) % Calculates the area of a trapezoid let h := right_end - left_end; f_avg := ( f(left_end) + f(right_end) )/2.0d0 in h * f_avg end let end function %% trap_area %=============================================================== function main( a, b : double_real; % Endpoints of integration % interval epsilon : double_real; % tolerance initial_n : integer; % Initial number of % subintervals returns One_Dim_D, % values of approximation % of definite integral integer, % iteration counter integer ) % number of subintervals % finally used for initial N := initial_n; Total_N := N; Cnt := 1; H := (b - a) / double_real(N); Area := for i in 1, N ai := a + double_real(i-1) * h; bi := ai + h; returns value of sum trap_area( ai, bi ) end for repeat N := old N * 2; Total_N := old Total_N + N; Cnt := old Cnt + 1; H := old H / 2.0d0; Area := for i in 1, N ai := a + double_real(i-1) * h; bi := ai + h; returns value of sum trap_area( ai, bi ) end for until abs( area - old area ) < epsilon | Cnt > 12 returns array of Area value of Cnt value of N value of Total_N end for end function %% main % Looking at the commentary at the head of the program, and at the main function (function "main), we can see that the program is an integration approximator, that recursively refines an initial number of subdivisions of an integration interval by halving the width of each subinterval (doubling their number) and using the total area of the trapezoidal approximation for each subinterval as the definite integral. This process stops when either a sufficient number or recursive refinements have taken place, or when a convergence test is past. The arguments to the program are the x-values of the integration interval endpoints, the convergence tolerance, and the initial number if subintervals. The outputs are the integral approximation, the number of refinement steps or iterations taken, the number of subintervals finally use to produce the final answer, and the total number of subinterval area values calculated. The function being integrated must be inserted into the code of this simple sample program; there is one present already, a polynomial, whose integral can readily be calculated by hand as a check on the program's operation. In structure, the program has a sequential outer loop, the initial clause of which computes the initial approximation, given the initial number of subintervals. In the loop body, the subintervals are refined and another approximation is calculated. The convergence test is at the bottom of the loop, and uses both the freshly computed approximation and the most recent previous one to determine convergence. The outer sequential loop contains a for-loop which ranges over the number of subintervals currently being used. In this loop a function named "trap_area" is called to calculate the area of one trapezoid. Function "f" is called by "trap_area" to calculate the function values at the trapezoidal endpoints. The doubling of the number of subintervals in each iteration of the outer loop causes the range of the for- loop to double, increasing the amount of work that is exploitable for parallel execution. This doubling justifies the use for an iteration ceiling in the convergence test, to make sure the program does not run away with itself under pathological numerical conditions, and generate an unbounded amount of work, swamping the runtime system in the process. The code in the file accessed by the above anchor should be in shape for compilation. The contents of the file accessible below are sample inputs for the program, once it is ready to run. 18.2 Trapezoidal Rule Integrator Program Input 1.0d0 2.0d0 1.0d-6 10 The shell actions and results of compiling and executing the program can be seen by clicking on the following anchor: 18.3 Compiling and Executing the Trapezoidal Rule Integrator Program unixbox> osc traprule2.sis.html Production OSC version V12_9_1 LL Parse, using binary files * Reading file: trap.sis... version 1.8 (Mar 28, 1989) accepted 82 lines in program 0 errors ( calls to corrector) 0 tokens inserted; 0 tokens deleted. 0 semantic errors if2mem: W - array of ON LINE 77 OF main IN trap.sis MAY INTRODUCE COPYING unixbox> s.out SGI SISAL 1.2 V12_9_1 1.0d0 2.0d0 1.0d-6 10 [ 1,11: # DRC=1 PRC=1 2.532788e+01 2.525147e+01 2.523237e+01 2.522759e+01 2.522640e+01 2.522610e+01 2.522602e+01 2.522601e+01 2.522600e+01 2.522600e+01 2.522600e+01 ] 11 10240 2047 The first line shows OSC being invoked, with no arguments other than the name of the source file. The Optimizing Sisal Compiler has a large number of options allowing in-depth control of the entire multi-phasic compilation and optimization process. These options are documented in the "OSC" man pages which are installed during the compiler installation process. The compilation proceeds through various stages, translating Sisal source into a data flow graph language, called IF1, translating that into an annotated memory management graph language called IF2, and finally producing C code; the target machine's compiler is then invoked to complete the process and leave an executable memory image, called "s.out" by default, containing the runtime system for I/O and parallel task management. The compiler can be ordered to return varying amounts of information to the terminal. Of particular interest are syntax error messages, which are not shown in this example. When compilation is completed, execution is ordered by typing the name of the executable, which in this case is the default "s.out." Extensive options also exist for controlling the execution, and these, too, are documented in the "s.out" man pages which are automatically installed during compiler installation. In this example, no options are present. The runtime system presents a startup message to the terminal, and then waits for inputs from the Unix input channel "standard Input," since no file name was given. The inputs are typed in manually on the next line, and execution proceeds. Again, no output file name was given, output is sent to the Unix output channel "standard output", and appears on the terminal. The outputs are those specified in the program: (1) the array of subinterval approximations - eleven of them, since we asked for ten iterations to be taken; (2) the total number of approximations steps, including the first one in the initial clause of the outer loop; (3) the total number of subintervals used to compute the final answer; and, (4) and the total number of area evaluations performed in the program's execution. The odd form of the array of area approximations is due to the fact that Sisal has a syntax for its inputs and outputs. This is because, since all objects are dynamic, their boundaries must be denoted for proper handling on input, and because one Sisal program's output is immediately suitable for use as input to other Sisal programs. The name of the simple I/O language is "Fibre," and more information on it can be found in the Fibre manual [47]. Next to observe are the results of more complex executions: 18.4 Executing the Trapezoidal Rule Integrator Program in Parallel unixbox> cat > trap.inp 1.0d0 2.0d0 1.0d-6 10 unixbox> s.out trap.inp trap.out SGI SISAL 1.2 V12_9_1 unixbox> s.out -r -w1 -z inp SGI SISAL 1.2 V12_9_1 unixbox> s.out -r -w4 -z inp SGI SISAL 1.2 V12_9_1 unixbox> more s.info Workers DsaSize ExactFit DsaHelps 1 2000000b 100b 0 MemW MemU LpSliceV ArrayEx 1040b 1256b 1 100 CpuTime WallTime CpuUse 0.0800 0.0800 100.0% Workers DsaSize ExactFit DsaHelps 4 2000000b 100b 0 MemW MemU LpSliceV ArrayEx 1240b 1528b 4 100 CpuTime WallTime CpuUse 0.0200 0.0200 100.0% 0.0200 0.0200 100.0% 0.0200 0.0200 100.0% 0.0200 0.0200 100.0% In this file, we see the results of a parallel execution. First, the input is put into a file for convenience, using the Unix "cat" command. Next, execution is performed with the names of input and output files specified. Then execution is performed again with options "-r -w1 -z". These tell the runtime system to turn on execution performance monitoring, run with one worker process, and suppress output, respectively. The next execution uses the same options, except for "-w4", which specifies the use of four worker processes. This allows the exploitation of parallelism, with the runtime system slicing the available loops and partitioning the work to the worker processes. Finally, see the display of the execution info file, "s.info," through the use of the Unix "more" command. This file contains the worker process execution times and memory usages for the two executions made with the "-r" option specified. The sequential run shows the single-worker cpu time is shown as 0.08 seconds, and each of the worker cpu times for the four- worker run are shown as 0.02 seconds. This indicates that the compiler and runtime system did a good job of exploiting the available parallelism, and the result was an execution speedup of four. 19. Sisal I/O There remains one more topic to cover, that of dealing with I/O in Sisal programs. The general topic of I/O in functional programs is one of hot debate and current research. This is because reading and writing are inherently non-functional. These operations are in fact supposed to have side-effects, so they cannot be done within a program, at least not with the freedom to be found in imperative languages. While these matters are being addressed, and there are interim work-arounds for the most needed forms of I/O, the designers of Sisal 1.2 took a conservative tack in implementing the I/O facilities they added to the language. To preserve functional semantics, the I/O that can be done in Sisal is either a bit stiff or a bit uncertain. We will look at both of these areas in the following sections, starting with the formal and ending with the informal. 19.1 Sisal I/O: The Fibre Syntax We have already seen how I/O is done in general in Sisal programs. The main function gets all the inputs as its arguments or parameters, and produces all the outputs as its results. This is workable for programs of the sort that we have seen in our examples - numerical, scientific ones that tend to transform their inputs into their outputs without much I/O needed during processing. But what we haven't explored are effects of Sisal's dynamic objects on exactly how input data and output results are specified. Since all arrays are of dynamic size, when one or more arrays are provided as inputs, they must be distinguishable from each other and from other inputs. If a number of values are provided, some of which are array elements, there is no way for the compiler or runtime systems to predict how many of them belong to the arrays. We solve this problem, maintaining the freedom to change input array sizes, by using a separate small language for the inputs (and outputs) of Sisal code. This language, called "Fibre", has its own small set of syntax rules that allow Sisal's I/O routines to determine the boundaries of aggregates. As one might imagine, this syntax is quite simple, and is completely documented in [47]. 19.1.1 Scalars In a nutshell, numbers of all types are specified the same as in program expressions (see sections 3 and 4). Boolean values are specified by the single unquoted characters T and F. Characters are specified by embedding them in single quote characters, such as 'Q'. (Nonprinting characters are specified using quoted character pairs that begin with the backslash character \; for details on this, the reader is referred to the Fibre manual.) Fibre also allows for commentary, in the form of text begun with the pound sign character # and continuing to the end of the line. The specification of aggregates is where the syntactic elements come into play. 19.1.2 Arrays Arrays in Fibre look very much like they do in Sisal expressions. They begin with an opening square bracket, [, followed by a lower index bound; next is an optional comma and upper bound; then, a colon, to separate the bound(s) from the elements; the elements are separated by one or more spaces, and a final closing square bracket ] closes the array. Here are a few examples to illustrate this: [1: ] [4,2: ] [0: 21 22 23 ] [ 1 : 1.0 2.0 3.0 4.0 ] [ 1, 4 : 1.0 2.0 3.0 4.0 ] [ 0, 5: 'h' 'e' 'l 'l' 'o' '!' ] The first two arrays are empty; the first has lower bound 1 and upper bound 0, by default; the second has specified bounds, but the lower is greater than the upper. The third one is an array of integers with lower bound 0 and (implied) upper bound 2. The fourth is an array of reals with lower bound 1 and upper bound 4. The fifth is the same array as the fourth, but with the upper bound specified rather than implied. The sixth example is an array of characters. Obviously, the spacings of the brackets and bounds are free-form; white space is not significant and arrays may break across line boundaries. Arrays of characters can be represented with a special string notation using double quotes, as follows: " Sisal is cool!" is the same array as [1: ' ' 'S' 'i' 's' 'a' 'l' ' ' 'i' 's' ' ' 'c' 'o' 'o' 'l' '!' ]

Similarly, multidimensional arrays also look just like they do in program 
code. Since each element of a two-or-greater-dimensional array is itself an 
array, the above rules are applied recursively. Here are a few more examples 
of this:

   [1: [1:] ]
   [1: [ 1: 10 20 30 40]
       [ 3: 5 10 15 ]
       [ 2, 3: 101 102 ] ]
   [ 1, 2 : [ 1, 2 : 1d0 2.0d0 ] [ 3.0d0 4d0 ] ]
   [ 1, 2: [ 1: 'g' 'o' 'o' 'd' ] [ 1: 'b' 'y' 'e' ] ]
   [ 1: [ 1: [ 1: [ 1: 11 12 13 ]
                  [ 1: 21 22 23 ] ]
             [ 1: [1 : 31 32 33 ] ] ]
        [ 1: [ 1: [1: 41 42 43 ] ] ] ]

The first is a non-empty two-dimensional empty whose only element is an empty 
one-dimensional array. (This sort of thing can get confusing, so caution is 
urged!) The second example is a two-dimensional array of integers; note that 
the individual row arrays have different index ranges. The third example is a 
two-dimensional array of double_reals. The fourth is a two-dimensional array 
of characters. The last example shows a four-dimensional array of integers. 
Indents and line breaks, while optional, can help clarify complex aggregates.

19.1.2 Records
Record syntax involves using angle brackets, <>, to mark record boundaries, 
which causes the Fibre syntax to deviate from the Sisal syntax; in Sisal 
code, records are enclosed in square brackets. Field names are not needed, 
but the field values must be specified in order. So, given the declarations

   type complex = record [ re, im : real ];
   type label = record [ name, addr1, addr2, city, state : array [ char ];
                         zip : integer ]; 

 and a program whose outermost function expected two arguments, of type 
complex and label, respectively, we could have the following Fibre data:

<3.1416, 2.1828>
<"DeBoni" "LLNL" "P.O. Box 808" Livermore "CA" 94550>

19.1.3 Other aggregates
Sisal syntax also allows for two other kinds of aggregate data types, streams 
and unions. Since we didn't treat these types in this tutorial, we will not 
do so here, either. The reader is referred to the Sisal Language Manual [33] 
and the Fibre manual [47] for more information.

19.2 Using Fibre

Fibre is the syntax for both input and output. When executing a program, as 
we saw in section 18, data can be provided directly from the keyboard, and 
execution will not begin until the Fibre parser recognizes all the values it 
was told by the compiler to expect. or, it can be put into a file and 
directed to the executable, as we also saw. This is generally the most 
convenient method, since most programs have more than just a few input 
values. But there's also another virtue to doing it this way. If input and 
output come from and go to files, the output file from one run can easily 
become the input file for a subsequent one. This requires forethought in 
program design, to insure that everything needed as input is also produced as 
output, but it's a good ides to return all major inputs as outputs, anyway, 
for purposes of documenting the run.

The only real downside to using Fibre is the fact that there currently are 
only minimal facilities  for formatting output in Sisal's support software. 
This means that the user is often presented with array outputs with one 
element per line. Some editing is usually required to change output files 
into readable form, and sometimes the number of lines can make this 
troublesome. However, there is a very useful trick that one of the Sisal 
users came up with, which deserves mentioning, even thought its details are 
outside the scope of this tutorial. So we will describe it briefly in the 
next section, and let the interested programmer experiment or ask us for 
further guidance on it.

19.3 Sisal I/O: The Real World

While the Fibre I/O language is simple and useful, it tends to feel 
incomplete on first exposure, sine the requirements of real-world programming 
and its needs for I/O services are generally more complicated. Some of this 
can be addressed, but a significant portion is a set of unanswered questions. 
Let us briefly describe these matters and then go on to some real-world 
expedients we can resort to, to get around them.

19.3.1 Getting Output
Real scientific and numerical programs tend not to produce a single result at 
termination time. Rather, they usually produce output continuously, as they 
proceed toward termination. for at least two purposes. First, they often 
generate a sequence of partial answers, which either aggregate to or point to 
the complete result. This can take the form of a series of approximations to 
a final result, as in convergence algorithms; or, it can be a sequence of 
completely evolved simulated systems, at intermediate steps between 
initiation and completion. In either case, the intermediate results are of 
interest for two reasons. They can be useful in charting the progress of a 
program and perhaps identifying where it goes wrong (or gets interesting in 
unexpected ways); and, they can prevent having to restart a lengthy 
simulation run at the beginning, after an abnormal termination due to 
hardware problems.

In both cases, it is a good idea to produce all the intermediate values as 
well as the final one. In Sisal, the typical technical program is composed as 
a sequential outer loop containing a set of parallel loops. The outermost 
loop steps the algorithm, using the results of one step as input to the next. 
It is no greater problem to return "array of system", that it is to return 
"value of system" at the bottom of the outer loop, and this is an obvious way 
to produce the intermediate steps' values. However, this array result will 
still not appear until the program terminates, since it is not fully formed 
until then, so this is at best a partial solution. Further, it adds to the 
problem by causing the answers produced by such programs to require a 
potentially unbounded amount of space to store them as they are built.

19.3.2 Stream Output
The "stream" data type was originally designed to provide another part of the 
solution. Streams, unlike arrays, are produced sequentially by elements, not 
all at once. This allows the storage requirements to be bounded by the size 
of a small number of the stream elements ("system", in our current example ). 
If the outer sequential loop of a program were to return "stream of system", 
then in theory, each element "system" would be available as it was completed 
by an iteration of the loop. However, in the current implementation of the 
Optimizing Sisal Compiler and Runtime System, streams are not built this way; 
instead they are treated as arrays, for practical reasons involving program 
performance,  optimization, and efficiency. A limited form of stream data 
types is now being designed for reimplementation, and future versions of the 
compiler should handle them acceptably. For now, streams can be used, but 
will act as arrays. So the problem remains unanswered.

19.3.3 Peek Output
However, the needs for intermediate output, added to the requirements of 
debugging Sisal programs, stimulated the introduction of a "meta-feature" 
into Sisal, called "peek". This is a primitive that serves no purpose in the 
language, and is used in programs only to provide quick-and-dirty output. 
Basically, peek is called with a single argument, any value name or 
expression, and it returns an integer which can be subsequently ignored. Peek 
is used not for its results, though, but for its side effects, which are to 
write the value of its argument to the Unix output file called "standard 
error." This normally appears at the terminal, but can also be directed into 
a file along with other program output. We will offer no examples of peek's 
use here, because it is dangerous, not in changing program results or causing 
abnormal terminations, but because calls to peek are sometimes removed by the 
optimization stages of the compiler. This is because its results are usually 
not used, and so have no effect on program execution. When they survive 
optimization, the calls to peek are often reordered, along with other 
sections of code, so results sometimes do not come out in the order 
expected.Still, when used with discipline, for simple debugging or saving 
intermediate values, peek can be useful and even dependable.

19.3.4 Advanced Output Trickery
The I/O trick we alluded to above, for producing intermediate results in 
intelligible form, is this. Sisal has a very advanced feature, called the 
Foreign Language Interface, that allows what we call "mixed language 
programming." This allows the integration of Sisal code with Fortran and/or 
C, with appropriate data being passed into and out of the Sisal runtime 
system. The data so passed is NOT in Fibre form, which allows I/O to be done 
with relative ease from outside of Sisal. The very clever user we mentioned 
above figured out how to invoke X-Windows display software from within an 
outer program shell written in C. He then let the outer program invoke a 
Sisal program to solve a big problem with parallel execution, and return 
solution data to the outer program. The outer program then displayed the 
answers using X graphics. Carried a bit further, the outer sequential loop 
could be written in the outer program, so that the intermediate results of 
each invocation of the Sisal program could be displayed as soon as they were 
returned from the Sisal portion of the program.

We strongly believe that using multiple languages, at various levels within a 
program's structure, to do work appropriate for the language and level 
represents a strong approach to the problem of parallel programming in 
general. It also adds to the programmer's burden the need for fluency in 
multiple languages as well as the interfaces among them. But this is grist 
for the research mill, and we believe that acceptable solutions are in the 
offing.

20. Exercise Set 8

These exercises are more extensive and less well defined than those given 
previously, to offer the reader more freedom in adapting them to her or his 
immediate purposes in learning the Sisal language. Also, no suggested answers 
are provided for them. However, the Sisal project staff stands ready to 
assist sincere efforts. To reach us, see the section titled "23. For More 
Information..."

Exercise 8.1: Computing Pi
Assuming that you already have a good machine-precision value for Pi, write a 
program that will determine what value of N will provide equivalent precision 
on your computer, using the classical expression

             n  (-1)^(i+1)
     pi/4 = sum ----------
            i=1   (2i-1)

Exercise 8.2 Computing the Mandelbrot Set
Given the Mandelbrot formula
        z(n+1) = z(n)**2 + c,
write a program that accepts coordinates of a rectangular region of the 
complex plane, a size for each dimension in points, and a value for c as 
inputs, and then iterates every point in your region until |z| > 2, counting 
the iterations needed for each point. The output should be the array of 
iteration counts.

Exercise 8.3 Bubble Sort
Write a program that sorts a set of values via the bubble sort method. This 
method iteratively compares adjacent pairs of values, swapping them is the 
smaller value has the higher index value; this causes the smaller (or larger) 
values to bubble up (or down) depending on which direction through the index 
range the search is started at. This iteration occurs N-1 times, for a set of 
N values being sorted. A standard refinement has the final index position 
treated removed from the next phase, since it has received its correct value, 
already; so, each phase is one position shorter than the previous one.

Exercise 8.4 Compound Interest with Increasing Principal
The formula for the value of a savings account with a principal p, which is 
invested at compound interest, with an annual rate r, for m years, compounded 
n times per year is
     c * (1+r/n)**(m/n).
Assuming a savings plan which takes the principal p by amount s very year, 
write a program that takes p, r, m, n, s , and the number of years, y, as 
inputs, and computes the total value of the savings program at the end of 
each year.

Exercise 8.5 Prime Factorizations
Every integer has a unique factorization into primes. Given that the largest 
prime factor of any integer will be one half its value, and given also a 
large list of primes which you supply as input, write a program that, given 
an argument number, generates and array of counts corresponding to the input 
list of factors, where each count is the number or times the corresponding 
prime occurs in the prime factorization of the argument number. For instance, 
the input list of primes might be [1: 2 3 5 7 11 13 17 19 23 ], and the 
argument number and their factorizations might be 
        46 and [1: 1 0 0 0 0 0 0 0 1 ]
        42 and [1: 1 1 0 1 0 0 0 0 0 ]
        24 and [1: 3 1 0 0 0 0 0 0 0 ]

You might want to take an array of numbers as input and return an array of 
factorizations. You might also want to calculate the array or primes, to get 
values that reach or exceed half the argument number.

Exercise 8.6 Triangular System Solution
Given an upper triangular coefficient matrix for a linear system, along with 
a vector of right-hand side values for the system, write a program that 
solves for the values of the unknowns using the back-substitution method. For 
instance, for a 4 by 4 system the last
unknown is calculated as
     x(4 ) = b(4)/ a(4,4); and
     x(3) = 1/a(3,3)*(b(3)-x(4*a(4,4).
Your program should be general enough for any size system, even one that is 
over-specified (has more equations than unknowns), and the output should the 
be array of values for the unknowns.

21. Advanced Features

Sisal has a number of other features which deserve mention here, even though 
detailed description of them is beyond the scope of this tutorial. We will 
briefly describe several of them, and give a reference to a document or 
person to use as a source for further information.

21.1 Union Types

In order to produce data structures that can take on a recursive structure, 
such as lists and trees, it is necessary that Sisal have a data structuring 
ability equivalent to Pascal's variant records. Sisal's equivalent of this is 
the Union Type; it allows multiple forms of data to occur under a single 
type, identified by a tag that can be tested by user code. This allows, for 
instance, a tree to be constructed of a union type consisting of internal 
nodes or leaf nodes. For more information on Union Types, see the Sisal 
Language Manual [33].

21.2 Streams

Sisal includes a data type that allows producers and consumers of information 
to execute simultaneously on a shared structure, called a Stream. Streams are 
homogeneous aggregates, like arrays, except that they can be accessed only 
for their first element or all but their first element. This allows the 
construction of programs that operate on continuous values to or from other 
programs or outside environments. For more information on Streams, see the 
Sisal Language Manual [33].

21.3 Debugging

Debugging functional programs, despite the tendency to make fewer mistakes in 
composing them, tends to be somewhat difficult, mainly because functional 
language semantics makes input and output difficult, at best. However, we 
have developed a state-of-the-art symbolic debugging system, incorporating 
its own compiler and graph interpreter. This debugging system is called 
Twine, and it is supported as an experimental tool. For more information on 
Twine, see [36].

21.4 Mixed Language Programming

It is possible to merge programs in Fortran or C with Sisal code, in order to 
best use the features of each language, via a facility called the Foreign 
Language Interface. For instance, Sisal can be used to exploit parallelism in 
an algorithm, while the imperative languages can be used for I/O or access to 
special purpose libraries. For more information on mixed language programming 
and the Foreign Language Interface, see [10].

21.5 Distributed Memory Sisal

The currently available version of the Optimizing Sisal Compiler (OSC) and 
Runtime System support parallelism only on computer systems having a shared 
memory programming model. These include such common computers as 
multiprocessor Sparcs, SGI's, Crays, and even a few machines with physically 
distributed memory that can be used in a shared address space mode, such as 
the BBN TC-2000, nCube, and Encore Multimax. However, prototype versions of 
Sisal exist for machines with only distributed memory programming models, 
such as the Thinking Machines CM-5 and Cray T3D. For more information on 
Distributed Memory Sisal, see [1].

21.6 Newer Versions of Sisal

Since the currently available Sisal language, Sisal 1.2, is based on a design 
that is approaching ten years of age, we are continuously exploring 
improvements and enhancements to it as an exercise in language design. 
However, our intentions are to produce a working programming language with a 
high performance compiler and support software. To this end, we produced a 
paper language, Sisal 2.0, to foster debate as to what features were 
appropriate and useful for a real language. For more information on Sisal 
2.0, see [4]. We are also working on a much expanded version of Sisal, called 
Sisal-90, that will arrive on the scene sometime in the next year, with a 
compiler and other support software. This new version will not only contain 
more modern features, but will be much more expandable that the current 
version, while retaining enough syntactic similarity with the Sisal 1.2 to 
allow some automatic translation assistance. It will continue to use our 
intermediate form languages, IF1 and IF2, and will continue to utilize the 
powerful optimization back-end of our current compiler. The Sisal-90 manual 
is in preparation; for more information on Sisal-90, contact the author of 
this tutorial, or see the section titled "23. For More Information...".

21.7 Performance Monitoring Tools

Developing parallel programs seems to be only the first part of the parallel 
programming problem. After the program is running correctly, it must be tuned 
for efficient utilization of machine resources, such as memory and 
communications bandwidth. To support this, we are working on experimental 
tools designed to provide the programmer feedback as to how a program is 
running on a parallel system. To learn more about this effort, contact the 
author of this tutorial, or see the section titled "23. For More 
Information...".

22. How to Get the Sisal Software

The currently released versions of the Sisal compiler and runtime system 
support software is available from the anonymous ftp server operated by the 
Computer Research Group at Lawrence Livermore National Laboratory. This 
server contains directories and compressed tar files of Sisal software, 
source programs, and documentation, and is located at 
"ftp://sisal.llnl.gov/pub/sisal".

In addition, we maintain a number of mailing lists for interested parties and 
Sisal developers:

sisal-info@sisal.llnl.gov    : general sisal-related information (BROADCAST)
sisal-request@sisal.llnl.gov : requests to added or deleted from sisal-info
sisal-bugs@sisal.llnl.gov    : to report sisal software bugs
ssci-help@sisal.llnl.gov     : SSCI consulting help

23. For More Information...

The Sisal Project Home page is located at "http://www.llnl.gov/sisal".

The following is a partial bibliography of our publications on Sisal and 
related topics. Papers may be ordered directly from

Judy Michels
P. O. Box 808, L-417
Lawrence Livermore National Laboratory
Livermore, CA 94551
(510) 422-4236
judy@diego.llnl.gov

or from either of

John Feo at (510) 422-6389 or feo@diego.llnl.gov, or
Tom DeBoni at (510) 423-3793 or deboni@llnl.gov.

23.1 Sisal Publication Bibliography

[1] Beard, Patrick; An Implementation of SISAL for Distributed-Memory 
Architectures; Jun-95;

[2] Cann, D.; Feo, J.; Sisal versus FORTRAN: A Comparison Using the Livermore 
Loops; Apr-90; UCRL-102263 (2);

[3] Cann, D.; Feo, J.; Bohm, W.; Oldehoeft, R.; SISAL: Toward Resolving the 
Parallel Programming Crisis; May-92; UCRL-JC-109774;

[4] Cann, D.; Feo, J.; Bohm, W.; Oldehoeft, R.; The SISAL 2.0 Reference 
Manual; Dec-91; UCRL-MA-109098;

[5] Cann, D.; Feo, J.; Bohm, W.; Oldehoeft, R.; A Guide to the Optimizing 
SISAL Compiler; Sep-91; UCRL-MA-108369;

[6] Cann, D.; Feo, J.; DeBoni, T.; SISAL 1.2: High Performance Applicative 
Computing; May-90; UCRL-JC-103980;

[7] Cann, D.; Wolski, R.; Feo, J.; Parallel Functional Computation: Current 
Results and Observations; Jun-91; UCRL-JC-107022;

[8] Cann, David; SISAL: 1.2 A Brief Introduction and Tutorial; May-92 UCRL-
MA-110620;

[9] Cann, David; Framework Preconstruction and Aggregate Storage Subsumption 
Optimizations for Sisal 1.2; May-92; UCRL-MA-110701;

[10] Cann, David; The Optimizing SISAL Compiler: Version 12.0; Apr-92; UCRL-
MA-110080;

[11] Cann, David; Retire Fortran? A Debate Rekindled; Mar-92; UCRL-JC-107018 
(2);

[12] Cann, David; Retire Fortran? A Debate Rekindled; Jul-91; UCRL-JC-
107018(1);

[13] Cann, David; Retire Fortran? A Debate Rekindled; Apr-91; UCRL-JC-107018;

[14] Cann, David; Vectorization of an Applicative Language: Current Results 
and Future Directions; Nov-90; UCRL-JC-105654;

[15] Cann, David; Compilation Techniques for High Performance Applicative 
Computation; May-89; CS-89-108;

[16] Cann, David; Sisal Multiprocessing Support (July 1987); Jul-87; UCID-
21115;

[17] Cann, David; Feo, John; Sisal versus FORTRAN: A Comparison Using the 
Livermore Loops; Nov-89; UCRL-102263 (1);
[18] Denton, Scott; Optimizing Parallel Reduction Operations;

[19] Feo, J.; Cann, D.; Oldehoeft, R.; A Report on the Sisal Language 
Project; UCRL-102440 (1);

[20] Feo, J.; DeBoni, T.; Caffey, H.; Hausheer, F.; Developing Molecular 
Dynamics Simulation Codes Using Mixed Language Programming; May-94; UCRL-JC-
116641;

[21] Feo, J.; DeBoni, T.; Rodrigue, G.; Muller, J.; Implementation and 
Performance of Domain Decomposition Algorithm in Sisal; Sep-93; UCRL-JC-
115029;

[22] Feo, John; Sisal; The Salishan Problems in Sisal; Jul-92; UCRL-JC-
110915;

[23] Feo, John; Arrays in Sisal; Sep-90 UCRL-JC-106081;

[24] Feo, John; An Analysis of the Computational and Parallel Complexity of 
the Livermore Loops; Parallel Computing; Jan-88;

[25] Feo, John; The Livermore Loops in Sisal; Aug-87; UCID-21159;

[26] Feo, John; DeBoni, Thomas; A Sisal Workshop (Ohio Supercomputer Center);

[27] Grit, D; McGraw, J.; Programming Divide and Conquer on a Multiprocessor; 
May-83; UCRL-88710;

[28] Hendrickson, Chris; Programming A Real Code In A Functional Language 
(Part 1); Sep-91; UCRL-JC-108326;

[29] Lee, C.; Skedzielewski, Steve; On the Implementation of Applicative 
Languages on Shared-Memory Mimd Multiprocessors; Apr-88; UCRL-97980;

[30] Lee, Ching-Cheng; Experience of Implementing Applicative Parallelism on 
Cray X-MP; May-88; UCRL-98303;

[31] (1); McGraw, James; The Val Language - Description and Analysis; Dec-80; 
UCRL-83251;

[32] McGraw, Jim; Parallel Functional Programming in Sisal: Fictions, Facts 
and Future; Jul-93; UCRL-JC-114360;

[33] McGraw, Jim; Skedzielewski, Steve; Sisal Streams and Iteration in A 
single-Assignment Language Version 1.2; Mar-85; M-146 (1);

[34] McGraw, Jim; Skedzielewski, Steve; Sisal Streams and Iteration in A 
single-Assignment Language Version 1.1; Jul-83; M-146;

[35] McGraw, Jim; Skedzielewski, Steve; Streams and Iteration in Val; 
Additions to a Data Flow Language; Mar-82; UCRL-87414;

[36] Miller, Patrick; Cedeno, W.; A User's Guide to Twine; Jul-93;

[37] Miller, Patrick; The Efficient Implementation of Error Values in 
Functional Languages; Nov-91; UCRL-JC-105794;

[38] Miller, Patrick; Implementing Error Values in Applicative Languages; 
Aug-91; UCRL-JC 106539;

[39] Miller, Patrick; Twineman - Man Pages for the Twine Debugging System;

[40] Miller, Patrick; Implementing Error-valued Semantics for Dataflow 
Languages on Imperative Machines; UCRL-JC-111550;

[41] Oldehoeft, Rod; Mixed Applicative and Imperative Programs; Feb-87; UCRL-
96244;

[42] Ranelletti, John; Graph Transformation Algorithms for Array Memory 
Optimization in Applicative Languages (Ph.D. Thesis); Nov-87; UCRL-53832;

[43] Sarkar, Vivek; Partitioning and Scheduling Parallel Programs for 
execution on Multiprocessors; Apr-87; CSL-TR-87-328;

[44] Sarkar, Vivek; Compile-time Partitioning and Scheduling of Parallel 
Programs;

[45] Sarkar, Vivek; Cann, David; POSC - a Partitioning and Optimizing Sisal 
Compiler; Apr-90; UCRL-102737 (1);

[46] Sarkar, Vivek; Skedzielewski, Steve; An Automatically Partitioning 
Compiler for Sisal; Dec-88; UCRL-98289;

[47] Skedzielewski, Steve; Yates, Robert; Fibre: An External Format for SISAL 
and IF1 Data Objects, Version 1.1; Apr-88; M-154, Rev. 1;

[48] Skedzielewski, Steve; Data Flow Graph Optimization in IF1 (Dec. 2, 
1987); Dec-87; UCRL-92122 (1);

[49] Skedzielewski, Steve; DI: An Interactive Debugging Interpreter for 
Applicative Languages; Mar-87; UCRL-95709;

[50] Skedzielewski, Steve; IF1 An Intermediate Form for Applicative 
Languages, Version 1.0; Jul-85; M-170;

[51] Skedzielewski, Steve; A Simple Method to Remove Reference Counting in 
Applicative Programs; UCRL-100156;

[52] Welcome, Mike; IF2 An Applicative Language Intermediate Form with 
Explicit Memory Management; Dec-86; M-195;

[53] Wolski, R.; Feo, J.; Cann, D.; A Prototype Functional Langauge 
Implementation for Hierarchical-Memory Architectures; Jun-91; UCRL-JC-107437 
(1);

[54] Wolski, R.; Feo, J.; Cann, D.; Implementing Functional Languages to 
Exploit Locality; Jun-91; UCRL-JC-107491;

[55] Wolski, Richard; Feo, John; Program Partitioning for NUMA Multiprocessor 
Computer Systems; Oct-92; UCRL-JC-112183;

[56] Yates, Robert; Di Tutorial; Apr-88; UCID-21393;

[57] Proceedings of the Third Sisal User's Conference; Oct-93; CONF-9310206;

[58] Proceedings of the Second Sisal User's Conference; Oct-92; CONF-9210270;

[59] LLNL SISAL Compiler User's Manual, Version 1.1; Aug-86; M-191;