Sisal for Science
David J. Raymond
Physics Department and Geophysical Research Center
New Mexico Tech
Chapter 1
Introduction
Sisal is a general purpose functional computer language. It is particularly useful for scientific applications for the following reasons:
- It uses a syntax that looks at least somewhat familiar to users of procedural languages such as C and Python. The syntax is particularly close to that of Pascal, reflecting the era in which it was developed.
- Properly written code is fast, with execution speeds comparable to or faster than the best hand-written C code in many situations, especially those involving looping.
- The language is very expressive in the sense that “what you see is what you mean”. The functional nature of Sisal forces one to clarify one's intent in writing a program.
- The language is “safe” in that many of the potential pitfalls of procedural languages such as dangling “if” clauses, uninitialized variables, pointers to nowhere, and exceeded array bounds are impossible. (Bounds checking may be turned off for speed after a program is debugged.)
- Sisal can call functions from the huge selection of C and Fortran scientific libraries.
- The functional nature of the language means that automatic parallelization is possible, though the full extent of this potential has yet to be exploited on linux.
In spite of its familiar look, much like that of Pascal in many ways, Sisal requires a radical rethink of how to write a computer program. However, scientific programmers generally have a much easier time becoming fluent in Sisal than in many other functional languages. The purpose of this guide is make the transition to Sisal as easy as possible.
1.1 A quick look at Sisal
Let's take a look at a personalized version of the traditional “Hello world!” program in Sisal:
% hello.sis -- first Sisal program
define main
function main(name: array[character] returns array[character])
"Hello " || name || "!"
end function
Compiling and running this program proceeds as follows:
loro$ sisalc -o hello hello.sis
LL Parse, using binary files
* Reading file: hello.sis...
version 1.8 (Mar 28, 1989)
accepted
5 lines in program
0 errors ( calls to corrector)
0 tokens inserted; 0 tokens deleted.
0 semantic errors
if2mem: W - || ON LINE 4 OF main IN hello.sis INTRODUCES COPYING
loro$ hello
x86_64-unknown-linux-gnu SISAL 1.2 (PThreads) ?.?
"Dave"
"Hello Dave!"
loro$
The “loro$” indicates the shell prompt on my laptop, which is named “loro”. The first line invokes the Sisal compiler on the above program source file, named “hello.sis”. The “-o” flag tells Sisal to name the executable file “hello”. If this option is omitted, the default name is “s.out”. The compiler is a little verbose, but we can ignore most of this for now, noting only that zero errors were found.
In the source code, the line “function main(name: array[character] returns array[character])” defines the input, an array of characters (or string) named “name”, and the output, another array of characters. Input comes from the standard input and output is written to the standard output. Comments can start anywhere on a line and terminate at the end of the line. They are prefaced by “%”.
Running “hello” results in a line of throat-clearing “x86...”, at which point the program waits for input. Typing your double-quoted name and hitting “enter” causes the program to spring into action. The line '"Hello " || name || "!"' concatenates the string “Hello “ with the value of “name” followed by “!”. (“||” is called the concatenation operator and it glues together two or more arrays of the same type.) This result is then sent to the standard output. Standard input and output can be redirected as with any Unix/Linux program.
Functions in Sisal are unlike functions in procedural languages in that they are much more like purely mathematical functions; output is determined solely by the input plus the code within the function. Functions don't remember anything from one invocation to the next and they don't receive information from “back doors” such as global variables, only from the input parameters. Thus, the output of “hello” comes only from the typed input and constant strings defined inside the function.
Function bodies consist of one top-level expression, possibly with nested lower level definitions. The result of the top-level expression is what the function returns. The input parameters can be used anywhere in the top-level expression or any of its subsidiaries. In the above program '"Hello " || name || "!"' is the top-level expression and there are no lower level definitions.
Consider the following program which introduces looping and arrays in Sisal:
% makearray.sis -- array and parallel loop demo
define main
function main(num: integer returns integer, array[integer], array[integer])
for j in 0, num - 1
k := j*j
returns
value of sum k
array of j
array of k
end for
end function
This program makes two arrays, “j”, which consists of a list of integers from “0” to “num - 1”, and “k”, which is an integer array of the same size containing the squares of the elements of “j”. The top-level expression is the “for” loop, containing a “returns” clause that organizes the successive values of “j” and “k” produced by the loop into arrays which are passed to the main function, and one definition, “k := j*j”. Think of this line of code as “k is defined as j*j”, not as “the value of j*j is assigned to k”. The latter interpretation is characteristic of procedural languages and the principal obstacle that programmers in these languages must overcome to learn functional languages is to reorient their thinking in this regard. The difference between these two interpretations is that a definition of a particular symbol (“k”) can occur only once, whereas the assignment of a value to a symbol can occur many times. This is why Sisal is called a “single assignment” language.
But wait, one might say, “k” is redefined many times as the loop progresses! This is not true; definitions inside a “for” loop are local, so even though the name is the same, “k” is actually a different symbol for each iteration. What is not allowed within a single loop iteration is an entire class of common procedural language statements like “k := k + 1”. Note that we are avoiding the term “variable”. “k” is not a variable because once defined in a particular scope, it cannot be redefined. It is simply a symbol representing the expression “j*j”.
Let's now look at the execution of this program, which we call “makearray.sis”:
loro$ makearray
x86_64-unknown-linux-gnu SISAL 1.2 (PThreads) ?.?
5
30 [ 0,4: 0 1 2 3 4 ]
[ 0,4: 0 1 4 9 16 ]
loro$
Entering “5” results in the return of the sum of the “k” values and the two arrays, written in a particular format called “Fibre”. (More on this later.) Each array is enclosed in square brackets with the first two numbers being the lower and upper indices of the array. The array values are then listed in sequence. Unlike some languages in which array indices start at particular values such as “0” or “1”, Sisal array indices can start with any value. In this case the arrays start with “0” because the “for” loop that generated them started with this value. Later we will discuss the interface between Sisal and the C language, which starts arrays at “0”. When interfacing to C, it is convenient to follow this convention. On the other hand, Fortran starts arrays at “1”, so it is easiest to follow that convention when linking to Fortran.
The line “value of sum k” is called a “reduction” since it collapses the values of “k” into a single value, in this case by summing them. Other reduction operators that work on scalar values are “product”, “least”, and “greatest”. “Catenate” glues arrays together.
Loops of the type presented in “makearray.sis” are easy to parallelize, as loop iterations don't depend on previous iterations – this is designed in. Such loops can be divided between processors with no interaction between them, eliminating interprocess communication. A different form of loop in which each iteration can depend on results from the previous iteration is now considered. As an example, we find the roots of a function using Newton's method. For the function
, an iteration
is made where
is a first guess for
. The program proceeds until either the iteration converges or it exceeds a specified number of loops without converging. Here is the program:
% newton.sis -- newton's method iteration
define main
% this is a cubic polynomial with roots 0, +1, and -1
function f(x: real returns real)
x*x*x - x
end function
% Input --
% x0: initial guess
% eps: convergence tolerance
% dx: step in the differentiation of the function
% maxloops: maximum loops before giving up
% Output --
% #1: number of loops actually run
% #2: computed root
% #3: T ==> converged, F ==> not converged
function main(x0, eps, dx: real; maxloops: integer returns integer, real, boolean)
for initial
x := x0;
loops := 0;
converged := false
while ~converged & loops < maxloops repeat
loops := old loops + 1;
oldx := old x;
f1 := f(oldx);
f2 := f(oldx + dx);
x := oldx - f1*dx/(f2 - f1);
converged := abs(x - oldx) < eps
returns
value of loops
value of x
value of converged
end for
end function
Here is the output for several sets of input parameters:
loro$ newton
x86_64-unknown-linux-gnu SISAL 1.2 (PThreads) ?.?
3.0 1.e-10 0.1 30
13 1.000000e+00 T
loro$ newton
x86_64-unknown-linux-gnu SISAL 1.2 (PThreads) ?.?
0.2 1.e-10 0.1 30
6 6.324247e-13 T
loro$ newton
x86_64-unknown-linux-gnu SISAL 1.2 (PThreads) ?.?
-20. 1.e-10 0.1 30
17 -1.000000e+00 T
loro$ newton
x86_64-unknown-linux-gnu SISAL 1.2 (PThreads) ?.?
3.0 1.e-10 0.1 5
5 1.013046e+00 F
loro$
The first three runs found each of the three roots. The fourth run failed due to the fact that the maximum number of allowed iterations was too small.
The “newton” program introduces several new concepts.
- First of all, the work is split between two functions, the “main” function that carries out the Newton's method calculation and the “f” function that defines the polynomial for which roots are to be found. The latter precedes the former so that the compiler knows about all inputs into “main” before it reaches it. This ordering can be changed at the price of some minor additional boiler plate.
- Semicolons separate input parameters of different types in the “function” line.
- Two new primitive data types are introduced, “real” and “boolean”, in addition to “integer” and “character”, which we have already seen. The final primitive data type is “double_real”. (Sisal doesn't do Unicode.)
- The “for initial” loop allows each iteration to depend on variables defined in the previous iteration, e.g., “old x” gives the value of “x” in the previous iteration.
- The first section after the “for initial” specifies what happens in the first iteration of the loop.
- Note that definitions in the “for initial” and “repeat” blocks, definitions are terminated by semicolons, though the semicolon is optional for the last definition, and by extension, when there is only one definition. No semicolons are needed (or tolerated) for the statements following “returns”.
Our final example introduces a slightly more complex non-looping application:
% divisor.sis -- checks to see if one number divides equally into another
define main
function main(number, divisor: integer returns boolean, array[character])
let
ratio := number/divisor;
isdivisor := ratio*divisor = number;
message := if isdivisor then
"Hooray!!"
else
"Sorry!!"
end if
in
isdivisor, message
end let
end function
There are two new features here:
- The “let/in/end let” construct allows a series of definitions to be strung to gether in the “let” block to come up with the results of a calculation, in this case a boolean symbol indicating whether one integer divides equally into another integer. Along with this boolean is a congratulatory or consolatory text message. The results of the calculation are reported in the “in” block.
- The message is selected by an “if” statement that tests the boolean result to decide which message to deliver. Note that an “if” statement is a definition in Sisal, as are the “for” and “let” statements. Also, there is no possibility of a “dangling if”; the “if” statement always returns a result in Sisal.
Here is some output from this program:
loro$ divisor
x86_64-unknown-linux-gnu SISAL 1.2 (PThreads) ?.?
4 2
T "Hooray!!"
loro$ divisor
x86_64-unknown-linux-gnu SISAL 1.2 (PThreads) ?.?
5 2
F "Sorry!!"
loro$
1.2 Installing Sisal on Linux
1.3 Using the Sisal compiler
The current (version 14.1) of the Sisal compiler is named “sisalc”. See the “sisalc” and “s.out” manual pages for detailed information. Documentation of an older version of the Sisal compiler is available
here. Some but not all of this manual is applicable to the current version of Sisal. The information on the various optimization stages of “sisalc” is still valid and is informative.
We now describe basic usage.
- To compile a Sisal program contained in a single file, run
$ sisalc -o example example.sis
The entry function should be named “main” and the first statement in “example.sis” should be “define main”; the “define” statement makes the named function visible globally. The “-o” option defines the name of the executable file. Without “-o”, the default name is “s.out”.
- For a program contained in multiple files, run
$ sisalc -o example example.sis example2.sis example3.sis ...
where “example.sis” contains the “main” function.
- More commonly, individual files are compiled to an intermediate form before combining them. This is done, for instance, by the command
$ sisalc -IF1 example.sis
which produces a file named “example.if1”. The compilation process is then finished by running
$ sisalc -o example example.if1 example2.if1 example3.if1 ...
This two-step process is important if one or more of the “.if1” files reside in a system directory that is not writable by the user. In this case the Sisal compiler may try to build the executable in this system directory, resulting in failure. This is a bug.
- If “.sis” or “.if1” files are stashed someplace to make them generally accessible, their paths must be included explicitly during compilation.
- External C and Fortran libraries of functions called by Sisal programs must be indicated in the compilation by the “-l” flag. For instance, for a C library named “libmatrix.a”, one might have
$ sisalc -o example example.sis -lmatrix
Use the “-L” flag to indicate any non-standard library locations, e.g., “-L/usr/local/math/lib”. The C math library is assumed by default and the “-lm” flag is not needed to access it. More on using foreign language libraries is given in section 4.
- The Sisal compiler translates Sisal into C and then compiles and links the C code. To see the C code generated by sisalc, run
$ sisalc -C example1.if1 ...
To compile this C program by hand (assumed to be named “example.c”), run
$ gcc -o example /usr/local/lib/sisal/srt0.o example.c -L/usr/local/etc/sisal -lsisal -lpthread
- In the current version of the compiler, array bounds checking is turned on by default. Adding the flag “-no-bounds” in the compilation phase turns off bounds checking. This, of course, should only be done after the program has been checked for array bounds violations.
The Sisal executable has a number of options that control the execution. Here are two possibly useful ones.
Recall that default input and output in Sisal occur via the standard input and the standard output using the Fibre format. Other I/O is possible via the foreign language interface.
Chapter 2
Language definition
Sisal programs contain of three types of statements, declarations, expressions, and definitions. Definitions use expressions to define values. For instance
x := expression(y, z)
computes the result “x” based on the input values “y” and “z”. This is an example of an expression that returns only a single value. A multi-valued expression might look like
a, b, c := expression(x, y)
Note that the construct “:=” means “is defined as”.
2.1 Building blocks
Sisal uses a subset of the the ASCII (not UTF) character set including upper and lower case letters, the underscore, numbers, space, tab, and newline characters, single and double quotes, parantheses and square brackets, the percent, pound, and dollar sign characters, and various mathematics operators. These are used to construct constants, symbols representing values, keywords, and math operations. There are no “variables” in Sisal, as symbols cannot be redefined once defined. Symbols and keywords are made up of letter and number characters and the underscore and are not case-sensitive, so, for instance, “if” is the same as “IF”.
The primitive values are of type “boolean”, “character”, “integer”, “real”, “double_real”, and “null”. They are represented as follows:
|
|
|
|
|
|
'a', '2', '\t', '\n', '\\'
|
'\' indicates “escape”, \́t' tab, '\n' newline
|
|
|
|
|
decimal point not needed in exponential form
|
|
|
the 'd' distinguishes from type “real”
|
|
|
|
Operators and punctuation symbols are
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
logical less than or equal
|
|
logical greater than or equal
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
array & stream indices, record syntax
|
|
|
|
|
|
|
|
|
Here are the Sisal keywords:
In addtion to keywords, there are pre-defined functions:
The meanings of keywords and pre-defined functions will become clear below.
Sisal treats run time errors such as divide by zero or array bounds exceeded different from most languages. Instead of terminating execution, the value “error” is assigned to the result and execution continues from there. Things get complicated when arrays are involved, but generally the occurrence of an error value is a strong message to fix the problem that caused it.
2.2 Compound data types
Compound data types allow one to combine primitive types into organized structures. There are several different compound types, arrays, streams, records, and unions. These are now discussed.
2.2.1 “Array” type
Arrays are indexed lists of arbitrary data types. However, all elements of a list must have the same data type. Thus, one can have arrays of primitive data types (“integer”, “boolean”, “real”, etc.) or of compound data types. For instance, arrays of arrays produce two-dimensional arrays, arrays of arrays of arrays produce three-dimensional arrays, etc. In this respect Sisal is similar to C and related languages.
Array elements may be accessed with the syntax “a[i]”, where “a” is the array name and “i” is the index value. In Sisal, multi-dimensional arrays are represented as arrays of arrays. For an array of arrays of (say) reals “a”, element 3 of “a[2]” can be written “a[2][3]. However, Sisal (as with C) has a convenient shortcut: “a[2,3]”.
If array elements that themselves are arrays are of different lengths, then we have what is called a “ragged array”. These are legal. One can have also arrays of records (see below), etc. The index of an array is an integer and array indices can can start and end at arbitrary values. However, if the ending value is less than the starting value, an empty array is indicated.
An array element may be replaced, yielding a new array with the syntax “a[i:V]”, where “V” is the new element value at the index value “i”. It may seem inefficient to create a new array when replacing a single element, but Sisal's optimization routines remove this inefficiency.
Arrays may be created in a number of ways, most of which are discussed later. Here is an example of creation of an array by specifying a list of constant element values:
define main
type oner = array[real]
function main(returns oner)
array oner [2: 4.5, 5.3, 8.2*3.14]
end function
The first component inside the square brackets is the starting index value. Running this program returns
[ 2,4: 4.500000e+00 5.300000e+00 2.574800e+01 ]
The first two integers in the output give the inclusive lower and upper bounds of the array, followed by the element values. If the element values in the generating program are of different types, compilation fails. At least one value must be specified. Notice the use of a type declaration “oner” as a one-dimensional real array. The use of the abbreviated type declaration “oner” in the array definition inside the main function is optional, since the compiler can infer the type from the elements of the array.
A new empty array may be created as follows:
type onei = array[integer]
...
array onei []
...
The type definition is not optional in the array statement in this case, as there are no elements from which the type may be inferred.
Strings in Sisal are really arrays of characters. Constant values of these arrays have a convenient syntax as well: array[1: 'H', 'i', '!'] can also be written “Hi!”. Strings written this way have an implicit initial index of 1.
The only operation on arrays is to glue together two arrays with the same element type with the “catenate” operator “||”:
define main
function main(returns array[integer])
array [0: 1, 2, 3] || array [0: 4, 5]
end function
This returns
[ 0,4: 1 2 3 4 5 ]
Notice that the starting index of the combined array is that of the first array.
There are a number of built-in functions that act on arrays:
|
|
|
New array over index range[lo,hi] with element values V
|
|
|
|
|
|
Number of elements of array A
|
|
Array A with element V added to high end
|
|
Array A with element V added to low end
|
|
Array A with one element removed from high end
|
|
Array A with one element removed from low end
|
|
Same array A with low index set to lo
|
A few other less useful array functions are not mentioned.
2.2.2 “Stream” type
Streams are ordered but unindexed sequences of values of the same type. They are useful for receiving input or marshalling output, which are basically sequential serial operations. (Since Sisal has no intrinsic input/output operations except for the standard input and output, I/O operations must actually be performed by calls to C or Fortran routines.)
To create an empty stream, first define the type of the stream and then create it:
define main
type istream = stream[integer]
function main(returns istream)
stream istream []
end function
Streams can be concatenated using the “||” operator, as with arrays. There are also a number of functions that can be applied to streams:
|
|
|
Stream G with element V appended
|
|
The first element of stream G
|
|
Stream G with first element removed
|
|
True if stream G has zero elements
|
|
Integer equal to number of elements
|
Think of a stream as a pipe with elements added to the inlet (tail of the stream) and removed from the outlet (head of the stream).
2.2.3 “Record” type
Records in Sisal are like C structs. Here is a program that defines a record type “client” and uses it to create an array of “client” records:
define main
type onec = array[character];
type client = record[fname: onec; lname: onec; age: integer];
type client_array = array[client]
function main(returns array[client])
array client_array [0: record[fname: "George"; lname: "Strait"; age: 47],
record[fname: "Willie"; lname: "Nelson"; age: 63]]
end function
The “client” record contains three elements, a client's first name, last name, and age. The elements may be of arbitrary type and different elements can be of different types including arrays, streams, or other records. The array contains as its elements two “client” records.
An element of a record can be recovered with the “.” notation familiar to structs in C:
x := record[fname: "Willie"; lname: "Nelson"; age: 63]
...
x.fname % yields "Willie"
However, unlike C, there are no pointers to records.
Record elements can also be replaced as in the following example:
y := x replace [fname: "William"] % yields [fname: "William"; lname: "Nelson"; age: 63]
Multiple replacements may be accomplished in one step:
y := x replace [fname: "William"; age ̈75]
2.2.4 “Union” type
2.3 Declarations
Declarations provide the program with needed auxiliary information ranging from the definition of new types to information required for the program to connect with the outside world.
2.3.1 “Define” declaration
The “define” statement indicates functions defined in a file that are accessible from outside that file:
define a, b, c
...
function a(...)
...
function b(...)
...
function c(...)
...
The main entry point function of a program is generally named “main” and it must be defined as such: “define main”.
2.3.2 “Type” declarations
Sisal allows new types or abbreviations for existing types to be created and used. For instance, we may define
type c = character;
type r = real;
type d = double_real;
type x = array[real];
type z = record[name:array[c]; age:integer]
Subsequently, “c” may be used where the type specification “character” is needed. Such type declarations come into their own when using compound data types. Type definitions are generally made near the beginning of a Sisal program before any functions are defined and are separated by colons. (A semicolon on the last declaration is optional.)
2.3.3 “Global” declaration
Global declarations let functions defined within a particular file know about functions defined in other files. They have the same format as function declarations. For instance:
global a(i, j: integer; returns real)
2.3.4 “Forward” declaration
If a Sisal function is used before it is defined within a file, then a “forward” declaration is needed before first use. For example
forward function a(b: real returns real)
...
code that calls a
...
function a(b: real returns real)
body of function
end function
Generally it is best to order things with functions defined after the functions they call, so that the “forward” declaration doesn't need to be used. This can't be done in the case of co-routines (A calls B which then calls A), in which case the “forward” declaration is needed.
2.3.5 “Value type” declaration
Most of the time the types of simple values are implicit in their definitions. Occasionally that is not true, or it may be desirable to declare the type of a value explicitly for clarity. Some examples:”Value type” declarations should occur in the same scope in which a value is defined, but before the definition, as shown.
a : real; % declare a to be of type real
i : integer; % declare i to be of type integer
i := 2;
a := real(i)*3.1;
2.4 Expressions
Expressions are either simple or compound. A number of examples of simple expressions are given below.
Compound expressions contain lower level definitions involving simple or compound expressions that simplify the construction of the final expression. The compound expressions available in Sisal are the “function”, which organizes the task of a program into pieces, “let”, in which multiple definitions leading to a final result are allowed, “if” which institutes branching, two kinds of “for” expressions, which implement loops, and the “tagcase” expression, which is used (in a rather obscure way) with unions.
2.4.1 Simple expressions
Simple expressions involving the primitive types are similar to those in other languages, with some caveats. One important caveat is that primitive types cannot be mixed; explicit conversion between integers, reals, and double_reals must be made:
a := real(4.2);
i := 5;
b := double_real(12.4);
c := real(b) + a + real(i);
Conversion from real to integer can be done in three ways:
i := floor(a); % largest integer <= a
i := trunc(a); % a positive: largest integer <= a; a negative: smallest integer >= a
i := integer(a); % rounds toward nearest integer
One oddity of Sisal is that exponentiation is performed using the function “exp”, which resembles the C language “pow” function:
a := exp(3e0,3);
a := exp(3d0,3);
a := exp(3e0,3.5e0);
a := exp(3d0,3.5d0);
The first argument needs to be “real” or “double_real”. The second argument (the power) can be either integer or “real/double_real”. In the second case, the type must match the type of the first argument. Unlike the “pow” function, “exp” treats small integer powers intelligently, i.e, by direct multiplication.
The pre-defined functions “abs”, “mod”, “min”, and “max” functions do what you would expect; the first takes one integer/real/double_real argument and returns the same; the second takes two integer arguments and returns integer; the third and fourth take two integer/real/double_real arguments and return the same.
Sisal is like Pascal in that the semicolon is used to separate expressions rather then terminate them. However, unlike Pascal, the language is forgiving in that a semicolon may be used to terminate the last in a series of expressions without complaint.
2.4.2 “Function” expression
The function is the basic unit that organizes code in Sisal. A function is an expression because it returns one or more values.
In most cases a function has a very simple structure:
function example(input_list returns output_list)
expression
end function
The input list is of the form of elements separated by semicolons, each containing one or more input variables of the same type separated by commas and terminated by a colon. Following the colon is the type declaration. The output list is a comma-separated list of types. For instance,
function example(a, b: real; i integer returns real, integer)
The return values of the expression must match the output list. Also, any input variable must be used or the Sisal compiler will complain.
Functions may also have sub-functions defined within the function body. For instance:
define main
function main(real returns real)
function square(a: real; returns real)
a*a
end function
5.0 + square(3.0)
end function
The expression “5.0 + square(3.0)” uses the “square” function defined within the main function in this thoroughly artificial example. The “square” function is not accessible outside of the enveloping function in which it is defined and is defined before it is used. This avoids the need for a “forward” declaration.
Functions can be called recursively in Sisal.
2.4.3 “Let” expression
The purpose of the “let” expression is to support the creation of a final expression with one or more definitions. By this means a complex expression may be built up from simple components. The syntax of the “let” expression is
let
definition1
definition2
...
in
expression
end let
The expression may have multiple elements separated by commas.
Here is an example:
define main
function main(returns real, real)
let
b := 5.0;
a := 7.2
in
a + b, a*b
end let
end function
Any definition not used in the final expression will be optimized out of the program so that it is not executed needlessly.
2.4.4 “If” expression
The format of an “if” expression is
if test1 then
expression1
elseif test2 then
expression2
...
else
default_expression
end if
where “test1” and “test2” are boolean expressions and the expressions in all branches return the same type and number of elements. There can be any number of “elseif” clauses, including zero. However, the “else” clause is required – as with other expressions, the “if” expression must always return a value.
Here is an example:
define main
function main(a: real returns array[character])
if a > 10.0 then
"big"
elseif a >= 0.0 then
"small"
else
"negative"
end if
end function
2.4.5 Parallel “for” expression
The parallel or product “for” expression generates loops, each iteration of which is independent from other iterations. For this reason, the loop is easily parallelizable, which explains the name. As with all Sisal expressions, the parallel “for” expression returns a value which may consist of one or more elements. Rather than present the general syntax of a parallel “for” loop, we instead show a series of examples.
2.4.5.1 “Index” form
Here is a simple parallel “for” loop that creates an array of squared integers as well as the sum of all of the elements of the array:
define main
function main(n: integer returns array[integer], integer)
for i in 0,n
j := i*i;
returns
array of j
value of sum j
end for
end function
In the “for” statement, the index “i” is given successive integer values starting at the first element after “in” and ending at the second element. The indexing of the array follows the values of “i”. Note that each iteration of the loop is independent of other iterations, meaning that this loop is easily parallelized. All “for” loops of this type have this characteristic.
The section between “for” and “returns” contains one or more definitions that are used after the “returns” clause. The definitions are separated or terminated by a semicolon. Definitions that are not used disappear during compilation.
The “returns” clause contains one or more statements defining values returned by the “for” loop. It is convenient to put these on separate lines, but this isn't necessary, given the “array” and “value” keywords at the beginning of each statement. Semicolons do not separate these statements.
The first line of the “returns” clause in this example returns an array of the values of “j”. If “stream” appears in place of “array”, a stream is returned.
The second line of the “returns” clause is called a reduction operation, because it returns a single value that is a function of all loop iterations, in this case, the sum of all of the “j” values. Possible reduction operators are “sum”, “product”, “least”, “greatest”, and “catenate”. The first four apply to numbers, the last applies to arrays and streams, and the first two also apply to booleans (sum –> or; product –> and).
Optional terms “left”, “right”, and “tree” can precede the reduction operator. These affect the order in which elements are combined. “Tree” successively combines neighboring elements together until the final result is reached, whereas “left” and “right” combine elements sequentially from the left and the right. These options only differ significantly when large numbers of elements are combined. Generally speaking, for large numbers of elements of randomly distributed size, “tree” gives the best result, though it takes the longest. If elements start small and increase in size more or less sequentially, start at the small end.
“Returns” in “for” loops can have the results of selected loop iterations omitted based on a logical test:
define main
function main(returns array[integer], array[integer], array[integer], integer)
for i in 0,5
a := i*i - 2*i;
returns
array of i
array of a
array of a unless a < 0
value of sum a when i > 0
end for
end function
When the array and the sum are being constructed, “a” is omitted if the condition following “unless” is true or the condition following “when” is false. If no element satisfies the specified condition, the resulting array is empty, the result of the sum reduction is 0, and the result of a product reduction is 1.
2.4.5.2 “Array” form
The “array” form of a “for” loop allows a simple way of dealing with pre-existing arrays:
define main
type iarray = array[integer];
function main(returns iarray, iarray, iarray)
let
a := for i in 0,4
returns
array of i*i
end for;
k, b := for x in a at k
returns
array of k
array of k*x
end for;
in
a, k, b
end let
end function
The second “for” loop doesn't need to specify the index range, as it picks it up from the previously defined array “a”. The “at k” part of the “for” statement may be omitted if access to the array index values is not required.
2.4.5.3 “Cross” form
Loops over multi-dimensional arrays are handled with the “cross” construct in Sisal. For instance,
define main
function main(returns array[array[real]])
for i in 0,3 cross j in 0,4
a := real(i - j);
returns
array of a
end for
end function
creates a two-dimensional
array “a”. For more than two dimensions, one would write something like “i in 0,ni cross j in 0,nj cross k in 0,nk”.
2.4.5.4 “Array” form with multi-dimensional arrays
The “array” form of the “for” loop works with multi-dimensional arrays as well:
define main
function main(returns array[array[real]], array[array[real]], array[real])
let
a := for i in 0,3 cross j in 0,4
a := real(i - j);
returns
array of a
end for;
b := for x in a at i, j
returns
array of x*real(i + j)
end for;
c := for x in a
returns
value of catenate x
end for;
in
a, b, c
end let
end function
The second loop provides scalar elements “x” of the two-dimensional array “a”. If the “at i, j” is omitted, as in the third loop, “x” takes on the values of the one-dimensional arrays making up “a”. The “catenate” operation glues these together in the form of a long one-dimensional array, thus flattening the original two-dimensional array. An alternate form for the third loop would have “at i” appended to the “for” statement.
The general rule is that “at” can be followed by from 1 to “n” indices where “n” is the dimensionality of the array in the “for” statement. If “m” indices are specified, then the returned element is a sub-array of the original array with dimensionality “n - m” with the first “m” dimensions stripped off. (A zero-dimensional array is just a scalar.) If the “at” part is omitted, this corresponds to “m = 1”.
2.4.5.5 “Dot” form
The “dot” form of a “for” loop provides two or more indices that march in lockstep:
define main
function main(returns array[real], array[real])
let
a := for i in 0,4
x := real(i*i);
returns
array of x
end for;
b := for i in 0,3 dot j in 1,4
y := a[j] - a[i];
returns
array of y
end for;
in
a, b
end let
end function
If the number of indices doesn't match, an error is generated. It is not clear why this form can't be replaced by a single loop with a bit of index arithmetic. At any rate, “dot” and “cross” cannot be mixed in a “for” statement.
2.4.6 Serial “for” expression
The serial “for” loop can use instances of loop variables from the previous iteration of the loop. It is therefore not parallelizable. Here is an example that computes the factorial of an integer:
define main
function main(n: integer; returns integer, integer)
for initial
i := n;
factorial := n;
repeat
i := old i - 1;
factorial := (old factorial)*i;
until i <= 1
returns
value of n
value of factorial
end for
end function
The section of the loop following “for initial” is an initialization section. Think of it as the first iteration of the loop. The “repeat” section is repeated until the boolean expression following “until” evaluates to “true”. The “returns” section works as for the “parallel” for loop except that any values appearing there must also appear in the initialization section.
An alternative form of the termination condition appears before the “repeat” statement:
define main
function main(n: integer; returns integer, integer)
for initial
i := n;
factorial := n;
while i > 1
repeat
i := old i - 1;
factorial := (old factorial)*i;
returns
value of n
value of factorial
end for
end function
The value of “i” in this case is actually that of “old i” in the repeat section. Note that “while i > 1” is equivalent to “until i <= 1” and either form can be used either before or after the “repeat” section.
The difference between having the termination test before or after the “repeat” section is that at least one iteration of this section always occurs. Because of this, the first form of the factorial calculation shown above actually fails by returning 0 if “n” equals 1; the second form correctly returns 1. (Both fail if “n” is zero, since 0! = 1 by definition!)
2.4.7 “Tagcase” expression
2.5 Putting it all together
Code for a Sisal program is organized into one or more files with the extension “.sis”. In each file, declarations come first, in the following order,
- Define statement
- Type declarations
- Global declarations
followed by function expressions. The entry point into a Sisal program is a file with a function defined as “main”. This function is responsible for calling other functions in the main file or in other files. In the latter case, the called function must be advertised by the define statement in the other file and a global declaration for it must exist in the main file. Functions in any file can call functions in other files via this mechanism.
“Forward” declarations are placed appropriately in the function section after global declarations and “value type” declarations are placed just before a value is defined.
Sisal reads information from the standard input and writes it to the standard output in a format called “Fibre”, which is described later. Data read on the standard input supplies the parameter values for “main” and the information returned by “main” is sent to the standard output. These are the only input/output functions of native Sisal code, though input and output may also be performed via calls to C or Fortran code.
Chapter 3
The Fibre input/out
Fibre format is a structured ASCII representation of Sisal data. The following program shows how scalars, arrays, and records are represented in Fibre:
define main
function main(returns real, integer, array[integer], record[x:integer; y:array[character]])
6.0, 4, array[0: 1,2,3], record[x:2; y:"my string"]
end function
Produces the following Fibre output:
6.000000e+00 4 [ 0,2: 1 2 3 ]
< # PRC=12
2 "my string"
>
Thus, successive elements are separated by white space, reals (and double_reals) are represented by exponential notation, arrays are enclosed in square brackets, and records are enclosed in angle brackets. Array elements are separated by white space with the index range preceeding the data followed by a colon.
Arrays and records can nest other arrays and records arbitrarily. Recall that a two-dimensional array is an array of arrays, etc. The “# PRC=12” is a comment and can be ignored.
To produce Fibre input for Sisal, simply follow this pattern. Text following a “#” is considered to be a comment that terminates at the end of the line. Comments are ignored by Sisal. For instance, the following Fibre file
<
5.7 # float variable
2 # integer
>
may be read by the following Sisal program
define main
function main(a: record[x:real; y:integer]; returns real, integer)
a.x, a.y
end function
producing the output
5.700000e+00 2
Chapter 4
Foreign language interface
4.1 Known external functions
Sisal can access certain functions that are not built in but are known to the compiler, including math functions from the C library. To use these functions, a “global” statement is required. For instance:
define main
global sin(a: real returns real)
function main(returns real)
sin(0.1)
end function
The “global” statement is needed to specify the argument type (“real”, “double_real”, or “integer”). The type of the return is the same as the type of the argument. The syntax of the “global” statement is the same as that of a “function” statement. (It is hard to imagine using trig functions on integers, but it is possible in Sisal!)
Here is a (possibly not exhaustive) list of available math functions:
|
|
|
angles in radians for all trig functions
|
|
|
|
|
|
|
two arguments: opposite, adjacent sides of triangle
|
|
|
|
|
|
|
''exp'' renamed to avoid shadowing pre-defined function
|
In addition, Sisal provides bitwise logical functions on integers that are distinct from the logical operations on boolean values:
|
|
|
bitwise ''and'' of two arguments
|
|
bitwise ''or'' of two arguments
|
|
bitwise ''exclusive or'' of two arguments
|
|
flips bits of one argument
|
|
right shift of first argument by second argument
|
|
left shift of first argument by second argument
|
4.2 C and Fortran functions