Saturday, November 24, 2012

Rev 3.7 release with map/reduce; Python-like syntax

Revision 3.7 of the alpha release 0.5 of the ParaSail compiler and virtual machine is now available at the same URL we have been using recently:

This release has a number of new features, including a special syntax for map/reduce style computations, filters added to iterators, and support for an optional Python-like syntax.  For a description and some examples of the new map/reduce syntax, see:

Here is a simple example using the map/reduce syntax (and a filter) to compute the square of N by summing the first N odd integers:
    func Square(N : Univ_Integer {N >= 0}) -> Univ_Integer is
        return (for I in 1 ..< 2*N {I mod 2 == 1} => <0> + I);
    end func Square;
Filters are mentioned in the above blog entry, and illustrated in the above example. Here is another example where a "functional" version of QSort uses filters with a triplet of container comprehensions.  The filters are immediately before the "=>", enclosed in {...}:
    func QSort(Vec : Vector<Element>) -> Vector<Element> is
        if |Vec| <= 1 then
            return Vec;  // The easy case
            const Mid := Vec[ |Vec|/2 ];  // Pick a pivot value
                QSort( [for each E of Vec {E < Mid} => E] )  // Recurse
              | [for each E of Vec {E == Mid} => E]
              | QSort( [for each E of Vec {E > Mid} => E] ); // Recurse
        end if;
    end func QSort; 
Note that we have added a "magnitude" operator "|...|" which can be defined as appropriate for a given interface; for vectors |V| is equivalent to Length(V).

The ParaSail compiler now recognizes a Python-like variant, in addition to the "normal" ParaSail syntax. In the Python-like syntax variant, "is", "then", "of", and "loop" are replaced by ":"; semicolons are optional; "end if", "end case", etc. are not used. "end class", "end interface, "end func" are optional. Indenting is significant. There are also some Python-inspired synonyms: "def" may be used instead of "func"; "elif" may be used instead of "elsif"; "# " may be used instead of "//" to introduce a comment (notice the required space).

The Python-like syntax is best illustrated by example. Here is the same QSort function given above using the Python-like syntax:
    def QSort(Vec : Vector<Element>) -> Vector<Element>:
        if |Vec| <= 1:
            return Vec  # The easy case
            const Mid := Vec[ |Vec|/2 ]  # Pick a pivot value
                QSort( [for each E of Vec {E < Mid} => E] )  # Recurse
              | [for each E of Vec {E == Mid} => E]
              | QSort( [for each E of Vec {E > Mid} => E] )  # Recurse
Note that the above is strongly typed (unlike Python), but because of type inference on declarations, types only show up in the parameter and result types.

Currently the two syntax variants can be mixed and matched. At some point we may have some restrictions on mixing and matching. In any case semicolons will remain optional in both variants, and probably will start disappearing from examples given in future blog entries.

Both syntax variants are being supported at the moment as we are still ambivalent about whether the Python-like syntax is preferable. We have tentatively named this ParaSail-with-Python-like-syntax variant "PARython" or perhaps "Parython." We may be adding more scripting-oriented libraries so PARython might be usable in more contexts where Python is chosen today.

Sunday, November 4, 2012

Special syntax for Map/Reduce in ParaSail?

The Map/Reduce operation has become widely known recently.  A traditional way of implementing a Map/Reduce operation is to pass to the map_reduce operation, a container such as a set, vector, or list, a mapping function which is applied to each element, and a reducing function which is used to combine two mapped elements into one result.  The mapping step is relatively straightforward, but the reducing step can come in various forms, as it depends on whether the reducing function is associative, commutative, symmetric, etc.  Most functional languages have at least two standard reducing mechanisms, often called foldl and foldr.  These differ in the way the elements of the sequence are combined, either starting at the left or starting at the right.  Typically there is also an initial value which most often corresponds to the identity element for the reducing function, and this becomes the result if the sequence is empty.  If there is no convenient identity element (such as for maximum), versions of foldl and foldr are provided that use the use the first or last element as the initial value (called foldl1 and foldr1 in Haskell), but then only work on non-empty sequences.  Here is the wikipedia entry on the various forms of fold/reduce:

Because ParaSail supports passing functions as parameters to other functions, Map/Reduce in all its forms can be supported in the "traditional" way.  The question is whether some alternative syntax might be useful as well, analogous to the special syntax provided for quantified expressions (which essentially use and or or as the reducing function over a set of boolean values), and for container comprehensions (which essentially use "|" as a reducing operation to combine values into a new container).  As examples, here is a ParaSail quantified expression that asserts an array is sorted:

  (for all I in 1 ..< Length(Vec) => Vec[I] <= Vec[I+1])

and here is a container comprehension that creates a vector of squares:

   const Squares : Vector<Integer> := [for I in 1..10 => I**2]

Quantified expressions and container comprehensions could also be implemented by passing functions as parameters, but clearly the special syntax makes it somewhat more convenient, and arguably easier to understand.  So the question is: is there an analogous situation for the general map/reduce operation?  Would a special syntax make it more convenient and/or easier to understand?

Here is an example using possible syntax for a general map/reduce:

   (for I in 1 .. Length(Vec) => <0> + Vec[I]**2)

This is to be interpreted as a map/reduce operation which produces the sum of the squares of the values of the Vec.  The initial result (in the case of an empty vector) is given inside the <...>, and then after each element is combined with the ongoing result, the new result replaces the <...> for the next element.  Here would be a dot product of two vectors (first we assert they are of the same length):

  {Length(Vec1) == Length(Vec2)}
 (for I in 1 .. Length(Vec1) => <0> + (Vec1[I] * Vec2[I]))

Here is a computation of factorial(N):

   (for I in 1 .. N => <1> * I)

If we wanted to compute the maximum of the elements of some function evaluated over a range of inputs, and we wanted to use the first value as the initial result (analogous to foldl1 in Haskell) we could write (after first asserting we have a non-empty set of inputs):

    {N > 0}
   (for I in 1 <.. N => Max(<F(1)>, F(I)))

This general syntax could be used with any ParaSail iterator, so if we wanted to count the number of symbols in a symbol table whose names are shorter than 3 letters, we could write:

   (for each S of Sym_Table => <0> + (Length(S.Name) < 3? 1: 0))

Here we might better use a filter annotation on the iterator (unfortunately, iterator filters are not yet implemented in the ParaSail compiler):

   (for each S of Sym_Table {Length(S.Name) < 3} => <0> + 1)
An explicit reverse or forward could be used with the iterator if we want to distinguish between foldr and foldl for reducing ordered sequences of values.  Otherwise, the default is an unordered parallel reduction.

Note that a quantified expression can be re-expressed in this more general map/reduce syntax.  For example, the above assertion of Vec being sorted could be re-expressed as:

   (for I in 1 ..< Length(Vec) => <#true> and then Vec[I] <= Vec[I+1])

Similarly, a container comprehension could be re-expressed using this general syntax.  For example, the table of squares could be re-expressed as:

   (for I in 1 .. 10 forward => <[]> | [I**2])

Overall this special syntax seems to provide a nice alternative for map/reduce.  In the traditional approach, we would have to "bundle" up the mapping and reducing operations into functions, whereas with this special syntax, we can write the operations directly in a relatively "normal" looking expression syntax.

We would appreciate comments on this proposed special map/reduce syntax.