Tuesday, July 19, 2005

Transformations for Abstractions

I finally finished my keynote paper for SCAM 2005, the worshop on Source Code Analysis and Manipulation, which will be held at the end of September in Budapest. This is a good occasion for finally making a start with my blog as well, as the title of the paper `Transformations for Abstractions' suggests that it is related to the topic of my blog. As the introduction states:
What is common in the transformations that we are exploring is that they are for supporting abstractions. That is, to enable programming at a higher-level of abstraction. Our aim is to integrate such transformations in the software development process and make them available to programmers. That is, create programming environments with full meta-programming support that let the (meta-) programmer extend the language, the compiler, and other aspects of the environment such as documentation generators. This requires the transformations provided by the programming environment to be extensible. The base programming language should be extensible with new constructs and/or with new (primitive) APIs implementing new abstractions. Likewise the compiler should be extensible with new transformations, and existing transformations should be extensible to new language constructs. New constructs may be implemented by reduction to the base language, or by extension of the back-end of the compiler. APIs may require domain-specific optimizations to ensure efficient execution.
The paper presents a small case study in the definition of extensible programming environments, where a programming environment is understood as consisting of a syntax definition of a language and a bunch of transformations on programs in that language. The programming environment in case is called TFA and is a tiny imperative language with a number of extensions.

A couple of highlights:

The language in the paper is strongly modularized. Each aspect such as built-in data types and language features are defined as a separate component. The framework provides a syntax definition and a number of transformations to be applied to programs in the language. Both the syntax definition and the implementation of the transformations in Stratego are modularized per extension. The transformations include desugarings, bound-variable renaming, evaluation (an interpreter), data-flow transformations, and function specialization.

Of course, the implementation of all the transformations use concrete syntax. For example, desugaring of for-loops is defined by the following rewrite rule:

ForToWhile :
  |[ for x := e1 to e2 do st* end ]| ->
  |[ begin
       var x : int; var y : int;
       x := e1; y := e2;
       while x <= y do
           st* x := x + 1;
       end
     end ]|
  where new => y
A particularly cool extension is the embedding of the domain-specific language of regular expressions. In the regexps extension one can write /re/e to match the string resulting from evaluating the expression e to the regular expression re. This extension is implemented by means of a bunch of desugaring rewrite rules that use dynamic rules to create new function definitions that should be added to the program at top-level. For example, the following rule defines the assimilation of the Kleene star operator:
ReKle : |[ /re*;f/ x ]| -> |[ g(x) ]|
where new => g
    ; add-def(||[
        function g(a : string) : int
        begin
          return /re;g/a | f(a);
        end ]|)
The sources for TFA can be found at the Transformations For Abstractions page of the Stratego wiki (although they are currently in very alpha state).

So what is my blog going to be about then? (First of all, I'm getting really annoyed with this editor as I haven't seem to find the right way to include code snippets, so I may well give up and go back to good old wiki editing). Well, in the coming months at least, I intend to further explore the TFA framework to see how far I can get with extensibile transformations in Stratego. In the course of this project I intend to add new extensions, both in the area of traditional programming language features as in the area of domain-specific language embeddings, andincorporate more types of transformations. The end result should be an extensible programming environment that might actually become useful in itself, a catalogue of transformations, and a better insight in the problems of extensibility.

No comments: