Transformations and Abstractions: 2005

Thursday, September 15, 2005

MetaBorg in Action

This morning Rene de Groot presented his thesis work in our Software Technology Colloquium. He has worked on two case studies of embedding domain-specific languages in Java --- applying and refining our MetaBorg approach to `concrete syntax for objects'.

In the first case study Rene elaborated the JavaSwul extension of Java with the SWing Userinterface Language. Rather than the usual spaghetti code of Swing method invocations, the language allows you to follow the hierarchical structure of the Swing class hierarchy to create userinterface objects. For example, the following GUI

is produced by this Java program

import javax.swing.*;
import java.awt.*;
  
public class Test3 {
  public static void main(String[] ps) {
    JFrame frame = frame {
      title = "Welcome!"
      content = panel of border layout {
        center = label { text = "Hello World" }
        south = panel of grid layout {
          row = {
            button { text = "cancel" }
            button { text = "ok"}
          }
        }
      }
    };
    frame.pack();
    frame.setVisible(true);
  }
}

The JavaSwul Examples page contains a number of other cool examples including the creation of menus with eventhandlers and gridbags with a semi-visual layout syntax.

The second case study is the extension of Java with concrete syntax for regular expressions. First it provides a syntax for regular expressions and checks regular expressions in a program against that syntax at compile-time. Second because of the explicit syntax extension, there is no need for escaping special characters as has to be done when encoding regular expressions using strings. Finally, the extension provides syntax for defining string rewrite rules and combining those into compound string rewrite operations (analogously to rewriting strategy combinators in Stratego). Here is an example, with some wiki-like rewrite rules:

public String publish(String page) {
  regex body = [/ <body[^>]*?> .* </body> /]

  regex amp = [/ & /] -> [/ &amp; /] ;
  regex lt  = [/ < /] -> [/ &lt;  /] ;
  regex gt  = [/ > /] -> [/ &gt;  /] ;

  regex escape = amp <+ lt <+ gt

  regex noattach  = [/ <a[^>]*?> \s* Attach \s* </a> /]
                 -> [/ <strike> Attach </strike> /] ;
  regex edittopic = [/ %EDITTOPIC% /]
                 -> [/ <a href="${editAction}"><b> Edit </b></a> /] ;
  input ~= one( body
                <~>
                all( edittopic <+ noattach <+ escape  )
              )
  return input ;
}

In addition to the syntactic extensions of Java and their definition using transformations, the implementations include an extension of Martin Bravenboer's typechecker for Java, so that Java programs using the extensions are typechecked before transformation.

A preview of the work is presented in MetaBorg in Action, a paper for the Summer School on Generative and Transformational Techniques in Software Engineering (GTTSE'05) that was held in Braga, Portugal last Summer. Rene's thesis is due soon.

Monday, September 05, 2005

Service Configuration Management

Today our sysadmin asked me to setup a new wiki for use in some course. (Due to early wiki enthusiasm I became wiki master in our department. In fact it took quite some time to convince people that not having a single web master was a good idea.) When I set up my first wikis, one for our Software Technology research group and the other for Program-Transformation.Org, it took me quite some time to get everything installed in the right place, customize the configuration file, and such misery. For instance, on the webserver of our department perl was not installed in /usr/bin as was expected by the CGI scripts of TWiki, but rather in /sw/bin. This required patching all scripts each time I upgraded the installation. Also the directories containing the site were mounted on a different path in the webserver than on the user server. Thus, scripts that were invoked via cron job (for example to update statistics) needed a different path configuration than scripts running on the server. As a result, updating the site was not something to look forward to, let alone adding new installations.

Today I didn't flinch a second when receiving the request to create a new wiki. I just copied the 60 line Nix file of another wiki, changed the name of the wiki, the directories in which to store the wiki data, the port at which to approach the wiki on the server, and customized a few other options. Then I called the install script, and there I had a new wiki set up and running. Well, actually I had to ask sysadmin to open the new port, but then it was just there.

This is all thanks to service configuration management with Nix. In this paper that Eelco Dolstra will present tomorrow morning at SCM-12 in Lisbon (Portugal), we show that the management of services requires the integrated management of software and configuration. By treating software configurations just like any other software components, rather than as state managed in global directories, one can uniquely and reproducibly describe a specific instance of a service. We have used this to succesfully deploy quite a few services already, including a subversion version management server, a Jira issue tracking server, a buildfarm and automatic release management system. The description of the machine specific parts of the deployment of these services is factored out into a small Nix expression, such that I could create a combined subversion/wiki server on my little home machine with the push of a button (well almost).

So if you don't have a chance to attend Eelco's talk tomorrow morning, at least have a look at the paper.

Friday, September 02, 2005

Stratego/XT 0.16M1

The new translation scheme turned out quite well. On Linux it gives a slight performance improvement, on Macs it gives a major performance improvement; speedups of 3 times have been observed. Good reason for a new release:

Stratego/XT 0.16M1 is now available from

http://www.stratego-language.org/Stratego/StrategoRelease016M1

This release is the first milestone release (M1) towards the new major release 0.16. (The zero-th milestone was called 0.15.)

The purpose of the 0.16 development is a major overhaul of the language and compiler internals. Release 0.15 introduced the Stratego Core language and the corresponding refactoring and clean up of the compiler. This release features a lot of fixes to the Stratego Core compiler and lots of other small improvements. Noteworthy features include:

call(f|s*|t*) is a new language constructs that supports calling strategies by name, i.e., f is a term that is interpreted as the name of the strategy to call. Can be used for callbacks in libraries.
checksum strategy gives MD5 checksum of a term

Furthermore, this release already contains some features planned for future milestones:

For 0.16M2: A new translation scheme for implementing choice. Rather than using setjmp/longjmp the scheme now use NULL pointers to indicate the failure of a strategy. This produces a slight performance improvement on Linux and a great performance on Macs, where the setjmp/longjmp feature is quite expensive to use.

For 0.16M3: A good number of improvements to the implementation of dynamic rules. In particular, the compiler now detects overlapping dynamic rules and forbids their use.

While the 0.15 release was still rather experimental, the 0.16M1 release is much improved and is fairly reliable. We are interested in your experiences. Please report any problems such that we can solve these as soon as possible. We plan to have a stable 0.16 release by the end of September.

Sunday, August 21, 2005

Translating Failing Transformations

After deliberating about it for a long time (several years even), I finally implemented a new translation scheme for the Stratego Compiler. The old scheme used the C feature of setjmp/longjmp to deal with failing transformations. This provided the opportunity to go from using C as an assembly language, where an entire Stratego program was compiled to a single C function using gotos for control-flow, to a more idiomatic style of C programs in which each strategy definition was compiled to a C function. The setjmp/longjmp feature elegantly dealt with the notion of a failure by declaring a choice point (with setjmp) and jumping to it from anywhere (with longjmp). However, since choice points are the control-flow mechanism in Stratego, the speed of programs depends heavily on the cost of this feature. On Intel machinery (running Linux) this is not a big issue, but on Apples and Suns (RISC machines) the number of registers saved at each choicepoint is quite expensive; at least that is a theory about possibilities for improving the performance of Stratego programs.

Eelco Dolstra suggested a long time ago to return NULL to indicate failure of a strategy. Indeed, this representation closely matches the formal operational semantics of the language, in which the set of terms is extended with a failure value; exactly the ATerm data-type extended with an extra value (NULL). While conceptually simple, the idea seemed too disruptive to the run-time system, compiler, and native parts of the library to start work on, and there were always plenty of other things to do. (Especially considering the fact that the change does not add one bit of functionality.)

After the recent Stratego Core refactoring and clean up of the intermediate representation and the compiler, and solving lots of outstanding issues, this translation scheme refactoring came into view again. Interestingly, it took less then a week real time to achieve, which included a camping trip, a visit to Philips Research, and reading a couple of theses and articles. So either the problem was never that problematic to start with, or the recent drastic refactorings of the Stratego/XT source tree, configuration, and build system has paid off. Also the change to baseline-based bootstrapping (from self-based bootstrapping) has enormously simplified the process of changing the foundations of the compiler. Finally, the reliance on a solid continous build and release system gives one much more confidence in committing to a new version to bootstrap against. If there is a general lesson here: refactoring and continuous integration pay off.

As for the new translation itself; it is pretty standard fair. Have a look at s2c-ng.str and compare it to the classic s2c.str. The only flaw is that I had to resort to producing code with gotos. Noteworthy about the new version is the use of concrete syntax of Stratego and C almost everywhere, which makes a difference between night and day in maintaining the translation scheme. For example, the following rule defines the translation of the crucial guarded choice construct:

  TranslateStrat(|S,F) :
    |[ s1 < s2 + s3 ]| ->
    stm|[
      {
        ATerm ~id:x = t;
        ~stm:<translate-strat(|Next,F')>s1
        ~stm:<translate-strat(|S,F)>s2
        ~id:F' : t = ~id:x; 
        ~stm:<translate-strat(|S,F)>s3
      }
    ]|
    where <not(Next)> S; new => x; new => F'

Another interesting new feature is the collection of code fragments using dynamic rules, and the synthesis of the target program from these fragments afterwards; in contrast to the old method in which the source program was traversed for each type of fragment `driven by' the target program.

I'm writing this blog while waiting for the bootstrap build to go through, but from the regression tests for the compiler the implementation seems to work fine. What is not clear yet, is the performance improvement this will bring, if any. Another feature of the current translation scheme that seems to have negative impact on the performance of Stratego programs, especially on Apples and Suns, is the use of nested functions in the target C code. This is a feature of GCC, and therefore fraught with portability and performance problems. The implementation using trampolines also does not go very well with the tendency to forbid executable code on the stack. So this aspect of the translation is next in line to be changed. The basic idea is again simple and the use of fragment collection was partly motivated by this change. More later.

Thursday, August 11, 2005

Transforming Bibliographies

Among the useless things one can do in life, maintaining one or more publication lists ranks high. My tendency to waste time on my publication list probably dates back to my days as a PhD student when I badly needed publications to put on a list. On the other hand, as publications are the measure of achievement in research, more researchers may have this problem.

Anyway, maintaining a list of publications can be quite tedious, in particular if you want to provide multiple views on the publications. For example, a listing with most recent publications first, one providing the most important ones first, one organized by research topic, and finally a separate list for each project. Also your department may require regular submission of lists. On the web version the entries should come with links to the pdf files and/or the webpage of the publisher, but these links should not be displayed in the version for printing, since they are quite useless there.

Being a computer scientist, I elevated the activity of maintaining content to maintaining a program for generating the various lists. This is still a waste of time, of course, but the excuse is that it will save me time in future. Another excuse is that I developed my program as a case study for the transformation language Stratego.

In fact, the bibtex-tools package has emerged over a long time, starting with a syntax definition for BibTeX first written in 1999. It turns out that BibTeX has quite an intricate syntax that is not so easily formalized with a traditional approach based on a separate lexical analyzer and context-free parser. With the scannerless approach of SDF this poses no problems at all.

Also, the use of the Stratego to perform transformations on a structured representation of a BibTeX file is a definite improvement over directly transforming its text representation. Moreover, these transformations can be expressed quite concisely. For example, the following strategy definitions define an inliner for BibTeX that replaces occurrences of string identifiers with their body. (BibTeX allows the definition of strings such as @string{LNCS={Lecture Notes in Computer Science}}, which can then be quoted in entry fields using the identifier, e.g., series = LNCS.)

  bib-inline = 
    bottomup(try(DeclareInlineString + InlineString + FoldWords))

  DeclareInlineString =
    ?String(_, StringField(key, value))
    ; rules( InlineString : Id(key) -> value )

  FoldWords :
    ConcValue(Words(ws1), Words(ws2)) -> Words((ws1, ws2))

After having developed my own set of BibTeX tools using the Stratego transformation language over the last couple of years, I decided to make them into a proper software package that could be used by others, complete with a manual that explains the LaTeX/BibTeX/Hevea techniques used to get a publication list into HTML. The currently availabe version is a pre-release of the first official release 0.2. I'm waiting for a new stable version of Stratego/XT and for some feedback from users before I make the release official.

So if you don't want to waste time on editing your publication list webpage, but instead want to wast time learnig to use my tools, you now know where to find them.

Thursday, July 21, 2005

Software Engineering News

Or niews rather. I'm warming up for another installment of our software engineering course, and browsing to see what is going on. The monthly seniews newsletter written by members of CIBIT (formerly SERC) provides pointers to developments in software engineering, and has an online archive. This month with a pointer to the new book Pragmatic Version Control Using Subversion, which might be recommendable reading for our students.

Software Deployment in the News

``The deployment role is a role that is often overlooked much to the pain of the users. '' writes Robert Bogue in a series about roles in software engineering. He forgets to mention the Nix Deployment System unfortunately. In Bogue's world (which is most of the world) software deployment is mostly a craft that has to be repeated for each and ever application to be installed, and one has to pray that a carefully crafted installation program will not break on the target machine. Nix expressions encode all the dependencies of an application, and once tested succesfully, deployment consists of transmitting the closure of those dependencies, in order to guarantee that everything is present at the target site.

Managed desktops, the next step for Nix? Chad Dickerson writes about outsourcing the management of desktop systems. We have been getting quite good at managing all aspects of services with Nix. That is, management of all software components and their configuration in one formalism. Examples are our subversion and Wiki servers. It would be great if this can be expanded to an entire desktop environment.

By the way, I found these items on ACM's careernews.

Tuesday, July 19, 2005

Transformations for Abstractions

I finally finished my keynote paper for SCAM 2005, the worshop on Source Code Analysis and Manipulation, which will be held at the end of September in Budapest. This is a good occasion for finally making a start with my blog as well, as the title of the paper `Transformations for Abstractions' suggests that it is related to the topic of my blog. As the introduction states:

What is common in the transformations that we are exploring is that they are for supporting abstractions. That is, to enable programming at a higher-level of abstraction. Our aim is to integrate such transformations in the software development process and make them available to programmers. That is, create programming environments with full meta-programming support that let the (meta-) programmer extend the language, the compiler, and other aspects of the environment such as documentation generators. This requires the transformations provided by the programming environment to be extensible. The base programming language should be extensible with new constructs and/or with new (primitive) APIs implementing new abstractions. Likewise the compiler should be extensible with new transformations, and existing transformations should be extensible to new language constructs. New constructs may be implemented by reduction to the base language, or by extension of the back-end of the compiler. APIs may require domain-specific optimizations to ensure efficient execution.

The paper presents a small case study in the definition of extensible programming environments, where a programming environment is understood as consisting of a syntax definition of a language and a bunch of transformations on programs in that language. The programming environment in case is called TFA and is a tiny imperative language with a number of extensions.

A couple of highlights:

The language in the paper is strongly modularized. Each aspect such as built-in data types and language features are defined as a separate component. The framework provides a syntax definition and a number of transformations to be applied to programs in the language. Both the syntax definition and the implementation of the transformations in Stratego are modularized per extension. The transformations include desugarings, bound-variable renaming, evaluation (an interpreter), data-flow transformations, and function specialization.

Of course, the implementation of all the transformations use concrete syntax. For example, desugaring of for-loops is defined by the following rewrite rule:

ForToWhile :
  |[ for x := e1 to e2 do st* end ]| ->
  |[ begin
       var x : int; var y : int;
       x := e1; y := e2;
       while x <= y do
           st* x := x + 1;
       end
     end ]|
  where new => y

A particularly cool extension is the embedding of the domain-specific language of regular expressions. In the regexps extension one can write /re/e to match the string resulting from evaluating the expression e to the regular expression re. This extension is implemented by means of a bunch of desugaring rewrite rules that use dynamic rules to create new function definitions that should be added to the program at top-level. For example, the following rule defines the assimilation of the Kleene star operator:

ReKle : |[ /re*;f/ x ]| -> |[ g(x) ]|
where new => g
    ; add-def(||[
        function g(a : string) : int
        begin
          return /re;g/a | f(a);
        end ]|)

The sources for TFA can be found at the Transformations For Abstractions page of the Stratego wiki (although they are currently in very alpha state).

So what is my blog going to be about then? (First of all, I'm getting really annoyed with this editor as I haven't seem to find the right way to include code snippets, so I may well give up and go back to good old wiki editing). Well, in the coming months at least, I intend to further explore the TFA framework to see how far I can get with extensibile transformations in Stratego. In the course of this project I intend to add new extensions, both in the area of traditional programming language features as in the area of domain-specific language embeddings, andincorporate more types of transformations. The end result should be an extensible programming environment that might actually become useful in itself, a catalogue of transformations, and a better insight in the problems of extensibility.

Monday, May 23, 2005

How does this blog?

An experimental post to see how this thing works and find out whether I'll be comfortable with putting content on some other server. (And is it possible to get the content out again?) I'd rather integrate a blog with my wiki homepage, but that hasn't worked so far.

Ok, checking the configuration options it turns out that everything is adaptable and that I can even store the blog on my own server (if i'd want that). So I'll leave things as they are and change them when I feel like it.

Transformations and Abstractions