Saturday, December 30, 2006
Thursday, October 26, 2006
Declarative, Formal, and Extensible Syntax Definition for AspectJ
This afternoon Martin Bravenboer presented our OOPSLA'06 paper Declarative, Formal, and Extensible Syntax Definition for AspectJ. The photo shows the coauthors just before the presentation.
Aspect-Oriented Programming (AOP) is attracting attention from both research and industry, as illustrated by the ever-growing popularity of AspectJ, the de facto standard AOP extension of Java. From a compiler construction perspective, AspectJ is interesting as it is a typical example of a compositional language, i.e., a language composed of a number of separate languages with different syntactical styles: in addition to plain Java, AspectJ includes a language for defining pointcuts and one for defining advices. Language composition represents a non-trivial challenge for conventional parsing techniques. First, combining several languages with different lexical syntax leads to considerable complexity in the lexical states to be processed. Second, as new language features for AOP are being explored, many research proposals are concerned with further extending the AspectJ language, resulting in a need for an extensible syntax definition.
This paper shows how scannerless parsing elegantly addresses the issues encountered by conventional techniques when parsing AspectJ. We present the design of a modular, extensible, and formal definition of the lexical and context-free aspects of the AspectJ syntax in the Syntax Definition Formalism SDF, which is implemented by a scannerless, generalized-LR parser (SGLR). We introduce grammar mixins as a novel application of SDF's modularity features, which allows the declarative definition of different keyword policies and combination of extensions. We illustrate the modular extensibility of our definition with syntax extensions taken from current research on aspect languages. Finally, benchmarks show that the performance of scannerless generalized-LR parsing for this grammar is comparable to the parser of abc.
Aspect-Oriented Programming (AOP) is attracting attention from both research and industry, as illustrated by the ever-growing popularity of AspectJ, the de facto standard AOP extension of Java. From a compiler construction perspective, AspectJ is interesting as it is a typical example of a compositional language, i.e., a language composed of a number of separate languages with different syntactical styles: in addition to plain Java, AspectJ includes a language for defining pointcuts and one for defining advices. Language composition represents a non-trivial challenge for conventional parsing techniques. First, combining several languages with different lexical syntax leads to considerable complexity in the lexical states to be processed. Second, as new language features for AOP are being explored, many research proposals are concerned with further extending the AspectJ language, resulting in a need for an extensible syntax definition.
This paper shows how scannerless parsing elegantly addresses the issues encountered by conventional techniques when parsing AspectJ. We present the design of a modular, extensible, and formal definition of the lexical and context-free aspects of the AspectJ syntax in the Syntax Definition Formalism SDF, which is implemented by a scannerless, generalized-LR parser (SGLR). We introduce grammar mixins as a novel application of SDF's modularity features, which allows the declarative definition of different keyword policies and combination of extensions. We illustrate the modular extensibility of our definition with syntax extensions taken from current research on aspect languages. Finally, benchmarks show that the performance of scannerless generalized-LR parsing for this grammar is comparable to the parser of abc.
Thursday, August 03, 2006
From Vertical to Horizontal Compilation?
The Stratego compiler is a pipeline applying over 40 components (some of which are applied more than once) to transform a sugared and modular source program to an implementation in C. This is an example of vertical compilation in the sense that one stage of compilation is applied to the entire program, before the next stage is applied.
The alternative, horizontal compilation, is to apply the entire pipeline, or at least a considerable part of it to a single unit of the program, before considering the next unit.
This is of course done in separate compilation, where 'compilation units' are compiled separately
from other units. But it can be done to smaller units as long as the transformations are compositional, don't need information from other parts of the program.
After getting rid of nested functions, the next issue on my agenda is to refactor the Stratego compiler to provide a compilation library that offers translation strategies that can be applied to single definitions and that can be composed to form both horizontal and vertical compilers.
The first goal of the operation is to improve the performance of the Stratego compiler by eliminating intermediate ATerm files and by increasing the locality of the operations; instead of traversing the entire program for each stage, apply all stages to a single definition, before moving to the next.
In the longer term, this refactoring should enable the combination of interpreted, statically compiled, and run-time compiled code. Furthermore, it should facilitate the creation of an open compiler for Stratego that supports languages extension libraries.
The catch of course is in the compositionality of the compilation process. While many stages
of the compiler can easily be applied per definition, or are even based on local rewriting, there are several stages that require the whole program. One example, is the stage that combines definitions with the same name into a single definition. A compositional compilation will require a scheme that allows the union of such definitions at link time. Another example, is the 'lifting of dynamic rules', which requires awares of all dynamic rule definition sites in a program to prevent generation of duplicate definition. Turning these operations into 'horizontal' ones may require some new (intermediate) language constructs or meta-data that can be consulted when joining program fragments. Such considerations will no doubt guide the design of a better module system for Stratego. More later.
Getting rid of nested functions
Last summer I started with a renovation of the back-end of the Stratego compiler.
The implementation of failure handling was based on the C exception handling API (setjmp/longjmp).
However, this turned out to be costly.
In the
new compilation scheme
failure is now represented as a NULL ATerm.
The second part of the renovation was to get rid of nested functions in the generated C code.
After a year I have finally managed to finish this part of the renovation. In this blog I'll
explain the old and new compilation schemes.
At the same time as the setjmp/longjmp were introduced in the Stratego compiler (somewhere in 2001), I discovered that gcc supported nested functions in C. The following example illustrates why this was a useful feature.
The strategy
bottomup
is defined as follows:
bottomup(s) = all(bottomup(s)); sA recursive invocation of
bottomup(s)
is passed to the one-level traversal
operator all
.
In the generated C code, strategy operators are implemented as functions that take a function
pointer as argument. For instance, the all
operator is implemented by the function
SRTS_all
with signature
ATerm SRTS_all(ATerm f(ATerm), ATerm);However, a strategy expression such as
bottomup(s)
is clearly not something with
a function pointer. Therefore, the compiler lifts such expressions from argument positions into
local definitions. Thus, the definition of bottomup
is rewritten to:
bottomup_1_0(e_1 : ATerm() -> ATerm()|) = let a_0(|) = bottomup_1_0(e_1|) in all(a_0|) end ; e_1Now the argument of
all
is just a function name, which can be translated
as passing a function pointer. The definition of a_0
has to be nested in the definition of bottomup
since it refers to the argument e_1
of
that definition.
Nested functions in C provide a very convenient feature for translating such local definitions.
A nested function is an ordinary C function that is defined within the scope of another C function. Thus, it can refer to all arguments and local variables of the enclosing function. This
enables a straightforward implementation of local definitions in Stratego; a local definition is
replaced with a nested function. Using this approach the rewritten definition of bottomup
above is translated as follows:
ATerm bottomup_1_0 (ATerm e_1 (ATerm), ATerm t) { auto ATerm a_0 (ATerm t); ATerm a_0 (ATerm t) { t = bottomup_1_0(e_1, t); if((t == NULL)) goto fail_1 ; return(t); fail_1 : return(NULL); } t = SRTS_all(a_0, t); if((t == NULL)) goto fail_0 ; t = e_1(t); if((t == NULL)) goto fail_0 ; return(t); fail_0 : return(NULL); }While nested functions make the translation of nested definitions straightforward, they posed several problems. In the first place, nested functions are only supported by gcc, reducing the portability of code generated by the Stratego compiler. Secondly, nested functions in gcc are implemented by means of trampolines, a small piece of code that is generated at run-time and stored on the stack. This code essentially stores the closure of the nested function, that is, a pointer to the actual function and (pointers to) the variables in the enclosing stack-frame(s) to which it has access. The problem with this implementation is that it executes code stored on the stack, which is forbidden on more and more platforms, as it poses a security risk. Thus, we were confronted with users not being able to use Stratego/XT on some platforms. Finally, the implementation of nested functions in gcc does not have a high priority with gcc developers, which caused build failures for the Stratego/XT distribution with some versions of gcc, especially on MacOSX. While nested functions were quite convenient, they had to go and be replaced with an explicit mechanism for handling closures. The mechanism chosen was to use static links to reach stack frames of enclosing functions and an explicit closure data-structure to represent a function pointer with its environment. For example, in the new translation scheme, the signature of the
SRTS_all
operator becomes:
ATerm SRTS_all(StrSL sl, StrCL s, ATerm t)
StrSL
represents static links and StrCL
is the type of closures.
Locally defined strategies are now translated to top-level (static) functions and an explicit
closure is allocated in the enclosing function. Thus, the bottomup
strategy
is translated to the following pair of C functions:
ATerm bottomup_1_0 (StrSL sl, StrCL e_1, ATerm t) { sl_decl(sl); sl_funs(1); sl_init_fun(0, e_1); { struct str_closure a_0 = { &(lifted_0) , &(frame) }; StrCL lifted_0_cl = &(a_0); t = SRTS_all(sl, lifted_0_cl, t); if((t == NULL)) goto fail_1 ; t = cl_fun(e_1)(cl_sl(e_1), t); if((t == NULL)) goto fail_1 ; } return(t); fail_1 : return(NULL); } static ATerm lifted_0 (StrSL sl, ATerm t) { sl_decl(sl); t = bottomup_1_0(sl_up(sl), sl_fun_cl(0, sl), t); if((t == NULL)) goto fail_2 ; return(t); fail_2 : return(NULL); }The declaration
struct str_closure a_0 = { &(lifted_0) , &(frame) };allocates a closure consisting of a pointer to the function
lifted_0
and
a pointer to the current frame
, which is an explicit data-structure in
which escaping functions and variables are stored. The frame
variable is
declared by the macro sl_decl
at the start of the bottomup_1_0
function. A pointer to this closure (lifted_0_cl
) is passed to the invocation of SRTS_all
.
Note that a call to a function passed as a closure needs unwrapping of the closure
into a function pointer and a static link:
t = cl_fun(e_1)(cl_sl(e_1), t);While I started the implementation of this new translation scheme already in September 2005, I didn't get around to finishing it until yesterday. We spent a large effort last Fall to get the documentation of Stratego/XT in better shape, and after that I was busy teaching and doing god knows what, but it was not hacking the Stratego compiler. With this stumbling block out of the way, the door is open to a whole new set of innovations to the compiler and the Stratego language. A better module system is high on the wish list. Also we want to move from a separation of the compiler in phases to a compiler library that will allow larger parts of the compilation process to be applied to a single definition and elimination of intermediate ATerm files.
Wednesday, July 05, 2006
Learn Stratego/XT at OOPSLA/GPCE
When you fill in the registration form for OOPSLA and GPCE don't forget to check the G4 box for our
(Martin,
Karl and
me) tutorial on Building Java Transformations with Stratego/XT. The tutorial will be held on Monday afternoon.
In this tutorial we give an overview of techniques for program transformation, illustrated through the Stratego/XT program transformation system. We explain the general architecture of transformation systems, and how Stratego/XT is used to assemble such systems from components. We introduce a set of ready made components for Java transformation, and show how to program custom transformation components using Stratego. In particular, we show how to express local transformations using rewrite rules and strategies and how context-sensitive transformations can be expressed easily using dynamic rewrite rules. All techniques and language features are illustrated with implementations of transformations on Java programs, that show how to apply all introduced techniques in practice.
See also the page on the GPCE'06 site.
Partial Evaluation and (Semantics-based) Program Manipulation
Once you are done with your contribution for the
Software Transformation Systems workshop in Portland (co-located with
OOPSLA/GPCE),
you should start thinking about your contribution to
PEPM 2007, which I am chairing together with Ramalingam from IBM.
Last year PEPM broadened its mission from a narrow focus on partial evaluation techniques and applications, to all topics concerned with analysis and manipulation of programs.
In particular, the aim is to attract work that applies such techniques to real languages and large code bases. That is, not just design of clever techniques, but also validation of these techniques in practice. The website provides extensive advice for authors of research and tool presentation papers.
- Program and model manipulation techniques such as transformations driven by rules, patterns, or analyses, partial evaluation, specialization, slicing, symbolic execution, refactoring, aspect weaving, decompilation, and obfuscation.
- Program analysis techniques that are used to drive program/model manipulation such as abstract interpretation, static analysis, binding-time analysis, dynamic analysis, constraint solving, and type systems.
- Analysis and transformation for programs/models with advanced features such as objects, generics, ownership types, aspects, reflection, XML type systems, component frameworks, and middleware.
- Techniques that treat programs/models as data objects including meta-programming, generative programming, staged computation, and model-driven program generation and transformation.
- Application of the above techniques including experimental studies, engineering needed for scalability, and benchmarking. Examples of application domains include legacy program understanding and transformation, domain-specific language implementations, scientific computing, middleware frameworks and infrastructure needed for distributed and web-based applications, resource-limited computation, and security.
Friday, June 30, 2006
From Natural Semantics to Stratego
Yesterday, Brad Alexander from the University of Adelaide visited the Uithof and gave a talk in the Software Technology Colloquium with the title "From Natural Semantics to Stratego". He explained how he was in the process of converting an implementation of an optimizer for a second-order functional language in natural semantics (using the Centaur system) to a Stratego program using rewrite rules and strategies. As a result of using Stratego, transformations that took hundreds of lines of code could be reduced to only a few.
The exciting thing about his presentation was that the basic Stratego techniques he used were quite unexiting. That is, using the basic ideas of separate rewrite rules and parametric strategies he could greatly simplify his optimizer. I hope to see more of his work in the future.
Brad was on his way to Estonia to deliver a talk at the AMAST 2006 conference.
Career Evolution (Moving to Delft)
Last week I accepted a position as associate professor at the Software Engineering Group of Delft University of Technology. I will join forces there with Arie van Deursen who recently started at TUD as a full professor.
The ambition is to build a strong research group in the area of software engineering, based on our work in software evolution (Arie) and program transformation.
With NWO funding for a project on extensibility of transformation systems and one on model-driven software evolution my tenure at TUD can start full force.
Of course, I will continue work on Stratego/XT and applications together with Martin Bravenboer, who will join me at TUD. I also hope to convince Eelco Dolstra to make the move to Delft.
We are aiming at setting up a new buildfarm for supporting our own, but hopefully also other research projects.
=> |
Model-Driven Software Evolution
We just got word from NWO, the dutch national research funding organization, that our
project proposal on Model-Driven Software Evolution has been granted. This means
that we will have funding to hire two PhD students, a postdoc, and an assistant professor.
The project is a collaboration between
Arie van Deursen (principal investigator) and myself and is going to take place at Delft University of Technology, (where I'll be moving; see next blog).
If you have an interest in program transformation and domain-specific languages, and the ambition to contribute to the state-of-the-art in these areas, then
you might consider applying for one of these jobs. Official job adverts will be available later (we really just learned about acceptance), but in the mean time you can contact me if you have questions.
Here is the summary of the research proposal:
The promise of model-driven engineering (MDE) is that the development
and maintenance effort can be reduced by working at the model instead
of the code level. Models define what is variable in a system, and
code generators produce the functionality that is common in the
application domain.
The problem with model-driven engineering is that it can lead to a
lock-in in the abstractions and generator technology adopted at
project initiation. Software systems need to evolve, and systems
built using model-driven approaches are no exception. What complicates
model-driven engineering is that it requires multiple dimensions of
evolution. In regular evolution, the modeling language is used to
make the changes. In meta-model evolution, changes are required to
the modeling notation. In platform evolution, the code generators and
application framework change to reflect new requirements on the target
platform. Finally, in abstraction evolution, new modeling languages
are added to the set of (modeling) languages to reflect increased
understanding of a technical or business domain. While MDE has been
optimized for regular evolution, presently little or no support exists
for metamodel, platform and abstraction evolution. It is this gap
that this project proposes to address.
The first fundamental premise of this proposal is that evolution
should be a continuous process. Software development is a continuous
search for recurring patterns, which can be captured using
domain-specific modeling languages. After developing a number of
systems using a particular meta-model, new patterns may be recognized
that can be captured in a higher-level or richter meta-model. The
second premise is that reengineering of legacy systems to the
model-driven paradigm should be a special case of this continuous
evolution, and should be performed incrementally.
The goal of this project is to develop a systematic approach to
model-driven software evolution. This approach includes methods,
techniques, and underlying tool support. We will develop a prototype
programming environment that assists software engineers with the
introduction, development, and maintenance of models and
domain-specific languages.
Friday, May 12, 2006
Software Transformation Systems
Together with
Magne Haveraaen,
Jim Cordy, and
Jan Heering,
I am organizing the next
Workshop on Software Transformation Systems (STS'06), which will be co-located with GPCE and OOPSLA in Portland, Oregon in October.
If you are interested in the design, implementation, and use of transformation systems you should consider submitting an abstract. (Note that the webpage is being updated, so check back on details of submissions.)
If it will be like last time, it promisses to be a lively day full of interesting discussions.
Wednesday, May 10, 2006
Transformations for Abstractions (Looking for a Postdoc or PhD student)
Recently I got notice from NWO, the dutch research funding organization, that the project proposal `Transformations for Abstractions (TFA)' that I submitted last September was accepted. (In a very competitive round; only 15% of proposals was accepted this year.) As a consequence, I have funding for a three year postdoc or a four year PhD student position. The topic of the project is language extensions (abstractions), in general, and extension of transformations on/for language extensions, in particular. Here is the summary of the proposal:
This proposal is about techniques at the intersection of two areas of software engineering. (1) In order to automate software engineering we would like to automate the process of producing programs by means of automatic transformations, thereby computing with programs as we do with other data. (2) In order to improve the expressivity of programming languages to the concepts and notations of specific application domains, we would would like to extend general-purpose languages with domain-specific abstractions. Combining these desiderata leads to the need to extend transformations for new domain-specific abstractions.
The goal of this project is to develop a systematic approach to the extension of general purpose languages with domain-specific abstractions, integrating those abstractions in the syntax and transformations of the programming environment. This requires research into the following issues:
- strategies for the definition of domain abstractions
- mechanisms for open extensibility of transformations
- methods and patterns for design of open transformations
- constraints for independent extensibility of transformations
- derivation of transformation extensions from definitions of abstractions
Sunday, May 07, 2006
Image Transformation
After many years, I finally have a new picture for my home page. Here is the transformation from December 1998 to May 2006:
Clearly the main reason for a new picture was the quality of the old one; I don't look a day older, do I;-)
My renewed interest in photography was sparked by the discovery of autostitch, a tool that stitches together pictures from different parts of the same scene. It allows easy composition of panoramas such as the following picture of a canal in Amersfoort:
It is composed from a bunch (10?) pictures taken by hand (no tripod) with my not so fantastic Samsung digital camera. Autostitch discovers how to put the pieces together.
It does not only work for landscape panoramas, as illustrated by the following 180 degrees interior shot:
It can also be used for funny things such as the following composition of three pictures I took back in 2004 at OOPSLA/GPCE in Vancouver of Martin Bravenboer and Jurgen Vinju:
Or even more surrealistic compositions with multiple occurrences of the same people:
See my autostitch experiments for more examples.
That was a fun diversion in the past week. I've neglected this blog for a while. Much has happened in the meantime, and there are many plans for new (program) transformation work. Now that I've found time for messing around with picture transformation, I expect I'll find time for new blogs as well in the near future. Stay tuned. (I will mainly publish new photos at my flickr account.)
=> | => |
Subscribe to:
Posts (Atom)