PicoJava Example

Git repository: https://bitbucket.org/jastadd/examples
Download: PicoJava.zip
Documentation: API

PicoJava Description

This is an example JastAdd project that implements static-semantic checking for a tiny object-oriented language, "PicoJava". PicoJava illustrates essential parts of name resolution for OO languages, including inheritance, nesting, and qualified access. PicoJava also illustrates essential parts of OO type analysis, including dealing with type hierarchies in checking assignments.

In the following, the PicoJava language is described in a two-column format: informal description to the left and JastAdd implementation comments to the right.

Informal description of PicoJava JastAdd implementation of the PicoJava compiler

The PicoJava language is designed to be able to illustrate the essential parts of name resolution and type analysis for object-oriented languages.

PicoJava supports single inheritance of classes. Classes can also be nested inside each other (like inner classes in Java). Name resolution thus has to deal with a combination of inheritance and nesting.

A PicoJava class can contain variables and statements. These constructs suffice to be able to illustrate important type checking problems, e.g., dealing with type hierarchies in assignment.

PicoJava variables can be accessed using qualified access. This illustrates the interdependences between name resolution and type analysis.

Note that PicoJava is only intended as a language for illustrating typical static semantics problems. It would need to be complemented with many additional language features in order to be able to express meaningful executable programs. For example, there are no methods in PicoJava, and no "new" expressions. These can easily be added, but do not contribute to illustrating the fundamental issues in static semantics mentioned above.

The following example programs illustrate a legal and an illegal picojava program:

  • examples/legal.pj
  • examples/illegal.pj
The illegal program illustrates a number of static-semantic errors.

The Compiler tool

The tool Compiler is built using the AST classes generated by JastAdd. Run it with the following command:

   java -cp tools/beaver-rt.jar:. Compiler PicoJava-program

Implementation overview

Scanning and parsing is implemented using the third-party tools Flex and Beaver. The main program is written in Java. The rest of the tool (name resolution, type analysis, error checking, etc.) is implemented using JastAdd. Tests are implemented using the JUnit framework.

The implementation is structured into the following files:

Main program

  • Compiler.java: Simply parses the input file and prints error messages. All the static-semantics is performed implicitly as needed when the error-checking computation accesses the AST and its attributes.

Scanning and Parsing

  • picojava.flex: Input to flex. See tests/ScannerTests.java for examples of legal and illegal tokens.
  • picojava.parser: Input to beaver. Uses semantic actions to construct the AST. See tests/ParserTests.java for small examples of syntactially legal and illegal programs. Example programs can be found in the other test cases as well.

Abstract Syntax

  • picojava.ast: Defines the abstract syntax as classes.

Static Semantics

  • NameResolution.jrag: Defines name resolution for the PicoJava language. Uses reference attributes to bind accessed names to their declarations. Also illustrates parameterized attributes and broadcasting.
  • TypeAnalysis.jrag: Uses reference attributes to bind each expression to a type object. Also illustrates double dispatching, rewrites, and circular attributes.
  • PredefinedTypes.jrag: Uses nonterminal attributes to add predefined types like boolean and class Object to the AST.
  • NullObjects.jrag: Uses the Null design pattern to simplify type comparisons for undeclared accesses.

Checking errors

  • ErrorCheck.jadd: An imperative aspect that prints out error messages based on the information computed in the NameResolution and TypeAnalysis aspects.

Abstract syntax

A PicoJava program consists of a block with declarations and statements in it.

There are two main kinds of declarations: classes and variables.

A class can optionally inherit from another class.

A variable has a type which is either a class type or the primitive predefined type boolean.

A class has a body which is a block. Note that classes can be nested since a class can be declared inside the body of another class.

The statements in a block can be assignments or while statements.

Variables in other objects can be accessed using ordinary qualified access (dot-notation).

The abstract syntax is defined in picojava.ast.

Predefined types are handled by inserting explicit declarations for them into the AST, as children of the Program root node. This is implemented by a list nonterminal attribute, whose value is defined by an equation. The AST nodes for the predefined types are thus not created by the parser, they are instead created automatically by the attribute evaluator.

The nonterminal attribute is defined in PredefinedTypes.jrag as a synthesized attribute, and its defining equation creates a type declaration for "boolean, and also an object of type UnknownDecl which is used for missing declarations and types.

In order to be able to refer to the object representing "boolean", a reference attribute named "booleanType" is broadcast using an inherited attribute and exposed at various places where it is needed.

Name resolution

The goal of name resolution is to find out which declaration each accessing name refers to.

The name resolution is defined in the aspect NameResolution.jrag. For each Access node, an attribute decl() is defined which refers to the appropriate Decl node. If there is no matching declaration, the decl() attribute will be the UnknownDecl, applying the Null design pattern.

The decl() attribute constitutes the API of the name resolution module that is useful for other aspects to access. For example, the type checker can use the decl() attribute in order to find the type of an Access.

PicoJava has the following basic scope rules:
  • Usual nesting applies: Names declared in an outer block are visible also in inner blocks, unless shadowed by a declaration of the same name in the inner block.
  • Order of declaration is irrelevant within a block. This is slightly different from ordinary Java where the order is irrelevant only for methods and classes, but not for variables. In PicoJava, a variable declared inside a block is visible also to statements above it in the block body.

Basic name resolution is implemented by two attributes:

  • Synthesized attributes localLookup(String) which are defined for constructs that contain declarations, e.g., Block. This attribute searches through the local declarations to find one with a given name. The position of the access is not taken into account, so the order of declaration is irrelevant within a block.
  • Inherited attributes lookup(String) which return the visible declaration of a given name. IdUse calls lookup(String) to find its declaration. Constructs that control scope rules define the lookup attribute for their children, typically by delegating to localLookup attributes, and to their own lookup attribute defined by the context. A Block defines lookup(String) by delegating first to its own localLookup (searching through local declarations), and then to its own lookup. This way ordinary nesting is implemented.
Test cases for basic name resolution are found in tests/BasicNameResolutionTests.java

PicoJava supports object-oriented inheritance in combination with block nesting:
  • Inheritance: Names declared in a class are visible also in subclasses.
  • Combining inheritance and nesting: For a class nested inside a block, the declarations in the superclasses will shadow declarations in the outer block that have the same name.
  • To implement inheritance, each ClassDecl has a synthesized attribute remoteLookup(String) which looks through the declarations of the class and (if needed) declarations in the superclasses. This attribute is intended for handling lookup from accesses inside subclasses and from qualified accesses.
  • To combine inheritance with nesting, each ClassDecl defines lookup(String) to first look in the superclass chain (using the remoteLookup(String) attribute), before looking in the surrounding context.

Test cases for inheritance are found in tests/InheritanceNameResolutionTests.java.

PicoJava supports qualified access. I.e., an expression "a.b" means that "b" is an entity in the object referred to by "a".

To implement qualified access, a Dot redefines lookup(String) for its right-hand identifier. Instead of searching in the context, a remote lookup is done on the object's type.

Test cases for qualified access are found in tests/DotNameResolutionTests.java

Type Analysis

The main goal of type analysis is to find the type of each expression.

The type analysis is defined in the aspect TypeAnalysis.jrag.

The API of the type analysis module consists of the following attributes

  • Expressions and declarations have a type() attribute that is a reference to the appropriate type declaration. If there is no such appropriate declaration, the type() attribute will be the UnknownDecl, applying the Null design pattern.
  • Type declarations have a parameterized attribute subtypeOf(TypeDecl) which compares the two types.
  • Expressions have a boolean attribute isValue() which is true if the expression denotes a value i.e., something that can be stored in a variable.
  • Classes have a boolean attribute hasCycleOnSuperclassChain() that is true if there is a cycle somewhere on the superclass chain.
  • Classes have a ClassDecl attribute superClass that refers to the superclass declaration if it exists and is appropriate, e.g., without any cycles on the superclass chain, otherwise this attribute is null.

Test cases for the type analysis are found in tests/TypeAnalysisTests.java.

A class is a subtype of another class if it is a direct or indirect subclass. The subtype relation is inclusive, i.g., a class is considered to be a subtype of itself.

The subtype relation is implemented using the parameterized attribute subtypeOf(TypeDecl). The implementation is done via double dispatching. I.e., for each subclass X of TypeDecl, another attribute supertypeOfX(TypeDecl) is introduced which is called in X's implementation of subtypeOf, reversing the argument and the receiver.

The double dispatching provides an open way of implementing the comparison: New kinds of TypeDecls can be introduced without having to change the TypeAnalysis.jrag module. For example, to add the Null object UnknownDecl, its type comparison with other TypeDecls could be added completely inside the module NullObjects.jrag.

Class hierarchies must not be cyclic. I.e., a class may not (directly or indirectly) extend itself. The definition of the attribute hasCycleOnSuperclassChain() makes use of the circular attribute feature, i.e., it is defined in terms of other hasCycleOnSuperclassChain() attributes, and ultimately, it may be defined in terms of itself (if there actually is a cycle in the class hierarchy). To guarantee termination of circular attributes, the possible values should be on a lattice of finite height, and the equations monotonic, i.e., change the values only upwards in the lattice. In this case, we have a boolean lattice with TRUE at the bottom (the start value), and FALSE at the top. We see that the equations are monotonic since the values can go up to FALSE, but there is no equation that can bring the value down to TRUE.
Depending on the declaration of an accessed identifier, it can be either an access to a variable or an access to a type.

When the parser constructs the initial AST, it cannot know the declarations of the accessed identifiers, so it constructs Use nodes for all accessed identifiers. After name resolution, however, the declaration is known. It is useful to then replace the Use node with more specialized nodes: VarUse or TypeUse, depending on if the declaration is a variable or a type declaration. This is specified by two context-dependent rewrite rules in the type analysis module.

The replacement of Use nodes by VarUse or TypeUse, simplifies other computations and supports extensibility. We can see this in the computation of the isValue() attribute. Without the rewrite, this attribute would have to be defined by testing the type of the decl() attribute (using instanceof). If we later on extend the language with more kinds of declarations, the definition of the isValue() attribute might need to be changed. But with the rewriting solution, we can simply add a rewrite for the corresponding new kind of use, and add an isValue() equation for that class.

Null objects

It is desirable that static-semantic errors in a PicoJava program do not lead to additional errors. For example, if a declaration is missing, it is sufficient to mark the access as having a missing declaration - additional messages about type errors are undesirable.

We apply the Null design pattern to missing declarations and other situations where there are static-semantic errors. I.e., instead of representing a missing declaration as null, it is represented by a singleton UnknownDecl object. UnknownDecl is also used for representing inappropriate accesses, e.g., accessing a variable name where a class name is expected.

The UnknownDecl should be compatible with everything in order to not generate meaningless error messages. So it is defined to be both a supertype and a subtype of other types. Thanks to the double dispatching discussed above, it is possible to define this completely within the NullObjects.jrag module, rather than having to change the Type Resolution module.

The singleton UnknownDecl object is created by an NTA equation and placed in the list of predefined declarations in the root of the AST. A reference to the object is broadcast through the AST, and exposed at the various places in the AST where it is needed.

Error checking

There are various kinds of possible static-semantic errors in PicoJava. E.g., use of unknown identifiers, incompatible types in assignments, cycles in the class hierarchy, and type-checking of conditions in while-loops.

Errorchecking is implemented by an imperative module ErrorCheck.jadd. The API contains a method errors() for the Program (root) node. This method simply traverses the AST and collects error messages in a Collection object that is returned, using the attributes in the API:s of the static-semantics modules.

Test cases for the error checking are found in tests/ErrorCheckTests.java.

Building the tool

This project also contains all tools needed to build the PicoJava compiler.

  • tests: package containing JUnit test cases.
  • testframework: package containing classes useful for testing
    • TestAll.java: Used for finding all testcases when running tests from a terminal window.
  • tools: directory containing all the tools you need to run this JastAdd project
  • build.xml: Ant build file to be used from a terminal window or from Eclipse
    • Use the targets build (default), test, and clean, when working from a terminal window. For example, type

      generate all java files and compile them

      ant test

      run all test cases

      ant clean

      remove all generated files and all the class files for both generated and handwritten java files

    • Use the targets gen and cleanGen when working from eclipse (see Running JastAdd Under Eclipse):
      ant gen

      generate all java files

      ant cleanGen

      remove all generated files and their class files.

  • AST: package containing files generated by the tools. This directory is not present when the project is "clean". The directory will contain parser files generated by parser generator and AST files generated by JastAdd.