&SPL is an acronym for Simple Programming Language, and is a C-like imperative programming language, with classes, lists, and tuples. For example, the following code produces the Fibonacci sequence.
#include "stdlib/std.spl"
// Output the Fibonacci sequence
main ()
{
var t1 = 0;
var t2 = 1;
while(True) {
println(t2);
var h = t1 + t2;
t1 = t2;
t2 = h;
}
}
The language is formally defined with a context-free grammar and typing rules. In this post an implementation of an &SPL source code compiler implemented in Haskell will be discussed. The compiler compiles the source code into assembly for a Simple Stack Machine [SSM] (http://www.staff.science.uu.nl/~dijks106/SSM/) in a four-part process of lexing, parsing, typing and code generation. The source code is available on GitHub.
Grammar
The &SPL grammar defines the legal syntax of the language. The semantics are only partially defined through the grammar, and are more fully defined through the language typing rules defined in the Typing section.
The context-free grammar is as follows:
<SPL> = <Decl>+
<Decl> = <VarDecl>
| <FunDecl>
| <ClassDecl>
<VarDecl> = ('var' | <Type>) <id> '=' <Exp> ';'
<FunDecl> = <id> '(' [ <FArgs> ] ')'
[ '::' <FunType> ]
'{' <VarDecl>* Stmt+ '}'
<ClassDecl> = ['class'] <ClassId>
'{' <VarDecl>* <FunDecl>* '}'
<FunType> = <Type>* '->' <Type>
<Type> = <BasicType>
| '(' <Type> ',' <Type> ')'
| '[' <Type> ']'
| <ClassId>
| <Id>
<BasicType> = 'Int'
| 'Bool'
| 'Char'
| 'Void'
<FArgs> = <id> [',' <FArgs>]
<Stmt> = <VarDecl>
| 'if' '(' <Exp> ')' <Stmt> [ 'else' <Stmt> ]
| 'while' '(' <Exp> ')' <Stmt>
| <Exp> '=' <Exp> ';'
| <FunCall> ';'
| 'return' [<Exp>] ';'
| '{' <Stmt>* '}'
<Exp> = <Exp> <Op2> <Exp> | <Exp-base>
<Exp-base> = <Id> [<Field>]
| <Op1> <Exp>
| <int>
| '\'' <char> '\''
| 'False'
| 'True'
| '(' <Exp> ')'
| <FunCall>
| '[]'
| '(' <Exp> ',' <Exp> ')'
| <ClassId>'(' <FArgs> ')'
| new <ClassId>'(' <FArgs> ')'
| delete <Exp>
<Field> = ( '.' 'hd' | '.' 'tl' | '.' 'fst' | '.' 'snd')+
<FunCall> = <id> '(' [ <ActArgs> ] ')'
<ActArgs> = <Exp> [ ',' <ActArgs> ]
<Op2> = '||' | '|' | '&&' | '&'
| '==' | '!=' | '<' | '>' | '<=' | '>='
| ':' | '&+' | '&-' | '&-&' | '<<' | '>>'
| '+' | '-'
| '*' | '/' | '%'
<Op1> = '!' | '-' | '(' <Type> ')' | '&' | '*'
<Int> = ['-' digit+] | ['-' '0x' digit+]
<ClassId> = Alpha ( '_' | AlphaNum)*
<Id> = alpha ( '_' | alphaNum)*
Here alpha
and alphaNum
refer to alphabetical and alphabetical or numerical characters respectively.
Operator precedence, associativity and semantics
The operator precedence, associativity and semantics of the binary operators are as follows:
Operator | Precedence | Associativity | Semantics |
---|---|---|---|
|| | 1 | Left | Boolean or |
| | 1 | Left | Bitwise or |
&& | 2 | Left | Boolean and |
& | 2 | Left | Bitwise and |
== | 3 | Left | Equals |
!= | 3 | Left | Not equals |
< | 4 | Left | Less than |
> | 4 | Left | Greater than |
<= | 4 | Left | Less than or equals |
>= | 4 | Left | Greater than or equals |
: | 5 | Right | List concatenation |
&+ | 5 | Left | Reference integer addition |
&- | 5 | Left | Reference integer subtraction |
&-& | 5 | Left | Reference reference subtraction |
« | 5 | Left | Bit-shift left |
» | 5 | Left | Bit-shift right |
+ | 6 | Left | Integer addition |
- | 6 | Left | Integer subtraction |
* | 7 | Left | Integer multiplication |
/ | 7 | Left | Integer division |
% | 7 | Left | Modulo |
The unary operators are defined as follows:
Operator | Semantics |
---|---|
! | Boolean negation |
- | Integer negation |
(<Type>) | Type casting |
& | Referencing |
* | Dereferencing |
Lexing
The lexing process is the initial phase of &SPL compilation. Through lexing raw
source code is turned into a list of tokens. Some such tokens represent single
characters in the source code, but other tokens represent more complex
structures. For example, the source code string 49020
is represented as a
single token TConstant (CInt 49020)
.
The lexer is implemented as a set of recognizers. Each recognizer takes as input the remaining source code, and either fails or outputs the recognized initial string and the corresponding token. If multiple recognizers produce output, the output of the recognizer matching the largest part of the input string is used. Tie-breaking is performed by a priority assigned to each recognizer. For example, the source code string if will be lexed as TKeyword KIf as opposed to TIdentifier “if”, as the keyword recognizer has a greater priority.
The recognizer data types are as follows:
type Recognizer = String -> Maybe (String , Token)
data RecognizerPriority = RP { recognizer :: Recognizer ,
priority :: Int}
and a single recognize step takes a list of such RecognizerPriority, the source code string, and either fails or outputs the recognized initial string and corresponding token:
recognize :: [RecognizerPriority]
-> String
-> Maybe (String , Token)
Regular expression engine
Internally, the recognizers make use of regular expressions to match strings. The regular expression engine implementation is based on Thompson’s construction [thompson1968programming,xing2004minimized]. In this construction, the regular expression engine compiles a regular expression to a non-deterministic finite automaton (NFA). Here, the compilation is implemented similarly to the main compiler as separate processes of lexing, parsing, and “code” generation in the form of building the NFA. The compiled NFA can subsequently be simulated on an input string, returning all matching strings.
The NFAs are implemented as machines taking string input, and have a set of states, a set of transitions between states, an initial state, and a set of accepting states. Four transition types are implemented: transitions from one state to another predicated on an input character, a conditional transition with an arbitrary predicate on characters, an empty transition, and a transition if the end of the input string has been reached.
data NFA a = NFA [a] [Transition a] a [a]
data Transition a = Transition a Char a
| ConditionalTransition
a
(Char -> Bool)
a
| EmptyTransition a a
| EOSTransition a a
The regular expression engine supports the following regular expression constructs:
- the basics: union
|
, concatenation, literals , and the Kleene closure; - the option operator (“0 or 1 times”)
?
; - characters sets (“one of …”)
[a-zA-Z0-9_!]
; - negative character sets (“none of …”)
[^a-z]
; - escaping of control characters (“literal ‘?’”)
\?
; and - the end-of-string marker
$
.
Many contemporary regular expression engines are implemented using some form of complex recursion, e.g. through parser combinators [cox2007regular]. As such, they are built using a construct able to recognize more expressive types of languages, such as the context-free grammar, as opposed to pure regular expressions. In contrast, NFAs and regular expressions are equivalent regarding the types of languages they can recognize. Thus, Thompson’s construction generally results in a better-performing implementation of regular expressions. For example, compiling the following regular expression
a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a
?a?a?a?a?a?a?a?aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
and matching it with
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
takes 58 seconds using Python’s regular expression implementation on my machine, and only 0.7 seconds using the implementation created for this compiler.
Constructing recognizers
The following illustrates some of the recognizers used and how to construct them:
recognizers = [ ...
-- Comments
constructShortestRecognizer 15 "/\\*.*\\*/"
(\s -> TComment $ length $ filter (== '\n') s),
constructRecognizer 15 "//[^\r\n]*" (\s -> TComment 0),
-- Line ends
constructRecognizer 14 "\r?\n"
(\s -> TWhitespace WNewline),
-- Whitespace
constructRecognizer 14 "[ \f\t\v]+"
(\s -> TWhitespace WOther),
-- Punctuators
constructRecognizer 13 ";"
(\s -> TPunctuator PSeparator),
constructRecognizer 13 "\\("
(\s -> TPunctuator PParenOpen),
constructRecognizer 13 "\\)"
(\s -> TPunctuator PParenClose),
...
-- Keywords
constructRecognizer 10 "var" (\s -> TKeyword KVar),
constructRecognizer 10 "if" (\s -> TKeyword KIf),
constructRecognizer 10 "else" (\s -> TKeyword KElse),
...
-- Types
constructRecognizer 9 "Int" (\s -> TType TypeInt),
constructRecognizer 9 "Bool" (\s -> TType TypeBool),
constructRecognizer 9 "Char" (\s -> TType TypeChar),
constructRecognizer 9 "Void" (\s -> TType TypeVoid),
-- Operators
constructRecognizer 8 "\\+" (\s -> TOperator OPlus),
...
-- Fields
constructRecognizer 7 "\\.hd" (\s -> TField FHd),
constructRecognizer 7 "\\.tl" (\s -> TField FTl),
...
-- Constants
constructRecognizer 1 "True"
(\s -> TConstant $ CBool True),
constructRecognizer 1 "False"
(\s -> TConstant $ CBool False),
constructRecognizer 1 "[0-9]+"
(\s -> TConstant $ CInt (read s :: Int)),
...
-- Identifiers
constructRecognizer 0 "(\\w|_)(\\w|[_0-9])*"
(\s -> if Char.isUpper $ s!!0
then TClassIdentifier s
else TIdentifier s)
]
Parsing
Parsing &SPL proceeds after the lexing phase. The input to the parser is a list of tokens. The parsing phase is aimed at transforming tokens into a more direct representation of the program structure. For example, an expression using a binary operator will have associated with it the specific operator and two sub-expressions, where each sub-expression can be arbitrarily large, resembling a tree structure. As such, the parser transforms the tokens into an abstract syntax tree, and is implemented using a recursive top-down implementation with parser combinators, provided by the Haskell library Parsec.
Abstract syntax tree
At a high level, the &SPL abstract syntax tree consists of declarations (auxiliary source code includes, class declarations, variable declarations and function declarations), statements (such as control structures), expressions (actual calculations) and constants. The &SPL abstract syntax tree is represented naturally as an algebraic data type in Haskell, e.g.:
type SPL = [Decl]
data Decl` = DeclI IncludeDecl
| DeclC ClassDecl
| DeclV VarDecl
| DeclF FunDecl
newtype IncludeDecl` = IncludeDecl String
data ClassDecl` = ClassDecl
ClassIdentifier
[VarDecl]
[FunDecl]
data VarDecl` = VarDeclTyped
Type
Identifier
Expression
...
type VarDecl = (VarDecl`, Meta)
data FunDecl` = FunDeclTyped
Identifier
[Identifier]
Type
[Statement]
...
data Expression` = ExprIdentifier Identifier
| ExprField Expression [Field]
| ExprFunCall
Expression
[Expression]
| ExprConstant Constant
| ExprTuple Expression Expression
...
The types are implemented to keep track of AST meta-information; namely the original source code position for error messages, and the resulting data type of the underlying tree:
type Decl = (Decl`, Meta)
type ClassDecl = (ClassDecl`, Meta)
type VarDecl = (VarDecl`, Meta)
type FunDecl = (FunDecl`, Meta)
type Expression = (Expression`, Meta)
...
data Meta = Meta { metaPos :: Pos,
metaType :: Maybe Type.Type }
Parser implementation
Using parser combinators, the parser closely mimics the defined grammar. For
example, 'while' '(' ')' ''
is implemented as follows:
pStatementWhile :: Parser AST.Statement
pStatementWhile =
(do
TP _ p <- tok (TKeyword KWhile)
condition <- pCondition
statements <- pStatement
return (AST.StmtWhile condition statements, AST.metaFromPos p)
) <?> "a while statement"
pCondition :: Parser AST.Expression
pCondition =
(do
tok (TPunctuator PParenOpen)
condition <- pExpression
tok (TPunctuator PParenClose)
return condition
) <?> "a parenthesized condition"
However, not all elements of the original grammar are translated so readily.
The defined grammar is ambiguous; e.g. it does not inherently make explicit
whether 1 + 2 * 4
should be parsed as (1 + 2) * 4
or as 1 + (2 * 4)
. As
such, on top of the the AST data structure we further define operator
precedence and associativity rules:
binaryOperatorPrecedence` :: BinaryOperator` -> Int
binaryOperatorPrecedence` BinaryOpOr = 1
binaryOperatorPrecedence` BinaryOpBitwiseOr = 1
binaryOperatorPrecedence` BinaryOpAnd = 2
binaryOperatorPrecedence` BinaryOpBitwiseAnd = 2
binaryOperatorPrecedence` BinaryOpEq = 3
...
binaryOperatorAssociativity` :: BinaryOperator` -> Associativity
binaryOperatorAssociativity` BinaryOpOr = ALeft
binaryOperatorAssociativity` BinaryOpBitwiseOr = ALeft
binaryOperatorAssociativity` BinaryOpAnd = ALeft
binaryOperatorAssociativity` BinaryOpBitwiseAnd = ALeft
binaryOperatorAssociativity` BinaryOpEq = ALeft
...
for which we can implement an expression parser as:
pExpression` :: Int -> Parser AST.Expression
pExpression` 8 = pExprBase -- Base case
pExpression` precedence = do
(expr, m) <- pExpression` (precedence + 1);
bOp <- try (lookAhead pBinaryOperator)
if AST.binaryOperatorPrecedence bOp == precedence
then do
pBinaryOperator -- Consume binary operator
expr` <- pExpression` precedence
return (
AST.ExprBinaryOp
bOp
(expr, m)
expr`,
pos)
else return (expr, m)
Pretty printing
Based on the abstract syntax tree, &SPL source code can automatically be pretty printed. For example,
f(n,b)::Int Bool->Int{if(b){return n+1;}else{return n;}}main()::->Void{f(5,True);}
becomes
f (n, b) :: Int Bool -> Int
{
if (b) {
return n + 1;
} else {
return n;
}
}
main () :: -> Void
{
f(5, True);
}
Additionally, unnecessary parentheses are ignored:
Int n1 = (((1 + 2) + 3) + 4) + 5;
Int n2 = 1 + (2 + (3 + (4 + 5)));
Int n3 = ((1 * 2) + (3 * 4)) + 5;
Int n4 = 1 * ((2 + 3) * (4 + 5));
becomes
Int n1 = 1 + 2 + 3 + 4 + 5;
Int n2 = 1 + (2 + (3 + (4 + 5)));
Int n3 = 1 * 2 + 3 * 4 + 5;
Int n4 = 1 * ((2 + 3) * (4 + 5));
Typing
&SPL is a strongly typed-language. However, unlike strongly-typed languages such as C++, the programmer is not required to annotate the source code with variable and function type definitions in &SPL. The &SPL compiler has a powerful type inference system based on type unification. In many cases a programmer does not need to supply any type annotations in a &SPL program at all.
Type system
&SPL has well-defined typing rules. The type system is based on the Hindley-Milner system [bicite key=milner1978theory], allowing for polymorphism and inference to find the most general type. The rules state what the resulting type is given the input types of an expression, statement, etc. Any expression, statement, etc., without a corresponding typing rule is illegal.
For example, given an element a of type A, the expression [a] will be of type [A]. Typing &SPL source code entails proving the types of statements, expression, etc., are consistent using these rules. The rules can be formalized in a natural deduction style:
-------- int Formation
int type
--------- bool Formation
bool type
--------- char Formation
char type
--------- void Formation
void type
A type
-------- [] Formation
[A] type
A type B type
--------------- (,) Formation
(A, B) type
A type
------- (*) Formation
A* type
A1 type A2 type ... B type
--------------------------- (->) Formation
A1 A2 ... -> B type
===========================
b in {True, False}
------------------ bool
G |- b : bool
i in {..., -1, 0, 1, ...}
------------------------- int
G |- i : int
...
G |- e1 : int G |- e2 : int
----------------------------- <=
G |- e1 <= e2 : bool
G |- e : A
------------ &
G |- &e : A*
G |- e : A*
----------- *
G |- *e : A
...
Type inference
The type system used allows for automatic inference to the most general type. Type inference is implemented using Algorithm M, which is a top-down algorithm using unification to solve type constraints. The type constraints come from the AST in combination with the defined type rules (see the Type system section above).
For example, the reference and dereference rules shown above are implemented as follows:
tInfUnaryOp t (AST.UnaryOpDereference, m) e = do
metaMGU m t
tInfExpr (TPointer t) e
tInfUnaryOp t (AST.UnaryOpReference, m) e =
case valueType e of
PersistentValue -> do
t` <- newTypeVar "ref"
tInfExpr t` e
void $ mgu m t (TPointer t`)
_ -> throwError $
TInfError
TInfErrorPersistentValueRequired
(AST.metaPos m)
Here, the metaMGU statements are used to enable annotating the AST with type information. They are not actually part of the type inference itself. We see that the type inference for the unary reference operator is based on an additional “meta type” of the expression, namely its value type. This is expanded upon in the next section.
Value types
&SPL provides pointers, with referencing and dereferencing, and assignment to expressions. These constructs make use of the address of a value in memory. The address of a value only makes semantic sense if the value exists for longer than the expression it is used in; i.e., it should be persistent over multiple statements. As such, each expression has an associated value type that is either persistent or temporary.
Intuitively, an expression has a persistent value type if it is derived from a named entity, and has a temporary value type otherwise. Value types are, with some abuse of notation, inductively defined as follows:
-------------------------
<identifier> : Persistent
<expr> : V
-----------
*<expr> : V
<expr> : V
-----------
&<expr> : V
<expr> : V
-------------------
(<type>) <expr> : V
<expr> : V
-----------------------
<expr>.<identifier> : V
otherwise:
------------------
<expr> : Temporary
Dependency analysis
The type system used allows for polymorphism, but this entails AST types should be inferred to their most general type. As such, it is undesirable to solve the entire program in a single set of constraints, as this would destroy the ability to have polymorphism.
Instead, type constraint unification is performed in blocks of mutually dependent declarations, after which the found types in those blocks are generalized. Generalization transforms the free type variables in the found most general types into bound type schemes, or “forall types.”
The dependencies of a declaration are found by traversing the AST and noting all other declarations it uses, thereby building a graph of dependencies. The strongly connected components are identified in this graph, thus finding all declarations that are mutually dependent. Then, the groups of mutually dependent declarations are toplogically ordered such that groups containing declarations are processed and generalized before groups that depend on those declarations.
AST notation
The AST is annotated with the found types, such that the code generator has access to type information of any AST element.
Example
The following unannotated code
var x = 0;
eq (x, y)
{
return x == y;
}
sum (x, y)
{
return x + y;
}
getX ()
{
return x;
}
main()
{
var a = eq(1, 2);
var b = eq('m', 'n');
var c = eq(True : [], b : []);
}
is automatically typed as follows
Int x = 0;
eq (x, y) :: a a -> Bool
{
return x == y;
}
sum (x, y) :: Int Int -> Int
{
return x + y;
}
getX () :: -> Int
{
return x;
}
main() :: -> Void
{
Bool a = eq(1, 2);
Bool b = eq('m', 'n');
Bool c = eq(True : [], b : []);
}
Code generation
After typing has been completed successfully, the resulting AST makes semantic sense within the rules of the language, and can be transformed into machine code. Here, the AST is transformed into SSM assembly. Each line of SSM assembly contains an instruction and optionally a label and a comment. The comments are for annotation purposes only and otherwise ignored. The labels are resolved to numeric constants by the SSM assembler.
Intermediate SSM representation
It would be possible to output the code generated from the AST directly in the plaintext assembly format. However, this is prone to mistakes and makes post-processing difficult. Instead, the AST is first compiled to an intermediate SSM representation, which is then compiled to plaintext. The SSM code is represented as a list of SSM lines. Lines are defined as:
data SSMLine = SSMLine
(Maybe SSMLabel)
(Maybe SSMInstruction)
(Maybe SSMComment)
with:
type SSMLabel = String
type SSMComment = String
data SSMInstruction = ILoad SSMLoad
| IStore SSMStore
| ICompute SSMOperation
| IControl SSMControl
| IIO SSMIO
data SSMLoad = LConstant SSMArgument
| LStack SSMArgument
| LHeap SSMArgument
| LHeapMultiple SSMArgument SSMArgument
...
...
data SSMArgument = ALabel SSMLabel
| ARegister SSMRegister
| ANumber Int
| AChar Char
| AText String
| AColor String
Stack-based
The SSM is, as its name suggests, stack-based, and as such code generation proceeds in a simple stack-based manner; i.e., evaluation of expressions is implemented akin to reverse Polish notation.
For example, evaluating a binary operation expression is performed by evaluating the first expression, evaluating the second expression, and then executing the operator instruction.
genExpression (AST.ExprBinaryOp op e1@(_, m1) e2@(_, m2), _) = do
case AST.metaType m1 of
Just t1 ->
case AST.metaType m2 of
Just t2 -> do
genExpression e1
genExpression e2
genBinaryOp op t1 t2
However, the SSM also provides a heap and multiple registers. The registers are largely ignored in generated code, though some are used implicitly by instructions to control the flow through the program code and stack. The only exception is the return register, which is used explicitly to set and retrieve function results. The heap is used in dynamic memory allocation.
Caller / callee convention
It is important to clearly define the program flow when generating code. Perhaps most apparent is the need to set up a clear function caller / callee convention. When calling a function, the function frame has to be set up, and the frame should be cleared upon leaving the function. However, there are multiple ways this can be achieved. For example, either the calling or the called code could reserve stack space for the function locals, and either the calling or the called code could subsequently clear the stack space.
Here, the convention is taken that the calling code evaluates the argument expressions in reverse order, places the return address on the stack, and finally branches to the called code. Upon returning, the calling code cleans the evaluated argument expressions from the stack. Upon entering the called code, stack space is allocated for any additional locals. Prior to leaving the called code, any objects with a class type (including objects passed as arguments) are destructed, the reserved space for locals is cleared, and the return address is popped from the stack.
Inspecting the stack for the following program right before returning from f
,
we see:
var g1 = 0xBAAAAAAD;
var g2 = True;
f (a, b, c)
{
var d = 0xCAFED00D;
{
var g2 = 0x99887766;
var a = 0x11223344;
}
}
main ()
{
f(0xDEADBEEF, [], 0xCAFEBABE);
}
Address | Value | RegPtr | |
---|---|---|---|
00000079 | ffffffff | Global g2 | |
0000007a | baaaaaad | Global g1 | |
0000007b | 00000006 | Ret. addr. from main (to halt) | |
0000007c | 00000078 | Prev. frame pointer | |
0000007d | cafebabe | Arg. c | |
0000007e | 000007d1 | Arg. b | |
0000007f | deadbeef | Arg. a | |
00000080 | 0000005f | Ret. addr. from f (to main) | |
00000081 | 0000007c | FP | Prev. frame pointer |
00000082 | cafed00d | Local 1, d | |
00000083 | 99887766 | Local 2, g2 | |
00000084 | 11223344 | SP | Local 3, a |
Global variables
Global variables are evaluated at the start of the program. Their results are then stored sequentially at the top of the stack. As they are evaluated first, their values will subsequently remain at the bottom of the stack throughout the program. To access these variables, a label __end_pc is added to the end of the program code. This label is a constant offset 0x12 lower than the bottom of the stack. As such, a global can be loaded by first putting the label value on the stack, and then loading from the address pointed to with a static offset based on which global should be accessed (0 for the first global variable, 1 for the second, etc.). In pseudo-SSM assmembly:
ldc __end_pc
lda (0x12 + global_offset)
Order of global variables
As global variables can depend on other global variables (but mutual dependence between global variables is not supported), the code generator uses the reverse topologically ordered AST. This guarantees that all global variables are declared and initialized before the global variables depending on them.
Tuple
When evaluating a tuple, the tuple contents are placed in the heap, and the tuple address is placed on the stack. The tuple data looks as follows in the heap:
Address offset | Value |
---|---|
0 | Tuple element 2 |
1 | Tuple element 1 |
Lists
Lists are implemented as linked lists. Similar to tuples, a linked list takes its values on the heap:
Address offset | Value |
---|---|
0 | List element value |
1 | Address of list tail |
An empty list has the next address set to -1
.
Value copying
Values of expressions can be copied on the stack, which means a new value is created from the value of the expression. For expressions of all types except a class type, copying entails simply evaluating the expression, thereby placing the value on the stack. For expressions of a class type, however, the “value” of the object does not make semantic sense. Should it be its first member’s value, or something else? Instead, a full copy of the object is placed on the stack. Expression values are copied in three cases:
- The value is assigned to an expression.
- The value is passed to a function.
- The value is returned from a function.
Though, note that this last case is not yet supported in the code generation, as the return register used for returning values can only pass values of a single word.
Memory
&SPL provides refined control over memory usage. The heap can be managed explicitly using memory allocation functions.
malloc :: Int -> a*
free :: a* -> Void
Function malloc allocates a block of consecutive memory of at least the amount of words requested (or a null pointer if no memory is available). Conversely, free frees the pointed-to memory that was previously allocated.
This allows for efficient implementation of arrays, without the overhead incurred by lists.
var length = 50;
var arr = malloc(length);
//...
var idx = 0;
while (idx < length) {
println(*(arr &+ idx));
}
free(arr);
As this will likely be a common use-case, subscript notation is provided. The previous code can be written more succinctly as:
var length = 50;
var arr = malloc(length);
//...
var idx = 0;
while (idx < length) {
println(arr[idx]);
}
free(arr);
Buddy block system
The memory allocator is based on the binary buddy block system (see e.g. [peterson1977buddy]). Here, a block of consecutive memory is divided into equal-sized fragments. Each fragment has a buddy, with which it forms a larger block of memory. The larger blocks of memory similarly have a buddy, recursively, until the root is reached. Each memory allocation request receives a block of memory that is at least as large as the requested memory, but will likely be larger, as it must be a power of two. Thus, if the smallest fragment size is, e.g., 2^5 words, and the total memory size is 2^20 words, a maximum of 2^15 objects can be held in memory.
For example, for 64 words of memory with fragments of 8 words, the memory tree would look as follows, where the header represents the memory offset, and the cells the block size.
Implementation
The buddy block system keeps track of buddy block allocations, or splits.
When allocating memory, the algorithm checks whether a large enough consecutive block is available. If not, it returns a null pointer. Otherwise, it starts from the root node. If the node is larger than needed for the requested memory and is not yet split, the node is split. The algorithm travels to the child with the least consecutive memory left that still fits. Once the smallest fitting block is found, the system updates the block allocations and returns a pointer to the start of the block.
When freeing memory, the system first frees the block pointed to, and subsequently recursively travels up the tree, re-merging two child blocks if they are now both free.
The algorithm is implemented in the SPL standard library.
Classes
A class in &SPL is a collection of variable members and methods. An object can be instantiated from a class by using the class constructor. An object can either be instantiated dynamically on the heap, or in-place on the stack. An object instantiated on the stack is automatically destructed using its class destructor when the object falls out of scope. When a class object is copied, the new object is instantiated through the class copy constructor.
New and delete
A class can be created dynamically on the stack by using the new keyword in front of the regular constructor expression. Note that, unlike languages such as C++, the new keyword does not require some “default” constructor. The defined class constructor can be used with dynamic parameters.
A class object created dynamically through the new keyword, must be manually destructed using the delete keyword. Note that delete is currently defined for any pointer, but using delete on a non-object pointer results in undefined behavior. An effective solution is already available in the type checker and code generator, where an object can be determined to be of a variable class type; e.g. Class (TVar “a”) instead of Class (TClassIdentifier “Car”). However, no &SPL syntax has been implemented for this yet.
Members and methods
Members of class objects are stored in the object space (either on the heap or in the stack). Currently object members cannot have default values; they must be initialized using the copy constructor. Methods of class objects are shared by all objects of that class, and are located statically in the program code. When a class method is called, the called code receives an argument self which is a pointer to the specific object.
Type inference
Type inference for classes poses a problem. The methods and members of a class can depend on each other, and further other functions or methods and members of other classes can depend on the class as well. However, without typing the AST it is not immediately clear on which methods and members a piece of code is dependent. For example, to know the code obj->get() is dependent on a method SomeClass.get as opposed to SomeOtherClass.get would require knowing obj is of type SomeClass. This could be solved by typing the AST in two passes, where in the second pass enough information should be available for full dependency analysis.
However, instead, we opted for a simpler approach: in dependency analysis, once a class identifier was named explicitly, the code was deemed dependent on all methods and members of that class. Note that by nature of class method and member aliasing, or “namespacing,” any code using a member or method specific to a class must have prior named that class explicitly.
Type frames
To ensure class copy constructors and destructors can be called without knowing the specific class in the code generator, each class object contains a pointer to a location in the program code. This pointer can be seen as the object’s class type frame. The program code location contains specific instructions to load basic information of the object to the stack, such as the object’s size, and the addresses of the copy constructor and destructor.
As such, when the code generator is required to generate code for deleting an object, but that specific area of code does not know the class type of the object, the type frame can be used to load the destructor address.
Address offset | Value |
---|---|
0 | Instruction to load object’s size to stack |
3 | Instruction to load address of object’s copy constructor to stack |
6 | Instruction to load address of object’s destructor to stack |
Standard library
To make &SPL more useful, a standard library is provided. The library contains functionality such as the buddy block memory management system, smart pointers, and some utility functions. The current library is distributed over four files:
- std.spl
- stdbuddyallocator.spl
- stdsmartpointers.spl
- stdmath.spl
Linking
To facilitate a standard library, some rudimentary form of code linking must be
available. In the current version of &SPL and its compiler this is solved
straightforwardly. The grammar is extended to allow for include statements,
e.g. std.spl
contains:
#include "stdlib/stdmath.spl"
#include "stdlib/stdbuddyallocator.spl"
#include "stdlib/stdsmartpointers.spl"
// ...
On compilation, the separate source files are parsed and type-checked independently. Code including another source receives the type-inferred type schemes of that source. On code generation, the generated ASTs are concatenated.
Smart pointers
For more automatic memory management and to prevent programmer mistakes resulting in memory leaks or dangling references, a reference counting smart pointer class is implemented in the standard library. This class wraps around a dynamically allocated object and leverages the copy constructor and the automatic destructor called when objects fall out of scope to keep track of the references to the wrapped object.
On constructing a smart pointer object it creates a dynamically allocated Count object. On copying a smart pointer object, the references to the count and wrapped object are shared, and the count is incremented. On destructing a smart pointer object, the shared count is decremented. Once the count reaches 0, the count and wrapped object are destructed. As a result, the programmer is no longer required to manually call delete on their objects.
For example, the following code will keep track of the references to the Vehicle object, even after the code returns from fun. Only once the last smart pointer wrapping the Vehicle object falls out of scope, will the object be destructed.
fun()
{
var p = SP(new Vehicle(4, 6, 250));
otherFun(p);
((Vehicle*) p.p)->doSomething();
// ...
}
Higher-order functions
&SPL has rudimentary support for higher-order functions. Global functions can be assigned to variables and passed around as expressions. The support for higher-order functions was a natural precursor to implementing classes, due to the need for value assigning to persistent value typed expressions’ addresses and later the need for dynamically calling functions based on expression results (i.e., calling an object method). However, no anonymous or nested functions are supported.
The following program outputs 1, 2, 3, 4, ..., 19, 20
.
var g = 0;
add(n)
{
g = g + 1;
return n + g;
}
mapArray(f, array, length) {
var idx = 0;
while(idx < length) {
array[idx] = f(array[idx]);
idx = idx + 1;
}
}
main() {
var length = 20;
var array = malloc(length);
mapArray(add, array, length);
var idx = 0;
while(idx < length) {
println(array[idx]);
idx = idx + 1;
}
}
SSM assembly post-processing
The SSM assembly resulting from the code generation can contain unnecessary inefficiencies. Some of these inefficiencies stem from the recursive, and separated, nature of the code generation. For example, in the code generation of an if-else statement, a label is generated in the SSM pointing to the end of the else code. We can not simply print a raw label without an operation, as it is possible the statement that is to be generated after the if-else statement will generate a label at the start. This would result in illegal SSM assembly: lbl1: lbl2: …. Instead, the if-else statement generates a label to a nop – no-operation instruction – resulting in wasted cycles:
lbl1: nop
lbl2: ...
In post-processing of the SSM assembly some of these inefficiencies can be optimized out while maintaining semantics of the original program. The following optimizations are carried out in the given order:
- adjacent stack pointer adjustments, where the second adjustment does not contain a label, are merged;
- stack pointer adjustments set to adjust the stack by 0 are changed into a nop operation; and
- nop operations are removed; if a nop is labeled and the next operation is not, the label is moved to the next operation. If the next operation is also labeled, the label clash is resolved by renaming the nop label to the label of the second operation throughout the program code.
For example,
__lbl_0: nop
ldl 1; load local a
ldc 5
lt
brf __lbl_1
ldl 1; load local a
ldc 1
add
stl 1; assign local a
bra __lbl_0
__lbl_1: nop
__lbl_2: nop
ldl 2; load local b
ldc 5
lt
brf __lbl_3
ldl 2; load local b
ldc 1
add
stl 2; assign local b
bra __lbl_2
__lbl_3: nop
would become
__lbl_0: ldl 1; load local a
ldc 5
lt
brf __lbl_2
ldl 1; load local a
ldc 1
add
stl 1; assign local a
bra __lbl_0
__lbl_2: ldl 2; load local b
ldc 5
lt
brf __lbl_3
ldl 2; load local b
ldc 1
add
stl 2; assign local b
bra __lbl_2
__lbl_3: ...
Known issues
Currently type inference with programmer-given annotations is bugged. If a programmer uses a type variable annotation, such as a, twice at different locations, the type inference process will attempt to unify those types. Additionally, tuples and lists are not yet implemented as classes. As such, they do not use dynamic memory management yet, and can not be garbage collected.
Source code and statistics
The source code is available on https://github.com/tomcur/spl-compiler. Lines of code:
- Data structures: 961
- Regex: 515
- Lexer: 248
- Parser: 551
- Type checker: 1573
- Code generator: 1154
- [cox2007regular]: Cox, R. (2007). Regular expression matching can be simple and fast (but is slow in Java, Perl, PHP, Python, Ruby,…).
- [SSM]: Dijkstra, A. (n.d.). Simple stack machine [Computer software manual]. Retrieved from http://www.staff.science.uu.nl/∼dijks106/SSM/
- [milner1978theory]: Milner, R. (1978). A theory of type polymorphism in programming. Journal of computer and system sciences, 17(3), 348–375.
- [peterson1977buddy]: Peterson, J. L., & Norman, T. A. (1977). Buddy systems. Communications of the ACM , 20(6), 421–431.
- [thompson1968programming]: Thompson, K. (1968). Programming techniques: Regular expression search algorithm. Communications of the ACM , 11(6), 419–422.
- [xing2004minimized]: Xing, G. (2004). Minimized Thompson NFA. International Journal of Computer Mathematics, 81(9), 1097–1106.