Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • shchen/cs320
  • raveendr/cs320
  • mwojnaro/cs320
3 results
Show changes
Showing
with 0 additions and 1681 deletions
\section{Alternative frontends/backends}
This section contains projects that do not modify the language features of \langname,
but change the implementation of a part of the \langname compiler frontend or backend.
\subsection{Code formatter (1)}
Build a code formatter for the Amy language.
A straightforward way to accomplish this would be to add a special mode
(e.g.\ \lstinline{--format}) that the user can start the compiler in.
It would then only run the existing pipeline up to, say, parsing and subsequently go
through a special pretty-printing phase that outputs the program according to code style
rules configurable by the user.
%
A more sophisticated version could instead work on the Token-level, allowing your
formatter to be aware of whitespace, e.g., respecting new-lines that a user inserted.
%
In any case, you will have to maintain comments (which are not part of the AST).
You can look at \lstinline{scalafmt} for some inspiration.
\subsection{Language Server (2)}
Implement a language server for Amy. VSCode and similar IDEs use the \href{https://en.wikipedia.org/wiki/Language_Server_Protocol}{language server protocol} to communicate with compilers and thereby provides deep integration with a variety of programming languages. Among other things, this enables features like showing type-checking errors within the editor, jumping to definitions and looking up all usages of a given definition. Your goal is to provide such functionality for Amy by implementing an additional mode in your compiler, in which it acts as a client for the language server protocol. Your implementation should be demonstrable with VSCode and include the aforementioned features. You can using an existing library, such as \href{https://github.com/eclipse/lsp4j}{LSP4J}, to simplify your task.
\subsection{Formalization of Amy (1)}
Develop an operational semantics for Amy and use your definitions along with Amy's typing rules to prove type safety. Note that this might require you to do some additional reading on type systems.
\subsection{JVM backend (2)}
Implement an alternative backend for \langname which outputs JVM bytecode.
You can use \href{https://github.com/psuter/cafebabe}{this library}.
You first have to think how to represent \langname values in a class-based environment,
and then generate the respective bytecode from \langname ASTs.
\subsection{C backend (3)}
Implement an alternative backend for \langname which outputs C code.
You have to think how to represent \langname values in C,
and then generate respective C code from \langname ASTs.
pdflatex main
pdflatex main
\newcommand{\CEGIS}{\textsf{CEGIS}}
\newcommand{\TerminalRule}{\textsf{Terminal}}
\newcommand{\Search}{\textsf{Search}}
\newcommand{\Verify}{\textsf{Verify}}
\newcommand{\Enumerate}{\textsf{Enumerate}}
\newcommand{\from}{\mathbin{\leftarrow}}
\newcommand{\union}{\mathbin{\cup}}
\newcommand{\expt}{\mathcal{E}}
\newcommand{\Expansions}{\mathcal{E}}
\newcommand{\prob}[1]{\operatorname{Pr}[#1]}
\newcommand{\R}{\mathbb{R}}
\newcommand{\Land}{\bigwedge}
\newcommand{\cost}{\operatorname{cost}}
\newcommand{\horizon}{h}
\newcommand{\score}{score}
\newtheorem{thm}{Theorem}[section]
\newcommand{\smartparagraph}[1]{\noindent\textbf{#1}}
\newcommand{\sparagraph}[1]{\noindent\textbf{#1}}
\newcommand{\TODO}[1]{\marginpar{\color{red}TODO}{\color{red}#1}\xspace}
% Name calling
\newcommand{\leon}{Leon\xspace}
\newcommand{\leonsyn}{LeonSyn\xspace}
\newcommand{\ourcegis}{STE\xspace} % DONE : find a non-ridiculous name
\newcommand{\ourca}{CA\xspace}
\newcommand{\andor}{\textsc{and/or}\xspace}
\newcommand{\insynth}{InSynth\xspace}
% General math
\newcommand{\ALL}[2]{\ensuremath{\forall #1 :~ #2}}
\newcommand{\EX}[2]{\ensuremath{\exists #1 :~ #2}}
\newcommand{\seq}[1]{\ensuremath{\bar{#1}}}
\newcommand{\seqa}{\seq{a}\xspace}
\newcommand{\seqx}{\seq{x}\xspace}
\newcommand{\seqt}{\seq{T}}
\newcommand{\seqg}{\seq{G}}
\newcommand{\seqr}{\seq{r}}
\newcommand{\varsof}[1]{\ensuremath{\text{vars}(#1)}}
\newcommand{\splus}{\ensuremath{\mathop{,}}} % separator in sequences
% Synthesis framework
\newcommand{\br}[4]{\ensuremath{\left\llbracket #1 \ \left\langle #2 \rhd #3 \right\rangle \ #4\right\rrbracket}}
\newcommand{\pg}[2]{\langle {#1} \mid {#2} \rangle}
\newcommand{\similar}[1]{\ensuremath{G(\textsf{#1})}}
\newcommand{\similarr}[2]{\ensuremath{G_{#2}(\textsf{#1})}}
\newcommand{\prename}{\ensuremath{P}\xspace}
\newcommand{\pcname}{\ensuremath{\Pi}\xspace}
\newcommand{\pgname}{\seqt}
\newcommand{\inputs}{\mathcal{I}}
\newcommand{\pgite}[3]{\ensuremath{\text{\textsf{if(}}#1\text{\textsf{) \{}}#2\text{\textsf{\} else \{}}#3\text{\textsf{\}}}}}
\newcommand{\pglet}[3]{\ensuremath{\text{\textsf{val}} \ #1 \colonequals #2 \text{\textsf{;}} \ #3}}
\newcommand{\match}[2]{\ensuremath{\text{#1\textsf{ match \{ }}#2\text{\textsf{\}}}}}
\newcommand{\mcase}[2]{\ensuremath{\text{\textsf{ case }}#1 \Rightarrow #2}}
\newcommand{\code}[1]{\text{\textsf{#1}}}
\newcommand{\guide}[1]{\ensuremath{\odot\mkern-4mu\left[#1\right]}}
\newcommand{\terminates}[1]{\ensuremath{\Downarrow\mkern-4mu\left[#1\right]}}
% Listing-like things.
\newcommand{\cl}[1]{\lstinline[mathescape]@#1@}
\newcommand{\clnoat}[1]{\lstinline[mathescape]!#1!}
\newcommand{\mcl}[1]{\ensuremath{\mathsf{#1}}}
% \newcommand{\mcl}[1]{\ensuremath{\text{\lstinline{#1}}}}
\newcommand{\choosesym}{\cl{choose}}
% Hoare triples
\newcommand{\HoareTriple}[3]{
\begin{displaymath}
\left\{\begin{array}{l}#1\end{array}\right\}
\begin{array}{l}#2\end{array}
\left\{\begin{array}{l}#3\end{array}\right\}
\end{displaymath}
}
\newcommand{\hoareTriple}[3]{\{$#1$\} $#2$ \{$#3$\}}
\newcommand{\btrue}{\mcl{true}}
\newcommand{\gpo}{::=}
\newcommand{\gnt}[1]{~#1~}
\newcommand{\gt}[1]{\text{\tt \textbf{~#1~}}}
\newcommand{\gtns}[1]{\text{\tt \textbf{#1}}}
\newcommand{\FIXME}[1]{ {\color{red} FIXME: #1}}
\newcommand\langname{Amy\xspace}
\section{Execution}
This section suggests projects that change how \langname code is executed.
\subsection{Memory deallocation (3)}
Allow explicit memory deallocation by the user.
\begin{lstlisting}
val x: List = Cons(1, Nil());
length(x); // OK
free(x);
length(x) // Wrong, might return garbage
\end{lstlisting}
When an object in linear memory is freed, the space it used to occupy
is considered free and can be allocated again.
Any further reference to the freed object is undefined behavior.
You need to change how memory allocation works in code generation
to maintain a list of free blocks,
which will now not be a continuous part at the end of the memory.
The list should not be external,
but rather implemented in the memory itself:
each free block needs to contain a pointer to the next one.
Each block will also need to record its size.
This means that free blocks have to be of size at least 2 words.
When you allocate an object, you need to look through the list
of blocks for one that fits and if none does,
the program should fail.
Make sure you always modify the free list in the simplest way possible,
i.e. the blocks in the list don't have to be in the same order as in memory.
\subsection{Lazy evaluation (1-2)}
Change the evaluation strategy of Amy to lazy evaluation.
Only input and output are evaluated strictly.
\begin{lstlisting}
val x: Int = (Std.printInt(42); 0); // Nothing happens
val y: Int = x + 1 // Still nothing...
Std.printInt(y); // 42 and 1 are printed
val l: List = Cons(1, Cons(2, Cons(error("lazy"), Nil())));
// No error is thrown
l match {
case Nil() => () // At this point, we evaluate l just enough
// to know it is a Cons
case Cons(h, t) => Std.printInt(h) // Prints 1
case Cons(h1, Cons(h2, Cons(h3, _))) =>
// Still no error...
Std.printInt(h3)
// This forces evaluation of the third list element
// and an error is thrown!
}
// We can do neat things like define infinite lists, i.e. streams
def countFrom(start: Int): List = Cons(start, countFrom(start + 1))
Std.printString(L.listToString(
take(countFrom(0), 5)
) // Will terminate and return `List(1, 2, 3, 4, 5)'
\end{lstlisting}
Each value is not evaluated until it is required.
Things that are not evaluated have to live in the runtime state as \emph{thunks},
or suspensions to be evaluated later.
A thunk is essentially a closure (see Section~\ref{closures}) with memoization:
it is either an already calculated value,
or an expression to be evaluated and an evaluation environment.
In turn, an evaluation environment is a mapping from identifiers to other thunks.
You have to make sure that pattern matching only evaluates expressions as much as needed.
Maybe \href{https://en.wikibooks.org/wiki/Haskell/Laziness#Thunks_and_Weak_head_normal_form}{this}
will help you understand the concept.
For simplicity, you can implement lazy evaluation directly in the interpreter, i.e., as an extension of the first lab (1 person).
If you implement lazy evaluation for the WebAssembly-backend, you can work on this project as a team of two. (Note that this variant might be significantly harder.)
\subsection{Final code optimizations (1+)}
Optimize the WebAssembly binary produced by your \langname compiler.
The simplest thing you can do is eliminate some obvious redundancies such as
\begin{minipage}{0.49\textwidth}
\begin{lstlisting}
i32.const 0
if (result i32)
e1
else
e2
end
// equivalent to e2
\end{lstlisting}
\end{minipage}
\begin{minipage}{0.49\textwidth}
\begin{lstlisting}
if (result i32)
i32.const 1
else
i32.const 0
end
// completely redundant
\end{lstlisting}
\end{minipage}
Preferably, you can implement a control flow analysis and some abstract
interpretations to implement more advanced optimizations,
also involving local parameters. This would involve a larger group.
You can have a look at \href{https://cs420.epfl.ch/archive/18/s/acc18_07_optimizations.pdf}{these slides}
for some ideas on optimization.
\subsection{Tail call optimization (1)}
Implement tail call optimization for \langname.
Tail-recursive functions should not create any additional stack frames,
i.e. use the \lstinline{call} instruction.
A way to implement tail recursive functions is to do a source-to-source transformation
which transforms tail recursive functions to loops. You will need to define new ASTs.
When it comes to tail calls that are not tail recursion,
things are tougher. If you feel like also handling those cases,
look \href{https://cs420.epfl.ch/archive/18/s/acc18_10_tail-calls.pdf}{here} for ideas.
\subsection{Foreign-function interface (FFI) to JavaScript (2)}
Design a cross-language interaction layer between \langname and JavaScript.
At a minimum you should support calling JavaScript functions with primitive parameter- and
result-types from Amy. You can also consider supporting calls from JavaScript into Amy.
You will have to decide how WebAssembly representations of Amy objects should map to
JavaScript objects.
To ensure that programs can be meaningfully type-checked, you should add syntax for
\lstinline{external} functions, e.g.
\begin{lstlisting}
object FS {
external def open(path: String): Int
external def read(fd: Int): String
// ...
}
object Example {
val f: Int = FS.open("/home/foo/hello.txt") // open file
Std.printString(FS.read(f)) // print contents of hello.txt
}
\end{lstlisting}
Conversely, Amy functions exposed to JavaScript could be annotated with an \lstinline{export}
keyword.
Ideally you will also demonstrate your FFI's capabilities by wrapping some NodeJS or browser
APIs and exposing them to Amy.
For instance, you might expose the file system API of NodeJS, thus allowing Amy programs to
read from and write to files.
Another idea is to adapt the HTML wrapper file that we provide with the compiler and use the
FFI to write an interactive browser application in Amy.
A more sophisticated version of this project (for three people) would also support foreign
functions involving case classes such as \lstinline{List}.
\subsection{REPL: Read-Eval-Print Loop (3)}
Implement a REPL for \langname.
It should support defining classes, functions and local variables, and evaluating expressions.
You don't have to support redefinitions. You can take a look at the Scala REPL for inspiration.
\subsection{Virtual machine (3)}
Develop your own VM to run WebAssembly code!
To simplify things, you will implement the VM in Scala.
Your VM should take as input
a wasm \lstinline{Module} from amyc's \lstinline{CodeGen} Pipeline
(so you don't need to implement a parser from wasm text or binary)
and execute the code contained within.
Despite using Scala, you still need to follow the VM execution model as much as possible:
translate labels to addresses,
use an array for the memory, a stack for execution etc.
You can choose the VM parameters, such as memory size, any way you choose,
and hard-code built-in functions that are not already implemented in WebAssembly.
\section{Language features}
Projects in this section extend \langname by adding a new language feature.
To implement one of these projects, you will probably need to modify
every stage of the compiler, from lexer to code generation.
If the project is too hard, you might be allowed to skip the code generation
part and only implement the runtime of your project in the interpreter.
\subsection{Imperative features (2+)}
With the exception of input/output, \langname is a purely functional language:
none of its expressions allow side effects.
Your task for this project is to add imperative language features to Amy.
These should include:
\begin{itemize}
\item Mutable local variables.
\begin{lstlisting}
var i: Int;
var j: Int = 0;
i = j;
j = i + 1;
Std.printInt(i);
Std.printInt(j) // prints 0, 1
\end{lstlisting}
Make sure your name analysis disallows plain \lstinline{val}s to be mutated.
\item While loops.
\begin{lstlisting}
def fact(n: Int): Int = {
var res: Int = 1;
var j: Int = n;
while(1 < j) {
res = res * j;
j = j - 1
};
res
}
\end{lstlisting}
\item \emph{Bonus:} Arrays.
You should support at least array initialization,
indexing and extracting array length.
If you add this feature,
you can add an additional member to the group.
\end{itemize}
\subsection{Implicit parameters (1)}
Much like Scala, this feature allows functions to take implicit parameters.
\begin{lstlisting}
def foo(i: Int)(implicit b: Boolean): Int = {
if (i <= 0 && !b) { i }
else { foo(i - 1) + i } // good, implicit parameter in scope
}
foo(1)(true); // good, argument explicitly provided
foo(1); // bad, no implicit in scope
implicit val b: Boolean = true;
foo(1); // good, implicit in scope
// equivalent to foo(1)(b)
implicit val b2: Boolean = false;
foo(1) // Bad, two boolean implicits in scope.
\end{lstlisting}
When a function that takes an implicit parameter is called
and the implicit parameter is not explicitly defined,
the compiler will look at the scope of the call for an implicit
variable/parameter definition of the same type.
If exactly one such definition is found, the compiler
will complete the call with the defined variable/parameter.
If more than one or no such definitions are found,
the compiler will fail the program with
``implicit parameter conflict'' or ``no implicit found''
errors respectively.
\subsection{Implicit conversions (1)}
Much like Scala, this feature allows specified functions to act as implicit conversions.
\begin{lstlisting}
implicit def i2b(i: Int): Boolean = { !(i == 0) }
2 || false // Good, returns true
def foo(b: Boolean): List = { ... }
foo(42) // Also good
1 + true // Bad, no implicit in scope.
def b2s(b: Boolean): String = { ... }
1 ++ "Hello" // Bad, we cannot apply two conversions
\end{lstlisting}
An implicit conversion is a function with the qualifier \lstinline{implicit}.
It must have a single parameter.
At any point in the program, when an expression \lstinline{e} of type \lstinline{T1} is found
but one of type \lstinline{T2} is expected,
the compiler searches the current module for an implicit conversion
of type \lstinline{(T1) => T2}.
If exactly one such conversion \lstinline{f} is found,
the compiler will substitute the \lstinline{e} by \lstinline{f(e)}
(and the program typechecks).
If multiple such conversions are found,
the compiler fails with an ambiguous implicit error.
If none is found, an ordinary type error is emitted.
Only a single conversion is allowed to apply to an expression.
For example, in the above example, we cannot implicitly apply
\lstinline{i2b} and then \lstinline{b2s} to get a \lstinline{String}.
\subsection{Tuples (1)}
Add support for tuples in \langname. You should support
tuple types, literals, and patterns:
\begin{lstlisting}
def maybeNeg(v: (Int, Boolean)): (Int, Boolean) = { // Type
v match {
case (i, false) => // pattern
(i, false) // literal
case (i, true) =>
(-i, false)
}
}
\end{lstlisting}
There are two ways you could approach this problem:
\begin{itemize}
\item Treat tuples as built-in language features. In this case,
you need to support tuples of arbitrary size.
\item Desugar tuples into case classes. A phase after
parsing and before name analysis will transform all tuples
to specified library classes, e.g. \lstinline{Tuple2, Tuple3} etc.
In this case, you cannot support tuples of arbitrary size,
but you still need to support all sizes up to, say, 10.
With this approach, you don't have to modify any compiler phases
from the name analysis onwards,
except maybe to print error messages that make sense to the user.
\end{itemize}
\subsection{Improved string support (1+)}
Improve string support for \langname.
As a starting point, you can add functionality like substring, length, and replace,
which will require you to write auxiliary WebAssembly or JavaScript code.
To avoid adding additional trees,
you can represent these functions as built-in methods in \lstinline{Std}.
If you want a more elaborate project for a larger group,
you can add \lstinline{Char} as a built-in type,
which opens the door for additional functionality with Strings.
You can also implement
\href{https://docs.scala-lang.org/overviews/core/string-interpolation.html}{string interpolation}.
In general, look at Java/Scala strings for inspiration.
\subsection{Higher-order functions (2+, challenging)}
\label{closures}
Add support for higher-order functions to \langname.
You need to support function types and anonymous functions.
\begin{lstlisting}
def compose(f: Int => Int, g: Int => Int): Int => Int = {
(x: Int) => f(g(x))
}
compose((x: Int) => x + 1, (y: Int) => y * 2)(5) // returns 11
def map(f: Int => Int, l: List): List = {
l match {
case Nil() => Nil()
case Cons(h, t) => Cons(f(h), map(f, t))
}
}
map( (x: Int) => x + 1, Cons(1, Cons(2, Cons(3, Nil()))) )
// Returns List(2, 3, 4)
def foo(): Int => Int = {
val i: Int = 1;
val res: Int => Int = (x: Int) => x + i
// Problem! How do we access i from within res?
res
}
foo()(42) // Returns 43
\end{lstlisting}
You have to think how to represent higher order functions during runtime.
In a bytecode setting,
a first approach is to represent a higher-order function as a pointer to
a named function, which is then called indirectly.
You have to read about tables and indirect calls in WebAssembly.
This works fine for \lstinline{compose} or \lstinline{map} above,
but not for \lstinline{foo}.
The problem is that higher order functions can refer to variables in their scope,
like \lstinline{res} above refers to \lstinline{i}.
The set of those variables are called the \emph{environment} of the function.
If its environment is empty, the function will be called \emph{closed}.
Above, we have no way to refer to \lstinline{i} from within \lstinline{res}
at runtime:
\lstinline{i} is in the frame of \lstinline{foo} which is not accessible in \lstinline{res}.
In fact, by the time we need \lstinline{i},
\lstinline{foo} may have returned and its frame disappeared!
The way to solve this problem is a technique called \emph{closure conversion}.
The idea is the following:
At runtime, a function are represented as a \emph{closure},
i.e. a function pointer along with the environment it captures from its scope.
When we create a closure at runtime, we create a pair of values in memory,
one of which points to the code (which will be a function)
and the other to the environment,
which will be a list of the captured variables.
When we call the function,
we really call the function pointer in the closure.
We need to make sure to extract and somehow pass to the function pointer
its environment from the other pointer.
You can find a detailed explanation of closure conversion
\href{https://cs420.epfl.ch/s/acc17_05_closure-conversion.pdf}{here}.
In the interpreter, things are simpler in both cases:
you can define a new value type \lstinline{FunctionValue}
which contains all necessary information.
In fact, you should probably start here as an exercise.
For your project, we recommend that you assume
all functions in the source code are closed,
but if you are motivated to implement closure conversion,
we will allow an additional group member.
\subsection{Custom operators (2)}
Allow the user to define operators.
\begin{lstlisting}
operator def :::(l1: List, l2: List): List = {
l1 match {
case Nil() => l2
case Cons(h, t) => Cons(h, t ::: l2)
}
}
Cons(1, Cons(2, Nil())) ::: Cons(3, Nil()) // returns List(1, 2, 3)
\end{lstlisting}
You can choose specific priorities for the operators based e.g. on their first character,
or you can allow the user to define it;
e.g. \lstinline{operator 55 def :::(...)} could signify
that \lstinline{:::} has a precedence between \lstinline{+} and \lstinline{*}
(with \lstinline{||} having 10, up to \lstinline{*} having 60).
You can also choose to have built-in binary operators of \langname
subsumed by this project. Of course, their implementation
will be left to be hard-coded by the compiler backend:
\begin{lstlisting}
operator 50 def +(i1: Int, i2: Int): Int = { error("+") }
\end{lstlisting}
In any case, your parser will be in no position to know
what operators are available in your program before actually parsing it.
Therefore, when you have more than one operators in a row,
your parser will just have to parse the tree as a flat sequence
of operand, operator, operand, \ldots,
and then fix the mess afterwards.
Of course other solutions are welcome.
\subsection{Improved Parameters (2)}
Add support for named and default parameters for functions and classes.
If a value for a parameter with a default value is not given,
the compiler completes the default value.
One can choose to explicitly name parameters when calling a function/constructor,
which also allows reordering:
\begin{lstlisting}
def foo(i: Int, j: Int = 42): Int = { i + j }
foo(1) // OK, j has default value
foo(i = 5, j = 7) // OK
foo(j = 5, i = 7) // OK, can reorder named parameters
foo(i = 7) // OK
foo(j = 7) // Wrong, i has no default value
foo() // Wrong, i has no default value
foo(i: Int = 5, j: Int): Int = { i + j }
// Wrong, default parameters have to be at the end
// Similarly for case classes
case class Foo(i: Int, j: Int = 42) extends Bar
\end{lstlisting}
Notice that names for case class parameters are currently not preserved in the AST,
which you will have to change.
% \subsection{Regular expressions (1+)}
% Add support for regular expressions to \langname. You can use syntax similar to Java/Scala.
% The size of the group depends on the number of features you want to implement.
\subsection{List comprehensions (2)}
Extend Amy with list comprehensions, which allow programmers to succinctly express
transformations of \lstinline{List}s.
\begin{lstlisting}
val xs: L.List = L.Cons(1, L.Cons(2, L.Cons(3, L.Nil())));
val ys: L.List = [ 2*x for x in xs if x % 2 != 0 ];
Std.printString(L.toString(ys)) // [2, 6]
\end{lstlisting}
Your list-comprehension syntax should support enumerating elements from one or
multiple lists, filtering them with and mapping them to arbitrary expressions.
It is up to you to decide whether to treat these comprehensions as primitives in your compiler.
If you do so, you will have a dedicated AST node for comprehensions in the entire compiler
pipeline and generate specific code or interpret them accordingly in the end.
Alternatively, you can \emph{desugar} list comprehensions earlier in the pipeline, e.g.\
right after (or during) parsing. You could, for instance, generate auxiliary functions that
compute the result of the list comprehension and are called in place of the comprehensions.
\subsection{Inlining (1+)}
Implement inlining on the AST level, that is, allow users to force the compiler to inline certain
functions and perform optimizations on the resulting AST.
\begin{lstlisting}
inline def abs(n: Int): Int = { if (n < 0) -n else n }
abs(123); // inlined and constant-folded to `123'
abs(-456); // inlined and constant-folded to `456'
// inlined, not cf-ed; careful with side-effects!
abs(Std.readInt())
\end{lstlisting}
Inlining is effective when we can expect optimizations to make code significantly more
efficient given additional information on function arguments. At a minimum, you would add an
\lstinline{inline} qualifier for function definitions and perform \emph{constant folding} on
inlined function bodies.
Inlining is particularly useful when applied to auxiliary functions that only exist for
clarity. While inlining can lead to \emph{code explosion} when applied too liberally, note that
inlining a non-recursive function that is only called in a single location will strictly reduce
code size and potentially lead to more efficient code.
This makes it very attractive to \emph{automatically} apply inlining to such functions:
\begin{lstlisting}
def foo(n: Int): Int = {
def plus1(n: Int): Int = { n + 1 }
inline def times2(n: Int): Int = { 2 * n }
plus1(times2(times2(n))) // inlined and cf-ed to `4 * n + 1'
}
def bar(): Int = {
def fib(n: Int): Int = {
if (n <= 2) { 1 }
else { fib(n-2) + fib(n-1) }
}
fib(10) // should *not* be automatically inlined
}
\end{lstlisting}
To incentivize the user to break functions down into the composition of many auxiliary functions
we can introduce \emph{local function definitions}. That is, the user may define a function within
a function. For this project it is sufficient to enforce local functions that only have access
to their own parameters and locals, but not the surrounding function's parameters or locals.
This project is for two people, if you choose to also implement local function definitions,
and one otherwise.
\ No newline at end of file
\relax
\providecommand\hyper@newdestlabel[2]{}
\providecommand\HyperFirstAtBeginDocument{\AtBeginDocument}
\HyperFirstAtBeginDocument{\ifx\hyper@anchor\@undefined
\global\let\oldcontentsline\contentsline
\gdef\contentsline#1#2#3#4{\oldcontentsline{#1}{#2}{#3}}
\global\let\oldnewlabel\newlabel
\gdef\newlabel#1#2{\newlabelxx{#1}#2}
\gdef\newlabelxx#1#2#3#4#5#6{\oldnewlabel{#1}{{#2}{#3}}}
\AtEndDocument{\ifx\hyper@anchor\@undefined
\let\contentsline\oldcontentsline
\let\newlabel\oldnewlabel
\fi}
\fi}
\global\let\hyper@last\relax
\gdef\HyperFirstAtBeginDocument#1{#1}
\providecommand\HyField@AuxAddToFields[1]{}
\providecommand\HyField@AuxAddToCoFields[2]{}
\@writefile{toc}{\contentsline {section}{\numberline {1}Introduction}{1}{section.1}\protected@file@percent }
\@writefile{toc}{\contentsline {section}{\numberline {2}Your own idea!}{1}{section.2}\protected@file@percent }
\@writefile{toc}{\contentsline {section}{\numberline {3}Language features}{1}{section.3}\protected@file@percent }
\@writefile{toc}{\contentsline {subsection}{\numberline {3.1}Imperative features (2+)}{1}{subsection.3.1}\protected@file@percent }
\@writefile{toc}{\contentsline {subsection}{\numberline {3.2}Implicit parameters (1)}{2}{subsection.3.2}\protected@file@percent }
\@writefile{toc}{\contentsline {subsection}{\numberline {3.3}Implicit conversions (1)}{3}{subsection.3.3}\protected@file@percent }
\@writefile{toc}{\contentsline {subsection}{\numberline {3.4}Tuples (1)}{3}{subsection.3.4}\protected@file@percent }
\@writefile{toc}{\contentsline {subsection}{\numberline {3.5}Improved string support (1+)}{4}{subsection.3.5}\protected@file@percent }
\@writefile{toc}{\contentsline {subsection}{\numberline {3.6}Higher-order functions (2+, challenging)}{4}{subsection.3.6}\protected@file@percent }
\newlabel{closures}{{3.6}{4}{Higher-order functions (2+, challenging)}{subsection.3.6}{}}
\@writefile{toc}{\contentsline {subsection}{\numberline {3.7}Custom operators (2)}{5}{subsection.3.7}\protected@file@percent }
\@writefile{toc}{\contentsline {subsection}{\numberline {3.8}Improved Parameters (2)}{6}{subsection.3.8}\protected@file@percent }
\@writefile{toc}{\contentsline {subsection}{\numberline {3.9}List comprehensions (2)}{6}{subsection.3.9}\protected@file@percent }
\@writefile{toc}{\contentsline {subsection}{\numberline {3.10}Inlining (1+)}{6}{subsection.3.10}\protected@file@percent }
\@writefile{toc}{\contentsline {section}{\numberline {4}Type systems}{7}{section.4}\protected@file@percent }
\@writefile{toc}{\contentsline {subsection}{\numberline {4.1}Polymorphic types (2)}{7}{subsection.4.1}\protected@file@percent }
\@writefile{toc}{\contentsline {subsection}{\numberline {4.2}Case class subtyping (2)}{8}{subsection.4.2}\protected@file@percent }
\@writefile{toc}{\contentsline {subsection}{\numberline {4.3}Arrays and range types}{8}{subsection.4.3}\protected@file@percent }
\@writefile{toc}{\contentsline {subsubsection}{\numberline {4.3.1}Dynamically-checked range types (2)}{9}{subsubsection.4.3.1}\protected@file@percent }
\@writefile{toc}{\contentsline {subsubsection}{\numberline {4.3.2}Statically-checked range types (2+, challenging)}{9}{subsubsection.4.3.2}\protected@file@percent }
\@writefile{toc}{\contentsline {paragraph}{Constant-bounds version (2)}{10}{section*.1}\protected@file@percent }
\@writefile{toc}{\contentsline {paragraph}{Dependently-typed version (3)}{10}{section*.2}\protected@file@percent }
\@writefile{toc}{\contentsline {section}{\numberline {5}Alternative frontends/backends}{11}{section.5}\protected@file@percent }
\@writefile{toc}{\contentsline {subsection}{\numberline {5.1}Code formatter (1)}{11}{subsection.5.1}\protected@file@percent }
\@writefile{toc}{\contentsline {subsection}{\numberline {5.2}Language Server (2)}{11}{subsection.5.2}\protected@file@percent }
\@writefile{toc}{\contentsline {subsection}{\numberline {5.3}Formalization of Amy (1)}{11}{subsection.5.3}\protected@file@percent }
\@writefile{toc}{\contentsline {subsection}{\numberline {5.4}JVM backend (2)}{12}{subsection.5.4}\protected@file@percent }
\@writefile{toc}{\contentsline {subsection}{\numberline {5.5}C backend (3)}{12}{subsection.5.5}\protected@file@percent }
\@writefile{toc}{\contentsline {section}{\numberline {6}Execution}{12}{section.6}\protected@file@percent }
\@writefile{toc}{\contentsline {subsection}{\numberline {6.1}Memory deallocation (3)}{12}{subsection.6.1}\protected@file@percent }
\@writefile{toc}{\contentsline {subsection}{\numberline {6.2}Lazy evaluation (1-2)}{12}{subsection.6.2}\protected@file@percent }
\@writefile{toc}{\contentsline {subsection}{\numberline {6.3}Final code optimizations (1+)}{13}{subsection.6.3}\protected@file@percent }
\@writefile{toc}{\contentsline {subsection}{\numberline {6.4}Tail call optimization (1)}{14}{subsection.6.4}\protected@file@percent }
\@writefile{toc}{\contentsline {subsection}{\numberline {6.5}Foreign-function interface (FFI) to JavaScript (2)}{14}{subsection.6.5}\protected@file@percent }
\@writefile{toc}{\contentsline {subsection}{\numberline {6.6}REPL: Read-Eval-Print Loop (3)}{15}{subsection.6.6}\protected@file@percent }
\@writefile{toc}{\contentsline {subsection}{\numberline {6.7}Virtual machine (3)}{15}{subsection.6.7}\protected@file@percent }
\gdef \@abspage@last{15}
\BOOKMARK [1][-]{section.1}{\376\377\000I\000n\000t\000r\000o\000d\000u\000c\000t\000i\000o\000n}{}% 1
\BOOKMARK [1][-]{section.2}{\376\377\000Y\000o\000u\000r\000\040\000o\000w\000n\000\040\000i\000d\000e\000a\000!}{}% 2
\BOOKMARK [1][-]{section.3}{\376\377\000L\000a\000n\000g\000u\000a\000g\000e\000\040\000f\000e\000a\000t\000u\000r\000e\000s}{}% 3
\BOOKMARK [2][-]{subsection.3.1}{\376\377\000I\000m\000p\000e\000r\000a\000t\000i\000v\000e\000\040\000f\000e\000a\000t\000u\000r\000e\000s\000\040\000\050\0002\000+\000\051}{section.3}% 4
\BOOKMARK [2][-]{subsection.3.2}{\376\377\000I\000m\000p\000l\000i\000c\000i\000t\000\040\000p\000a\000r\000a\000m\000e\000t\000e\000r\000s\000\040\000\050\0001\000\051}{section.3}% 5
\BOOKMARK [2][-]{subsection.3.3}{\376\377\000I\000m\000p\000l\000i\000c\000i\000t\000\040\000c\000o\000n\000v\000e\000r\000s\000i\000o\000n\000s\000\040\000\050\0001\000\051}{section.3}% 6
\BOOKMARK [2][-]{subsection.3.4}{\376\377\000T\000u\000p\000l\000e\000s\000\040\000\050\0001\000\051}{section.3}% 7
\BOOKMARK [2][-]{subsection.3.5}{\376\377\000I\000m\000p\000r\000o\000v\000e\000d\000\040\000s\000t\000r\000i\000n\000g\000\040\000s\000u\000p\000p\000o\000r\000t\000\040\000\050\0001\000+\000\051}{section.3}% 8
\BOOKMARK [2][-]{subsection.3.6}{\376\377\000H\000i\000g\000h\000e\000r\000-\000o\000r\000d\000e\000r\000\040\000f\000u\000n\000c\000t\000i\000o\000n\000s\000\040\000\050\0002\000+\000,\000\040\000c\000h\000a\000l\000l\000e\000n\000g\000i\000n\000g\000\051}{section.3}% 9
\BOOKMARK [2][-]{subsection.3.7}{\376\377\000C\000u\000s\000t\000o\000m\000\040\000o\000p\000e\000r\000a\000t\000o\000r\000s\000\040\000\050\0002\000\051}{section.3}% 10
\BOOKMARK [2][-]{subsection.3.8}{\376\377\000I\000m\000p\000r\000o\000v\000e\000d\000\040\000P\000a\000r\000a\000m\000e\000t\000e\000r\000s\000\040\000\050\0002\000\051}{section.3}% 11
\BOOKMARK [2][-]{subsection.3.9}{\376\377\000L\000i\000s\000t\000\040\000c\000o\000m\000p\000r\000e\000h\000e\000n\000s\000i\000o\000n\000s\000\040\000\050\0002\000\051}{section.3}% 12
\BOOKMARK [2][-]{subsection.3.10}{\376\377\000I\000n\000l\000i\000n\000i\000n\000g\000\040\000\050\0001\000+\000\051}{section.3}% 13
\BOOKMARK [1][-]{section.4}{\376\377\000T\000y\000p\000e\000\040\000s\000y\000s\000t\000e\000m\000s}{}% 14
\BOOKMARK [2][-]{subsection.4.1}{\376\377\000P\000o\000l\000y\000m\000o\000r\000p\000h\000i\000c\000\040\000t\000y\000p\000e\000s\000\040\000\050\0002\000\051}{section.4}% 15
\BOOKMARK [2][-]{subsection.4.2}{\376\377\000C\000a\000s\000e\000\040\000c\000l\000a\000s\000s\000\040\000s\000u\000b\000t\000y\000p\000i\000n\000g\000\040\000\050\0002\000\051}{section.4}% 16
\BOOKMARK [2][-]{subsection.4.3}{\376\377\000A\000r\000r\000a\000y\000s\000\040\000a\000n\000d\000\040\000r\000a\000n\000g\000e\000\040\000t\000y\000p\000e\000s}{section.4}% 17
\BOOKMARK [3][-]{subsubsection.4.3.1}{\376\377\000D\000y\000n\000a\000m\000i\000c\000a\000l\000l\000y\000-\000c\000h\000e\000c\000k\000e\000d\000\040\000r\000a\000n\000g\000e\000\040\000t\000y\000p\000e\000s\000\040\000\050\0002\000\051}{subsection.4.3}% 18
\BOOKMARK [3][-]{subsubsection.4.3.2}{\376\377\000S\000t\000a\000t\000i\000c\000a\000l\000l\000y\000-\000c\000h\000e\000c\000k\000e\000d\000\040\000r\000a\000n\000g\000e\000\040\000t\000y\000p\000e\000s\000\040\000\050\0002\000+\000,\000\040\000c\000h\000a\000l\000l\000e\000n\000g\000i\000n\000g\000\051}{subsection.4.3}% 19
\BOOKMARK [1][-]{section.5}{\376\377\000A\000l\000t\000e\000r\000n\000a\000t\000i\000v\000e\000\040\000f\000r\000o\000n\000t\000e\000n\000d\000s\000/\000b\000a\000c\000k\000e\000n\000d\000s}{}% 20
\BOOKMARK [2][-]{subsection.5.1}{\376\377\000C\000o\000d\000e\000\040\000f\000o\000r\000m\000a\000t\000t\000e\000r\000\040\000\050\0001\000\051}{section.5}% 21
\BOOKMARK [2][-]{subsection.5.2}{\376\377\000L\000a\000n\000g\000u\000a\000g\000e\000\040\000S\000e\000r\000v\000e\000r\000\040\000\050\0002\000\051}{section.5}% 22
\BOOKMARK [2][-]{subsection.5.3}{\376\377\000F\000o\000r\000m\000a\000l\000i\000z\000a\000t\000i\000o\000n\000\040\000o\000f\000\040\000A\000m\000y\000\040\000\050\0001\000\051}{section.5}% 23
\BOOKMARK [2][-]{subsection.5.4}{\376\377\000J\000V\000M\000\040\000b\000a\000c\000k\000e\000n\000d\000\040\000\050\0002\000\051}{section.5}% 24
\BOOKMARK [2][-]{subsection.5.5}{\376\377\000C\000\040\000b\000a\000c\000k\000e\000n\000d\000\040\000\050\0003\000\051}{section.5}% 25
\BOOKMARK [1][-]{section.6}{\376\377\000E\000x\000e\000c\000u\000t\000i\000o\000n}{}% 26
\BOOKMARK [2][-]{subsection.6.1}{\376\377\000M\000e\000m\000o\000r\000y\000\040\000d\000e\000a\000l\000l\000o\000c\000a\000t\000i\000o\000n\000\040\000\050\0003\000\051}{section.6}% 27
\BOOKMARK [2][-]{subsection.6.2}{\376\377\000L\000a\000z\000y\000\040\000e\000v\000a\000l\000u\000a\000t\000i\000o\000n\000\040\000\050\0001\000-\0002\000\051}{section.6}% 28
\BOOKMARK [2][-]{subsection.6.3}{\376\377\000F\000i\000n\000a\000l\000\040\000c\000o\000d\000e\000\040\000o\000p\000t\000i\000m\000i\000z\000a\000t\000i\000o\000n\000s\000\040\000\050\0001\000+\000\051}{section.6}% 29
\BOOKMARK [2][-]{subsection.6.4}{\376\377\000T\000a\000i\000l\000\040\000c\000a\000l\000l\000\040\000o\000p\000t\000i\000m\000i\000z\000a\000t\000i\000o\000n\000\040\000\050\0001\000\051}{section.6}% 30
\BOOKMARK [2][-]{subsection.6.5}{\376\377\000F\000o\000r\000e\000i\000g\000n\000-\000f\000u\000n\000c\000t\000i\000o\000n\000\040\000i\000n\000t\000e\000r\000f\000a\000c\000e\000\040\000\050\000F\000F\000I\000\051\000\040\000t\000o\000\040\000J\000a\000v\000a\000S\000c\000r\000i\000p\000t\000\040\000\050\0002\000\051}{section.6}% 31
\BOOKMARK [2][-]{subsection.6.6}{\376\377\000R\000E\000P\000L\000:\000\040\000R\000e\000a\000d\000-\000E\000v\000a\000l\000-\000P\000r\000i\000n\000t\000\040\000L\000o\000o\000p\000\040\000\050\0003\000\051}{section.6}% 32
\BOOKMARK [2][-]{subsection.6.7}{\376\377\000V\000i\000r\000t\000u\000a\000l\000\040\000m\000a\000c\000h\000i\000n\000e\000\040\000\050\0003\000\051}{section.6}% 33
File deleted
\documentclass[]{article}
%\settopmatter{printfolios=true}
% For final camera-ready submission
% \documentclass[acmlarge]{acmart}
% \settopmatter{}
\usepackage{amssymb}
\usepackage{amsmath}
\usepackage{defs}
\usepackage{listings}
\usepackage{stmaryrd}
\usepackage{xcolor}
\usepackage{xspace}
\usepackage[colorlinks]{hyperref}
\hypersetup{urlcolor=cyan}
\usepackage{caption} % Link to beginning of figures
\usepackage{mathpartir}
%\usepackage{subcaption}
\input{scalalistings}
\title{Compiler Extensions for \langname}
\date{Computer Language Processing\\~\\LARA\\~\\Autumn 2019}
\begin{document}
\maketitle
\section{Introduction}
In this document you will find some compiler extension ideas
for the last assignment of the semester.
The ideas are grouped in sections based on the broader subject they cover.
Every extension indicates the maximum size of a
group that is allowed to take it up next to its title.
Some assignments suggest additional features which allow the group to include
additional members.
\section{Your own idea!}
We will be very happy to discuss an idea you come up with yourselves.
\input{features.tex}
\input{types}
\input{alternatives}
\input{execution}
\end{document}
\ No newline at end of file
%% To import in the preambule
%\usepackage{listings}
\usepackage{letltxmacro}
\newcommand*{\SavedLstInline}{}
\LetLtxMacro\SavedLstInline\lstinline
\DeclareRobustCommand*{\lstinline}{%
\ifmmode
\let\SavedBGroup\bgroup
\def\bgroup{%
\let\bgroup\SavedBGroup
\hbox\bgroup
}%
\fi
\SavedLstInline
}
\lstdefinelanguage{ML}{
alsoletter={*},
morekeywords={datatype, of, if, *},
sensitive=true,
morecomment=[s]{/*}{*/},
morestring=[b]"
}
% "define" Scala
\lstdefinelanguage{scala}{
alsoletter={@,=,>},
morekeywords={abstract, Boolean, case, class, def,
else, error, extends, false, free, if, implicit, Int, match,
object, operator, String, true, Unit, val, var, while,
for, in, inline, array, external, export},
sensitive=true,
morecomment=[l]{//},
morecomment=[s]{/*}{*/},
morestring=[b]"
}
% \newcommand{\codestyle}{\tiny\sffamily}
\newcommand{\codestyle}{\ttfamily}
\newcommand{\SAND}{\mbox{\tt \&\&}\xspace}
\newcommand{\SOR}{\mbox{\tt ||}\xspace}
\newcommand{\MOD}{\mbox{\tt \%}\xspace}
\newcommand{\DIV}{\mbox{\tt /}\xspace}
\newcommand{\PP}{\mbox{\tt ++}\xspace}
\newcommand{\MM}{\mbox{\tt {-}{-}}\xspace}
\newcommand{\RA}{\Rightarrow}
\newcommand{\EQ}{\mbox{\tt ==}}
\newcommand{\NEQ}{\mbox{\tt !=}}
\newcommand{\SLE}{\ensuremath{\leq}}
\newcommand{\SGE}{\ensuremath{\geq}}
\newcommand{\SGT}{\mbox{\tt >}}
\newcommand{\SLT}{\mbox{\tt <}}
\newcommand{\rA}{\rightarrow}
\newcommand{\lA}{\leftarrow}
%============================
% To make it colorful uncomment \color in next 30 lines
%\makeatletter
%\newcommand*\idstyle{%
% \expandafter\id@style\the\lst@token\relax
%}
%\def\id@style#1#2\relax{%
% \ifcat#1\relax\else
% \ifnum`#1=\uccode`#1\color{blue!60!black}
% \fi
% \fi
%}
\makeatother
% Default settings for code listings
\lstset{
language=scala,
showstringspaces=false,
columns=fullflexible,
mathescape=true,
numbers=none,
% numberstyle=\tiny,
basicstyle=\codestyle,
keywordstyle=\bfseries\color{blue!60!black}
,
commentstyle=\itshape\color{red!60!black}
,
%identifierstyle=\idstyle,
tabsize=2%,
%aboveskip=0pt,
%belowskip=0pt
}
\section{Type systems}
\subsection{Polymorphic types (2)}
Allow polymorphic types for functions and classes.
\begin{lstlisting}
abstract class List[A]
case class Nil[A]() extends List[A]
case class Cons[A](h: A, t: List[A]) extends List[A]
def length[A](l: List[A]): Int = {
l match {
case Nil() => 0
case Cons(_, t) => 1 + length(t)
}
}
case class Cons2[A, B](h1: A, h2: B, t: List[A]) extends List[A]
// Wrong, type parameters don't match
\end{lstlisting}
You can assume the sequence of type parameters of an extending class
is identical with the parent in the \lstinline{extends} clause
(see example).
\subsection{Case class subtyping (2)}
Add subtyping support to \langname.
Case classes are now types of their own:
\begin{lstlisting}
val y: Some = Some(0) // Correct, Some is a type
val x: Option = None() // Correct, because None <: Option
val z: Some = None() // Wrong
y match {
case Some(i) => () // Correct
case None() => () // Wrong
}
\end{lstlisting}
Since case classes are types, you can declare a variable, parameter,
ADT field or function return type to be of a case class type,
like any other type.
Case class types are subtypes of their parent (abstract class) type.
This means you can assign case class values to variables
declared with the parent type.
Since we have subtyping, you can now optionally support the \lstinline{Nothing}
type in source code, which is a subtype of every type
and the type of \lstinline{error} expressions.
For this project you will probably rewrite the type checking phase in its entirety.
Rather than dealing with explicit constraints, the resulting phase could perform
more classical type-checking based on the minimal type satisfying all the local
subtyping constraints (the so-called \emph{least-upper bound}).
\subsection{Arrays and range types}
In both of the following two projects you would add fixed-size arrays of integers as
a primitive language feature along with a type system that allows users to specify
the range of integers.
The information about an integer's range can then be used to make array accesses safe
by ensuring that indices are in-bounds.
The difference between the two projects lies in \emph{when} integer bounds are checked,
i.e., at compile-time (\emph{statically}) or at runtime (\emph{dynamically}).
In either case you will add two kinds of types:
First, a family of primitive types \lstinline{array[$n$]} that represent integer arrays of
size $n$.
Furthermore, \emph{range types} that represent subsets of \lstinline{Int} taking the
following form:
\lstinline{[$i$ .. $j$]} where $i$ and $j$ are integer constants.
The intended semantics is for \lstinline{[$i$ .. $j$]} to represent a signed 32-bit integer
$n$ such that $i \le n \le j$.
\subsubsection{Dynamically-checked range types (2)}
Your type system should allow users to specify \emph{concrete} ranges, e.g.,
\lstinline{[0 .. 7]} to denote integers $0 \le n \le 7$. Values of \lstinline{Int} and
any range types will be compatible during type-checking, but your system will have to be
able to detect when an integer might not fall within a given range at runtime.
During code generation your task will then be to emit \emph{runtime checks} to ensure
that, e.g., an \lstinline{Int} in fact falls within the range \lstinline{[0 .. 7]}.
\begin{lstlisting}
// initialize an array of size 8:
val arr: array[8] = [10, 20, 30, 40, 50, 60, 70, 80];
arr[0]; // okay, should not emit any runtime check
arr[arr.length-1]; // okay, same as above
// also okay, but should emit a runtime bounds check:
arr[Std.readInt()];
\end{lstlisting}
In effect, your system will ensure that array accesses are always in-bounds, i.e., do not
over- or under-run an array's first, respectively last, element.
Note that the resulting system should only emit the minimal number of runtime checks to
ensure such safety. For instance, consider the following program:
\begin{lstlisting}
def printBoth(arr1: array[8], arr2: array[8], i: [0 .. 7]): Unit = {
Std.printInt(arr1[i], arr2[i])
}
val someInt: Int = 4;
printBoth([1,2,3,4,5,6,7,8], [8,7,6,5,4,3,2,1], someInt)
\end{lstlisting}
Here it is not necessary to perform any checks in the body of \lstinline{printBoth}, since
whatever values are passed as arguments for parameter \lstinline{i} should have previously
been checked to lie between 0 and 7.
In this concrete case, a runtime check should occur when \lstinline{someInt} is passed to
\lstinline{printBoth}.
\subsubsection{Statically-checked range types (2+, challenging)}
Your type system should be strict and detect potential out-of-bounds array accesses
early. In particular, when your type checker cannot prove that an integer lies in the
required range it should produce a type error (and stop compilation as usual).
\begin{lstlisting}
val arr: array[8] = [10, 20, 30, 40, 50, 60, 70, 80];
arr[0]; // okay
arr[arr.length-1]; // okay
arr[arr.length]; // not okay, type error "Idx 8 is out-of-bounds"
val i: Int = Std.readInt();
arr[i]; // not okay, type error "Int may be out-of-bounds"
if (i >= 0 && i < 8) {
arr[i] // okay, branch is only taken when i is in bounds
}
\end{lstlisting}
To allow as many programs as possible to be accepted your type-checker will have to
employ precise typing rules for arithmetic expressions and if-expressions.
What you will implement are simple forms of \emph{path sensitivity} and
\emph{abstract intepretation}.
\paragraph{Constant-bounds version (2)}
Implement statically-checked range types for arrays of fixed and statically-known sizes.
Range types will only involve constant bounds and in effect your type-checker will only
have to accept programs that operate on arrays whose sizes are known to you as concrete
integer constants.
The typing rules that you come up with should be sufficiently strong to prove safety of
simple array manipulations such as the following:
\begin{lstlisting}
def printArray(arr: array[4], i: [0 .. 4]): Unit = {
if (i < arr.length) {
Std.printInt(arr[i]);
printArray(arr, i+1)
}
}
printArray([1,2,3,4], 0)
\end{lstlisting}
\paragraph{Dependently-typed version (3)}
Rather than relying on the user to provide the exact sizes of arrays, also allow arrays
to be of a fixed, but not statically-known size. To enable your type system to accept
more programs, you should also extend the notion of range types to allow bounds
\emph{relative to} a given array's size.
The resulting types will extend the above ones in at least two ways:
In addition to \lstinline{array[$n$]} there is a special form \lstinline{array[*]}
which represents an array of arbitrary (but fixed) size.
For a range type \lstinline{[$i$ .. $j$]} $i$ and $j$ may not only be integer constants,
but may also be expressions of the form \lstinline{arr.length + $k$} where
\lstinline{arr} is an Array-typed variable in scope and $k$ is an integer constant.
Your system should then be able to abstract over concrete array sizes by referring to
some Array-typed binding's length like in the following example:
\begin{lstlisting}
def printArray(arr: array[*], i: [0 .. arr.length]): Unit = {
if (i < arr.length) {
Std.printInt(arr[i]);
printArray(arr, i+1)
}
}
printArray([1,2,3,4], 0)
printArray([1,2,3,4,5,6,7,8], 0)
\end{lstlisting}
Note that the resulting language will be \emph{dependently-typed}, meaning that
types can depend on terms. In the above example, for instance, the type of parameter
\lstinline{i} of function \lstinline{printArray} depends on parameter \lstinline{arr}.
% TODO: (2) Simple ownership system / affine types + in-place updates for ADTs?
% TODO: (1) Region-based ADT allocation + static tracking of provenance
\ No newline at end of file
File deleted
File deleted
## Compiler Extension Presentation Instructions
Background presentations will take place in week 14.
**The presentation should be 10 minutes long.**
**Q&A session of 5-10 minutes** will follow right after the
presentation.
Shortly after, you will receive feedback from us regarding the content
of your presentation, as well as some general feedback on the form.
### Presentation content
Your presentation should summarize your project. In particular, we\'d
expect to see
- a basic overview of the features you added to the compiler/language
- some (short) programs highlighting the use of these features, with a
description of how your extended compiler behaves on them
- possibly some theoretical background you had to learn about to
implement the extension
- an overview of the changes you made to each compiler phase and/or
which phases you added
### Presentation style
Here are some useful resources on how to prepare and give talks:
- [How To Speak by Patrick
Winston](https://www.youtube.com/watch?v=Unzc731iCUY)
- [How to give a great research talk by Simon Peyton
Jones](https://www.microsoft.com/en-us/research/academic-program/give-great-research-talk/)
Please do not use Viktor\'s videos as a model for the presentation, but
instead incorporate as many points of the talk of [Patrick
Winston](https://en.wikipedia.org/wiki/Patrick_Winston) as you believe
apply to your presentation. It is an amazing and entertaining talk,
despite (or because) it is meta-circular: he does as he says. Note:
breaking physical objects or referring to supernatural beings in your
video is not required. Use your own judgment and strike a balance in
being comfortable with what and how you are saying things and trying out
these pieces of advice.
File deleted
File deleted
Below you will find the instructions for the first lab assignment in which you will get to know and implement an interpreter for the Amy language. If you haven't looked at the [Labs Setup](https://gitlab.epfl.ch/lara/cs320/-/blob/main/labs/labs_setup.md) page yet, please do so before starting out with the assignment.
# Part 1: Your first Amy programs
Write two example Amy programs each and make sure you can compile them using the [Amy Reference Compiler](https://gitlab.epfl.ch/lara/cs320/-/blob/main/labs/amy_reference_compiler.md). Put them under `/examples`. Please be creative when writing your programs: they should be nontrivial and not reproduce the functionality of the examples in the `/library` and `/examples` directories of the repository. Of course you are welcome to browse these directories for inspiration.
Remember that you will use these programs in the remaining of the semester to test your compiler, so don't make them too trivial! Try to test many features of the language.
If you have questions about how a feature of Amy works, you can always look at the [Amy Specification](https://gitlab.epfl.ch/lara/cs320/-/blob/main/labs/amy_specification.md). It's a good idea to keep a local copy of this document handy -- it will be your reference for whenever you are asked to implement an aspect of the Amy language throughout this semester.
# Part 2: An Interpreter for Amy ([Slides](slides/lab01.pdf))
The main task of the first lab is to write an interpreter for Amy.
(If you haven't been assigned your repository yet, you can download a packaged version of the interpreter lab's skeleton [here](https://gitlab.epfl.ch/lara/cs320/-/blob/main/labs/labs01_material/clp-lab01.zip). If you already have your repository assigned, you can simply check out the `lab01` branch. Note that future labs will only be distributed through the repository, so be sure to familiarize yourself with the setup.)
## Interpreters
The way to execute programs you have mostly seen so far is compilation to some kind of low-level code (bytecode for a virtual machine such as Java's; native binary code in case of languages such as C). An alternative way to execute programs is interpretation. According to Wikipedia, "an interpreter is a computer program that directly executes, i.e. performs, instructions written in a programming or scripting language, without previously compiling them into a machine language program". In other words, your interpreter is supposed to directly look at the code and *interpret* its meaning. For example, when encountering a call to the 'printString' function, your interpreter should print its argument on the standard output.
## The general structure of the Interpreter
The skeleton of the assignment is provided by us in three files:
- The `Main.scala` source file
- The `Interpreter.scala` source file, and
- the `amyc-frontend-1.7.jar` bytecode file, which is located under `lib/` .
Now let's look into the code in a little more detail.
In `Main.scala`, take a look at the main method, which is the entry point to your program. After processing the command line arguments of the interpreter, the main method creates a Pipeline, which contains the different stages of the compiler (more on it in later assignments). The Pipeline will first call the Amy frontend, which will parse the source program into an abstract syntax tree (AST) and check it for correctness according to the [Amy specification](https://gitlab.epfl.ch/lara/cs320/-/blob/main/labs/amy_specification.md), and then passes the result to the Interpreter.
The implementation of the frontend is given to you in compiled form, because you will need to write your own version in the next assignments. **Note**: You are only allowed to use this binary code to link against your interpreter.
So what is this AST we've mentioned? For the computer to "understand" the meaning of a program, it first has to transform it from source (text) form to a more convenient form, which we call an abstract syntax tree. The AST abstracts away uninteresting things of the program (e.g. parentheses, whitespace, operator precedence...) and keeps the essential structure of the program.
In Scala, we represent the AST as a tree-form object. The tree has different types of nodes, each one representing a different programming structure. The types of nodes are of course represented as different classes, which all inherit from a class called Tree. Conveniently enough, the classes correspond pretty much one-to-one to the rules of the BNF grammar given in the language specification. E.g. in the language spec we read that a module looks as follows:
Module ::= **object** Id Definition* Expr? **end** Id
and indeed in the implementation we find a class
`case class ModuleDef(name: Identifier, defs: List[ClassOrFunDef], optExpr: Option[Expr]) extends Definition`
You can find the source code of the AST [here](https://gitlab.epfl.ch/lara/cs320/-/blob/main/labs/labs01_material/SymbolicTreeModule.scala).
Note: This is not exactly the code we will use in later assignments, but it's good enough to serve as a reference while implementing this first assignment.
## The Interpreter class
Now let's delve into `Interpreter.scala`. This file currently only contains a partial implementation, and it is your task to complete it! The entrypoint into the interpreter is `interpret`, which takes an expression as input and executes its meaning. The main loop at the end of the class will just take the modules in order and interpret their expression, if present.
`interpret` returns a `Value`, which is a type that represents a value that an Amy expression can produce. Value is inherited by classes which represent the different types of values present in Amy (`Int(32)`, `Booleans`, `Unit`, `String` and ADT values). `Value` has convenience methods to cast to `Int(32)`, `Boolean` and `String` (`as*`). Remember we can always call these methods safely when we know the types of an expression (e.g. the operands of an addition), since we know that the program type-checks.
`interpret` takes an additional implicit parameter as an argument, which is a mapping from variables to values (in the interpreted language). In Scala, when an implicit parameter is expected, the compiler will look in the scope for some binding of the correct type and pass it automatically. This way we do not have to pass the same mapping over and over to all recursive calls to `interpret`. Be aware, however, that there are some cases when you need to change the `locals` parameter! Think carefully about when you have to do so.
A few final notes:
* You can print program output straight to the console.
* You can assume the input programs are correct. This is guaranteed by the Amy frontend.
* To find constructors and functions in the program, you have to search in the `SymbolTable` passed along with the program. To do this, use the three helper methods provided in the interpreter:
* `isConstrutor` will return whether the `Identifier` argument is a type constructor in the program
* `findFunctionOwner` will return the module which contains the given `Identifier`, which has to be a function in the program. E.g. if you give it the `printInt` function of the `Std` module, you will get the string `"Std"`.
* `findFunction` will return the function definition given a pair of Strings representing the module containing the function, and the function name. The return value is of type `FunDef` (see [the AST definitions](https://gitlab.epfl.ch/lara/cs320/-/blob/main/labs/labs01_material/SymbolicTreeModule.scala)).
* When comparing Strings by reference, compare the two `StringValue`s directly and not the underlying Strings. The reason is that the JVM may return true when comparing Strings by equality when it is not expected (it has to do with JVM constant pools).
* Some functions contained in the `Std` module are built-in in the language, i.e. they are hard-coded in the interpreter because they cannot be implemented in Amy otherwise. An example of a built-in function is `printString`. When you implement the interpreter for function calls, you should first check if the function is built-in, and if so, use the implementation provided in the `builtIns` map in the interpreter.
* When a program fails (e.g. due to a call to `error` or a match fail), you should call the dedicated method in the Context: `ctx.reporter.fatal`.
## Implementation skeleton
If you have followed [Labs Setup](https://gitlab.epfl.ch/lara/cs320/-/blob/main/labs/labs_setup.md) for Lab 01, you should have a working project with a stub implementation, containing the following files:
* `src/amyc/interpreter/Interpreter.scala` contains a partially implemented interpreter
* `src/amyc/Main.scala` contains the `main` method which runs the interpreter on the input files
* The `library` directory contains library definitions you can call from your programs.
* The `examples` directory contains some example programs on which you can try your implementation. Remember that most of them also use library files from `/library`.
* `lib/amy-frontend-1.7.jar` contains the frontend of the compiler as a library, allowing you directly work with type-checked ASTs of input programs.
You will have to complete the interpreter by implementing the missing methods (marked with the placeholder `???`).
## Testing
When you are done, use sbt to try some of your programs from Part 1:
```
$ sbt
> run library/Std.scala examples/Hello.scala
Hello world!
```
There is also testing infrastructure under `/test`. To add your own tests, you have to add your testcases under `/test/resources/interpreter/passing`
and the expected output under
`/test/resources/interpreter/outputs`.
Then, you have to add the name of the new test in `InterpreterTests`, similarly to the examples given.
To allow a test to also use the standard library (e.g., `Std.printString`), you can copy `Std.scala` from `library/Std.scala` to `/test/resources/interpreter/passing`.
For example, to add a test that expects only "Hello world" to be printed, you can add "/test/resources/interpreter/passing/Hello.scala" containing `object Hello Std.printString("Hello world") end Hello` and `/test/resources/interpreter/outputs/Hello.txt` containing `Hello world` (with a newline in the end!). You will also have to add a line to `/test/scala/amyc/test/InterpreterTests.scala`: `@Test def testHello = shouldOutput(List("Std", "Hello"), "Hello")`. This will pass both files `Std.scala` and `Hello.scala` as inputs of the test. When you now run `test` from sbt, you should see the additional test case (called `testHello`).
## Deliverables
You are given **2 weeks** for this assignment.
Deadline: **Friday October 8 at 11 pm**.
Submission: one team member submits a zip file submission-groupNumber.zip to the [moodle submission page](https://moodle.epfl.ch/mod/assign/view.php?id=1169243).
## Related documentation
* End of Chapter 1 in the Tiger Book presents a similar problem for another mini-language. A comparison of the implementation of ASTs in Java (as shown in the book) and Scala is instructive.
# Lab 04: Type Checker ([Slides](slides/lab04.pdf))
Parsing concludes the syntactical analysis of Amy programs. Having
successfully constructed an abstract syntax tree for an input program,
compilers typically run one or multiple phases containing checks of a
more semantical nature. Virtually all high-level programming languages
enjoy some form of name analysis, whose purpose is to disambiguate
symbol references throughout the program. Some languages go further and
perform a series of additional checks whose goal is to rule out runtime
errors statically (i.e., during compilation, or in other words, without
executing the program). While the exact rules for those checks vary from
language to language, this part of compilation is typically summarized
as \"type checking\". Amy, being a statically-typed language, requires
both name and type analysis.
## Prelude: From Nominal to Symbolic Trees
Recall that during parsing we created (abstract syntax) trees of the
*nominal* sort: Names of variables, functions and data types were simply
stored as strings. However, two names used in the program could be the
same, but not refer to one and the same \"thing\" at runtime. During
name analysis we translate from nominal trees to symbolic ones, to make
it clear whether two names refer to one and the same underlying entity.
That is, we explicitly replace strings by fresh identifiers which will
prevent us from mixing up definitions of the same name, or referring to
things that have not been defined. Amy\'s name analyzer is provided to
you as part of this lab\'s skeleton, but you should read the [dedicated
name analyzer page](labs04_material/NameAnalysis.md) to understand how it works.
## Introduction to Type Checking
The purpose of this lab is to implement a type checker for Amy. Our type
checking rules will prevent certain errors based on the kind or shape of
values that the program is manipulating. For instance, we should prevent
an integer from being added to a boolean value.
Type checking is the last stage of the compiler frontend. Every program
that reaches the end of this stage without an error is correct (as far
as the compiler is concerned), and every program that does not is wrong.
After type checking we are finally ready to interpret the program or
compile it to binary code!
Typing rules for Amy are presented in detail in the
[Amy specification](amy_specification.md). Make sure to check correct
typing for all expressions and patterns.
## Implementation
The current assignment focuses on the file `TypeChecker.scala`. As
usual, the skeleton and helper methods are given to you, and you will
have to complete the missing parts. In particular, you will write a
compiler phase that checks whether the expressions in a given program
are well-typed and report errors otherwise.
To this end you will implement a simplified form of the Hindley-Milner
(HM) type-inference algorithm that you\'ll hear about during the
lectures. Note that while not advertised as a feature to users of Amy,
behind the scenes we will perform type inference. It is usually
straightforward to adapt an algorithm for type inference to type
checking, since one can add the user-provided type annotations to the
set of constraints. This is what you will do with HM in this lab.
Compared to the presentation of HM type inference in class your type
checker can be simplified in another way: Since Amy does not feature
higher-order functions or polymorphic data types, types in Amy are
always *simple* in the sense that they are not composed of arbitrary
other types. That is, a type is either a base type (one of `Int`, `Bool`
and `String`) or it is an ADT, which has a proper name (e.g. `List` or
`Option` from the standard library). In the latter case, all the types
in the constructor of the ADT are immediately known. For instance, the
standard library\'s `List` is really a list of integers, so we know that
the `Cons` constructor takes an `Int` and another `List`.
As a result, your algorithm will never have to deal with complex
constraints over type constructors (such as the function arrow
`A => B`). Instead, your constraints will always be of the form
`T1 = T2` where `T1` and `T2` are either *simple* types or type
variables. This is most important during unification, which otherwise
would have to deal with complex types separately.
Your task now is to a) complete the `genConstraints` method which will
traverse a given expression and collect all the necessary typing
constraints, and b) implement the *unification* algorithm as
`solveConstraints`.
Familiarize yourself with the `Constraint` and `TypeVariable` data
structures in `TypeChecker.scala` and then start by implementing
`genConstraints`. The structure of this method will in many cases be
analogous to the AST traversal you wrote for the name analyzer. Note
that `genConstraints` also takes an *expected type*. For instance, in
case of addition the expected type of both operands should be `Int`. For
other constructs, such as pattern `match`es it is not inherently clear
what should be the type of each `case` body. In this case you can create
and pass a fresh type variable.
Once you have a working implementation of both `genConstraints` and
`solveConstraints` you can copy over your previous work on the
interpreter and run the programs produced by your frontend! Don\'t
forget that to debug your compiler\'s behavior you can also use the
reference compiler with the `--interpret` flag and then compare the
output.
## Skeleton
As usual, you can find the skeleton for this lab in a new branch of your
group\'s repository. After merging it with your existing work, the
structure of your project `src` directory should be as follows:
src/amyc
├── Main.scala (updated)
├── analyzer (new)
│ ├── SymbolTable.scala
│ ├── NameAnalyzer.scala
│ └── TypeChecker.scala
├── ast
│ ├── Identifier.scala
│ ├── Printer.scala
│ └── TreeModule.scala
├── interpreter
│ └── Interpreter.scala
├── lib
│ ├── scallion_3.0.6.jar
│ └── silex_3.0.6.jar
├── parsing
│ ├── Parser.scala
│ ├── Lexer.scala
│ └── Tokens.scala
└── utils
├── AmycFatalError.scala
├── Context.scala
├── Document.scala
├── Pipeline.scala
├── Position.scala
├── Reporter.scala
└── UniqueCounter.scala
## Deliverables
You are given **1 week** for this assignment.
Deadline: **Wednesday November 17 at 11 pm**.
Submission: one team member submits a zip file submission-groupNumber.zip to the [moodle submission page]().
Your submission only needs to contain your `src` directory.
You can use the following command (from the root of your repository) to generate the archive:
```
zip -r submission-<groupNumber>.zip src/
```
You can then verify the content of the archive using `unzip -l submission-<groupNumber>.zip`
# Lab 05: Code Generation(Slides: [Markdown](slides/lab05.md)/[HTML](slides/lab05.html))
## Introduction
Welcome to the last common assignment for the Amy compiler. At this
point, we are finally done with the frontend: we have translated source
programs to ASTs and have checked that all correctness conditions hold
for our program. We are ready to generate code for our program. In our
case the target language will be *WebAssembly*.
WebAssembly is \"a new portable, size- and load-time-efficient format
suitable for compilation to the web\" (<http://webassembly.org>).
WebAssembly is designed to be called from JavaScript in browsers and
lends itself to highly-performant execution.
For simplicity, we will not use a browser, but execute the resulting
WebAssembly bytecode directly using `nodejs` which is essentially a
standalone distribution of the Chrome browser\'s JavaScript engine. When
you run your complete compiler (or the reference compiler) with no
options on program `p`, it will generate four different files under the
`wasmout` directory:
- `p.wat` is the wasm output of the compiler in text format. You can
use this representation to debug your generated code.
- `p.wasm` is the binary output of the compiler. This is what `nodejs`
will use. To translate to the binary format, we use the `wat2wasm`
tool provided by the WebAssembly developers. For your convenience we
have included it in the `bin` directory of the skeleton. Note that
this tool performs a purely mechanical translation and thus its
output (for instance, `p.wasm`) corresponds to a binary
representation of `p.wat`.
- `p.js` is a JavaScript wrapper which we will run with nodejs and
serve as an entrypoint into your generated binary.
To run the program, simply type `nodejs wasmout/p.js`
### Installing nodejs
- You can find directions for your favorite operating system
[here](https://nodejs.org/en/). You should have nodejs 12 or later
(run `nodejs --version` to make sure).
- Once you have installed nodejs, run `npm install deasync` from the
directory you plan to run `amyc` in, i.e. the toplevel directory of
the compiler.
- Make sure the `wat2wasm` executable is visible, i.e. it is in the
system path or you are at the toplevel of the `amyc` directory.
## WebAssembly and Amy
The slides for this year's presentation are in the files called lab05-slides. See [here](https://gitlab.epfl.ch/lara/cs320/-/blob/main/labs/slides/lab05.md) and [here](https://gitlab.epfl.ch/lara/cs320/-/blob/main/labs/slides/lab05.html).
Look at [this
presentation](http://lara.epfl.ch/~gschmid/clp20/codegen.pdf) for the
main concepts of how to translate Amy programs to WebAssembly.
You can find the annotated compiler output to the concat example
[here](http://lara.epfl.ch/~gschmid/clp20/concat.wat).
## The assignment code
### Overview
The code for the assignment is divided into two directories: `wasm` for
the modeling of the WebAssembly framework, and `codegen` for
Amy-specific code generation. There is a lot of code here, but your task
is only to implement code generation for Amy expressions within
`codegen/CodeGen.scala`.
- `wasm/Instructions.scala` provides types that describe a subset of
WebAssembly instructions. It also provides a type `Code` to describe
sequences of instructions. You can chain multiple instructions or
`Code` objects together to generate a longer `Code` with the `<:>`
operator.
- `wasm/Function.scala` describes a wasm function.
- `LocalsHandler` is an object which will create fresh indexes for
local variables as needed.
- A `Function` contains a field called `isMain` which is used to
denote a main function without a return value, which will be
handled differently when printing, and will be exported to
JavaScript.
- The only way to create a `Function` is using `Function.apply`.
Its last argument is a function from a `LocalsHandler` to
`Code`. The reason for this unusual choice is to make sure the
Function object is instantiated with the number of local
variables that will be requested from the LocalsHandler. To see
how it is used, you can look in `codegen/Utils.scala` (but you
won\'t have to use it directly).
- `wasm/Module.scala` and `wasm/ModulePrinter.scala` describe a wasm
module, which you can think of as a set of functions and the
corresponding module headers.
- `codegen/Utils.scala` contains a few utility functions (which you
should use!) and implementations of the built-in functions of Amy.
Use the built-ins as examples.
- `codegen/CodeGen.scala` is the focus of the assignment. It contains
code to translate Amy modules, functions and expressions to wasm
code. It is a pipeline and returns a wasm Module.
- `codegen/CodePrinter.scala` is a Pipeline which will print output
files from the wasm module.
### The cgExpr function
The focus of this assignment is the `cgExpr` function, which takes an
expression and generates a `Code` object. It also takes two additional
arguments: (1) a `LocalsHandler` which you can use to get a new slot for
a local when you encounter a local variable or you need a temporary
variable for your computation. (2) a map `locals` from `Identifiers` to
locals slots, i.e. indices, in the wasm world. For example, if `locals`
contains a pair `i -> 4`, we know that `get_local 4` in wasm will push
the value of i to the stack. Notice how `locals` is instantiated with
the function parameters in `cgFunction`.
## Skeleton
As usual, you can find the skeleton for this lab in a new branch of your
group\'s repository. After merging it with your existing work, the
structure of your project `src` directory should be as follows:
src/amyc
├── Main.scala (updated)
├── analyzer
│ ├── SymbolTable.scala
│ ├── NameAnalyzer.scala
│ └── TypeChecker.scala
├── ast
│ ├── Identifier.scala
│ ├── Printer.scala
│ └── TreeModule.scala
├── bin
│ └── ...
├── codegen (new)
│ ├── CodeGen.scala
│ ├── CodePrinter.scala
│ └── Utils.scala
├── interpreter
│ └── Interpreter.scala
├── lib
│ ├── scallion_3.0.6.jar
│ └── silex_3.0.6.jar
├── parsing
│ ├── Parser.scala
│ ├── Lexer.scala
│ └── Tokens.scala
├── utils
│ ├── AmycFatalError.scala
│ ├── Context.scala
│ ├── Document.scala
│ ├── Pipeline.scala
│ ├── Position.scala
│ ├── Reporter.scala
│ └── UniqueCounter.scala
└── wasm (new)
├── Function.scala
├── Instructions.scala
├── ModulePrinter.scala
└── Module.scala
## Deliverables
You are given **2 weeks** for this assignment.
Deadline: **Thursday December 2 at 11 pm.**.
Submission: one team member submits a zip file submission-groupNumber.zip to the [moodle submission page](https://moodle.epfl.ch/mod/assign/view.php?id=1181848).
Your submission only needs to contain your `src` directory.
You can use the following command (from the root of your repository) to generate the archive:
```
zip -r submission-<groupNumber>.zip src/
```
You can then verify the content of the archive using `unzip -l submission-<groupNumber>.zip`
# Labs 06: Compiler extension project
You have now written a compiler for Amy, a simple functional language.
The final lab project is to design and implement a new functionality of
your own choice on top of the compiler you built so far. In preparation
for this, you should aim to learn about the problem domain by searching
the appropriate literature. The project includes:
- designing and implementing the new functionality
- documenting the results in a written report document
This project has several deadlines, detailed below. Please note that the
first of them (choosing the topic) is already coming up on Sunday!
## Selecting a Project Topic
**Deadline: Friday November 26th**
In the following document, we list several project ideas, but you should
also feel free to submit your own. All groups will rank the
projects in order of preference, and we will then do our best to assign
the preferred projects to as many groups as possible. Because not all
projects are equally difficult, we annotated each of them with the
expected workload. The suggested projects cover a wide range of
complexity, and we will evaluate your submissions with that complexity
in mind. For instance, for a project marked with `(1)` (relatively low
complexity) we will be expecting a polished, well-tested and
well-documented extension, whereas projects on the other end (`(3)`) may
be more prototypical. For all submissions, however, we require that you
deliver code that compiles and a set of example input files that
demonstrate the new functionality.
[Project ideas](labs06_material/extensions.pdf)
To announce your preferences, [please fill out this form before the deadline](https://docs.google.com/forms/d/1EqRwNb61ndyTW31bmn_VellCHHTMmaaOPYSiPGbgaKw/edit). You\'ll have to
provide **the names of the top exactly 5** projects you would like to
work on, in order of descending preference. We will do our best to
assign you the project you are most interested in.
## Project Orientation
**Deadline: Thursday December 9th**
We will try to inform you about the project assignment as soon as possible. We ask you to be **proactive** and validate with the assistants your understanding of the project goals and the expectations of the end product. Think about the following questions and feel free to ask the assistants about them during the exercise sessions:
- What are the features you will add to the compiler/language?
- What would be some (short) programs highlighting the use of these features?
- What changes might be required in each compiler phase and/or what new phases would you add? (Very roughly)
## Project Presentation
You will present your idea during the lab sessions on the last regular
week of the semester (Dec 16th/22nd/23rd). We'll announce the concrete
schedule of presentations at a later point. [Instructions on what and
how to present your project can be found here.](labs06_material/presentation.md)
## Project Implementation and Report
You will develop your project on top of your implementation of Amy. Please push all development on a new branch `lab06`, ideally building on top of the codegen lab (branch `lab05`). We will refer to this branch in case of problems with your submission.
Deadline: **Friday January 7th at 11 pm**.
Submission: one team member submits a zip file submission-groupNumber.zip to the [moodle submission page](https://moodle.epfl.ch/mod/assign/view.php?id=1189120).
Your zip file should contain:
- Your implementation, which must, to be graded at all, compile and be able to run non-trivial examples.
- A subdirectory `extension-examples/` which includes some examples that demonstrate your compiler extension in action.
- A subdirectory `report/` which includes a PDF summarizing your extension.
- A subdirectory `slides/` which includes the PDF of the project presentation.
- A README file indicating how we should run and test the implemented functionality, with examples.
**If you did not manage to complete your planned features, or they are
partially implemented, make this clear in your report!**
You are encouraged to use the following (LaTeX) template for your
report:
- [LaTeX sources](labs06_material/report-template.tar.gz)
A PDF version of the template with the required section is available
here:
- [PDF Example](labs06_material/report-template.pdf)
Although you are not required to use the above template, your report
must contain at least the sections described in it with the appropriate
information. Note that writing this report will take some time, and you
should not do it in the last minute. The final report is an important
part of the compiler project. If you have questions about the template
or the contents of the report, make sure you ask them early.
A common question is \"how long should the report be?\". There\'s no
definitive answer to that. Considering that the report will contain code
examples and a technical description of your implementation, it would be
surprising if it were shorter than 3 pages. Please try to stay within 6
pages. A concise, but well-written report is preferable to a long, but
poorly-written one.