Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • shchen/cs320
  • raveendr/cs320
  • mwojnaro/cs320
3 results
Show changes
%% To import in the preambule
%\usepackage{listings}
\usepackage{letltxmacro}
\newcommand*{\SavedLstInline}{}
\LetLtxMacro\SavedLstInline\lstinline
\DeclareRobustCommand*{\lstinline}{%
\ifmmode
\let\SavedBGroup\bgroup
\def\bgroup{%
\let\bgroup\SavedBGroup
\hbox\bgroup
}%
\fi
\SavedLstInline
}
\lstdefinelanguage{ML}{
alsoletter={*},
morekeywords={datatype, of, if, *},
sensitive=true,
morecomment=[s]{/*}{*/},
morestring=[b]"
}
% "define" Scala
\lstdefinelanguage{scala}{
alsoletter={@,=,>},
morekeywords={abstract, Boolean, case, class, def,
else, error, extends, false, free, if, implicit, Int, match,
object, operator, String, true, Unit, val, var, while,
for, in, inline, array, external, export},
sensitive=true,
morecomment=[l]{//},
morecomment=[s]{/*}{*/},
morestring=[b]"
}
% \newcommand{\codestyle}{\tiny\sffamily}
\newcommand{\codestyle}{\ttfamily}
\newcommand{\SAND}{\mbox{\tt \&\&}\xspace}
\newcommand{\SOR}{\mbox{\tt ||}\xspace}
\newcommand{\MOD}{\mbox{\tt \%}\xspace}
\newcommand{\DIV}{\mbox{\tt /}\xspace}
\newcommand{\PP}{\mbox{\tt ++}\xspace}
\newcommand{\MM}{\mbox{\tt {-}{-}}\xspace}
\newcommand{\RA}{\Rightarrow}
\newcommand{\EQ}{\mbox{\tt ==}}
\newcommand{\NEQ}{\mbox{\tt !=}}
\newcommand{\SLE}{\ensuremath{\leq}}
\newcommand{\SGE}{\ensuremath{\geq}}
\newcommand{\SGT}{\mbox{\tt >}}
\newcommand{\SLT}{\mbox{\tt <}}
\newcommand{\rA}{\rightarrow}
\newcommand{\lA}{\leftarrow}
%============================
% To make it colorful uncomment \color in next 30 lines
%\makeatletter
%\newcommand*\idstyle{%
% \expandafter\id@style\the\lst@token\relax
%}
%\def\id@style#1#2\relax{%
% \ifcat#1\relax\else
% \ifnum`#1=\uccode`#1\color{blue!60!black}
% \fi
% \fi
%}
\makeatother
% Default settings for code listings
\lstset{
language=scala,
showstringspaces=false,
columns=fullflexible,
mathescape=true,
numbers=none,
% numberstyle=\tiny,
basicstyle=\codestyle,
keywordstyle=\bfseries\color{blue!60!black}
,
commentstyle=\itshape\color{red!60!black}
,
%identifierstyle=\idstyle,
tabsize=2%,
%aboveskip=0pt,
%belowskip=0pt
}
\section{Type systems}
\subsection{Polymorphic types (2)}
Allow polymorphic types for functions and classes.
\begin{lstlisting}
abstract class List[A]
case class Nil[A]() extends List[A]
case class Cons[A](h: A, t: List[A]) extends List[A]
def length[A](l: List[A]): Int = {
l match {
case Nil() => 0
case Cons(_, t) => 1 + length(t)
}
}
case class Cons2[A, B](h1: A, h2: B, t: List[A]) extends List[A]
// Wrong, type parameters don't match
\end{lstlisting}
You can assume the sequence of type parameters of an extending class
is identical with the parent in the \lstinline{extends} clause
(see example).
\subsection{Case class subtyping (2)}
Add subtyping support to \langname.
Case classes are now types of their own:
\begin{lstlisting}
val y: Some = Some(0) // Correct, Some is a type
val x: Option = None() // Correct, because None <: Option
val z: Some = None() // Wrong
y match {
case Some(i) => () // Correct
case None() => () // Wrong
}
\end{lstlisting}
Since case classes are types, you can declare a variable, parameter,
ADT field or function return type to be of a case class type,
like any other type.
Case class types are subtypes of their parent (abstract class) type.
This means you can assign case class values to variables
declared with the parent type.
Since we have subtyping, you can now optionally support the \lstinline{Nothing}
type in source code, which is a subtype of every type
and the type of \lstinline{error} expressions.
For this project you will probably rewrite the type checking phase in its entirety.
Rather than dealing with explicit constraints, the resulting phase could perform
more classical type-checking based on the minimal type satisfying all the local
subtyping constraints (the so-called \emph{least-upper bound}).
\subsection{Arrays and range types}
In both of the following two projects you would add fixed-size arrays of integers as
a primitive language feature along with a type system that allows users to specify
the range of integers.
The information about an integer's range can then be used to make array accesses safe
by ensuring that indices are in-bounds.
The difference between the two projects lies in \emph{when} integer bounds are checked,
i.e., at compile-time (\emph{statically}) or at runtime (\emph{dynamically}).
In either case you will add two kinds of types:
First, a family of primitive types \lstinline{array[$n$]} that represent integer arrays of
size $n$.
Furthermore, \emph{range types} that represent subsets of \lstinline{Int} taking the
following form:
\lstinline{[$i$ .. $j$]} where $i$ and $j$ are integer constants.
The intended semantics is for \lstinline{[$i$ .. $j$]} to represent a signed 32-bit integer
$n$ such that $i \le n \le j$.
\subsubsection{Dynamically-checked range types (2)}
Your type system should allow users to specify \emph{concrete} ranges, e.g.,
\lstinline{[0 .. 7]} to denote integers $0 \le n \le 7$. Values of \lstinline{Int} and
any range types will be compatible during type-checking, but your system will have to be
able to detect when an integer might not fall within a given range at runtime.
During code generation your task will then be to emit \emph{runtime checks} to ensure
that, e.g., an \lstinline{Int} in fact falls within the range \lstinline{[0 .. 7]}.
\begin{lstlisting}
// initialize an array of size 8:
val arr: array[8] = [10, 20, 30, 40, 50, 60, 70, 80];
arr[0]; // okay, should not emit any runtime check
arr[arr.length-1]; // okay, same as above
// also okay, but should emit a runtime bounds check:
arr[Std.readInt()];
\end{lstlisting}
In effect, your system will ensure that array accesses are always in-bounds, i.e., do not
over- or under-run an array's first, respectively last, element.
Note that the resulting system should only emit the minimal number of runtime checks to
ensure such safety. For instance, consider the following program:
\begin{lstlisting}
def printBoth(arr1: array[8], arr2: array[8], i: [0 .. 7]): Unit = {
Std.printInt(arr1[i], arr2[i])
}
val someInt: Int = 4;
printBoth([1,2,3,4,5,6,7,8], [8,7,6,5,4,3,2,1], someInt)
\end{lstlisting}
Here it is not necessary to perform any checks in the body of \lstinline{printBoth}, since
whatever values are passed as arguments for parameter \lstinline{i} should have previously
been checked to lie between 0 and 7.
In this concrete case, a runtime check should occur when \lstinline{someInt} is passed to
\lstinline{printBoth}.
\subsubsection{Statically-checked range types (2+, challenging)}
Your type system should be strict and detect potential out-of-bounds array accesses
early. In particular, when your type checker cannot prove that an integer lies in the
required range it should produce a type error (and stop compilation as usual).
\begin{lstlisting}
val arr: array[8] = [10, 20, 30, 40, 50, 60, 70, 80];
arr[0]; // okay
arr[arr.length-1]; // okay
arr[arr.length]; // not okay, type error "Idx 8 is out-of-bounds"
val i: Int = Std.readInt();
arr[i]; // not okay, type error "Int may be out-of-bounds"
if (i >= 0 && i < 8) {
arr[i] // okay, branch is only taken when i is in bounds
}
\end{lstlisting}
To allow as many programs as possible to be accepted your type-checker will have to
employ precise typing rules for arithmetic expressions and if-expressions.
What you will implement are simple forms of \emph{path sensitivity} and
\emph{abstract intepretation}.
\paragraph{Constant-bounds version (2)}
Implement statically-checked range types for arrays of fixed and statically-known sizes.
Range types will only involve constant bounds and in effect your type-checker will only
have to accept programs that operate on arrays whose sizes are known to you as concrete
integer constants.
The typing rules that you come up with should be sufficiently strong to prove safety of
simple array manipulations such as the following:
\begin{lstlisting}
def printArray(arr: array[4], i: [0 .. 4]): Unit = {
if (i < arr.length) {
Std.printInt(arr[i]);
printArray(arr, i+1)
}
}
printArray([1,2,3,4], 0)
\end{lstlisting}
\paragraph{Dependently-typed version (3)}
Rather than relying on the user to provide the exact sizes of arrays, also allow arrays
to be of a fixed, but not statically-known size. To enable your type system to accept
more programs, you should also extend the notion of range types to allow bounds
\emph{relative to} a given array's size.
The resulting types will extend the above ones in at least two ways:
In addition to \lstinline{array[$n$]} there is a special form \lstinline{array[*]}
which represents an array of arbitrary (but fixed) size.
For a range type \lstinline{[$i$ .. $j$]} $i$ and $j$ may not only be integer constants,
but may also be expressions of the form \lstinline{arr.length + $k$} where
\lstinline{arr} is an Array-typed variable in scope and $k$ is an integer constant.
Your system should then be able to abstract over concrete array sizes by referring to
some Array-typed binding's length like in the following example:
\begin{lstlisting}
def printArray(arr: array[*], i: [0 .. arr.length]): Unit = {
if (i < arr.length) {
Std.printInt(arr[i]);
printArray(arr, i+1)
}
}
printArray([1,2,3,4], 0)
printArray([1,2,3,4,5,6,7,8], 0)
\end{lstlisting}
Note that the resulting language will be \emph{dependently-typed}, meaning that
types can depend on terms. In the above example, for instance, the type of parameter
\lstinline{i} of function \lstinline{printArray} depends on parameter \lstinline{arr}.
% TODO: (2) Simple ownership system / affine types + in-place updates for ADTs?
% TODO: (1) Region-based ADT allocation + static tracking of provenance
\ No newline at end of file
Below you will find the instructions for the first lab assignment in which you will get to know and implement an interpreter for the Amy language. If you haven't looked at the [Labs Setup](https://gitlab.epfl.ch/lara/cs320/-/blob/main/labs/labs-setup.md) page yet, please do so before starting out with the assignment.
# Part 1: Your first Amy programs
Write two example Amy programs each and make sure you can compile them using the [Amy Reference Compiler](https://gitlab.epfl.ch/lara/cs320/-/blob/main/labs/amy_reference_compiler.md). Put them under `/examples`. Please be creative when writing your programs: they should be nontrivial and not reproduce the functionality of the examples in the `/library` and `/examples` directories of the repository. Of course you are welcome to browse these directories for inspiration.
Remember that you will use these programs in the remaining of the semester to test your compiler, so don't make them too trivial! Try to test many features of the language.
If you have questions about how a feature of Amy works, you can always look at the [Amy Specification](https://gitlab.epfl.ch/lara/cs320/-/blob/main/labs/amy-specification/amy-specification.pdf). It's a good idea to keep a local copy of this document handy -- it will be your reference for whenever you are asked to implement an aspect of the Amy language throughout this semester.
# Part 2: An Interpreter for Amy ([Slides](lab01-slides.pdf))
The main task of the first lab is to write an interpreter for Amy.
(If you haven't been assigned your repository yet, you can download a packaged version of the interpreter lab's skeleton [here](https://gitlab.epfl.ch/lara/cs320/-/blob/main/labs/lab01/lab01.zip). If you already have your repository assigned, you can simply check out the `clplab1` branch. Note that future labs will only be distributed through the repository, so be sure to familiarize yourself with the setup.)
## Interpreters
The way to execute programs you have mostly seen so far is compilation to some kind of low-level code (bytecode for a virtual machine such as Java's; native binary code in case of languages such as C). An alternative way to execute programs is interpretation. According to Wikipedia, "an interpreter is a computer program that directly executes, i.e. performs, instructions written in a programming or scripting language, without previously compiling them into a machine language program". In other words, your interpreter is supposed to directly look at the code and *interpret* its meaning. For example, when encountering a call to the 'printString' function, your interpreter should print its argument on the standard output.
## The general structure of the Interpreter
The skeleton of the assignment is provided by us in three files:
- The `Main.scala` source file
- The `Interpreter.scala` source file, and
- the `amyc-frontend-1.7.jar` bytecode file, which is located under `lib/` .
Now let's look into the code in a little more detail.
In `Main.scala`, take a look at the main method, which is the entry point to your program. After processing the command line arguments of the interpreter, the main method creates a Pipeline, which contains the different stages of the compiler (more on it in later assignments). The Pipeline will first call the Amy frontend, which will parse the source program into an abstract syntax tree (AST) and check it for correctness according to the [Amy specification](https://gitlab.epfl.ch/lara/cs320/-/blob/main/labs/amy-specification/amy-specification.pdf), and then passes the result to the Interpreter.
The implementation of the frontend is given to you in compiled form, because you will need to write your own version in the next assignments. **Note**: You are only allowed to use this binary code to link against your interpreter.
So what is this AST we've mentioned? For the computer to "understand" the meaning of a program, it first has to transform it from source (text) form to a more convenient form, which we call an abstract syntax tree. The AST abstracts away uninteresting things of the program (e.g. parentheses, whitespace, operator precedence...) and keeps the essential structure of the program.
In Scala, we represent the AST as a tree-form object. The tree has different types of nodes, each one representing a different programming structure. The types of nodes are of course represented as different classes, which all inherit from a class called Tree. Conveniently enough, the classes correspond pretty much one-to-one to the rules of the BNF grammar given in the language specification. E.g. in the language spec we read that a module looks as follows:
Module ::= **object** Id Definition* Expr? **end** Id
and indeed in the implementation we find a class
`case class ModuleDef(name: Identifier, defs: List[ClassOrFunDef], optExpr: Option[Expr]) extends Definition`
You can find the source code of the AST [here](https://gitlab.epfl.ch/lara/cs320/-/blob/main/labs/lab01/material/SymbolicTreeModule.scala).
Note: This is not exactly the code we will use in later assignments, but it's good enough to serve as a reference while implementing this first assignment.
## The Interpreter class
Now let's delve into `Interpreter.scala`. This file currently only contains a partial implementation, and it is your task to complete it! The entrypoint into the interpreter is `interpret`, which takes an expression as input and executes its meaning. The main loop at the end of the class will just take the modules in order and interpret their expression, if present.
`interpret` returns a `Value`, which is a type that represents a value that an Amy expression can produce. Value is inherited by classes which represent the different types of values present in Amy (`Int(32)`, `Booleans`, `Unit`, `String` and ADT values). `Value` has convenience methods to cast to `Int(32)`, `Boolean` and `String` (`as*`). Remember we can always call these methods safely when we know the types of an expression (e.g. the operands of an addition), since we know that the program type-checks.
`interpret` takes an additional implicit parameter as an argument, which is a mapping from variables to values (in the interpreted language). In Scala, when an implicit parameter is expected, the compiler will look in the scope for some binding of the correct type and pass it automatically. This way we do not have to pass the same mapping over and over to all recursive calls to `interpret`. Be aware, however, that there are some cases when you need to change the `locals` parameter! Think carefully about when you have to do so.
A few final notes:
* You can print program output straight to the console.
* You can assume the input programs are correct. This is guaranteed by the Amy frontend.
* To find constructors and functions in the program, you have to search in the `SymbolTable` passed along with the program. To do this, use the three helper methods provided in the interpreter:
* `isConstrutor` will return whether the `Identifier` argument is a type constructor in the program
* `findFunctionOwner` will return the module which contains the given `Identifier`, which has to be a function in the program. E.g. if you give it the `printInt` function of the `Std` module, you will get the string `"Std"`.
* `findFunction` will return the function definition given a pair of Strings representing the module containing the function, and the function name. The return value is of type `FunDef` (see [the AST definitions](https://gitlab.epfl.ch/lara/cs320/-/blob/main/labs/lab01/material/SymbolicTreeModule.scala)).
* When comparing Strings by reference, compare the two `StringValue`s directly and not the underlying Strings. The reason is that the JVM may return true when comparing Strings by equality when it is not expected (it has to do with JVM constant pools).
* Some functions contained in the `Std` module are built-in in the language, i.e. they are hard-coded in the interpreter because they cannot be implemented in Amy otherwise. An example of a built-in function is `printString`. When you implement the interpreter for function calls, you should first check if the function is built-in, and if so, use the implementation provided in the `builtIns` map in the interpreter.
* When a program fails (e.g. due to a call to `error` or a match fail), you should call the dedicated method in the Context: `ctx.reporter.fatal`.
## Implementation skeleton
If you have followed [Labs Setup](https://gitlab.epfl.ch/lara/cs320/-/blob/main/labs/labs-setup.md) for Lab 01, you should have a working project with a stub implementation, containing the following files:
* `src/amyc/interpreter/Interpreter.scala` contains a partially implemented interpreter
* `src/amyc/Main.scala` contains the `main` method which runs the interpreter on the input files
* The `library` directory contains library definitions you can call from your programs.
* The `examples` directory contains some example programs on which you can try your implementation. Remember that most of them also use library files from `/library`.
* `lib/amy-frontend-1.7.jar` contains the frontend of the compiler as a library, allowing you directly work with type-checked ASTs of input programs.
You will have to complete the interpreter by implementing the missing methods (marked with the placeholder `???`).
## Testing
When you are done, use sbt to try some of your programs from Part 1:
```
$ sbt
> run library/Std.scala examples/Hello.scala
Hello world!
```
There is also testing infrastructure under `/test`. To add your own tests, you have to add your testcases under `/test/resources/interpreter/passing`
and the expected output under
`/test/resources/interpreter/outputs`.
Then, you have to add the name of the new test in `InterpreterTests`, similarly to the examples given.
To allow a test to also use the standard library (e.g., `Std.printString`), you can copy `Std.scala` from `library/Std.scala` to `/test/resources/interpreter/passing`.
For example, to add a test that expects only "Hello world" to be printed, you can add "/test/resources/interpreter/passing/Hello.scala" containing `object Hello Std.printString("Hello world") end Hello` and `/test/resources/interpreter/outputs/Hello.txt` containing `Hello world` (with a newline in the end!). You will also have to add a line to `/test/scala/amyc/test/InterpreterTests.scala`: `@Test def testHello = shouldOutput(List("Std", "Hello"), "Hello")`. This will pass both files `Std.scala` and `Hello.scala` as inputs of the test. When you now run `test` from sbt, you should see the additional test case (called `testHello`).
## Deliverables
Deadline: **Friday October 7 at 11 pm**.
## Related documentation
* End of Chapter 1 in the Tiger Book presents a similar problem for another mini-language. A comparison of the implementation of ASTs in Java (as shown in the book) and Scala is instructive.
File deleted
# Labs Setup
This page contains instructions on how to set up your computer to work on assignments.
## Step 0: Find a group
All work in this year's edition of CS-320 will be done in groups of two or three students.
If you are looking for teammates, please consider using the course's [Moodle](https://moodle.epfl.ch/course/view.php?id=4241) forums.
You should submit the names and EPFL e-mail addresses of all teammates through [this form](https://forms.gle/v1gHDFdPZyv8LwzB6). Note that you need access the form with EPFL e-mail account.
## Step 1: Version control
Since you will be working with others throughout the course, we provide all labs via Git repositories.
Once you have registered as a group, you will be assigned a repository on [EPFL's GitLab](https://gitlab.epfl.ch/).
For every new lab we are going to push the starter code ("skeleton") on a separate branch of that repository.
Since all labs starting from the second are cumulative, you should ideally merge your work resulting from the preceding, completed lab into the starter code of the new lab.
Once the deadline for a lab comes around, we will simply consider your most recent commit on the respective branch as your submission.
## Step 2: Java / Scala / SBT
We will be using Scala 3 to implement our compiler project.
Please make sure your system is running at least Java 8 and has [SBT](http://www.scala-sbt.org/) (the default build tool for Scala) installed.
## Step 3: Working on the labs
Once you have formed a group and installed SBT, you are ready to start the labs. The workflow for each assignment is roughly as follows:
* Check out the branch of the current lab from your repository. Merge with your work from prior labs (when starting work on lab 3, 4 or 5).
* Run ''compile'' from inside SBT to make sure your build still succeeds. If you run ''test'' at this point it should fail on some of the new tests provided by us.
* Implement the assignment according to the specification. Throughout the semester we will be providing you with details on each specific assignment.
\ No newline at end of file