diff --git a/labs/labs06_material/presentation.md b/labs/labs06_material/presentation.md new file mode 100644 index 0000000000000000000000000000000000000000..c0cc05acaa11b1798493e523a127cbc29ba68719 --- /dev/null +++ b/labs/labs06_material/presentation.md @@ -0,0 +1,109 @@ +## Compiler Extension Presentation Instructions + +Background presentations will take place in week 14. We strongly +recommend that you pre-record your presentation. **[You should upload +your talk on SwitchTube](https://tube.switch.ch/channels/c1d660a4)** +(the precise channel will be linked here soon). However, if you prefer, +you can also live stream your presentation, but in that case you are +responsible if the presentation does not reach your audience due to +network quality issues. + +**The presentation should be 10 minutes long.** + +**Q&A session of 5-10 minutes** will follow right after the +presentation. Please make sure at least one of you is available for the +entire 20 minute slot. + +**We would like each member of the group to be part of the +presentation.** + +Shortly after, you will receive feedback from us regarding the content +of your presentation, as well as some general feedback on the form. + +### Presentation content + +Your presentation should summarize your project. In particular, we\'d +expect to see + +- a basic overview of the features you added to the compiler/language +- some (short) programs highlighting the use of these features, with a + description of how your extended compiler behaves on them +- possibly some theoretical background you had to learn about to + implement the extension +- an overview of the changes you made to each compiler phase and/or + which phases you added + +### Presentation style + +Here are some useful resources on how to prepare and give talks: + +- [How To Speak by Patrick + Winston](https://www.youtube.com/watch?v=Unzc731iCUY) +- [How to give a great research talk by Simon Peyton + Jones](https://www.microsoft.com/en-us/research/academic-program/give-great-research-talk/) + +Please do not use Viktor\'s videos as a model for the presentation, but +instead incorporate as many points of the talk of [Patrick +Winston](https://en.wikipedia.org/wiki/Patrick_Winston) as you believe +apply to your presentation. It is an amazing and entertaining talk, +despite (or because) it is meta-circular: he does as he says. Note: +breaking physical objects or referring to supernatural beings in your +video is not required. Use your own judgement and strike a balance in +being comfortable with what and how you are saying things and trying out +these pieces of advice. + +### Instructions for video (recording or streaming) + +We suggest that the speaker\'s video shows up when the speaker starts to +speak, so that the audience can relate and identify the speaker. +Afterwards, the video can be turned off and should come back on for +questions and answers. Optionally, a small video can stay on throughout +the presentation. The main content of the presentation should be a +window showing the material being presented, for example as a PDF to +which you can point to and/or annotate it. If the hardware allows you, +you can also use a tablet to simulate a blackboard presentation where +you write down everything you present, or use a combination or simple +slides and a strategy of what you will write on them. + +**Video upload:** [please upload your video to this +channel](https://tube.switch.ch/channels/c1d660a4) (login with EPFL +credentials) + +### Viktor\'s recording setup + +For your information and not as a requirement, Viktor\'s lectures are +prepared using this hardware and software setup on Ubuntu 20 OS: + +- slides prepared using the \`beamer\` latex package +- slides annotated using \`xournal\` PDF annotator in full screen mode + on display size 1920x1080 +- recording using Zoom, with the following options: + - screen sharing PDF annotator (\`xournal\`), **without** option + to optimize for full-screen viewing + - local recording, with option **Optimize for 3rd party video + editor** +- wacom cintiq pro display as external monitor for annotating PDF\'s + using pen +- video segments are cut and assembled using ffmpeg, which works very + fast: + - cut like this: + +```{=html} +<!-- --> +``` + fmpeg -i zoom_0.mp4 -ss 00:00:00 -to 00:02:03.00 -c copy mysegment01.mp4 + + * concatenate like this: + + ffmpeg -f concat -i segmentlist.txt -c copy mycombinedvideo.mp4 + +where segmentlist.txt is a file containing one line per each file to +include: + + file 'mysegment01.mp4' + file 'mysegment02.mp4' + file 'mysegment03.mp4' + +Alternatively, you can also use \`obs\` open source software. For +recording, under advanced options, you may wish to choose a 1 second key +frame interval to make cutting the video with ffmpeg work well. diff --git a/labs/labs06_material/report-template.pdf b/labs/labs06_material/report-template.pdf new file mode 100644 index 0000000000000000000000000000000000000000..c4d859b01e0dd78f2ac48af34659a62896ebbcd4 Binary files /dev/null and b/labs/labs06_material/report-template.pdf differ diff --git a/labs/labs06_material/report-template.tar.gz b/labs/labs06_material/report-template.tar.gz new file mode 100644 index 0000000000000000000000000000000000000000..4585d7888e45ec24cbac6cc7020e7954bf3fa6b7 Binary files /dev/null and b/labs/labs06_material/report-template.tar.gz differ diff --git a/labs/labs_04.md b/labs/labs_04.md new file mode 100644 index 0000000000000000000000000000000000000000..f71596cb8bb56f361735366358a154272b6e0b83 --- /dev/null +++ b/labs/labs_04.md @@ -0,0 +1,149 @@ +# Lab 04: Type Checker + +Parsing concludes the syntactical analysis of Amy programs. Having +successfully constructed an abstract syntax tree for an input program, +compilers typically run one or multiple phases containing checks of a +more semantical nature. Virtually all high-level programming languages +enjoy some form of name analysis, whose purpose is to disambiguate +symbol references throughout the program. Some languages go further and +perform a series of additional checks whose goal is to rule out runtime +errors statically (i.e., during compilation, or in other words, without +executing the program). While the exact rules for those checks vary from +language to language, this part of compilation is typically summarized +as \"type checking\". Amy, being a statically-typed language, requires +both name and type analysis. + +## Prelude: From Nominal to Symbolic Trees + +Recall that during parsing we created (abstract syntax) trees of the +*nominal* sort: Names of variables, functions and data types were simply +stored as strings. However, two names used in the program could be the +same, but not refer to one and the same \"thing\" at runtime. During +name analysis we translate from nominal trees to symbolic ones, to make +it clear whether two names refer to one and the same underlying entity. +That is, we explicitly replace strings by fresh identifiers which will +prevent us from mixing up definitions of the same name, or referring to +things that have not been defined. Amy\'s name analyzer is provided to +you as part of this lab\'s skeleton, but you should read the [dedicated +name analyzer page](name analyzer) to understand how it works. + +## Introduction to Type Checking + +The purpose of this lab is to implement a type checker for Amy. Our type +checking rules will prevent certain errors based on the kind or shape of +values that the program is manipulating. For instance, we should prevent +an integer from being added to a boolean value. + +Type checking is the last stage of the compiler frontend. Every program +that reaches the end of this stage without an error is correct (as far +as the compiler is concerned), and every program that does not is wrong. +After type checking we are finally ready to interpret the program or +compile it to binary code! + +Typing rules for Amy are presented in detail in the +[Amy specification](amy_specification.md). Make sure to check correct +typing for all expressions and patterns. + +## Implementation + +The current assignment focuses on the file `TypeChecker.scala`. As +usual, the skeleton and helper methods are given to you, and you will +have to complete the missing parts. In particular, you will write a +compiler phase that checks whether the expressions in a given program +are well-typed and report errors otherwise. + +To this end you will implement a simplified form of the Hindley-Milner +(HM) type-inference algorithm that you\'ll hear about during the +lectures. Note that while not advertised as a feature to users of Amy, +behind the scenes we will perform type inference. It is usually +straightforward to adapt an algorithm for type inference to type +checking, since one can add the user-provided type annotations to the +set of constraints. This is what you will do with HM in this lab. + +Compared to the presentation of HM type inference in class your type +checker can be simplified in another way: Since Amy does not feature +higher-order functions or polymorphic data types, types in Amy are +always *simple* in the sense that they are not composed of arbitrary +other types. That is, a type is either a base type (one of `Int`, `Bool` +and `String`) or it is an ADT, which has a proper name (e.g. `List` or +`Option` from the standard library). In the latter case, all the types +in the constructor of the ADT are immediately known. For instance, the +standard library\'s `List` is really a list of integers, so we know that +the `Cons` constructor takes an `Int` and another `List`. + +As a result, your algorithm will never have to deal with complex +constraints over type constructors (such as the function arrow +`A => B`). Instead, your constraints will always be of the form +`T1 = T2` where `T1` and `T2` are either *simple* types or type +variables. This is most important during unification, which otherwise +would have to deal with complex types separately. + +Your task now is to a) complete the `genConstraints` method which will +traverse a given expression and collect all the necessary typing +constraints, and b) implement the *unification* algorithm as +`solveConstraints`. + +Familiarize yourself with the `Constraint` and `TypeVariable` data +structures in `TypeChecker.scala` and then start by implementing +`genConstraints`. The structure of this method will in many cases be +analogous to the AST traversal you wrote for the name analyzer. Note +that `genConstraints` also takes an *expected type*. For instance, in +case of addition the expected type of both operands should be `Int`. For +other constructs, such as pattern `match`es it is not inherently clear +what should be the type of each `case` body. In this case you can create +and pass a fresh type variable. + +Once you have a working implementation of both `genConstraints` and +`solveConstraints` you can copy over your previous work on the +interpreter and run the programs produced by your frontend! Don\'t +forget that to debug your compiler\'s behavior you can also use the +reference compiler with the `--interpret` flag and then compare the +output. + +## Skeleton + +As usual, you can find the skeleton for this lab in a new branch of your +group\'s repository. After merging it with your existing work, the +structure of your project `src` directory should be as follows: + + src/amyc + ├── Main.scala (updated) + │ + ├── analyzer (new) + │ ├── SymbolTable.scala + │ ├── NameAnalyzer.scala + │ └── TypeChecker.scala + │ + ├── ast + │ ├── Identifier.scala + │ ├── Printer.scala + │ └── TreeModule.scala + │ + ├── interpreter + │ └── Interpreter.scala + │ + ├── lib + │ ├── scallion_3.0.6.jar + │ └── silex_3.0.6.jar + │ + ├── parsing + │ ├── Parser.scala + │ ├── Lexer.scala + │ └── Tokens.scala + │ + └── utils + ├── AmycFatalError.scala + ├── Context.scala + ├── Document.scala + ├── Pipeline.scala + ├── Position.scala + ├── Reporter.scala + └── UniqueCounter.scala + +## Deliverables + +You are given **3 weeks** for this assignment. + +Deadline: **TBD**. + +Submission: one team member submits a zip file submission-groupNumber.zip to the [moodle submission page](). diff --git a/labs/labs_05.md b/labs/labs_05.md new file mode 100644 index 0000000000000000000000000000000000000000..6da60e8a51a5abb73eae0f5fa20bf353e61fbed3 --- /dev/null +++ b/labs/labs_05.md @@ -0,0 +1,170 @@ +# Lab 05: Code Generation + +## Introduction + +Welcome to the last common assignment for the Amy compiler. At this +point, we are finally done with the frontend: we have translated source +programs to ASTs and have checked that all correctness conditions hold +for our program. We are ready to generate code for our program. In our +case the target language will be *WebAssembly*. + +WebAssembly is \"a new portable, size- and load-time-efficient format +suitable for compilation to the web\" (<http://webassembly.org>). +WebAssembly is designed to be called from JavaScript in browsers and +lends itself to highly-performant execution. + +For simplicity, we will not use a browser, but execute the resulting +WebAssembly bytecode directly using `nodejs` which is essentially a +standalone distribution of the Chrome browser\'s JavaScript engine. When +you run your complete compiler (or the reference compiler) with no +options on program `p`, it will generate four different files under the +`wasmout` directory: + +- `p.wat` is the wasm output of the compiler in text format. You can + use this representation to debug your generated code. +- `p.wasm` is the binary output of the compiler. This is what `nodejs` + will use. To translate to the binary format, we use the `wat2wasm` + tool provided by the WebAssembly developers. For your convenience we + have included it in the `bin` directory of the skeleton. Note that + this tool performs a purely mechanical translation and thus its + output (for instance, `p.wasm`) corresponds to a binary + representation of `p.wat`. +- `p.js` is a JavaScript wrapper which we will run with nodejs and + serve as an entrypoint into your generated binary. + +To run the program, simply type `nodejs wasmout/p.js` + +### Installing nodejs + +- You can find directions for your favorite operating system + [here](https://nodejs.org/en/). You should have nodejs 12 or later + (run `nodejs --version` to make sure). +- Once you have installed nodejs, run `npm install deasync` from the + directory you plan to run `amyc` in, i.e. the toplevel directory of + the compiler. +- Make sure the `wat2wasm` executable is visible, i.e. it is in the + system path or you are at the toplevel of the `amyc` directory. + +## WebAssembly and Amy + +Look at [this +presentation](http://lara.epfl.ch/~gschmid/clp20/codegen.pdf) for the +main concepts of how to translate Amy programs to WebAssembly. + +You can find the annotated compiler output to the concat example +[here](http://lara.epfl.ch/~gschmid/clp20/concat.wat). + +## The assignment code + +### Overview + +The code for the assignment is divided into two directories: `wasm` for +the modeling of the WebAssembly framework, and `codegen` for +Amy-specific code generation. There is a lot of code here, but your task +is only to implement code generation for Amy expressions within +`codegen/CodeGen.scala`. + +- `wasm/Instructions.scala` provides types that describe a subset of + WebAssembly instructions. It also provides a type `Code` to describe + sequences of instructions. You can chain multiple instructions or + `Code` objects together to generate a longer `Code` with the `<:>` + operator. +- `wasm/Function.scala` describes a wasm function. + - `LocalsHandler` is an object which will create fresh indexes for + local variables as needed. + - A `Function` contains a field called `isMain` which is used to + denote a main function without a return value, which will be + handled differently when printing, and will be exported to + JavaScript. + - The only way to create a `Function` is using `Function.apply`. + Its last argument is a function from a `LocalsHandler` to + `Code`. The reason for this unusual choice is to make sure the + Function object is instantiated with the number of local + variables that will be requested from the LocalsHandler. To see + how it is used, you can look in `codegen/Utils.scala` (but you + won\'t have to use it directly). +- `wasm/Module.scala` and `wasm/ModulePrinter.scala` describe a wasm + module, which you can think of as a set of functions and the + corresponding module headers. +- `codegen/Utils.scala` contains a few utility functions (which you + should use!) and implementations of the built-in functions of Amy. + Use the built-ins as examples. +- `codegen/CodeGen.scala` is the focus of the assignment. It contains + code to translate Amy modules, functions and expressions to wasm + code. It is a pipeline and returns a wasm Module. +- `codegen/CodePrinter.scala` is a Pipeline which will print output + files from the wasm module. + +### The cgExpr function + +The focus of this assignment is the `cgExpr` function, which takes an +expression and generates a `Code` object. It also takes two additional +arguments: (1) a `LocalsHandler` which you can use to get a new slot for +a local when you encounter a local variable or you need a temporary +variable for your computation. (2) a map `locals` from `Identifiers` to +locals slots, i.e. indices, in the wasm world. For example, if `locals` +contains a pair `i -> 4`, we know that `get_local 4` in wasm will push +the value of i to the stack. Notice how `locals` is instantiated with +the function parameters in `cgFunction`. + +## Skeleton + +As usual, you can find the skeleton for this lab in a new branch of your +group\'s repository. After merging it with your existing work, the +structure of your project `src` directory should be as follows: + + src/amyc + ├── Main.scala (updated) + │ + ├── analyzer + │ ├── SymbolTable.scala + │ ├── NameAnalyzer.scala + │ └── TypeChecker.scala + │ + ├── ast + │ ├── Identifier.scala + │ ├── Printer.scala + │ └── TreeModule.scala + │ + ├── bin + │ └── ... + │ + ├── codegen (new) + │ ├── CodeGen.scala + │ ├── CodePrinter.scala + │ └── Utils.scala + │ + ├── interpreter + │ └── Interpreter.scala + │ + ├── lib + │ ├── scallion_3.0.6.jar + │ └── silex_3.0.6.jar + │ + ├── parsing + │ ├── Parser.scala + │ ├── Lexer.scala + │ └── Tokens.scala + │ + ├── utils + │ ├── AmycFatalError.scala + │ ├── Context.scala + │ ├── Document.scala + │ ├── Pipeline.scala + │ ├── Position.scala + │ ├── Reporter.scala + │ └── UniqueCounter.scala + │ + └── wasm (new) + ├── Function.scala + ├── Instructions.scala + ├── ModulePrinter.scala + └── Module.scala + +## Deliverables + +You are given **4 weeks** for this assignment. + +Deadline: **TBD**. + +Submission: one team member submits a zip file submission-groupNumber.zip to the [moodle submission page](). diff --git a/labs/labs_06.md b/labs/labs_06.md new file mode 100644 index 0000000000000000000000000000000000000000..edd768fda8a79df166f88dc3e93c8e8ea32e4fac --- /dev/null +++ b/labs/labs_06.md @@ -0,0 +1,110 @@ +# Labs 06: Compiler extension project + +You have now written a compiler for Amy, a simple functional language. +The final lab project is to design and implement a new functionality of +your own choice on top of the compiler you built so far. In preparation +for this, you should aim to learn about the problem domain by searching +the appropriate literature. The project includes: + +- designing and implementing the new functionality +- documenting the results in a written report document + +This project has several deadlines, detailed below. Please note that the +first of them (choosing the topic) is already coming up on Sunday! + +## Selecting a Project Topic + +**Deadline: TBD** + +In the following document, we list several project ideas, but you should +also feel free to submit your own by email. All groups will rank the +projects in order of preference, and we will then do our best to assign +the preferred projects to as many groups as possible. Because not all +projects are equally difficult, we annotated each of them with the +expected workload. The suggested projects cover a wide range of +complexity, and we will evaluate your submissions with that complexity +in mind. For instance, for a project marked with `(1)` (relatively low +complexity) we will be expecting a polished, well-tested and +well-documented extension, whereas projects on the other end (`(3)`) may +be more prototypical. For all submissions, however, we require that you +deliver code that compiles and a set of example input files that +demonstrate the new functionality. + +[Project ideas](labs06_material/extensions.pdf) + +To announce your preferences, [please fill out this form by Sunday at +the latest](). You\'ll have to +provide **the names of the top exactly 5** projects you would like to +work on, in order of descending preference. We will do our best to +assign you the project you are most interested in. + +## Project Orientation + +**Deadline: TBD** + +We will try to inform you about the project assignment as soon as +possible. To give you a chance to validate your understanding of the +project and what\'s expected of you, we will offer dedicated slots +during the project sessions next week. Before you join, you should think +about the following questions + +- What are the features you will add to the compiler/language? +- What would be some (short) programs highlighting the use of these + features? +- What changes might be required in each compiler phase and/or what + new phases would you add? (Very roughly) + +**TODO: define slots** + +## Project Presentation + +You will present your idea during the lab sessions on the last regular +week of the semester (Dec 16th/22nd/23rd). We\'ll announce the concrete +schedule of presentations at a later point. [Instructions on what and +how to present your project can be found here.](labs06_material/presentation.md) + +## Project Implementation and Report + +**Deadline: Jan 7th 2021 23h00** + +Your implementation and a report are due on this date, and both will be +delivered using Git. You will develop your project on top of your +implementation of Amy. Please push all development on a new branch +`lab06`, ideally building on top of the codegen lab (branch `lab05`). +**TODO: define submission method** + +Your repository should contain: + +- Your implementation, which must, to be graded at all, compile and be + able to run non-trivial examples. +- A subdirectory `extension-examples/` which includes some examples + that demonstrate your compiler extension in action. +- A subdirectory `report/` which includes a PDF summarizing your + extension. + +**If you did not manage to complete your planned features, or they are +partially implemented, make this clear in your report!** + +You are encouraged to use the following (LaTeX) template for your +report: + +- [LaTeX sources](labs06_material/report-template.tar.gz) + +A PDF version of the template with the required section is available +here: + +- [PDF Example](labs06_material/report-template.pdf) + +Although you are not required to use the above template, your report +must contain at least the sections described in it with the appropriate +information. Note that writing this report will take some time, and you +should not do it in the last minute. The final report is an important +part of the compiler project. If you have questions about the template +or the contents of the report, make sure you ask them early. + +A common question is \"how long should the report be?\". There\'s no +definitive answer to that. Considering that the report will contain code +examples and a technical description of your implementation, it would be +surprising if it were shorter than 3 pages. Please try to stay within 6 +pages. A concise, but well-written report is preferable to a long, but +poorly-written one.