diff --git a/README.md b/README.md index e9ee0ff25d3fa865579d3e18a73e6e40bdf0ec0f..856f5b5cfc0090ff80d3a4fd54c19081aba61b4f 100644 --- a/README.md +++ b/README.md @@ -4,7 +4,7 @@ Links: [Moodle](https://moodle.epfl.ch/course/view.php?id=4241) and [Course Desc Important information: - * Midterm exam will take place Friday 4 April within the time block 13:00-17:00 in two rooms: [ELA 2](https://plan.epfl.ch/?room==ELA%202) and [CM 1 120](https://plan.epfl.ch/?room==CM%201%20120). One reminder sheet (2 sided) will be allowed. + * Midterm exam will take place Friday 4 April within the time block 13:00-17:00 in two rooms: [ELA 2](https://plan.epfl.ch/?room==ELA%202) and [CM 1 120](https://plan.epfl.ch/?room==CM%201%20120). One reminder sheet (2 sided) will be allowed. * Please register for project groups on Moodle as soon as this is possible ([Registration link](https://moodle.epfl.ch/mod/choicegroup/view.php?id=1282182)) @@ -19,21 +19,21 @@ The grade is based on a midterm (30%) as well as team project work (70%). Please | 1 | | Wed | 19.02.2025 | 13:15 | BC 01 | Lecture 1 | [Intro to CLP](https://mediaspace.epfl.ch/media/01-01%2C+Intro+to+Computer+Language+Processing/0_okro5h0v) [(PDF)](info/lectures/lec01a.pdf), [Formal languages](https://mediaspace.epfl.ch/media/01-02%2C+Formal+Languages/0_segfj94w) [(PDF)](info/lectures/lec01b.pdf) | | | | Fri | 21.02.2025 | 13:15 | ELA 2 | Lecture 2 | [Operations on Formal Languages](https://mediaspace.epfl.ch/media/02-01%2C+Operations+on+Formal+Languages/0_otyeghg6), [Regular Expressions and Idea of a Lexer](https://mediaspace.epfl.ch/media/02-02%2C+Regular+Expressions+and+Lexer+Idea/0_th59v9kx) [(PDF)](info/lectures/lec02.pdf) | | | 1.... | Fri | 21.02.2025 | 15:15 | ELA 2 | Lab 1 | [Interpreter lab released (due in 2 weeks)](./info/labs/lab01/) | +| 2 | 1.... | Wed | 26.02.2025 | 13:15 | BC 01 | Lecture 3 | [First Symbols. Constructing a Lexer](https://mediaspace.epfl.ch/media/03-01%2C+First+Symbols.+Constructing+a+Lexer/0_a943fw0n) [(PDF)](info/lectures/lec03a.pdf), [From Regular Expressions to Automata](https://mediaspace.epfl.ch/media/03-02%2C+From+Regular+Expressions+to+Automata/0_icjqhfj0) [(PDF)](info/lectures/lec03b.pdf) | +| | 1..... | Fri | 28.02.2025 | 13:15 | ELA 2 | Exercise 1 | [Languages, Automata and Lexers](info/exercises/ex-01.pdf) | +| | 12.... | Fri | 28.02.2025 | 15:15 | ELA 2 | Lab 2 | Lexer lab release | ## Schedule and Materials - Current | Week | Labs | Day | Date | Time | Room | Topic | Materials | | | :-- | :-- | :-- | :-- | :-- | :-- | :-- | :-- | :-- | -| 2 | 1.... | Wed | 26.02.2025 | 13:15 | BC 01 | Lecture 3 | [First Symbols. Constructing a Lexer](https://mediaspace.epfl.ch/media/03-01%2C+First+Symbols.+Constructing+a+Lexer/0_a943fw0n) [(PDF)](info/lectures/lec03a.pdf), [From Regular Expressions to Automata](https://mediaspace.epfl.ch/media/03-02%2C+From+Regular+Expressions+to+Automata/0_icjqhfj0) [(PDF)](info/lectures/lec03b.pdf) | -| | 1..... | Fri | 28.02.2025 | 13:15 | ELA 2 | Exercise 1 | [Languages, Automata and Lexers](info/exercises/ex-01.pdf) | -| | 12.... | Fri | 28.02.2025 | 15:15 | ELA 2 | Lab 2 | Lexer lab release | -| 3 | 12.... | Wed | 05.03.2025 | 13:15 | BC 01 | Lecture 4 | Grammars and Trees | -| | 12.... | Fri | 07.03.2025 | 13:15 | ELA 2 | Exercises 2 | Grammars (not LL(1)) | +| 3 | 12.... | Wed | 05.03.2025 | 13:15 | BC 01 | Lecture 4 | [Introduction to Grammars](https://mediaspace.epfl.ch/media/04-01%2C+Introduction+to+Grammars/0_krhjbo09) [(PDF)](info/lectures/lec04-grammars-intro.pdf), [Syntax Trees](https://mediaspace.epfl.ch/media/04-02%2C+Syntax+Trees/0_9h4g5k1c) [(PDF)](info/lectures/lec04-trees.pdf) +| | 12.... | Fri | 07.03.2025 | 13:15 | ELA 2 | Exercises 2 | Grammar Concepts | | | 123... | Fri | 07.03.2025 | 15:15 | ELA 2 | Lab 3 | Parser lab release | -| 4 | .23... | Wed | 12.03.2025 | 13:15 | BC 01 | Lecture 5 | LL(1) Parsing | +| 4 | .23... | Wed | 12.03.2025 | 13:15 | BC 01 | Lecture 5 | [LL(1) Parsing](https://mediaspace.epfl.ch/media/04-03%2C+LL%281%29+Parsing/0_se2zd8kt) | | | .23... | Fri | 14.03.2025 | 13:15 | ELA 2 | Lecture 6 | Name Analysis. Operational Semantics | | | .23... | Fri | 14.03.2025 | 15:15 | ELA 2 | Lab 3 | Parser lab | -| 5 | ..3... | Wed | 19.03.2025 | 13:15 | BC 01 | Exercises 3 | Grammars | +| 5 | ..3... | Wed | 19.03.2025 | 13:15 | BC 01 | Exercises 3 | LL(1) Grammars | | | ..3... | Fri | 21.03.2025 | 13:15 | ELA 2 | Lecture 7 | Type Checking | | | ..34.. | Fri | 21.03.2025 | 15:15 | ELA 2 | Lab 4 | Typer lab release | | 6 | ..34.. | Wed | 26.03.2025 | 13:15 | BC 01 | Exercises 4 | Parsing. Type checking | diff --git a/info/books.md b/info/books.md index ce7f263181f4eb646d6e09d99c86805ee8e4e765..2af5f41d087c3b62a4d534fab96541916d41370e 100644 --- a/info/books.md +++ b/info/books.md @@ -2,7 +2,8 @@ The following books contain overlapping material with some recommendations for most relevant parts: - * [Modern compiler implementation in ML](http://library.epfl.ch/en/beast?isbn=9781107266391). Read Sections 2.1-2.4 for Lexical analysis, Sections 3.1-3.2 for parsing, and 5.3-5.4 as well as 16.1-16.3 for type checking - * [Discrete Mathematics and Its Applications by Kenneth H. Rosen (8th edition)](https://epfl.swisscovery.slsp.ch/discovery/fulldisplay?docid=alma99116968862405516&context=L&vid=41SLSP_EPF:prod&lang=en&search_scope=MyInst_and_CI&adaptor=Local%20Search%20Engine&tab=41SLSP_EPF_MyInst_and_CI&query=any,contains,Discrete%20Mathematics%20and%20Its%20Applications&sortby=date_d&facet=frbrgroupid,include,9018235242682604086&offset=0), available in the library and you may have it already. Useful backround is in sections 1.7, 1.8, 2.1, 2.2, 5.1, 5.3, 9.1, 9.2, 13.1, 13.3, 13.4 + * [Modern compiler implementation in ML](http://library.epfl.ch/en/beast?isbn=9781107266391). Read Sections 2.1-2.4 for Lexical analysis, Sections 3.1-3.2 for parsing, and 5.3-5.4 as well as 16.1-16.3 for type checking + * [Introduction to automata theory, languages, and computation](https://epfl.swisscovery.slsp.ch/permalink/41SLSP_EPF/1g1fbol/alma990053993040205516) (3rd Ed.) by Hopcroft, Motwani, Ullman, 2007. First seven chapters are an excellent way to get in-depth understanding of finite automata and context-free grammars. If you need to pick, read sections 2.2, 2.3, 2.5, 3.1, 3.2.3, 4.1, 5.1, 5.2 + * [Discrete Mathematics and Its Applications by Kenneth H. Rosen (8th edition)](https://epfl.swisscovery.slsp.ch/discovery/fulldisplay?docid=alma99116968862405516&context=L&vid=41SLSP_EPF:prod&lang=en&search_scope=MyInst_and_CI&adaptor=Local%20Search%20Engine&tab=41SLSP_EPF_MyInst_and_CI&query=any,contains,Discrete%20Mathematics%20and%20Its%20Applications&sortby=date_d&facet=frbrgroupid,include,9018235242682604086&offset=0), available in the library and you may have it already. Useful backround is in sections 1.7, 1.8, 2.1, 2.2, 5.1, 5.3, 9.1, 9.2, 13.1, 13.3, 13.4 * [Basics of Compiler Design](http://hjemmesider.diku.dk/~torbenm/Basics/). Online! Read pages 9-88 (omit Section 2.8) for lexical analysis and parsing * [Compilers, principle, techniques and tools](http://library.epfl.ch/en/beast?isbn=9781292024349) diff --git a/info/exercises/Makefile b/info/exercises/Makefile index 47f52641ac6fe074292f38a574019220aa4e2a2f..f36de7314ef655dab4176376e2938509d607812f 100644 --- a/info/exercises/Makefile +++ b/info/exercises/Makefile @@ -7,20 +7,22 @@ DIRS := $(wildcard src/ex-??) EXPDFS := $(patsubst src/ex-%,ex-%.pdf,$(DIRS)) SOLPDFS := $(patsubst src/ex-%,ex-%-sol.pdf,$(DIRS)) +TEXARGS := -shell-escape -interaction=batchmode + all: $(EXPDFS) $(SOLPDFS) ex-%.pdf: src/ex-%/main.tex cd src/ex-$* && \ - lualatex -jobname=ex-$* "\def\ANSWERS{0}\input{main.tex}" && \ + lualatex $(TEXARGS) -jobname=ex-$* "\def\ANSWERS{0}\input{main.tex}" && \ cp ex-$*.pdf $(OUT_DIR)/ex-$*.pdf ex-%-sol.pdf: src/ex-%/main.tex cd src/ex-$* && \ - lualatex -jobname=ex-$*-sol "\def\ANSWERS{1}\input{main.tex}" && \ + lualatex $(TEXARGS) -jobname=ex-$*-sol "\def\ANSWERS{1}\input{main.tex}" && \ cp ex-$*-sol.pdf $(OUT_DIR)/ex-$*-sol.pdf clean: rm -f $(EXPDFS) $(SOLPDFS) for d in $(DIRS); do \ - cd $$d && rm -f *.aux *.log *.out main.pdf; \ + pushd $$d && rm -f *.aux *.log *.out main.pdf; popd; \ done diff --git a/info/exercises/ex-01-sol.pdf b/info/exercises/ex-01-sol.pdf index e272c57ea28219fcdf5bd2b32e2dcde0057d408f..f44876473069f634e8481a8ecaf348073f1d790c 100644 Binary files a/info/exercises/ex-01-sol.pdf and b/info/exercises/ex-01-sol.pdf differ diff --git a/info/exercises/ex-01.pdf b/info/exercises/ex-01.pdf index 66648029c286b7711553bb60c056a1192a59fa97..bea9f40a05dfe49f34e1e0d7852628c1e738789d 100644 Binary files a/info/exercises/ex-01.pdf and b/info/exercises/ex-01.pdf differ diff --git a/info/exercises/ex-02-sol.pdf b/info/exercises/ex-02-sol.pdf new file mode 100644 index 0000000000000000000000000000000000000000..47bb3a8cece402275ad716d843f80b826cb19d13 Binary files /dev/null and b/info/exercises/ex-02-sol.pdf differ diff --git a/info/exercises/ex-02.pdf b/info/exercises/ex-02.pdf new file mode 100644 index 0000000000000000000000000000000000000000..4af57628d82238c937075f44b6b6cfac0eb42b4b Binary files /dev/null and b/info/exercises/ex-02.pdf differ diff --git a/info/exercises/src/ex-01/ex/languages.tex b/info/exercises/src/ex-01/ex/languages.tex index ff0a1fa65a890930f337cb503cac878ec1fb20d2..7cdbda25d4a1be1194601a68fb66adeaae59878b 100644 --- a/info/exercises/src/ex-01/ex/languages.tex +++ b/info/exercises/src/ex-01/ex/languages.tex @@ -134,7 +134,7 @@ such that \(w_1 = a\), \(w_2 = a\), and \(w_{i + 2} = v_i\) for \(1 \le i \le m\). Since \(m < |v|\) and \(|v| = |w| - 2\), \(m + 2 < |w|\). QED. - \item \(w = aab\): by the same argument as the previous case, \(v\) has a decomposition + \item \(w = abv\): by the same argument as the previous case, \(v\) has a decomposition into words in \(\{a, ab\}\), \(v = v_1\ldots v_m\) for some \(m < |v|\) and \(v_i \in \{a, ab\}\). diff --git a/info/exercises/src/ex-01/ex/lexer.tex b/info/exercises/src/ex-01/ex/lexer.tex index 935d80edf48120ad2199dab881d509fdf05a56dc..72217ae30f5f0b89b74257f7aeae67f85d7ae435 100644 --- a/info/exercises/src/ex-01/ex/lexer.tex +++ b/info/exercises/src/ex-01/ex/lexer.tex @@ -63,7 +63,7 @@ digits \(\{0 - 9\}\). \node[state,accepting] (ql_3) [right=of ql_2] {$q_{let}$}; % \node[state] (qin_1) [right=of q_0] {$q_{i1}$}; - \node[state] (qin_2) [right=of qin_1] {$q_{in}$}; + \node[state,accepting] (qin_2) [right=of qin_1] {$q_{in}$}; % \node[state] (qite_1) [below right=of q_0] {$q_{i2}$}; \node[state] (qite_2) [right=of qite_1] {$q_t$}; @@ -128,8 +128,8 @@ lexer drops any \texttt{skip} tokens. \item \texttt{[keyword("let"), id("x"), equal("="), number("5"), keyword("in"), id("x"), op("+"), number("3")]} \item \texttt{[keyword("let"), number("5"), id("x2")]} \item \texttt{[id("xin")]} - \item \texttt{[comp("=="), op(">")]} - \item \texttt{[comp("<="), comp("=="), op(">"), comp("<="), equal("=")]} + \item \texttt{[comp("=="), comp(">")]} + \item \texttt{[comp("<="), comp("=="), comp(">"), comp("<="), equal("=")]} \end{enumerate} \end{solution} diff --git a/info/exercises/src/ex-02/ex/cfg.tex b/info/exercises/src/ex-02/ex/cfg.tex new file mode 100644 index 0000000000000000000000000000000000000000..f200e3f7e51439c9567ca6204de544ee50ddc22b --- /dev/null +++ b/info/exercises/src/ex-02/ex/cfg.tex @@ -0,0 +1,343 @@ + +\begin{exercise}{} + + For each of the following languages, give a context-free grammar that + generates it: + + \begin{enumerate} + \item \(L_1 = \{a^nb^m \mid n, m \in \naturals \land n \geq 0 \land m \geq n\}\) + \item \(L_2 = \{a^nb^mc^{n+m} \mid n, m \in \naturals\}\) + \item \(L_3 = \{w \in \{a, b\}^* \mid \exists m \in \naturals.\; |w| = 2m + + 1 \land w_{(m+1)} = a \}\) (\(w\) is of odd length, has \(a\) in the middle) + \end{enumerate} + + \begin{solution} + \begin{enumerate} + \item \(L_1 = \{a^nb^m \mid n, m \in \naturals \land n \geq 0 \land m \geq n\}\) + \begin{align*} + S &::= aSb \mid B\\ + B &::= bB \mid \epsilon + \end{align*} + \item \(L_2 = \{a^nb^mc^{n+m} \mid n, m \in \naturals\}\) + \begin{align*} + S &::= aSc \mid B\\ + B &::= bBc \mid \epsilon + \end{align*} + + A small tweak to \(L_1\)'s grammar allows us to keep track of addition + precisely here. Could we do something similar for \(\{a^nb^nc^n \mid n \in + \naturals\}\)? (open-ended discussion) + + \item \(L_3 = \{w \in \{a, b\}^* \mid \exists m \in \naturals.\; |w| = 2m + + 1 \land w_{(m+1)} = a \}\) + \begin{align*} + S &::= aSb \mid bSa \mid aSa \mid bSb \mid a + \end{align*} + + Note that after each recursive step, the length of the inner string has + the same parity (i.e. odd). + \end{enumerate} + \end{solution} + +\end{exercise} + +\begin{exercise}{} + + Consider the following context-free grammar \(G\): + + \begin{align*} + A &::= -A \\ + A &::= A - \textit{id} \\ + A &::= \textit{id} \\ + \end{align*} + + \begin{enumerate} + \item Show that \(G\) is ambiguous, i.e., there is a string that has two + different possible parse trees with respect to \(G\). + \item Make two different unambiguous grammars recognizing the same words, + \(G_p\), where prefix-minus binds more tightly, and \(G_i\), where + infix-minus binds more tightly. + \item Show the parse trees for the string you produced in (1) with respect + to \(G_p\) and \(G_i\). + \item Produce a regular expression that recognizes the same language as + \(G\). + \end{enumerate} + + \begin{solution} + \begin{enumerate} + \item An example string is \(- \textit{id} - \textit{id}\). It can be + parsed as either \(-(\textit{id} - \textit{id})\) or \((- \textit{id}) - + \textit{id}\). The corresponding parse trees are: + + \begin{center} + \begin{forest} + [\(A\) + [\(A\) + [\(-\)] + [\(\textit{id}\)] + ] + [\(-\)] + [\(\textit{id}\)] + ] + \end{forest} + \hspace{10ex} + \begin{forest} + [\(A\) + [\(-\)] + [\(A\) + [\(A\) + [\(\textit{id}\)] + ] + [\(-\)] + [\(\textit{id}\)] + ] + ] + \end{forest} + \end{center} + + Left: prefix binds tighter, right: infix binds tighter. + + \item \(G_p\): + \begin{align*} + A &::= B \mid A - \textit{id} \\ + B &::= -B \mid \textit{id} + \end{align*} + + \(G_i\): + \begin{align*} + A &::= C \mid -A \\ + C &::= \textit{id} \mid C - \textit{id} + \end{align*} + + \item Parse trees for \(- \textit{id} - \textit{id}\) with respect to \(G_p\) (left) + and \(G_i\) (right): + + \begin{center} + \begin{forest} + [\(A\) + [\(A\) + [\(B\) + [\(-\)] + [\(B\) + [\(\textit{id}\)] + ] + ] + ] + [\(-\)] + [\(\textit{id}\)] + ] + \end{forest} + \hspace{10ex} + \begin{forest} + [\(A\) + [\(-\)] + [\(A\) + [\(C\) + [\(\textit{id}\)] + ] + [\(-\)] + [\(\textit{id}\)] + ] + ] + \end{forest} + \end{center} + + \item \(L(G) = L(-^*\textit{id} (-\textit{id})^*)\). Note: \(()\) are part + of the regular expression syntax, not parentheses in the string. + + \end{enumerate} + \end{solution} + +\end{exercise} + + +\begin{exercise}{} + + Consider the two following grammars \(G_1\) and \(G_2\): + + \begin{align*} + G_1: & \\ + S &::= S(S)S \mid \epsilon \\ + G_2: & \\ + R &::= RR \mid (R) \mid \epsilon + \end{align*} + + \noindent + Prove that: + \begin{enumerate} + \item \(L(G_1) \subseteq L(G_2)\), by showing that for every parse tree in + \(G_1\), there exists a parse tree yielding the same word in \(G_2\). + \item (Bonus) \(L(G_2) \subseteq L(G_1)\), by showing that there exist + equivalent parse trees or derivations. + \end{enumerate} + + \begin{solution} + + \begin{enumerate} + \item \(L(G_1) \subseteq L(G_2)\). + + We give a recursive transformation of parse trees in \(G_1\) producing + parse trees in \(G_2\). + + \begin{enumerate} + \item \textbf{Base case:} The smallest parse tree is the \(\epsilon\) + production, which can be transformed as (left to right): + \begin{center} + \begin{forest} + [\(S\) + [\(\epsilon\)] + ] + \end{forest} + \hspace{8ex} + \begin{forest} + [\(R\) + [\(\epsilon\)] + ] + \end{forest} + \end{center} + \item \textbf{Recursive case:} Rule \(S ::= S(S)S\). The parse tree transformation is: + \begin{center} + \begin{forest} + [\(S\) + [\(S_1\)] + [\((_2\)] + [\(S_3\)] + [\()_4\)] + [\(S_5\)] + ] + \end{forest} + \hspace{10ex} + \begin{forest} + [\(R\) + [\(R_1\)] + [\(R\) + [\(R\) + [\((_2\)] + [\(R_3\)] + [\()_4\)] + ] + [\(R_5\)] + ] + ] + \end{forest} + \end{center} + + The nodes are numbered to check that the order of children (left to + right) does not change. This ensures that the word yielded by the tree + is the same. The transformation is applied recursively to the children + \(S_1, S_3, S_5\) to obtain \(R_1, R_3, R_5\). + + Verify that the tree on the right is indeed a parse tree in \(G_2\). + \end{enumerate} + + \item \(L(G_2) \subseteq L(G_1)\). + + Straightforward induction on parse trees does not work easily. The rule + \(R ::= RR\) in \(G_2\) is not directly expressible in \(G_1\) by a simple + transformation of parse trees. However, we can note that, in fact, adding + this rule to \(G_1\) does not change the language! + + Consider the grammar \(G_1'\) defined by \(S ::= SS \mid S(S)S \mid + \epsilon\). We must show that for every two words \(v\) and \(w\) in + \(L(G_1)\), \(vw\) is in \(L(G_1)\), and so adding the rule \(S ::= SS\) + does not change the language. + + We induct on the length \(|v| + |w|\). + + \begin{enumerate} + \item \textbf{Base case:} \(|v| + |w| = 0\). \(v = w = vw = \epsilon \in + L(G_1)\). QED. + \item \textbf{Inductive case:} \(|v| + |w| = n + 1\). The induction + hypothesis is that for every \(v', w'\) with \(|v'| + |w'| = n\), \(v'w' + \in L(G_1)\). + + From the grammar, we know that either \(v = \epsilon\) or \(v = x(y)z\) + for \(x, y, z \in L(G_1)\). If \(v = \epsilon\), then \(w = vw \in + L(G_1)\). In the second case, \(vw = x(y)zw\). However, \(zw \in + L(G_1)\) by the inductive hypothesis, as \(|z| + |w| < n \). + + Thus, \(vw = x(y)z'\) for \(z' \in L(G_1)\). Finally, since \(x, y, z' + \in L(G_1)\), it follows from the grammar rules that \(vw = x(y)z' \in + L(G_1)\). + \end{enumerate} + + Thus, \(L(G_1) = L(G_1')\). It can now be shown just as in the first part, + that \(L(G_2) \subseteq L(G_1')\). + \end{enumerate} + + \end{solution} + +\end{exercise} + +\begin{exercise}{} + + Consider a context-free grammar \(G = (A, N, S, R)\). Define the reversed + grammar \(rev(G) = (A, N, S, rev(R))\), where \(rev(R)\) is the set of rules + is produced from \(R\) by reversing the right-hand side of each rule, i.e., + for each rule \(n ::= p_1 \ldots p_n\) in \(R\), there is a rule \(n ::= + p_n \ldots p_1\) in \(rev(R)\), and vice versa. The terminals, + non-terminals, and start symbol of the language remain the same. + + For example, \(S ::= abS \mid \epsilon\) becomes \(S ::= Sba \mid \epsilon\). + + Is it the case that for every context-free grammar \(G\) defining a language + \(L\), the language defined by \(rev(G)\) is the same as the language of + reversed strings of \(L\), \(rev(L) = \{rev(w) \mid w \in L\}\)? Give a proof + or a counterexample. + + \begin{solution} + + Consider any word \(w\) in the original language. Looking at the definition + of a language \(L(G)\) defined by a grammar \(G\): + \begin{equation*} + w \in L(G) \iff \exists T.\; w = yield(T) \land isParseTree(G, T) + \end{equation*} + + There must exist a parse tree \(T\) for \(w\) with respect to \(G\). We must + show that there exists a parse tree for \(rev(w)\) with respect to the + reversed grammar \(G_r = rev(G)\) as well. + + We propose that this is precisely the tree \(T_r = mirror(T)\). Thus, we + need to show that \(rev(w) = yield(T_r)\) and that \(isParseTree(G_r, + T_r)\). + + \begin{enumerate} + \item \(rev(w) = yield(T_r)\): \(yield(\cdot)\) of a tree is the word + obtained by reading its leaves from left to right. Thus, the yield of the + mirror of a tree \(yield(mirror(\cdot))\) is the word obtained by reading + the leaves of the original tree from right to left. Thus, \(yield(T_r) = + yield(mirror(T)) = rev(yield(T)) = rev(w)\). + + \item \(isParseTree(G_r, T_r)\): We need to show that \(T_r\) is a parse + tree with respect to \(G_r\). Consider the definition of a parse tree: + \begin{enumerate} + \item The root of \(T_r\) is the start symbol of \(G_r\): the root of + \(T_r = mirror(T)\) is the same as that of \(T\). Since \(T\)'s root + node must be the start symbol of \(G\), it is also the root symbol of + \(T_r\). \(G\) and \(G_r\) share the same start symbol in our + transformation. + \item The leaves are labelled by the elements of \(A\): the mirror + transformation does not alter the set or the label of leaves, only their + order. This property transfers from \(T\) to \(T_r\) as well. + \item Each non-leaf node is labelled by a non-terminal symbol: the + mirror transformation does not alter the label of non-leaf nodes either, + so this property transfers from \(T\) to \(T_r\) as well. + \item If a non-leaf node has children that are labelled \(p_1, \ldots, + p_n\) left-to-right, then there is a rule \((n ::= p_1 \ldots p_n)\) in + the grammar: consider any non-leaf node in \(T_r\), labelled \(n\), with + children labelled left-to-right \(p_1, \ldots, p_n\). By the definition + of \(mirror\), the original tree \(T\) must have the same node labelled + \(n\), with the reversed list of children left-to-right, \(p_n, \ldots, + p_1\). Since \(T\) is a parse tree for \(G\), \(n ::= p_n \ldots p_1\) + is a valid rule in \(G\), and by the reverse transformation, \(n ::= p_1 + \ldots p_n\) must be a rule in \(G_r\). Thus, the property is satisfied. + \end{enumerate} + \end{enumerate} + + Thus, both properties are satisfied. Therefore, the language defined by the + reversed grammar is the reversed language of the original grammar. + + \end{solution} + +\end{exercise} + diff --git a/info/exercises/src/ex-02/ex/pumping.tex b/info/exercises/src/ex-02/ex/pumping.tex new file mode 100644 index 0000000000000000000000000000000000000000..51bdf0636b18404a3ea8cefbad882213d984d036 --- /dev/null +++ b/info/exercises/src/ex-02/ex/pumping.tex @@ -0,0 +1,42 @@ + +\begin{exercise}{} + + Recall the pumping lemma for regular languages: + + For any language \(L \subseteq \Sigma^*\), if \(L\) is regular, there exists a + strictly positive constant \(p \in \naturals\) such that every word \(w \in + L\) with \(|w| \geq p\) can be written as \(w = xyz\) such that: + + \begin{itemize} + \item \(x, y, z \in \Sigma^*\) + \item \(|y| > 0\) + \item \(|xy| \leq p\), and + \item \(\forall i \in \naturals.\; xy^iz \in L\) + \end{itemize} + + Consider the language \(L = \{w \in \{a\}^* \mid |w| \text{ is prime}\}\). + Show that \(L\) is not regular by using the pumping lemma. + + \begin{solution} + \(L = \{w \in \{a\}^* \mid |w| \text{ is prime}\}\) is not a regular + language. + + To the contrary, assume it is regular, and so there exists a constant + \(p\) such that the pumping conditions hold for this language. + + Consider the word \(w = a^{n} \in L\), for some prime \(n \geq p\). By the + pumping lemma, we can write \(w = xyz\) such that \(|y| > 0\), \(|xy| \leq + p\), and \(xy^iz \in L\) for all \(i \geq 0\). + + Assume that \(|xz| = m\) and \(|y| = k\) for some natural numbers \(m\) + and \(k\). Thus, \(|xy^iz| = m + ik\) for all \(i\). Since by the pumping + lemma \(xy^iz \in L\) for every \(i\), it follows that for every \(i\), + the length \(m + ik\) is prime. However, if \(m \not = 0\), then \(m\) + divides \(m + mk\), and if \(m = 0\), then \(m + 2k\) is not prime. In + either case, we have a contradiction. + + Thus, this language is not regular. + + \end{solution} + +\end{exercise} diff --git a/info/exercises/src/ex-02/main.tex b/info/exercises/src/ex-02/main.tex new file mode 100644 index 0000000000000000000000000000000000000000..ad9d334d324b410c8b7457dc9955f63eb29a978b --- /dev/null +++ b/info/exercises/src/ex-02/main.tex @@ -0,0 +1,22 @@ +\documentclass[a4paper]{article} + +\input{../macro} + +\ifdefined\ANSWERS + \if\ANSWERS1 + \printanswers + \fi +\fi + +\title{CS 320 \\ Computer Language Processing\\Exercises: Week 3} +\author{} +\date{March 7, 2025} + +\begin{document} +\maketitle + + \input{ex/pumping} + + \input{ex/cfg} + +\end{document} diff --git a/info/lectures/lec04-grammars-intro.pdf b/info/lectures/lec04-grammars-intro.pdf new file mode 100644 index 0000000000000000000000000000000000000000..6159571856f582bd15f1e33e37a644ef114307dc Binary files /dev/null and b/info/lectures/lec04-grammars-intro.pdf differ diff --git a/info/lectures/lec04-trees.pdf b/info/lectures/lec04-trees.pdf new file mode 100644 index 0000000000000000000000000000000000000000..cb2d12d1e52d9d7717af58c8f3388446e185e5a1 Binary files /dev/null and b/info/lectures/lec04-trees.pdf differ