Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • lts2/matrix-analysis-2025
  • magbi/matrix-analysis-2025
  • ibelhadj/matrix-analysis-2025
  • aboujaou/matrix-analysis-2025
  • wazen/matrix-analysis-2025
  • kuyumcu/matrix-analysis-2025
6 results
Show changes
Commits on Source (13)
Showing
with 7071 additions and 1 deletion
......@@ -11,4 +11,5 @@ dependencies:
- scikit-image=0.24.0
- scikit-learn=1.5.1
- scipy=1.14.1
\ No newline at end of file
- pip:
- otter-grader==6.1.0
%% Cell type:code id:0b9af790 tags:
``` python
# Initialize Otter
import otter
grader = otter.Notebook("Projections and signal restoration.ipynb")
```
%% Cell type:markdown id:4133723e-952b-4b1c-bb3d-a7f78a957024 tags:
# Matrix Analysis 2025 - EE312
## Week 3 - Signal restoration using projections
[LTS2](https://lts2.epfl.ch)
%% Cell type:markdown id:fb42154f-8832-4960-9fee-3372b473ef69 tags:
Let us consider a signal with $N$ elements, i.e. a vector in $\mathbb{R}^N$.
Under our observations conditions, we can only recover partially the signal's values, the other remain unknown, i.e. we know that:
$x[k] = x_k$ for $k\in E = \{e_0, e_1, \dots,e_{M-1}\}$, with $E\in\mathbb{N}^M$ and $e_i\neq e_j \forall i\neq j$ (i.e. each known index is unique).
We also make the assumption that the signal is contained within **lower frequencies of the spectrum**. This can be expressed using the (normalized) Fourier matrix you have constructed last week $\hat{W}$.
In this notebook, we will try to reconstruct the signal by projecting its observation successively on the Fourier subspace defined above, then back to its original domain (with the constraint regarding its values), then on the Fourier domain, etc. This is a simplified version of a more general method called ["Projection onto convex sets" (POCS)](https://en.wikipedia.org/wiki/Projections_onto_convex_sets). The method is illustrated by the figure below (of course you do not project on lines but on spaces having larger dimension).
![POCS](images/pocs.png "POCS")
### 1. Low-pass filter
Let us first create example signals to validate our implementation of the filtering operation.
%% Cell type:code id:d238cbc9-055f-4d55-a6ae-29d83881e2f5 tags:
``` python
import numpy as np
import matplotlib.pyplot as plt
N = 100
k = np.arange(0, N)
w1 = 3
w2 = 7
w3 = 12
# generate simple signals
x1 = np.cos(2*w1*np.pi*k/N) + np.sin(2*w2*np.pi*k/N)
x2 = np.sin(2*w1*np.pi*k/N) + np.cos(2*w3*np.pi*k/N)
```
%% Cell type:code id:59a703e8-f351-4a02-8278-12560a67b023 tags:
``` python
# visualize the base signals
plt.plot(x1)
plt.plot(x2, 'r')
```
%% Cell type:markdown id:09249ca9-e441-4537-9bc5-018be67bc099 tags:
1. Implement a function that returns the normalized Fourier matrix of size N (use your implementation from week 2)
%% Cell type:code id:f886d8a6-4cb7-4187-ae4b-4615a06caa2d tags:otter_answer_cell
``` python
def fourier_matrix(N):
...
```
%% Cell type:code id:f367f4e1 tags:
``` python
grader.check("q1")
```
%% Cell type:code id:62fae5a8-f3c4-477b-a53f-a0106730b842 tags:
``` python
W_hat = fourier_matrix(N)
```
%% Cell type:markdown id:1e0a2337-446d-49d3-8521-55d07a1539bc tags:
Let $X = \hat{W}x$, $X$ is the *discrete Fourier transform* of the input signal $x$. The frequency condition can then be rewritten as
$X[k] = 0$ for $w_c < k\leq N-w_c$.
2. Implement the function that returns a $N\times N$ matrix $P$, s.t. $PX$ satisfies the above condition for a given $w_c$. Make sure the input values are valid, if not raise a `ValueError` exception.
%% Cell type:code id:aeed4067-1130-4e6c-9fbd-ab8d0a3a41c0 tags:otter_answer_cell
``` python
def lowpass_matrix(N, w_c):
"""
Computes the P matrix that will remove high-frequency coefficients in a DFT transformed signal
Parameters
----------
N : length of the input signal
w_c : cutoff frequency
Returns
-------
The P matrix
"""
...
```
%% Cell type:code id:38529611 tags:
``` python
grader.check("q2")
```
%% Cell type:markdown id:fef68750-d97f-4625-bd5e-5f13e7894f7d tags:
3. Now let us try the filtering on the test signals and make sure it behaves appropriately. Using the matrix $P$ defined above, choose the parameter $w_c$ approiately s.t. the filtered signals retain $w_1$ and $w_2$ but discard $w_3$.
%% Cell type:code id:bc1370d5-c7b2-4598-b573-d57bec56aa1b tags:otter_answer_cell
``` python
w_c = ...
```
%% Cell type:code id:85787e70-1427-4860-8a7b-94d20432bd76 tags:otter_answer_cell
``` python
# Compute the DFT of x1 and x2
X1 = W_hat@x1
X2 = W_hat@x2
# Get the lowpass filter matrix
P = lowpass_matrix(N, w_c)
# Filter X1 and X2 and apply inverse DFT
x1_f = ...
x2_f = ...
```
%% Cell type:markdown id:700eb2e2-3137-4916-a829-9e8e2ba8c6e5 tags:
NB: Make sure the filtered output is **real** (or its imaginary part should be smaller than $10^{-12}$).
%% Cell type:code id:e960728d-c2c0-4b9d-ab73-3db971f559a1 tags:
``` python
# Plot the filtered signals
# x1_f should still contain both frequencies, x2_f only one
plt.plot(x1_f)
plt.plot(x2_f, 'r')
```
%% Cell type:code id:b6f5b443 tags:
``` python
grader.check("q3")
```
%% Cell type:markdown id:c34b8fc7-8a51-4fe5-a6ab-1cc27da2cd1e tags:
<!-- BEGIN QUESTION -->
4. Prove that $P$ is a projection.
%% Cell type:markdown id:0cd9f0a9 tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:markdown id:7f17f3ae-6a39-4e86-a29c-879c96b4e5fb tags:
<!-- END QUESTION -->
### 2. Signal extension
In order to express the condition on $x[k]$ as a pure matrix operation we need to make use of an extension of the input signal and design a slightly different Fourier transform matrix to use properly those extended signals.
Let us denote by $x_E$ the vector from $\mathbb{R}^M$ containing the known values of $x$, i.e. the $x_k$ for $k\in E$.
For each vector $y$ in $\mathbb{R}^N$ we define its extension as $\tilde{y} = \begin{pmatrix}y \\ x_E\end{pmatrix}$.
**This notation will remain throughout the notebook**, i.e. a vector with a tilde denotes its extension made by adding the $x_E$ values at the end.
%% Cell type:markdown id:f9434f5f-ceb9-42bf-9b5e-bd7d7a94dae8 tags:
5. Let us define the matrix $B_E$ and $y=B_E x$, s.t. $y[k] = 0$ if $k\in E$ and $y[k] = x[k]$ otherwise. Write a function that returns its extended version $\tilde{B_E}$ s.t. $\tilde{y}=\tilde{B_E}\tilde{x}$, for any $x\in\mathbb{R}^N$.
- $\tilde{B_E}$ is a square matrix of size $N+M$.
- Check the validity of parameters and raise a `ValueError` in case of invalid inputs.
%% Cell type:code id:dae1c4aa-c6ce-4f35-b309-aa6dd09e9abf tags:otter_answer_cell
``` python
def Bt_E(N, E):
"""
Computes the $\tilde{B}_E$ matrix
Parameters
----------
N : length of the input signal
E : list of suitable indices. e.g. for N=5 a valid E could be [1, 3]
Returns
-------
The $\tilde{B}_E$ matrix
"""
...
```
%% Cell type:code id:751ff499 tags:
``` python
grader.check("q5")
```
%% Cell type:markdown id:853c1a3b-9ed9-461c-86f9-12992e07337d tags:
Let us know design $C_E$ as an operator from $\mathbb{R}^{N+M} \rightarrow \mathbb{R}^{N+M}$ such that when applied to any extended vector $\tilde{z}$ s.t. $\tilde{z}[k] = 0, \forall k\in E$ as input (i.e. for instance the output of $\tilde{B}_E$), it generates a vector $\tilde{z}_R$ s.t.:
$\tilde{z}_R[k] = \left\{\begin{matrix} x_k \mbox{ if } k\in E \\ \tilde{z}[k] \mbox{ otherwise} \end{matrix}\right.$
6. Write a function that generates $C_E$.
%% Cell type:code id:932ae153-f336-4b0f-b7c0-774c02ccaad1 tags:otter_answer_cell
``` python
def C_E(N, E):
"""
Computes the $C_E$ matrix
Parameters
----------
N : length of the input signal
E : list of suitable indices. e.g. for N=5 a valid E could be [1, 3]
Returns
-------
The $C_E$ matrix
"""
...
```
%% Cell type:code id:473d2fb2 tags:
``` python
grader.check("q6")
```
%% Cell type:markdown id:0e57be3f-1c44-445a-86ad-07e20fbcd1d2 tags:
<!-- BEGIN QUESTION -->
7. What is the effect of $D_E = C_E \tilde{B}_E$ on an extended signal $\tilde{x}$ ?
%% Cell type:markdown id:786cd4ed tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:markdown id:3b2caf03-41cb-41f9-93eb-778047b9b944 tags:
<!-- END QUESTION -->
<!-- BEGIN QUESTION -->
8. Verify (numerically on an example) that $D_E$ is a projector. You can use the function `numpy.testing.assert_array_almost_equal` to check that arrays are almost equal (with a good precision)
%% Cell type:code id:94dbc7f8-c2fc-4337-9828-dd2e3311830d tags:otter_answer_cell
``` python
# Set some parameters
N=5
E=[1,3]
# Compute D_E
D_E = ...
# Now check that D_E is a projector
...
```
%% Cell type:markdown id:7fb30993-5350-4e9b-aa45-b657b450a3a1 tags:
<!-- END QUESTION -->
### 3. Extended signals in the Fourier domain
Let us now define an operator that will work almost as the normalized Fourier transform, except that it will be applied to the extended signals and preserve the $x_E$ part. This can be easily done using the following block matrix:
$\tilde{W} = \begin{pmatrix}\hat{W} & 0 \\0 & I_M\end{pmatrix}$.
Using this definition, multiplying an extended signal $\tilde{x}$ by $\tilde{W}$ will result in a vector containing the Fourier transform of $x$ followed by $x_E$, preserving the known initial values.
%% Cell type:markdown id:3157e22e-f7d7-44b8-a4fd-9ddec1d7aa8a tags:
<!-- BEGIN QUESTION -->
9. Prove that $\tilde{W}$ is orthonormal.
%% Cell type:markdown id:7abf655c tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:markdown id:f140c720-eaed-4fd2-8b9f-6b99a1a5b0ff tags:
<!-- END QUESTION -->
10. Write a function that returns $\tilde{W}$.
%% Cell type:code id:b744a698-40d5-41a6-9623-76a19f50bcd2 tags:otter_answer_cell
``` python
def Wt_E(N, E):
"""
Computes the $\tilde{W}_E$ matrix
Parameters
----------
N : length of the input signal
E : list of suitable indices. e.g. for N=5 a valid E could be [1, 3]
Returns
-------
The $\tilde{W}_E$ matrix
"""
...
```
%% Cell type:code id:60244622 tags:
``` python
grader.check("q10")
```
%% Cell type:markdown id:1afee45d-78e6-498b-905b-cbc97eeaf2d5 tags:
11. Recalling the low-pass matrix $P$ defined previously, build $\tilde{P}$ s.t. when applied to $\tilde{W}\tilde{x}$ it results in a vector containing the filtered values (still in the Fourier domain) followed by $x_E$.
%% Cell type:code id:4576bfbc-a249-4d89-919d-47763c8a60f9 tags:otter_answer_cell
``` python
def Pt_E(N, M, w_c):
...
```
%% Cell type:markdown id:781d7752-74ae-4ec2-8ea2-4279b54ee5e7 tags:
<!-- BEGIN QUESTION -->
12. A signal $\tilde{x}$ will be filtered by doing $\tilde{W}^H \tilde{P}\tilde{W}\tilde{x}$.
Prove that this is a projection.
%% Cell type:markdown id:d38cefa0 tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:markdown id:1c8c4469-7264-4968-b700-897be635a193 tags:
<!-- END QUESTION -->
### 4. Signal restoration
%% Cell type:markdown id:ac24806d-105d-4fbf-bc98-788dbcc7c0ba tags:
<!-- BEGIN QUESTION -->
13. We can now use all defined above to implement a function that performs an iteration, taking as input the extension of $x$ as defined above, that does the following:
- compute the filtered version of the signal using $\tilde{W}^H \tilde{P}\tilde{W}$ (i.e. projecting on the space of "smooth signals")
- restore the known values in the signal by using $D_E = C_E\tilde{B}_E$ (i.e project back on the space of signals having the known initial values we have set)
Make sure to force the result to real values by using `np.real` before returning
%% Cell type:code id:155e3277-bcef-44ed-b53f-a1b2e5ad7363 tags:otter_answer_cell
``` python
def restoration_iter(Wt, Pt, Dt, xt):
"""
Performs a full restoration iteration by
- projecting the signal into Fourier, performing low-pass filtering followed by inverse DFT
- restoring the known initial values into the filtered signal
Parameters
----------
Wt : \tilde{W} matrix
Pt : \tilde{P} matrix
Dt : \tilde{D} matrix
xt : \tilde{x} vector
Returns
-------
The new $\tilde{x} vector after the iteration
"""
...
```
%% Cell type:markdown id:758aa9ce-1ebc-49e1-b2f6-ca8e1035b31c tags:
<!-- END QUESTION -->
<!-- BEGIN QUESTION -->
15. Finally we are ready to apply all this to an example.
%% Cell type:code id:b467649e-2a46-40f2-bc98-73c354fb0484 tags:
``` python
# Setup an example
N = 100
w_c = 10 # cut-off
w1 = 3
w2 = 7
E = np.array([0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95])
# E = np.array([0, 10, 20, 30, 40, 50, 60, 70, 80, 90]) # try with less known points
M = len(E)
x = np.cos(2*w1*np.pi*np.arange(0, N)/N) + np.sin(2*w2*np.pi*np.arange(0,N)/N) # original signal
# Create a signal that is only noise, except at known initial values
y = np.random.rand(N) #
y[E] = x[E] # restore known values
xe = x[E]
```
%% Cell type:code id:510b6f8f-b1c2-48b7-af00-eee8e55b5868 tags:
``` python
plt.plot(y) # plot the "acquired" signal
plt.plot(x, 'r+-', alpha=0.5) # plot the ground truth signal
```
%% Cell type:markdown id:b71f1ff5-73d6-4bc1-ba06-1452c8bf8adb tags:
Generate $\tilde{y}$ (this only need to be done once)
%% Cell type:code id:4e0943bf-0f35-4837-b6db-1615ba466ae5 tags:
``` python
yt = np.append(y, xe) # SOLUTION
```
%% Cell type:markdown id:c25239d9-3015-4a33-b0c0-0c906ee4129e tags:
Run a number of iterations of the method and plot the result:
%% Cell type:code id:6aaf1a71-6c2a-4f43-a51e-82b40cc6d6d0 tags:
``` python
def signal_restore(Wt, Pt, Dt, yt, niter=20):
yr0 = yt # initialize
for k in range(niter):
yr1 = restoration_iter(Wt, Pt, Dt, yr0)
yr0 = yr1
return yr1
```
%% Cell type:code id:951418c2-ee12-4e8d-8f58-cba29d26dd7e tags:
``` python
Wt = Wt_E(N, E)
Dt = C_E(N, E)@Bt_E(N, E)
Pt = Pt_E(N, M, w_c)
yr = signal_restore(Wt, Pt, Dt, yt)
```
%% Cell type:code id:43b3d939-eeb2-4171-9293-0076f1c06c83 tags:
``` python
plt.plot(yr[:N]) # plot reconstructed signal
plt.plot(x, 'r+', alpha=0.7) # plot original for comparison
```
%% Cell type:markdown id:bf876405-b60c-4f7c-9162-56d58cafd19f tags:
Depending on $M$, you will find that the method can reconstruct perfectly the signal or not.
%% Cell type:markdown id:ccf83efe tags:
<!-- END QUESTION -->
exercises/week03/images/pocs.png

76.9 KiB

%% Cell type:code id:faef2260 tags:
``` python
# Initialize Otter
import otter
grader = otter.Notebook("Deblurring.ipynb")
```
%% Cell type:markdown id:b8b107c5-c56c-4e5f-8cee-b970f8e10f8f tags:
# Matrix Analysis 2025 - EE312
## Week 4 - Image deblurring using projections
[LTS2](https://lts2.epfl.ch)
### Objectives
In this week's exercises you will need to use the properties of the four fundamental subspaces, as well as the projection operator on those subspaces. You will apply those to deblur an image.
## I. Exercises
%% Cell type:markdown id:3acf015e-a0ae-48a6-925f-7ef5e5b18c01 tags:
<!-- BEGIN QUESTION -->
### 1.
Let $A=\begin{pmatrix}1 & 1 & 1 &0\\1 & 2 & 2 & 1\\2 & 5 & 5 & 3\end{pmatrix}$
Compute a basis for each of its four fundamental subspaces.
%% Cell type:markdown id:c5974723 tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:markdown id:5504f856-3edd-436f-b14f-4e660a7faf7c tags:
<!-- END QUESTION -->
<!-- BEGIN QUESTION -->
### 2.
Let $A \in \mathbb{R}^{m \times n}$ and suppose it is onto. Prove that $AA^T$ is nonsingular.
%% Cell type:markdown id:658270cd tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:markdown id:44bf7882-513e-4c29-9619-96bd357670b0 tags:
<!-- END QUESTION -->
## II. Image deblurring
### Introduction
Since we will need to visualize images, just a brief reminder on how to use matplotlib for this task
%% Cell type:code id:a05d6e95-711e-41ce-bf8a-5b983d8999fd tags:
``` python
import numpy as np
import matplotlib.pyplot as plt
import skimage
import skimage.io
import os
```
%% Cell type:markdown id:f387c338-7c82-46cf-9950-def341903021 tags:
We will use the `imread` function from the `scikit-image` package (pre-installed on noto, make sure to install it if you are using a local environment)
%% Cell type:code id:97017b6f-fc95-4d94-b1f6-f7da6621444f tags:
``` python
filename = os.path.join(skimage.data_dir, 'camera.png')
camera = skimage.io.imread(filename)
```
%% Cell type:markdown id:6b4b81f7-6867-43b7-9fd5-9478d421db86 tags:
Some basic information regarding the contents of `camera`:
%% Cell type:code id:03e539bd-2d68-4240-9d72-ee4ef02b9081 tags:
``` python
camera.shape
```
%% Cell type:code id:3b723f81-ebf9-45e5-83a9-6e3d7048586b tags:
``` python
camera.dtype
```
%% Cell type:markdown id:f2c5f73f-bfee-47e7-b68d-d364f6478663 tags:
The type `uint8` means each value in `camera` is represented using a single byte, and is within $[0, 255]$.
Displaying the image is fairly straightforward :
%% Cell type:code id:04ce2927-48e4-41d6-961f-f1a5f0791cc3 tags:
``` python
plt.imshow(camera, cmap='gray')
```
%% Cell type:markdown id:18174a60-7b66-4ccf-a5f4-43ec56299c0e tags:
The image here is widely used in many image processing scientific publications.
The `cmap` parameter specifies the colormap used to display the image. Matplotlib default colormap not being grayscale, passing `cmap='gray'` is needed. You can find [several predefined colormaps](https://matplotlib.org/stable/tutorials/colors/colormaps.html) in Matplotlib.
You can also check the images distributed with `scikit-image` on their [github repository](https://github.com/scikit-image/scikit-image/tree/main/skimage/data). Feel free to use another image, or even one of your own.
%% Cell type:markdown id:30dbc6b0-be1f-43b6-a575-ffdedac904b8 tags:
## 1. Blurring operator
In order to efficiently deblur an image, let us design our own blurring operator. Since it will be entirely known, it will be possible to compute its inverse.
Let us consider an input vector $x=\begin{pmatrix}x_0 \\ x_1 \\ ... \\ x_{N-1}\end{pmatrix} \in \mathbb{R}^N$.
Our blurring operator $B$ will do two things:
- replace each value $x_k$ by the average of $2p+1$ values surrounding $x_k$, i.e. between $x_{k-p}$ and $x_{k+p}$.
- compute this average not for every $x_k$ but only every $s$ values, with $s$ being a subsampling factor, $s>0$.
Formally, if we denote by $y$ the output of the blurring operator on $x$, this means that
- $y\in\mathbb{R}^M$, with $M=\frac{N}{s}$ (NOTE: we will only consider the case where $N$ is a mutiple of $s$)
- $y_m = \frac{1}{2p+1}\sum_{k=sm-p}^{sm+p}x_k$
As you may have noticed, our summation indices in the definition of $y_m$ can be negative in some cases and bigger than $N-1$ in others. In those cases the missing value of $x_k$ is replaced by 0. This way of handling borders is often referred to as **zero-padding**, as it can be achieved by creating an extended version of $x$ with a number of leading and trailing zeros. There are other ways of handling borders, e.g. by using $x_0$ for negative indices and $x_{N-1}$ for indices greater than $N-1$, or even by mirroring the input vector.
%% Cell type:markdown id:63fecde4-7ee7-4692-b21e-dfc33e4a4de7 tags:
1. Write a function that creates the matrix $B$ that performs the operations described above.
Remember that in python the `//` operator performs the integer division.
**!!Warning!!** the automated tests are here to help you, they might however be incomplete and do not fully guarantee that your implementation is correct even if they all pass !
%% Cell type:code id:e40d4cba-1187-448d-96fb-d0033c129aaf tags:otter_answer_cell
``` python
def blur_matrix(N, p, s):
"""
Computes the blurring matrix
Parameters
----------
N : length of the input signal
p : half-length of the blurring filter
s: subsampling factor
Returns
-------
The blurring matrix
"""
...
```
%% Cell type:code id:73ed3a03 tags:
``` python
grader.check("q2p1")
```
%% Cell type:markdown id:af393d26-6c86-4ff0-b8fb-2f78fae94c57 tags:
<!-- BEGIN QUESTION -->
2. What is the rank of $B$ ?
Hint:
- row-rank might be easier to consider
- you might also need to add constraints on $s$ and $p$
%% Cell type:markdown id:d4849ba8 tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:markdown id:67659a9b-5206-40da-a8ab-0f190884d901 tags:
<!-- END QUESTION -->
## 2. Blurring the image
We now have our blurring matrix ready, let us use it to blur the `camera` image.
As $B$ is designed to operate on vectors, we will need two steps to blur the image
- apply $B$ on each column of the input image
- apply $B$ on each row of the column-blurred image computed above
%% Cell type:markdown id:596b49e7-438a-42c6-b5e6-5ea1a7f338f2 tags:
3. Implement a function that blurs an image using the matrix defined in the previous question
%% Cell type:code id:0a8714b7-1dfa-4aa9-b09a-a5a9caa317bc tags:otter_answer_cell
``` python
def blur_image(img, B):
"""
Blurs the input image using the blur matrix
Parameters
----------
img : input image
B : blur matrix
Returns
-------
The blurred image
"""
...
```
%% Cell type:code id:6498302c tags:
``` python
grader.check("q2p3")
```
%% Cell type:code id:c167c207-66da-41a7-b536-5fc4577420e3 tags:
``` python
# build the blur matrix
B = blur_matrix(512, 3, 2)
```
%% Cell type:code id:42231cfd-b223-4f90-97c5-dec703aee5af tags:
``` python
# Blur on rows
camera_blurred = blur_image(camera, B)
```
%% Cell type:code id:78aa4a09-ff28-4e21-83f9-2b14c6d6ad83 tags:
``` python
plt.imshow(camera_blurred, cmap='gray') # check the result
```
%% Cell type:code id:492b6b43-abb8-49a1-a8cd-4a34c9b1a4eb tags:
``` python
# Display blurred and original side by side
fig, ax = plt.subplots(nrows=1, ncols=2, figsize=(15, 7), dpi=80, sharey=True, sharex=True)
ax[1].imshow(camera_blurred, cmap='gray')
ax[0].imshow(camera, cmap='gray')
```
%% Cell type:markdown id:1d64733c-35eb-440b-a572-60bbdfd446e5 tags:
## 3. Deblurring
Now everything is setup, we can proceed with the image restoration.
%% Cell type:markdown id:c962b694-45b2-4c2b-b7f5-ee69ffd8e342 tags:
<!-- BEGIN QUESTION -->
4. Let us denote the result of a blur operation by $y=Bx$. Using the properties of the four fundamental subspaces, write $x^*$ as a sum, making use of projections operators those subspaces. Hint: you might need results from question 2.
%% Cell type:markdown id:c35a1228 tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:markdown id:8f024ccc-04b6-4437-aa6a-daba627ed4d4 tags:
<!-- END QUESTION -->
<!-- BEGIN QUESTION -->
5. Using this result, find a possible solution for the complete blurring operation (on both rows and columns).
%% Cell type:markdown id:ab1c7611 tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:markdown id:cf20cfff-0c77-4367-8b4e-92f41c41acad tags:
<!-- END QUESTION -->
6. Implement a function that computes the right inverse of the blur matrix, and uses it preform image restoration. Do not forget to consider row and columns blurring.
You can use `np.linalg.inv` to perform the inverse of a matrix.
%% Cell type:code id:32426413-a8ea-46ac-ae3e-968f480ff150 tags:otter_answer_cell
``` python
def inverse_blur(blurred_img, B):
...
```
%% Cell type:code id:f82aa4d6 tags:
``` python
grader.check("q2p6")
```
%% Cell type:markdown id:812c7443-3492-46d2-ad99-b2c83826163b tags:
Using this function, compute the reconstructed image from the blurred one
%% Cell type:code id:f270a921-f7c3-40fd-a969-d37010ad2954 tags:
``` python
camera_unblur = inverse_blur(camera_blurred, B)
```
%% Cell type:code id:75296a85-053c-407c-915f-b562ba9c8acd tags:
``` python
plt.imshow(camera_unblur) # check the result
```
%% Cell type:code id:0dfd3731-b40a-4425-b78d-e0b65590c9f3 tags:
``` python
# Let us compare the original image, the restored image and an upscaled version of the blurred image
fig, ax = plt.subplots(nrows=1, ncols=3, figsize=(15, 7), dpi=80, sharey=True, sharex=True)
ax[1].imshow(camera_unblur, cmap='gray')
ax[0].imshow(camera, cmap='gray')
ax[2].imshow(skimage.transform.resize(camera_blurred, (512,512)), cmap='gray')
```
%% Cell type:markdown id:ca9e1169-772a-402a-be66-8fd166826e54 tags:
<!-- BEGIN QUESTION -->
6. Evaluate the performance of the restoration method. What are its limitations for a real-world usage ? Check that the computed solution is valid by blurring the restored image.
%% Cell type:markdown id:2d3adde9 tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:code id:3658f662-3832-4098-84f4-5d618b95fd02 tags:otter_answer_cell
``` python
...
```
%% Cell type:code id:54f26333 tags:
``` python
grader.check("q2p6")
```
%% Cell type:markdown id:650659df tags:
<!-- END QUESTION -->
%% Cell type:markdown id:0bd3e058 tags:
## Submission
Make sure you have run all cells in your notebook in order before running the cell below, so that all images/graphs appear in the output. The cell below will generate a zip file for you to submit. **Please save before exporting!**
%% Cell type:code id:d10563fe tags:
``` python
# Save your notebook first, then run this cell to export your submission.
grader.export(pdf=False, run_tests=True)
```
%% Cell type:markdown id:de374d0d tags:
%% Cell type:code id:044b8034 tags:
``` python
# Initialize Otter
import otter
grader = otter.Notebook("Linear systems - Poisson equation.ipynb")
```
%% Cell type:markdown id:d133bce9-d159-4b1d-a25f-69bed8da493a tags:
# Matrix Analysis 2025 - EE312
## Week 5 - Discrete Poisson equation
[LTS2](https://lts2.epfl.ch)
### Objectives
Apply your knowledge about linear systems to Poisson equation resolution.
%% Cell type:markdown id:654ad4ee-c060-4bc3-b218-3a6709952750 tags:
## Poisson equation
Let $u,v \in \mathbb{R}^n$ represent a physical quantity $f$ and $g: \mathbb{R} \mapsto \mathbb{R}$ sampled at $n$ equi-spaced locations $x_i$, i.e $u_i = f(x_i)$, $v_i = g(x_i)$.
Let us assume that the underlying continuous object satisfies the Poisson equation: $\frac{d^2}{dx^2} f (x)= g(x)$ with constraints $f(x_j) = y_j$ for a subset of $m$ locations $j \in \{i_1, \ldots i_m \}$.
We will assume that **all the $i_k$ are distinct**.
This equation governs a number of physical principles, e.g. gravity, fluid mechanics and electrostatics. In the latter we have $\Delta \varphi = -\frac{\rho}{\varepsilon}$, where $\Delta$ is the Laplacian operator $(\frac{d^2}{dx^2} + \frac{d^2}{dy^2} + \frac{d^2}{dz^2})$, $\varphi$ is the electric potential, $\rho$ the density of electric charges and $\varepsilon$ the electric permittivity.
---
%% Cell type:markdown id:06ac5cfa-62ef-4477-a957-e240ddeb85d8 tags:
### 1D case
%% Cell type:code id:7f3cb74d-72f0-460b-a978-a47f87de8557 tags:
``` python
import numpy as np
import matplotlib.pyplot as plt
```
%% Cell type:markdown id:9fd89d0b-50df-4711-a32b-ec1026e6b122 tags:
<!-- BEGIN QUESTION -->
1. Write down a matrix equation for the discrete version of $\frac{d^2}{dx^2} f (x)= g(x)$, using the finite-difference approximation of the derivative $\frac{d^2}{dx^2} f = f(x_{k+1}) - 2f(x_k) +f(x_{k-1})$. For the boundary conditions we will make the assumption that $x_{-1}=x_0$ and $x_{n-1}=x_n$ (also referred to as Dirichlet boundary conditions)
What is the rank of the corresponding matrix $D$ ?
%% Cell type:markdown id:d1f8bd46 tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:markdown id:aed76fe2-8cb5-4b28-bd3b-fa0bea113a46 tags:
<!-- END QUESTION -->
<Your answers here>
%% Cell type:markdown id:cceac118-c43d-4782-9f7e-4392e5ecc451 tags:
2. Implement a function that creates the $D$ matrix (also called Laplacian). The [diag](https://numpy.org/doc/stable/reference/generated/numpy.diag.html) function in numpy might be useful.
%% Cell type:code id:d41105d8-6933-4ef8-b2f0-c6ac053e1f55 tags:otter_answer_cell
``` python
def lapl_matrix(N):
...
```
%% Cell type:code id:756d9370 tags:
``` python
grader.check("q2")
```
%% Cell type:markdown id:4cea41dc-6f12-42f2-8644-b51e4918504d tags:
<!-- BEGIN QUESTION -->
3. Write down a matrix equation for the discrete version of $f(x_j) = y_j$. What is the rank of the corresponding matrix $B$ ?
%% Cell type:markdown id:fc67bfcb tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:markdown id:418bc83f-7d46-41c2-8a7e-df892857320c tags:
<!-- END QUESTION -->
4. Implement a function that creates matrix $B$
%% Cell type:code id:1d294eda-7edc-42c2-9640-13bec779bd48 tags:otter_answer_cell
``` python
def b_matrix(N, idx_list): # idx_list is a list of valid indices, e.g. N=5, idx_list=[1,3]
...
```
%% Cell type:code id:d592616d tags:
``` python
grader.check("q4")
```
%% Cell type:markdown id:b77f215e-e4d0-4976-9ed6-7ee210c689a6 tags:
<!-- BEGIN QUESTION -->
3. Write down a matrix equation for the full problem (Hint: Consider the matrix $C=\begin{pmatrix}D\\B\end{pmatrix}$). Discuss the rank of the matrix and deduce a way to numerically solve the linear system. Implement a function that returns matrix $C$.
%% Cell type:markdown id:70f5637a tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:code id:2eaf5a8c-b83a-4f53-9c4e-6192bf4e6fa4 tags:otter_answer_cell
``` python
def c_matrix(D, B):
...
```
%% Cell type:code id:d77116d4 tags:
``` python
grader.check("q4")
```
%% Cell type:markdown id:10a3d136-f5f3-4c1b-9c0a-ece58f891dfe tags:
<!-- END QUESTION -->
<!-- BEGIN QUESTION -->
5. What explicit formula can you use to compute the pseudo-inverse of $C$ (justify)?
%% Cell type:markdown id:ce8aa19a tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:markdown id:3726eb03-0c82-4aed-b179-77ff947583c3 tags:
<!-- END QUESTION -->
<!-- BEGIN QUESTION -->
6. Implement a function that return the solution of the problem. You can either use the formula above (you may then use the [linalg.inv](https://numpy.org/doc/stable/reference/generated/numpy.linalg.inv.html) or compute the pseudo-inverse using [linalg.pinv](https://numpy.org/doc/stable/reference/generated/numpy.linalg.pinv.html).
%% Cell type:code id:79f8c6d1-7b92-48b6-bb90-e888aaa41243 tags:otter_answer_cell
``` python
# v is a vector of size N
# u is a vector of size len(idx_list)
def solve_poisson(N, v, idx_list, u):
...
```
%% Cell type:markdown id:ff6184e1-e308-47d1-8b72-ab4559b48adb tags:
<!-- END QUESTION -->
<!-- BEGIN QUESTION -->
7. Let us now consider the solutions of the particular case $g(x) = 0, \forall x\in\mathbb{R}$. What are the analytical solutions that would solve Poisson equation ?
%% Cell type:markdown id:53e63141 tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:markdown id:4a33db39-077d-4775-a55d-ae5bd7400b23 tags:
<!-- END QUESTION -->
<!-- BEGIN QUESTION -->
Let us verify that our implementation of the solution works with a small example. Fill the values for $u$ s.t. they belong to the solution space.
%% Cell type:code id:96381701-e737-41c1-8642-fa1a5c13924c tags:otter_answer_cell
``` python
N = 50
v = np.zeros(N)
idx_list = [10, 20, 30, 40]
u = ...
sol = solve_poisson(N, v, idx_list, u)
plt.plot(sol) # plot the result
```
%% Cell type:markdown id:04e6fc55-6c8a-4dfa-8ca0-9b45ed2f157c tags:
<!-- END QUESTION -->
<!-- BEGIN QUESTION -->
9. Are the results conform to what we expected ? What happens near the boundaries ?
%% Cell type:markdown id:e1c9c95f tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:markdown id:ecf46b87-accd-4043-8ab6-06835830699b tags:
<!-- END QUESTION -->
<!-- BEGIN QUESTION -->
10. Let us use a step function for $f$ and try to solve the system
%% Cell type:code id:d81b4501-b2a5-4412-a266-d04ffa8dd2c7 tags:
``` python
idx_list = [10, 20, 30, 40]
u2 = [1, 1, -1, -1]
sol2 = solve_poisson(N, v, idx_list, u2)
plt.scatter(idx_list, u2, marker='+', color='r')
plt.plot(sol2)
```
%% Cell type:markdown id:4918d31c-40a2-4df3-88ec-d18a80e879e7 tags:
What do you observe and what is the reason for that ?
%% Cell type:markdown id:1482c041 tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:markdown id:b520b2c6-7c84-4c3e-be16-8eb866648a97 tags:
<!-- END QUESTION -->
<!-- BEGIN QUESTION -->
11. Let us now consider the application of the Poisson equation to electrostatic. Knowing a distribution of electric charges, we can solve it to compute the electric potential. The distribution of electric charges will be very simple, consisting of two punctual charges
%% Cell type:code id:96932aa1-74aa-4f4e-bbf5-df0f0d85c47c tags:
``` python
N = 50
v3 = np.zeros(N)
v3[10] = 1
v3[40] = 1
plt.plot(v3)
```
%% Cell type:markdown id:42861911-53e1-4cb3-8dc7-aa617c01a7e8 tags:
- What are the analytical solutions for this problem ? (remember the discrete derivative of a step function is a Dirac function, i.e. 0 everywhere except at one point where it is 1).
- Plot the analytical solution
- Compute and plot the numerical solution
- What do you observe ?
%% Cell type:markdown id:1b19e6d2 tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:code id:f42fc23b-c412-4791-b5b6-bef54fa3bab3 tags:otter_answer_cell
``` python
...
```
%% Cell type:code id:f2f9bd66-067e-4590-8088-87b7969621bb tags:otter_answer_cell
``` python
# Plot analytical solution
...
```
%% Cell type:code id:22eae7ab-4c91-4000-b123-7105cc4c023a tags:otter_answer_cell
``` python
idx_list = [10, 20, 30, 40]
u3 = [1, 1, 1, 1]
sol3 = solve_poisson(N, v3, idx_list, u3)
plt.scatter(idx_list, u3, marker='+', color='r')
plt.plot(sol3)
```
%% Cell type:code id:b0310f39-9bbf-4518-be7d-af0e3f6724a7 tags:otter_answer_cell
``` python
# check the 2nd derivative of the solution computed
...
```
%% Cell type:markdown id:c0f0755f tags:
<!-- END QUESTION -->
%% Cell type:code id:e3f2b37e tags:
``` python
# Initialize Otter
import otter
grader = otter.Notebook("Recursive least squares.ipynb")
```
%% Cell type:markdown id:eb88859c-9077-43c0-868b-e5f45289ff06 tags:
# Matrix Analysis 2025 - EE312
## Week 6 - Recursive least squares
[N. Aspert](https://people.epfl.ch/nicolas.aspert) - [LTS2](https://lts2.epfl.ch)
## Important
- The unit tests are here to help you, however they might pass even if your implementation is faulty (and conversely fail even if your implementation is correct), due to platform/environment variability. Feel free to ignore them (after carefully reviewing your answers/results)
## Objective
The goal of this notebook is to find the parameters of a system using least squares, and more specifically use a recursive formulation to update those parameters with a reduced amount of computations.
%% Cell type:code id:f4ab0d05-aa5a-4dc7-b2b5-b04ad6dd426e tags:
``` python
import numpy as np
import matplotlib.pyplot as plt
```
%% Cell type:markdown id:d1a159ae-f3e6-4629-bf5b-6d93af1b0b68 tags:
## Part I - Least-square parameter estimation
In this notebook, we will estimate physical parameters of a moving system. Its $(x,y)$ coordinates are sampled at regular time intervals, but are noisy (e.g. due to measurement artefacts). You can load them from a file using the code below.
%% Cell type:code id:68883575-5b43-4ae7-8e23-13cdab9b2fa4 tags:
``` python
data = np.load('data.npz')
```
%% Cell type:markdown id:b82107e6-9019-4598-9e9a-9b123fba4e58 tags:
Once loaded, the `data` variable is a dictionary containing 3 arrays:
- `time` for the time values corresponding to each sample
- `tr1` for the first system trajectory
- `tr2` for the second system trajectory
The trajectories are stored as complex numbers to have a single value representing the $x$ and $y$ components. In the first part of the problem we will only use ther data from `tr1`.
%% Cell type:code id:141016a9-b347-4c7b-b76e-13af8ade65e3 tags:
``` python
plt.scatter(np.real(data['tr1']), np.imag(data['tr1']), marker='+')
plt.xlabel('x')
plt.ylabel('y')
```
%% Cell type:markdown id:575ce09b-fcbd-4442-a8e7-dc2d5e587226 tags:
<!-- BEGIN QUESTION -->
Let us now build the model of this system. In this part we will assume that the system is defined by 3 parameters:
- its initial position $p_0$
- its initial velocity $v_0$
- its acceleration $a$ which is constant
Those parameters will be represented as complex values (in order to take into account their $x$ and $y$ components)
1. Express the position of the system $p(t)$ as a function of $t$ and of the 3 above parameters.
%% Cell type:markdown id:2857a8f4 tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:markdown id:5e93d1f2-82a7-4e1b-b543-d89dd38c1c69 tags:
<!-- END QUESTION -->
Since we have the whole trajectory of the system and given the relatively small number of samples, we can now estimate directly the 3 parameters using a least-square approximation.
2. Using the previous question, write a function that generates the matrix $A$ s.t.
$$
A\begin{pmatrix}p_0 \\ v_0 \\ a\end{pmatrix} = \begin{pmatrix}p(t_0) \\ p(t_1) \\ \vdots \\ p(t_{N-1}) \end{pmatrix}
$$
%% Cell type:code id:fbeb5b6b-77ee-4b03-8871-81be21c5cf01 tags:otter_answer_cell
``` python
def computeA(t):
"""
Computes the A matrix
Parameters
----------
t : vector containing values of time samples
Returns
-------
The A matrix
"""
...
```
%% Cell type:code id:f62efb3c tags:
``` python
grader.check("q2")
```
%% Cell type:markdown id:9903f3fe-bd77-4e83-880f-5fdddbd450a6 tags:
3. Estimate the parameters of the system using a least square approximation
%% Cell type:code id:a8623196-a2cd-47ef-b2c6-30804172868d tags:otter_answer_cell
``` python
def solve_lsq(t, pos):
"""
Solve the system Ax = pos, where x is the vector (p_0, v_0, a) and pos contains the position values at time t
Parameters
----------
t : vector containing values of time samples
pos: vector containing the positions of the system corresponding to the values of t (complex)
Returns
-------
The vector x containing the approximation of the system's parameters in the least-square sense.
"""
...
```
%% Cell type:code id:8182465f tags:
``` python
grader.check("q3")
```
%% Cell type:markdown id:f7edef23-f5d4-4c59-b4c8-ef57ad9d6b8b tags:
<!-- BEGIN QUESTION -->
4. Display the measured trajectory `tr1` and the one reconstructed from the parameters you estimated. If your implementation is correct, they should match. Also plot the euclidean distance between the measured and reconstructed point fo each timestep.
%% Cell type:markdown id:c1867853 tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:code id:18e792a6-6104-422d-a7b3-c7b53a56b32b tags:otter_answer_cell
``` python
# Plot the observed and predicted data
...
```
%% Cell type:code id:fa59fa59-f036-4906-87d3-52e558e7685c tags:otter_answer_cell
``` python
# Plot the distance between each point
...
```
%% Cell type:markdown id:832f6e06-46c1-4ac9-a01f-6acb138560df tags:
<!-- END QUESTION -->
## Part II - Recursive least-square estimation
The problem addressed in the first part was fairly easy: we had all measurements available at once (which is fine in an "offline" setting), and computing the least-square solution did not require to invert a huge matrix. In a more realistic setting, the measurements would come gradually and it would be nice to update our parameters without recomputing the whole solution at each step. This is where recursive least-squares come into place.
%% Cell type:markdown id:f51903c3-e84d-4352-a4ea-e568bdc3aead tags:
<!-- BEGIN QUESTION -->
We first need to prove a classical result. Let us consider four matrices $A,B,C$ and $D$ such that:
- $A$ is an invertible $N\times N$ matrix
- $B$ is a $N\times M$ matrix
- $C$ is $M\times N$ matrix
- $D$ is an invertible $M\times M$ matrix
5. Prove that $(A + BDC)^{-1} = A^{-1} - A^{-1}B(D^{-1}+CA^{-1}B)^{-1}CA^{-1}$
%% Cell type:markdown id:be98a834 tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:markdown id:6768f75a-e59d-42f1-99b7-9c16150ec2a0 tags:
<!-- END QUESTION -->
<!-- BEGIN QUESTION -->
Let us consider the least-square problem $Ax=y$, and suppose we already have received a small (but still keeping the system over-determined) number of measurements $y_0$ and computed $\hat{x}_0$ that is the least-square solution of $A_0\hat{x}_0=y_0$. Let us assume we have already computed $Q_0=(A_0^TA_0)^{-1}$.
Now another set of measurements $y_1$ arrives, and the system you want to solve looks like
$$
\begin{pmatrix}A_0\\A_1\end{pmatrix}x = \begin{pmatrix}y_0\\y_1\end{pmatrix}
$$
6. Using this block matrix formulation, rewrite the normal equation and express the updated solution $\hat{x}_1$ using $A_0$, $A_1$, $Q_0^{-1}$, $\hat{x}_0$, and $y_1$.
Let us denote by $Q_1 = (A_0^TA_0 + A_1^TA_1)^{-1}$. Express $\hat{x}_1$ as a function of $Q_1$, $A_1$, $y1$ and $\hat{x}_0$.
%% Cell type:markdown id:c3516e6e tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:markdown id:58d88136-aa54-4839-af16-672638af3087 tags:
<!-- END QUESTION -->
<!-- BEGIN QUESTION -->
Using the result shown in question 5, express $Q_1$ as a function of $Q_0$, $A_1$ and $U=Q_0A_1^T$.
Hints:
- you might consider $Q_1^{-1}$
- what about $Q_0$ symmetry ?
Why is it interesting to update the system's parameters by computing $Q_1$ this way ?
%% Cell type:markdown id:4368082f tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:markdown id:29107f5e-92d0-4790-b235-8e1362cd8250 tags:
<!-- END QUESTION -->
8. Using the results of questions 6 and 7, implement a function that computes the updated parameters for a new set of measurements and complete the `recursive_lsq`implementation.
%% Cell type:code id:bab14427-11c5-47b9-99b1-04e99fb92571 tags:otter_answer_cell
``` python
def update_lsq(Q0, xhat0, meas_time, y1):
"""
Compute the updated parameters xhat1 from Q0, y1, xhat0
Parameters
----------
Q0 : matrix Q0
xhat0: previously computed parameters of the system
meas_time: vector containing the times at which samples were acquired
y1: vector containing the new positions of the system corresponding to the values of meas_time
Returns
-------
xhat1 and Q1
"""
...
```
%% Cell type:code id:371b7525-b48b-43b8-a636-180cbd2e2566 tags:otter_answer_cell
``` python
def recursive_lsq(t, pos, batch_size):
"""
Perform recursive LSQ fit on iput data splitted in batches
Parameters
----------
t: vector containing values of time samples
pos: vector containing the positions of the system corresponding to the values of t (complex)
batch_size: size of each batch of measurements
Returns
-------
array containing model parameters estimated after each batch
"""
if batch_size < 2:
raise ValueError("Invalid batch size")
nsteps = len(t)//batch_size
# initialization step
A0 =computeA(t[:batch_size])
Q0 = np.linalg.inv(A0.T@A0)
xhat0 = ...
xh_cur = xhat0
Q_cur = Q0
x_h = [xhat0] # store the history of parameters in a list
for k in np.arange(1, nsteps):
cur_time = t[k*batch_size:(k+1)*batch_size]
cur_pos = pos[k*batch_size:(k+1)*batch_size]
xhat1, Q1 = ...
x_h.append(xhat1)
xh_cur = xhat1
Q_cur = Q1
return np.array(x_h)
```
%% Cell type:code id:ebdfb081 tags:
``` python
grader.check("q8")
```
%% Cell type:markdown id:a9f907da-74a3-4d1c-9892-8d2b9fa9c69f tags:
<!-- BEGIN QUESTION -->
9. Complete the `predict_trajectory` function, and the `tr1` data:
- display the predicted trajectory of the system when using recursive least-squares
- display the distance between the predicted and observed trajectory
Make sure to chose a suitable batch size.
%% Cell type:code id:7cbfe794-5aa1-4e12-a13c-3af377dfa828 tags:otter_answer_cell
``` python
def predict_trajectory(x_h, t, pos, batch_size):
pos_pred = np.zeros(pos.shape, dtype='complex')
# Initialization
pos_pred[:batch_size] = pos[:batch_size]
for k in np.arange(1, len(x_h)):
A1 = computeA(t[k*batch_size:(k+1)*batch_size])
pos_pred[k*batch_size:(k + 1)*batch_size] = ...
return pos_pred
```
%% Cell type:code id:97513eaf-ef5a-43c5-86f9-29cbc9bbd4de tags:otter_answer_cell
``` python
batch_size = ...
```
%% Cell type:code id:2cfd4110-2ee6-4e5e-bfdc-e6155745e19e tags:otter_answer_cell
``` python
x_h = ...
```
%% Cell type:code id:5daa05c1-23bf-4f00-a7d4-9849b6ef27f6 tags:otter_answer_cell
``` python
pred_traj = ...
```
%% Cell type:code id:e1961a97-4a16-4d47-8f0e-df2e106a68a6 tags:otter_answer_cell
``` python
# Plot the predicted and observed data
...
```
%% Cell type:code id:11ec5c11-9ee2-49e0-b4d3-507629f2f71d tags:otter_answer_cell
``` python
# Plot the error between prediction and observation
...
```
%% Cell type:markdown id:31ef60cf-1277-401a-b8d3-32b8e50e5797 tags:
<!-- END QUESTION -->
## Part III - Recursive least square with forgetting factor
%% Cell type:markdown id:528dc1bb-b87b-4af9-8247-c718a74fbf7a tags:
<!-- BEGIN QUESTION -->
10. Use the recursive least square on the `tr2` trajectory, whose acceleration varies across time. Display the predicted trajectory and approximation error. Is that what you expected ?
%% Cell type:markdown id:31abd7d1 tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:code id:e39bedb4-7c88-4398-a41b-26c8b815640c tags:otter_answer_cell
``` python
x2_h = ...
```
%% Cell type:code id:341e47da-1516-4a69-a90f-58e227297e10 tags:otter_answer_cell
``` python
pred_traj2 = ...
```
%% Cell type:code id:ebe4b99f-9f32-47b4-9af6-3086608358f4 tags:otter_answer_cell
``` python
# Plot the predicted and observed data
...
```
%% Cell type:code id:d564e0b0-73f5-4315-b5ea-0986822e1388 tags:otter_answer_cell
``` python
# Plot the error between prediction and observation
...
```
%% Cell type:markdown id:70180a33-f723-4ccd-8f09-6c772941865a tags:
<!-- END QUESTION -->
In order to solve this issue, we could either build a more complex model. However, making the assumption that the acceleration changes infrequently, we can try to update the recursive least-square in a way that will give higher importance to the recent measurements with respect to older ones. In order to do so, we will introduce a **forgetting factor** $\alpha$, with $0<\alpha<1$.
We will use this as follows:
When another set of measurements $y_1$ arrives, and the system defined in question 6 will be rewritten as
$$
\begin{pmatrix}\alpha A_0\\A_1\end{pmatrix}x = \begin{pmatrix}\alpha y_0\\y_1\end{pmatrix}
$$
%% Cell type:markdown id:6275538d-4edd-42d1-aa15-312936c217d9 tags:
<!-- BEGIN QUESTION -->
11. Using the new "forgetful" system defined above, update the expressions $Q_1$ and $\hat{x}_1$ derived in questions 6 and 7
Hint:
- You might want to introduce $U_\alpha = \frac{1}{\alpha} Q_0A_1^T$,
%% Cell type:markdown id:4d5f71ee tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:markdown id:bc7bc015-18a8-4059-a2cb-cc680f199b1e tags:
<!-- END QUESTION -->
12. Using the results found at the previous question, implement the recursive least square with a forgetting factor.
%% Cell type:code id:6476b120-1e67-43b7-80fa-42eff433fb46 tags:otter_answer_cell
``` python
def update_lsq_ff(Q0, xhat0, meas_time, y1, ff):
"""
Compute the updated parameters xhat1 from Q0, y1, xhat0, ff
Parameters
----------
Q0 : matrix Q0
xhat0: previously computed parameters of the system
meas_time: vector containing the times at which samples were acquired
y1: vector containing the new positions of the system corresponding to the values of meas_time
ff: forgetting factor
Returns
-------
xhat1 and Q1
"""
...
```
%% Cell type:code id:70c04c51-59d8-4742-8bba-6c5ebe639c5f tags:otter_answer_cell
``` python
def recursive_lsq_ff(t, pos, batch_size, ff):
"""
Perform recursive LSQ fit on iput data splitted in batches
Parameters
----------
t: vector containing values of time samples
pos: vector containing the positions of the system corresponding to the values of t (complex)
batch_size: size of each batch of measurements
ff: forgetting factor (in practice, keep it close to 1)
Returns
-------
array containing model parameters estimated after each batch
"""
if batch_size < 2:
raise ValueError("Invalid batch size")
nsteps = len(t)//batch_size
# initialization step
A0 = computeA(t[:batch_size])
Q0 = np.linalg.inv(A0.T@A0)
xhat0 = ...
xh_cur = xhat0
Q_cur = Q0
x_h = [xhat0] # store the history of parameters in a list
for k in np.arange(1, nsteps):
cur_time = t[k*batch_size:(k+1)*batch_size]
cur_pos = pos[k*batch_size:(k+1)*batch_size]
xhat1, Q1 = ...
x_h.append(xhat1)
xh_cur = xhat1
Q_cur = Q1
return np.array(x_h)
```
%% Cell type:code id:cf21410e tags:
``` python
grader.check("q12")
```
%% Cell type:markdown id:fb61e06c-cf61-41c0-88c1-2d49c58870ca tags:
<!-- BEGIN QUESTION -->
13. Use the functions defined at question 12 to predict `tr2` trajectory. Adjust the batch size and forgetting factor (hint: keep it close to 1) to improve results. Display the predicted trajectory and observed one, and also the evolution of the prediction error across time.
%% Cell type:markdown id:d01eed91 tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:code id:0de3bc00-2da1-4758-9158-829d3724322a tags:otter_answer_cell
``` python
ff = ...
```
%% Cell type:code id:6adab39d-b558-4b7f-a4d2-9ef44350f312 tags:otter_answer_cell
``` python
batch_size = ...
```
%% Cell type:code id:b421fd4d-d7ca-4b7b-8aa5-09280ed01322 tags:otter_answer_cell
``` python
x2_h_ff = ...
```
%% Cell type:code id:7f4204d6-6da3-4652-a491-aaf95bc6e455 tags:otter_answer_cell
``` python
pred_traj2_ff = ...
```
%% Cell type:code id:31cd1cdc-0b4e-4563-a2cd-4a0b1874d4cb tags:otter_answer_cell
``` python
# Plot the predicted and observed data
...
```
%% Cell type:code id:e8414ea3-f39e-4ede-a690-4e3d94ce6b29 tags:otter_answer_cell
``` python
# Plot the error between prediction and observation
...
```
%% Cell type:markdown id:a84ca578 tags:
<!-- END QUESTION -->
File added
%% Cell type:code id:bd39160c tags:
``` python
# Initialize Otter
import otter
grader = otter.Notebook("RBF networks.ipynb")
```
%% Cell type:markdown id:b62d1e09-41c3-40d0-a763-e8231d437183 tags:
# Matrix Analysis 2024 - EE312
## Week 8 - Image classification with Radial Basis Function (RBF) networks
[LTS2](https://lts2.epfl.ch)
%% Cell type:code id:nasty-access tags:
``` python
import numpy as np
from matplotlib import pyplot as plt
from sklearn.metrics import pairwise_distances
from scipy.spatial import distance_matrix
from scipy.special import softmax
import requests
from pathlib import Path
```
%% Cell type:markdown id:vietnamese-basin tags:
## 1. Image Classification
In this exercise, we will be doing image classification with a simple neural network. For simplicity, let's assume we will be working with black and white images.
Given an input image $i$ represented as a vector of pixel intensities $ \mathbf{x}_i \in [0,1]^d$, we want to predict its correct label $\mathbf{y}_i$, which is represented as a one-hot vector in $\{0,1\}^K$, where $K$ is the number of possible categories (classes) that the image may belong to. For example, we may have pictures of cats and dogs, and our goal would be to correctly tag those images as either cat or dog. In that case we would have $K=2$, and the vectors $\begin{pmatrix}0 \\ 1\end{pmatrix}$ and $\begin{pmatrix}1 \\ 0\end{pmatrix}$ to represent the classes of cat and dog.
In today's example we will be using the MNIST handwritten digit dataset. It contains images of handwritten numbers from 0 to 9 and our goal is to create a model that can accurately tag each image with its number. Let's load the data first.
%% Cell type:code id:exact-cycling tags:
``` python
### Load the data
# Download data if needed
if not Path("./mnist_data.npz").is_file():
r = requests.get('https://os.unil.cloud.switch.ch/swift/v1/lts2-ee312/mnist_data.npz', allow_redirects=True)
with open('mnist_data.npz', 'wb') as f: # save locally
f.write(r.content)
mnist = np.load('mnist_data.npz')
```
%% Cell type:markdown id:varying-taylor tags:
In the context of classification, neural networks are models that given one (or multiple) input data points produce as output a set of corresponding labels for each input. The model itself consists of parametric functions $g_i$ which can be applied sequentially to the input data, resulting in a set of labels which are the model's prediction for the data. For example, in a model that consists of two parameteric functions $g_1$ and $g_2$, for a given $\mathbf{x}_i$, we have the predicted label $ \hat{\mathbf{y}}_i = g_1(g_2(\mathbf{x}_i))$. The parametric functions are commonly called "layers".
In a standard image classification setup, we are given some training data which we can use to tune the parameters of the parametric functions $g_i$ in order to improve its ability to predict the labels correctly. The parameters are generally tuned with respect to some objective (commonly called a loss function). We want to find the parameters of the model that minimize this loss function. Various loss functions can be used, but in general they tend to encode how "wrong" the model is. For
example, on a given image $i$ one can use the loss $\mathcal{L}(\hat{\mathbf{y}_i}, \mathbf{y}_i)= \sum_{j=1}^{K}(\hat{{y}}_{ij} -{y}_{ij})^2 $, which is the mean squared difference between the vector coordinates of the predicted label of the image and the ones of the actual label $\mathbf{y}_i$.
Minimizing the loss over the whole training set is referred to as "training the model". Furthermore, the goal is that given new data we have not seen before and we have not trained our model with, the model will still be able to classify accurately.
Before we go into the details of the model and how we will train it, let's prepare the data.
%% Cell type:code id:e2140162-4414-47c2-ae99-d616300a1d65 tags:
``` python
# Preprocess the data
images = mnist['data']
num_images = images.shape[0]
train_set_size = 60000
test_set_size = 10000
train_images = images[:train_set_size]
train_images = train_images/255.
train_images = train_images
test_images = images[-test_set_size:]
test_images = test_images/255.
test_images = test_images
#create one-hot encodings of labels
mnist_target = mnist['target']
num_classes = mnist_target.max()+1
labels = []
for k in range(num_images):
one_hot = np.zeros(num_classes)
one_hot[int(mnist_target[k])]=1
labels += [one_hot]
labels = np.array(labels)
#labels in one-hot
train_labels = labels[:train_set_size]
test_labels = labels[-test_set_size:]
#labels in integer form
int_labels = np.array(mnist_target, dtype=int)
int_labels_train = int_labels[:train_set_size]
int_labels_test = int_labels[-test_set_size:]
```
%% Cell type:code id:00b3e353-a254-4cb5-975d-0741a50576c7 tags:
``` python
# View an image to make sure everything went well
which_one = 5
plt.imshow(train_images[which_one].reshape((28,28)));
```
%% Cell type:markdown id:textile-tower tags:
## 2. Radial Basis Function (RBF) networks
%% Cell type:markdown id:acquired-malaysia tags:
For our task, we will be using Radial Basis Function (RBF) Networks as our neural network model.
The pipeline, which is presented in the image below, consists of two layers. The first employs non-linear functions $g_1(\mathbf{x};\boldsymbol{\mu}): \mathbb{R}^{n \times d} \rightarrow \mathbb{R}^{n \times c}$.
The second is a linear layer, represented by a matrix of weights $\mathbf{W} \in \mathbb{R}^{c \times K}$, which maps the output of the previous layer to class scores; its role is to predict labels.
The pipeline proceeds in the following steps:
i) Choose a set of $c$ points $\boldsymbol{\mu}_j\in [0,1]^d$.
ii) Compute $g_1(\mathbf{x}_i;\boldsymbol{\mu}_j) = \exp^{-\frac{||{\mathbf{x}_i-\boldsymbol{\mu}_j||^2}}{\sigma^2}}=a_{ij}$ for all possible pairs of $i$ and $j$. Here $\sigma$ is a hyperparameter that controls the width of the gaussian.
iii) Compute the predicted labels $g_2(\mathbf{a}_i)= \mathbf{a}_i^{\top}\mathbf{W}= \hat{\mathbf{y}}_i$. Here $\mathbf{a}_i \in \mathbb{R}^c$ are the outputs of the layer $g_1$ for an input image $i$. $\hat{\mathbf{y}}_i$ is a row vector and $\hat{y}_{ij} = \sum_{m=1}^{c}a_{im}w_{mj}$, $j\in\{1,...,K\}$.
![RBF_NN.png](images/RBF_NN.png)
%% Cell type:markdown id:f762f16e-4a88-43a8-813f-c93c62407c72 tags:
Intuitively, the first layer of the RBF network can be viewed as matching the input data with a set of prototypes (templates) through a gaussian whose width is determined by $\sigma$. The second layer performs a weighted combination of the matching scores of the previous layer to determine the predicted label for a given point.
%% Cell type:markdown id:126fdb8e-9607-4f93-9726-692e5ed8bb91 tags:
**1.** For hyperparameters $c$ and $\sigma$ of your choice, select $c$ prototypes and obtain the output of the first layer of the RBF network. The prototypes can simply be random images from your training set.
The following functions might be helpful:
- [pairwise_distances (from scikit-learn)](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise_distances.html)
- [random.choice (from numpy)](https://numpy.org/doc/stable/reference/random/generated/numpy.random.choice.html)
You can (optionally) perform an additional normalization step on the activations using the [softmax](https://docs.scipy.org/doc/scipy/reference/generated/scipy.special.softmax.html) function.
%% Cell type:code id:0e4ab127-352e-49c3-9b72-9e69bfc8b4ba tags:otter_answer_cell
``` python
def get_rand_centers(num_centers, imgs):
"""
Sample num_centers (randomly) from imgs
Parameters
----------
num_centers : number of samples
imgs : matrix to sample rows from
Returns
-------
The samples matrix
"""
...
def get_activations(imgs, rand_centers, sigma, softmax_norm=False):
"""
Computes the activations of the images vs. the sample centers
Parameters
----------
imgs: images matrix to compute activations for
rand_centers: matrix of centers points
sigma: parameter of the gaussian kernel
softmax_norm: if true, perform softmax activation on the activations
Returns
-------
The activation matrix A
"""
...
```
%% Cell type:code id:19e1cca2-bb2c-40c7-b64a-e14c3b42e54d tags:otter_answer_cell
``` python
#pick random centers
number_of_centers = 200
rand_centers = get_rand_centers(number_of_centers, train_images)
sigma = 10
activations = get_activations(train_images, rand_centers, sigma)
```
%% Cell type:code id:6bec41d9 tags:
``` python
grader.check("q1")
```
%% Cell type:markdown id:institutional-thompson tags:
## 3. Training the network
%% Cell type:markdown id:disciplinary-present tags:
To make things easier, we will fix the parameters $\boldsymbol{\mu}$ and $\sigma$ of the network, i.e., we decide their values before and the remain constant throughout training and testing of the model. Therefore, the only trainable parameters are going to be the weights of the second layer.
To train the model, we are going to use the mean squared loss function that we mentioned earlier. For a training dataset with $n$ images we have
$$ \mathcal{L}(\text{training data}, \text{training labels}) = \frac{1}{2n}\sum_{i=1}^n\mathcal{L}(\hat{\mathbf{y}}_i,\mathbf{y}_i) = \frac{1}{2n}\sum_{i=1}^n ||(\hat{\mathbf{y}}_{i} - \mathbf{y}_{i})||^2.$$
There are two ways of tuning those:
i) Backpropagation.
ii) Solve a linear system.
%% Cell type:markdown id:f0ef20c5-0cc0-4a7d-a80b-8acfa207f82e tags:
### 3.1 Training with backpropagation
Backpropagation depends on [gradient descent](https://en.wikipedia.org/wiki/Gradient_descent#Description). The goal is to update the trainable parameters of the network by "moving them" in the direction that will decrease the loss function.
In our case, the weights $w_{kl}$ are updated in the following manner
$$ w_{kl}' = w_{kl}- \gamma \frac{\partial\mathcal{L}(\text{training data}, \text{training labels})}{\partial w_{kl}}, $$
where $\gamma$ is a hyper-parameter called the learning rate. The gradient of the Loss points towards the direction of steepest descent, hence we update the weights of the network towards that direction.
---
%% Cell type:markdown id:189e8b8e-a909-4f9e-950a-e0f65a48d70c tags:
<!-- BEGIN QUESTION -->
2. For the mean squared error loss, what is the gradient of the loss with respect to the weights $w_{kl}$ of the network?
%% Cell type:markdown id:0df8dee2 tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:markdown id:injured-mother tags:
<!-- END QUESTION -->
3. Train the weights of the linear layer using stochastic gradient descent. For $p$ iterations (called epochs), you have to update each weight $w_{kl}$ of the network once for each image, by computing the gradient of the loss with respect to that weight.
NB: if you implement gradient computation naively, it might be very slow. Consider using [numpy.outer](https://numpy.org/doc/stable/reference/generated/numpy.outer.html) to speed up computation.
%% Cell type:code id:ddb1a6cc-c9d5-4561-8c0d-0f0c5b396ae1 tags:
``` python
# Helper function to compute the loss
def get_predictions_loss(activations, weight, labels, int_labels):
predictions = activations@weights
num_correct_predictions = ((predictions.argmax(1) - int_labels)==0).sum()
loss = ((predictions - labels)*(predictions - labels)).sum(1).mean()
return loss, num_correct_predictions
```
%% Cell type:code id:e365cb99-d2c3-4b98-a814-e5c5485f3967 tags:otter_answer_cell
``` python
# compute the gradient for a single input
def compute_gradient(activation, weights, train_label):
"""
Computes gradients of the weight for a single activation
Parameters
----------
activation : vector containing the activation of the current image
weights: current weights
train_label: label of the current image
Returns
-------
The gradient to update the weights
"""
...
```
%% Cell type:code id:46e31c54-3682-4124-8a08-f675fba974f7 tags:otter_answer_cell
``` python
# Initial values for hyperparams. Feel free to experiment with them.
weights = (1/28)*np.random.randn(number_of_centers, num_classes)
epochs = 5 # you should train for more epochs !
learning_rate = 0.1
#Backpropagation with SGD
for k in range(epochs):
for counter, activation in enumerate(activations):
gradient = ...
weights = ...
loss_train, num_correct_predictions_train = get_predictions_loss(activations, weights, train_labels, int_labels_train)
print("Loss:", loss_train)
print("Number of correct predictions:", num_correct_predictions_train)
```
%% Cell type:code id:957d63fb tags:
``` python
grader.check("q3")
```
%% Cell type:markdown id:criminal-imperial tags:
We can now check how well your network does on the test set and print its accuracy.
%% Cell type:code id:civil-riding tags:
``` python
def get_accuracy(predictions, int_labels, set_size):
num_correct_predictions = ((predictions.argmax(1) - int_labels)==0).sum()
return num_correct_predictions/set_size
test_activations = get_activations(test_images, rand_centers, sigma)
test_predictions = test_activations@weights
print(f"The accuracy on the test set is: {get_accuracy(test_predictions, int_labels_test, test_set_size)*100} %")
```
%% Cell type:markdown id:promising-funds tags:
### 3.2 Solving the linear system
%% Cell type:markdown id:f8234141-0eeb-49da-8d37-892a6416e41d tags:
Since we only have one weight matrix to tune, we can avoid learning with backpropagation entirely. Consider the mean squared error for the whole dataset and a one-dimensional binary label $y_i$ for each data point for simplicity. The mean squared loss for the dataset is
$$ \sum_{i=1}^n (\hat{{y}}_{i} - {y}_{i})^2= ||(\mathbf{A}\mathbf{w} - \mathbf{y})||^2.$$ Here $\mathbf{A} \in \mathbb{R}^{n \times c}$ is the matrix that contains the outputs (activations) of the first layer. From a linear algebra perspective, we are looking for a matrix $\mathbf{w}$ that solves the linear system $ \mathbf{A}\mathbf{w} = \mathbf{y}.$
---
%% Cell type:markdown id:4e691348-afe4-4021-a42b-3985a4e9999e tags:
<!-- BEGIN QUESTION -->
4. Can we find solutions to this system (justify) and how ?
%% Cell type:markdown id:cf9e153e tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:markdown id:subsequent-exercise tags:
<!-- END QUESTION -->
<!-- BEGIN QUESTION -->
5. Based on your answer above, compute the weights of the neural network that best classify the data points of the training set.
%% Cell type:markdown id:6e7fb43a tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:code id:collected-renaissance tags:otter_answer_cell
``` python
#calculate the weights of the linear layer
weights_lsq = ...
```
%% Cell type:code id:chinese-foster tags:otter_answer_cell
``` python
#predict the labels of each image in the training set and compute the accuracy
train_prediction_lsq = ...
print(f"The accuracy on the training set is: {get_accuracy(train_prediction_lsq, int_labels_train, train_set_size)*100} %")
```
%% Cell type:code id:young-invitation tags:otter_answer_cell
``` python
#calculate the activations of the test set
test_activations = get_activations(test_images, rand_centers, sigma)
```
%% Cell type:code id:naked-dover tags:otter_answer_cell
``` python
#predict the accuracy on the test set
test_predictions_lsq = ...
print(f"The accuracy on the test set is: {get_accuracy(test_predictions_lsq, int_labels_test, test_set_size)*100} %")
```
%% Cell type:markdown id:handmade-warrant tags:
<!-- END QUESTION -->
<!-- BEGIN QUESTION -->
### 6. **Open ended**: On the choice of templates.
Suggest a different or more refined way to select templates for the RBF network and implement it. Check how it compares with your original approach.
Check how it works with the backpropagation and linear system solutions.
%% Cell type:markdown id:4b05dc5e tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:markdown id:8734c27e tags:
<!-- END QUESTION -->
exercises/week07/images/RBF_NN.png

47 KiB

%% Cell type:code id:60552e92 tags:
``` python
# Initialize Otter
import otter
grader = otter.Notebook("Linear transforms.ipynb")
```
%% Cell type:markdown id:73bbe3f2-6e0a-463e-abc6-01213d4e5811 tags:
# Matrix Analysis 2025 - EE312
## Week 2 - Linear transforms
[N. Aspert](https://people.epfl.ch/nicolas.aspert) - [LTS2](https://lts2.epfl.ch)
The first week notebook (introduction to Python, Numpy and Matplotlib) can be used as a help.
## Objective
The end goal is to understand purely algebraic, matrix based, view of a few linear transforms. You will use those linear transform to perform some basic time-frequency analysis of signals.
%% Cell type:code id:9e522683-69ce-458f-a78c-9780e07a865e tags:
``` python
import numpy as np
import matplotlib.pyplot as plt
```
%% Cell type:markdown id:5901baaa-4eb1-446a-8c0b-88369a9e213e tags:
## Part I - Fourier
%% Cell type:markdown id:7e81811f-e9c9-4d57-bbb9-bee71b7df722 tags:
<!-- BEGIN QUESTION -->
1. Prove that any set of orthogonal vectors $v_i \in \mathbb{C}^N, \, i=1, \ldots , M \leq N$ such that $v_i^H v_j = C \delta_{i,j}$ is linearly independent (where $C$ is some constant).
%% Cell type:markdown id:b6087a12 tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:markdown id:5d9c0799-a145-4b31-b8a7-3dc6d9fcf6da tags:otter_assign_solution_cell
We have a set of orthogonal vectors $v_i \in \mathbb{C}^N,$ with $i=1, \ldots , M \leq N$.\
Therefore, $v_i^H v_j = C \delta_{i,j} =\left\{\begin{matrix}
C,& if & i=j\\
0,& if & i\neq j
\end{matrix}\right.$.<br/>
We want to prove that the set is linearly independent, in other words that we have, with $c_{i}$ some scalar, \
$\sum_{j = 1}^{N} c_{j}.v_{j} = 0 \Leftrightarrow c_{j} = 0$.<br/><br/><br/>
Let's start, we can write:\
$v_{i}^H. (\sum_{j = 1}^{N}c_{j}.v_{j}) = v_{i}.0 = 0$<br/><br/>
Therefore,\
$0 = v_{i}^H. (\sum_{j = 1}^{N}c_{j}.v_{j}) =\sum_{j = 1}^{N}c_{j}.v_{j}.v_{i}^H$<br/><br/>
Yet, the set of vectors $v_i$ is orthogonal, $v_i^H v_j = C \delta_{i,j}$. So, all the terms of the sum are equal to zero except for the one where $i=j$:<br/>
$0 =\sum_{j = 1}^{N}c_{j}.v_{j}.v_{i}^H = 0 + c_{i}.v_{i}.v_{i}^H = c_{i}.v_{i}.v_{i}^H$.\
$0 = c_{i}.v_{i}.v_{i}^H$.<br/><br/>
Nonetheless, we know that $v_{i}$ is a non-zero vector, $v_{i}.v_{i}^H =C$. Hence, we can deduce:\
$0 = c_{i}.v_{i}.v_{i}^H$ *if and only if* $c_{i} = 0$<br/><br/>
This reasonning works for all $i=1, \ldots , M \leq N$.<br/><br/>
We can conclude that:\
$\sum_{j = 1}^{N} c_{j}.v_{j} = 0 \Leftrightarrow c_{j} = 0$.<br/>
**The set of orthogonal vectors is linearly independent.**
%% Cell type:markdown id:accea561-acef-429f-aa68-cc002cba19b8 tags:
<!-- END QUESTION -->
<!-- BEGIN QUESTION -->
2. Compute $a_r = \sum_{n=0}^{N-1}e^{j2\pi r\frac{n}{N}}$, where $r$ is an integer (discuss the result depending on the value of $r$).
%% Cell type:markdown id:3f0ff7a0 tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:markdown id:6da31060-083f-4873-9ad3-bf33c4a62b84 tags:otter_assign_solution_cell
If $r\equiv 0 \mod N$, then $e^{j2\pi r\frac{n}{N}}=1$, and $a_r = \sum_{n = 0}^{N-1} 1 = N$.
If $r\not\equiv 0 \mod N$, this is the sum of a geometric series $a_r = \sum_{n = 0}^{N-1} b^{n} = \frac{1 - b^N}{1 - b}$, with $b=e^{\frac{2jr\pi}{N}}$.
Since $b^N = e^{2jr\pi} = 1$, we then have $a_r = 0$.
%% Cell type:markdown id:b591574c-ba6d-4a15-8b42-0cb62508da73 tags:
<!-- END QUESTION -->
<!-- BEGIN QUESTION -->
3. Let $v_k \in \mathbb{C}^N$ be such that $v_k[n] = e^{-j 2 \pi \frac{kn}{N}}$, for $k,n = 0, \ldots , N-1$.
- Prove that these vectors are mutually orthogonal, hence linearly independent.
- Compute the norm of $v_k$.
%% Cell type:markdown id:d8031ee7 tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:markdown id:1cd4d461-705e-405d-a433-ee0d4caeb522 tags:otter_assign_solution_cell
$v_{k}$ has the form $\begin{pmatrix}
e^{0}\\
e^{-j 2 \pi \frac{k}{N}}\\
...\\
e^{-j 2 \pi \frac{kN-1}{N}}
\end{pmatrix}$, for $k,n = 0, \ldots , N-1$.<br/><br/>
As we mentioned in the previous question, to prove that vectors are mutually orthogonal, we need to demonstrate that $v_k^H v_l = C \delta_{k,l}$, with $C$ some constant.<br/><br/>
$v_k^H v_l = \sum_{n = 0}^{N-1} v_{k}[n]^H .v_l[n]$,&nbsp; with $k,l,n = 0, \ldots , N-1$. &nbsp;&nbsp;&nbsp; (simple rule of matrix multiplication)<br/><br/>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; $= \sum_{n = 0}^{N-1} e^{j2\pi k\frac{n}{N}}.e^{-j2\pi l\frac{n}{N}} $<br/>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; $= \sum_{n = 0}^{N-1} e^{j2\pi \frac{n}{N}(k-l)}$<br/>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; $= \sum_{n = 0}^{N-1} e^{j2\pi \frac{n}{N}r}$, &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if we define $r=k-l$.<br/><br/>
Using the result of question 2, $v_k^H v_l = 0$ for $k\neq l$.
For $r = 0$, question2 shows that $v_k^H v_l = N$ for $k = l$.<br/><br/>
We can conclude with:<br/>
$v_k \in \mathbb{C}^N$ such as $v_k[n] = e^{-j 2 \pi \frac{kn}{N}}$, for $k,n = 0, \ldots , N-1$ give us:<br/>
$v_k^H v_l = N \delta_{k,l}$.
**The vectors are mutually orthogonal, hence linear independent.**
From the above result we have $v_k^Hv_k = N = \|v_k\|^2$, and therefore $\|v_k\| = \sqrt{N}$.
%% Cell type:markdown id:efc7a6a3-1602-4100-8ec8-d82f9e01ad86 tags:
<!-- END QUESTION -->
4. Implement the function `get_fourier_matrix` that returns a normalized Fourier matrix of size $N\times N$. Do not make use of the builtin DFT/FFT functions in `numpy` or `scipy`. Raise a `ValueError` exception if a ngetive or zero $N$ value is supplied.
%% Cell type:code id:f58a90c7-d4e3-48d9-a21b-42a4d79e2344 tags:otter_assign_solution_cell
``` python
def get_fourier_matrix(N):
# BEGIN SOLUTION
if N <= 0:
raise ValueError("Invalid size")
k = np.arange(0, N)
Ws = np.exp(-2j*np.pi*np.outer(k, k)/N)
return Ws/np.sqrt(N)
# END SOLUTION
```
%% Cell type:code id:2283adf4 tags:
``` python
grader.check("q4")
```
%% Output
q4 results: All test cases passed!
q4 - 1 message: Good, your implementation returns correct results
q4 - 2 message: Good, you did not use the scipy.linalg.dft function
q4 - 3 message: Good, you did not use the scipy.fft.fft function
q4 - 4 message: Good, you did not use the numpy.fft.fft function
q4 - 5 message: Good, your implementation returns an orthogonal matrix
q4 - 6 message: Good, you properly validated size before computing the result
%% Cell type:markdown id:a5f6b819-1e53-456a-bf67-b76258922d6e tags:
Let us now generate two test signals.
The first one $x_1(t)$ is made of four piecewise sinusoids, of different frequencies:
$$
x_1(t) = \cos(2\pi 5t), 0\le t < 2\\
x_1(t) = \cos(2\pi 10t), 2\le t < 4\\
x_1(t) = \cos(2\pi 25t), 4\le t < 6\\
x_1(t) = \cos(2\pi 50t), 6\le t < 8\\
$$
%% Cell type:code id:081e39c8-1874-4274-a7dc-59c8bdab84e4 tags:
``` python
Fs = 256 # sampling frequency
t = np.arange(0, Fs*8)/Fs
x1 = np.zeros(Fs*8)
x1[0:Fs*2] = np.cos(2*np.pi*5*t[0:Fs*2])
x1[Fs*2:Fs*4] = np.cos(2*np.pi*10*t[Fs*2:Fs*4])
x1[Fs*4:Fs*6] = np.cos(2*np.pi*25*t[Fs*4:Fs*6])
x1[Fs*6:Fs*8] = np.cos(2*np.pi*50*t[Fs*6:Fs*8])
```
%% Cell type:markdown id:b7f7475c-bfb0-4e9f-9dd6-ac9eeb5abef1 tags:
The second signal $x_2(t)$ is the sum of the same sinusoids over the complete time interval, with a scaling term s.t. the amplitude of both signals is identical.
%% Cell type:code id:8e0199b4-baf4-4376-a61c-b778aabe59f7 tags:
``` python
x2 = 0.25*(np.cos(2*np.pi*5*t) + np.cos(2*np.pi*10*t) + np.cos(2*np.pi*25*t) + np.cos(2*np.pi*50*t))
```
%% Cell type:markdown id:c52f3daf-0017-4252-b297-b5d6bb6c8151 tags:
<!-- BEGIN QUESTION -->
5.
- Display the generated signals using `plt.plot`.
- Compute their Fourier transforms using the Fourier matrix generate at the previous question.
- Display the amplitude of their Fourier spectrum. What do you observe ?
%% Cell type:markdown id:cbde7de1 tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:code id:7d4aee96-f61e-4e51-84e8-56b4e357aa0f tags:otter_assign_solution_cell
``` python
# plot x1
# BEGIN SOLUTION
plt.plot(x1)
# END SOLUTION
```
%% Cell type:code id:7a504349-03a3-4efd-bc3b-8249c09453fb tags:otter_assign_solution_cell
``` python
# plot x2
# BEGIN SOLUTION
plt.plot(x2)
# END SOLUTION
```
%% Cell type:code id:7bf1bf4a-04b8-43f6-8128-29647e1f964a tags:otter_assign_solution_cell
``` python
# Compute the Fourier transform of x1 and x2
# BEGIN SOLUTION
W = get_fourier_matrix(len(x1))
xf1 = W@x1
xf2 = W@x2
# END SOLUTION
```
%% Cell type:code id:159e693c-4427-4a8b-9f53-42dc63bafbc9 tags:otter_assign_solution_cell
``` python
# Plot the spectrum of x1
# BEGIN SOLUTION
plt.plot(np.abs(xf1))
# END SOLUTION
```
%% Cell type:code id:f2c3061b-900e-44d9-bb73-02c98334a818 tags:otter_assign_solution_cell
``` python
# Plot the spectrum of x2
# BEGIN SOLUTION
plt.plot(np.abs(xf2))
# END SOLUTION
```
%% Cell type:markdown id:61c226cc-df74-4a71-9e18-8b2071e65a91 tags:otter_assign_solution_cell
Despite having $x_1$ and $x_2$ being quite different signals, we notice that the amplitude of their Fourier transform is very similar. The Fourier transform of $x_1$ exhibit more ripples around the peaks due to discontinuites, but both show four symmetrical peaks at the frequencies of the 4 sinusoids composing the signals. While the input signals can be reconstructed from their Fourier transform, these representations are not very well suited, as they give precise informations about the frequencies present in both signals, but no information about their location in time.
%% Cell type:markdown id:19060be9-6494-4b4e-9803-1a4cb6fc36ac tags:
<!-- END QUESTION -->
In order to have a better compromise between time and frequency, the input signal will be split in smaller non-overlapping blocks of length $p$, and we will perform the DFT of each block.
6. Using the `get_fourier_matrix` implemented previously, fill the `get_block_dft_matrix` function below s.t. it returns a $N\times N$ matrix that will perform the block Fourier transform when applied to an input vector. Raise a `ValueError` if $p$ does not divide $N$.
Hint: [numpy.pad](https://numpy.org/doc/stable/reference/generated/numpy.pad.html#numpy.pad) and/or [numpy.kron](https://numpy.org/doc/stable/reference/generated/numpy.kron.html) might prove useful.
%% Cell type:code id:903005b2-e249-48a0-b28b-24daef16cdb7 tags:otter_assign_solution_cell
``` python
def get_block_dft_matrix(N, p):
# BEGIN SOLUTION
if N % p != 0:
raise ValueError("Incorrect size")
return np.kron(np.eye(N//p), get_fourier_matrix(p))
# END SOLUTION
```
%% Cell type:code id:58978e43 tags:
``` python
grader.check("q6")
```
%% Cell type:markdown id:4e894db3-b5e6-4b44-a7d1-988c59542b74 tags:
<!-- BEGIN QUESTION -->
We will now use this block Fourier transform to how the frequencies of the signal evolve through time.
7.
- Using the `reshape` and `plt.imshow` functions, display the amplitude of the result of the block Fourier transform of $x_1$ and $x_2$. Is the result improved when compared to the one observed in question 5 ?
- What is the influence of parameter $p$ ?
%% Cell type:markdown id:6239185e tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:code id:6da33694-7bd6-4257-9a5f-9bb300cd5c69 tags:otter_assign_solution_cell
``` python
# Compute the block DFT matrix Wb
# BEGIN SOLUTION
p = 64
Wb = get_block_dft_matrix(len(x1), p)
# END SOLUTION
```
%% Cell type:code id:e1eb36fe-3079-4feb-86df-10596293e550 tags:otter_assign_solution_cell
``` python
# Plot the block DFT of x1
# BEGIN SOLUTION
plt.imshow(np.rot90(np.abs(Wb@x1).reshape(len(x1)//p, p)))
# END SOLUTION
```
%% Cell type:code id:4de97281-75ba-49d0-bc02-2573cf9791ec tags:otter_assign_solution_cell
``` python
# Plot the block DFT of x2
# BEGIN SOLUTION
plt.imshow(np.rot90(np.abs(Wb@x2).reshape(len(x2)//p, p)))
# END SOLUTION
```
%% Cell type:markdown id:34644739-72f3-4570-b27f-579baeb2f33d tags:otter_assign_solution_cell
We now have a better idea of how the frequencies present in the signals evolve through time. It is more obvious that $x_1$ contains a sequence of sinusoids and $x_2$ a superposition of sinusoids.
However, the limitation is that parameter $p$ controls the frequency resolution. If we chose a value of $p$ too small, the frequency resolution is poor, and if we chose a $p$ too big, the time resolution will be in turn bad. So while we improve the time-frequency display, there is a tradeoff to be made between time and frequency resolutions, as both cannot be simultaneously high.
%% Cell type:markdown id:76cc3589-b7d9-44ec-98ea-07fc6550a53a tags:
<!-- END QUESTION -->
<!-- BEGIN QUESTION -->
8. In a real-world application, would generating a $N\times N$ matrix to perform the block Fourier transform be a good way to implement it ? (Justify)
%% Cell type:markdown id:ff1a761a tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:markdown id:21f8ce3e-1548-45fe-aa05-7b59aa052eca tags:otter_assign_solution_cell
In a real-world application, the matrix as we implemented it is a poor choice. As the blocks that compose it are identical, it would be much more efficient to apply the smaller block Fourier matrix directly to the chunks of the input signals.
%% Cell type:markdown id:87372bd1-4055-4742-ab6a-f7b6ba8750af tags:
<!-- END QUESTION -->
## Part II - Haar transform
In this part we will study another approach to study the time/frequency properties of signals.
Let us consider a vector $x\in\mathbb{R}^N$, with $N$ being even. The single-level Haar transform decomposes $x$ into two vectors $a^1$ and $d^1$ belonging to $\mathbb{R}^{\frac{N}{2}}$.
The coefficients of $a^1$ are defined as follows: $a^1_n = \frac{1}{\sqrt{2}}(x_{2n-1} + x_{2n}), n=1, ..., \frac{N}{2}$. $a^1$ is referred to as the *average coefficients vector*.
The coefficients of $d^1$ are defined as follows: $d^1_n = \frac{1}{\sqrt{2}}(x_{2n-1} - x_{2n}), n=1, ..., \frac{N}{2}$. $d^1$ is referred to as the *detail coefficients vector*.
%% Cell type:markdown id:170a3e62-8b25-482c-a8dd-91988b1d08ed tags:
<!-- BEGIN QUESTION -->
9. Let us represent the single-level Haar transform by a matrix $H_1$ s.t.
$$
H_1 x = \begin{pmatrix}a^1 \\ d^1\end{pmatrix}
$$
Prove that $H_1$ is orthonormal.
%% Cell type:markdown id:56fd9eda tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:markdown id:e5906e37-9a3f-469c-9904-ba4fe551612c tags:otter_assign_solution_cell
In order to check that $H_1H_1^T=H_1^TH_1 = I$ we need to distinguish different cases.
First, we see that $H_1$ has two types of rows: the upper-half is made of shifted copies of $\frac{1}{\sqrt{2}}(1, 1, 0, ...., 0)$, the lower-half is made of shifted copies of
$\frac{1}{\sqrt{2}}(1, -1, 0, ...., 0)$. Both are shifted by multiple of 2 elements, and the only case where those vectors have a common support is when they are shifted by the same amount of elements. Otherwise they have disjoint supports, and their scalar product is 0, which makes them orthogonal.
When they have a common support, there are two cases:
- we compute the scalar product of a row of $H_1$ with itself, which in this case is clearly equal to 1
- we compute the scalar product of a row of $H_1$ with its counterpart having the same support in the other half of the matrix, i.e. we compute the scalar product of $(..., 1, 1, 0, ... )$ with $(..., 1, -1, ...)$, which is clearly equal to 0.
This shows that $H_1$ is an orthonormal matrix.
%% Cell type:markdown id:92f1b78b-3017-4129-bc95-f132717bc7b9 tags:
<!-- END QUESTION -->
10. Write a function that returns the single-level Haar transform matrix $H_1$ for a given $N$. Raise a `ValueError` if $N$ is invalid.
%% Cell type:code id:5d123a1c-7f98-4edf-b896-7d773bf48e76 tags:otter_assign_solution_cell
``` python
def get_sl_haar_matrix(N):
# BEGIN SOLUTION
if N <= 0 or N%2 != 0:
raise ValueError("Invalid size")
Hu = np.kron(np.eye(N//2), [1, 1])
Hl = np.kron(np.eye(N//2), [1, -1])
return np.vstack([Hu, Hl])/np.sqrt(2)
# END SOLUTION
```
%% Cell type:code id:86b9b019 tags:
``` python
grader.check("q10")
```
%% Cell type:markdown id:e2e7a4af-0f4e-4972-a051-468a12967e79 tags:
The multi-level Haar transform is defined by recursively applying the single-level transform **to the average coefficient parts**.
For instance constructing 2-level Haar transform over $N$ points start with the previously defined $H_{1,N}$ matrix of size $N\times N$ and the corresponding $\frac{N}{2}\times\frac{N}{2}$ version denoted by $H_{1,\frac{N}{2}}$.
$H_{1,N}$ can be written as
$$
\begin{pmatrix} H_{1, N}^a \\ H_{1,N}^d \end{pmatrix},
$$
where $H_{1, N}^a$ and $H_{1,N}^d$ are respectively the average and detail coefficient matrices, both of size $\frac{N}{2}\times N$.
Following these notations, the 2-level Haar transform matrix $H_{2,N}$ can be written as:
$$
\begin{pmatrix} H_{1,\frac{N}{2}}H_{1, N}^a \\ H_{1,N}^d \end{pmatrix},
$$
11. Implement a function that returns the $H_{p,N}$ matrix of size $N\times N$ that performs a $p$-level haar transform. Raise a `ValueError` if the size and the level are incompatible, or if the level is smaller than 1.
%% Cell type:code id:d73eb83b-8d4b-4bf9-9c19-b3d15ee927c1 tags:otter_assign_solution_cell
``` python
def get_haar_matrix(N, level):
# BEGIN SOLUTION
if level < 1:
raise ValueError("Invalid level")
H = get_sl_haar_matrix(N)
cur_N = N
for k in range(level - 1):
cur_N = cur_N//2
if cur_N % 2 != 0:
raise ValueError("Incompatible size")
Hs = get_sl_haar_matrix(cur_N)
H[:cur_N, :] = Hs@H[:cur_N, :]
return H
# END SOLUTION
```
%% Cell type:code id:c3138843 tags:
``` python
grader.check("q11")
```
%% Cell type:markdown id:918653e3-22c1-4b25-abbc-2461a6736557 tags:
<!-- BEGIN QUESTION -->
12. Prove that $H_{p,N}$ is orthonormal.
%% Cell type:markdown id:02e21ab7 tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:markdown id:136913af-42f3-417a-a94d-2ae7fa7411c7 tags:otter_assign_solution_cell
We have shown that $H_{1,N}$ is orthonormal. Using the definition of $H_{2,N}$, we have
$$
H_{2,N}H_{2,N}^T=\begin{pmatrix} H_{1,\frac{N}{2}}H_{1, N}^a \\ H_{1,N}^d \end{pmatrix}\begin{pmatrix} {H_{1, N}^a}^T H_{1,\frac{N}{2}}^T & {H_{1,N}^d}^T \end{pmatrix},\\
= \begin{pmatrix} H_{1,\frac{N}{2}}(H_{1, N}^a{H_{1, N}^a}^T) H_{1,\frac{N}{2}}^T & H_{1,\frac{N}{2}}H_{1, N}^a {H_{1,N}^d}^T \\ H_{1,N}^d{H_{1, N}^a}^T H_{1,\frac{N}{2}}^T & H_{1,N}^d{_{1,N}^d}^T\end{pmatrix}.
$$
Since the vectors from $H^a$ and $H^d$ are orthogonal, we can simplify the expression to
$$
H_{2,N}H_{2,N}^T=\begin{pmatrix} I & 0 \\ 0 & I\end{pmatrix}.
$$
By induction, this is true for all levels.
%% Cell type:markdown id:8224d668-7cdc-4175-a587-0647df59b2b4 tags:
<!-- END QUESTION -->
### Haar transform visualization
In order to make the visualization of the Haar decomposition easy, we provide you the `plot_haar_coeffs` function that will display the average and detail coefficients of the different levels.
The function takes 2 arguments:
- the input signal
- the number of levels
%% Cell type:markdown id:b6cac5b0-a173-499c-ad89-9e4b82166c7a tags:
<!-- BEGIN QUESTION -->
13. Display the Haar transform of $x1$ and $x2$.
%% Cell type:code id:5ca2ec21-222e-430c-a7bd-d2af54e2de47 tags:
``` python
from display_helper import plot_haar_coeffs
```
%% Cell type:code id:6a946638-18dc-44fa-8ffc-0b44a3f653e1 tags:otter_assign_solution_cell
``` python
# display the decomposition of x1
# BEGIN SOLUTION
plot_haar_coeffs(x1, 5)
# END SOLUTION
```
%% Cell type:code id:b01b9267-9609-474e-b5e0-8b7adb687164 tags:otter_assign_solution_cell
``` python
# display the decomposition of x2
# BEGIN SOLUTION
plot_haar_coeffs(x2, 5)
# END SOLUTION
```
%% Cell type:markdown id:81e23ac6-de1a-4cb4-98f2-89545052f9f3 tags:
<!-- END QUESTION -->
## Part III - Denoising
We will now use the different transforms defined in part I and II to perform denoising.
Let us create a noisy signal for this purpose.
%% Cell type:code id:e5583ca2-13ea-49f8-baf6-4d6a223f3b2e tags:
``` python
angle1 = np.linspace(0, 5*np.pi/2, 300)
wave1 = np.sin(angle1)
angle2 = np.linspace(0, 3*np.pi/2, 300)
wave2 = np.sin(angle2)
angle3 = np.linspace(np.pi/2, 2*np.pi, 424)
wave3 = np.sin(angle3)
wave = np.append(wave1, wave2)
wave = np.append(wave, wave3)
wave_noisy = wave + 0.2*np.random.normal(0, 1, 1024)
```
%% Cell type:code id:d3278fc1-e814-47b9-83f0-61f4fad8d4d7 tags:
``` python
plt.plot(wave_noisy, 'r')
plt.plot(wave)
```
%% Cell type:markdown id:33489d2d-b038-4896-87e7-0af6568ad702 tags:
The noise is usually located in the higher frequencies. However, the signal we created is a bit special as it has two discontinuities which also generate high frequencies components (remember the Fourier transform of a rectangle function is a sinc).
%% Cell type:markdown id:fbd9ce45-7214-4ca7-a5e3-cf0c340458a3 tags:
<!-- BEGIN QUESTION -->
14. Implement a function `denoise_signal` that perform denoising of the input signal by using a supplied orthonormal transform matrix, and by setting the transformed coefficients having an amplitude smaller than a given threshold to 0. You might want to use the [numpy.where](https://numpy.org/doc/stable/reference/generated/numpy.where.html) function for this. When denoising using the Haar transform, you can preform the thresholding only on the detail coefficients. Implement the function `denoise_signal_haar` that performs this operation.
NB: The result returned should be real, in order to be displayed.
%% Cell type:code id:cf7270a7-de1d-4c24-913c-916479ee8ab5 tags:otter_assign_solution_cell
``` python
def denoise_signal(W, x, threshold=0.1):
"""
W: NxN input orthonormal matrix (Fourier, block-Fourier or Haar)
x: input signal (of length N)
"""
# BEGIN SOLUTION
x_tf = W@x
Winv = np.conj(W.T)
return np.real(Winv@np.where(np.abs(x_tf)<threshold, 0, x_tf))
# END SOLUTION
```
%% Cell type:code id:c029ab19-11a5-4bc9-b348-31bf8ca07bca tags:otter_assign_solution_cell
``` python
def denoise_signal_haar(W, x, threshold=0.1, detail_start_index=256):
"""
W: NxN input orthonormal matrix (Fourier, block-Fourier or Haar)
x: input signal (of length N)
detail_start_index: thresholding is performed on x[detail_start_index:]
"""
# BEGIN SOLUTION
x_tf = W@x
Winv = W.T
x_dn = x_tf
x_dn[detail_start_index:] = np.where(np.abs(x_tf[detail_start_index:])<threshold, 0, x_tf[detail_start_index:])
return Winv@x_dn
# END SOLUTION
```
%% Cell type:code id:1b47fbc5-070b-49c4-a616-d8e753adede1 tags:otter_assign_solution_cell
``` python
# Perform denoising with the full Fourier transform and display the result.
# Make sure you choose a good threshold
# BEGIN SOLUTION
waverec_fourier = denoise_signal(get_fourier_matrix(len(wave_noisy)), wave_noisy, 0.55)
plt.plot(wave_noisy, 'r')
plt.plot(waverec_fourier)
# END SOLUTION
```
%% Cell type:code id:d06fa927-ca94-4387-b987-a4e3ffeabeb2 tags:otter_assign_solution_cell
``` python
# Perform denoising with the block Fourier transform and display the result
# Make sure you choose a good threshold and block size
# BEGIN SOLUTION
waverec_block_fourier = denoise_signal(get_block_dft_matrix(len(wave_noisy), 16), wave_noisy, 0.4)
plt.plot(wave_noisy, 'r')
plt.plot(waverec_block_fourier)
# END SOLUTION
```
%% Cell type:code id:98e6e4a6-a289-4552-b5a9-27f40c40cb40 tags:otter_assign_solution_cell
``` python
# Perform denoising with the Haar transform and display the result
# Make sure you choose a good threshold and an appropriate number of levels
# BEGIN SOLUTION
waverec_haar = denoise_signal_haar(get_haar_matrix(len(wave_noisy), 5), wave_noisy, 0.9, 256)
plt.plot(wave_noisy, 'r')
plt.plot(waverec_haar)
# END SOLUTION
```
%% Cell type:code id:d5cb5707 tags:
``` python
grader.check("q14")
```
%% Cell type:markdown id:e6b05f2f-34a1-4689-82ef-c3be92c842ba tags:
<!-- END QUESTION -->
<!-- BEGIN QUESTION -->
15. Compare the three denoising methods (Fourier, block Fourier and Haar). Which one performs better (in terms of noise removal but also in terms of discontinuity preservation) ? Was that expected (justify) ?
%% Cell type:markdown id:a999fadd tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:markdown id:eb6b1561-b5cb-4db0-87a0-ca9a4245b7f3 tags:otter_assign_solution_cell
We see quite different results in terms of output. The full Fourier transform generates a cleaner denoised signal, however it also smoothes out the discontinuties, which might not be desireable (if you think of the 2D version applied to an image with sharp contours, they would be blurred out). Block DFT and Haar preserve discontinuities better but our very simple denoising method introduces also undesirable artifacts in the block DFT version, while Haar yields better results (good noise removal and discontinuity preservation). The Haar details coefficients are computed using only few neighboring samples, unlike high frequency Fourier coefficients. This helps in preserving the discontinuities when denoising.
%% Cell type:markdown id:e705a9ec tags:
<!-- END QUESTION -->
Source diff could not be displayed: it is too large. Options to address this: view the blob.
solutions/week03/images/pocs.png

76.9 KiB

Source diff could not be displayed: it is too large. Options to address this: view the blob.
%% Cell type:code id:5410f063 tags:
``` python
# Initialize Otter
import otter
grader = otter.Notebook("Linear systems - Poisson equation.ipynb")
```
%% Cell type:markdown id:d133bce9-d159-4b1d-a25f-69bed8da493a tags:
# Matrix Analysis 2025 - EE312
## Week 5 - Discrete Poisson equation
[LTS2](https://lts2.epfl.ch)
### Objectives
Apply your knowledge about linear systems to Poisson equation resolution.
%% Cell type:markdown id:654ad4ee-c060-4bc3-b218-3a6709952750 tags:
## Poisson equation
Let $u,v \in \mathbb{R}^n$ represent a physical quantity $f$ and $g: \mathbb{R} \mapsto \mathbb{R}$ sampled at $n$ equi-spaced locations $x_i$, i.e $u_i = f(x_i)$, $v_i = g(x_i)$.
Let us assume that the underlying continuous object satisfies the Poisson equation: $\frac{d^2}{dx^2} f (x)= g(x)$ with constraints $f(x_j) = y_j$ for a subset of $m$ locations $j \in \{i_1, \ldots i_m \}$.
We will assume that **all the $i_k$ are distinct**.
This equation governs a number of physical principles, e.g. gravity, fluid mechanics and electrostatics. In the latter we have $\Delta \varphi = -\frac{\rho}{\varepsilon}$, where $\Delta$ is the Laplacian operator $(\frac{d^2}{dx^2} + \frac{d^2}{dy^2} + \frac{d^2}{dz^2})$, $\varphi$ is the electric potential, $\rho$ the density of electric charges and $\varepsilon$ the electric permittivity.
---
%% Cell type:markdown id:06ac5cfa-62ef-4477-a957-e240ddeb85d8 tags:
### 1D case
%% Cell type:code id:7f3cb74d-72f0-460b-a978-a47f87de8557 tags:
``` python
import numpy as np
import matplotlib.pyplot as plt
```
%% Cell type:markdown id:9fd89d0b-50df-4711-a32b-ec1026e6b122 tags:
<!-- BEGIN QUESTION -->
1. Write down a matrix equation for the discrete version of $\frac{d^2}{dx^2} f (x)= g(x)$, using the finite-difference approximation of the derivative $\frac{d^2}{dx^2} f = f(x_{k+1}) - 2f(x_k) +f(x_{k-1})$. For the boundary conditions we will make the assumption that $x_{-1}=x_0$ and $x_{n-1}=x_n$ (also referred to as Dirichlet boundary conditions)
What is the rank of the corresponding matrix $D$ ?
%% Cell type:markdown id:ca746265 tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:markdown id:a21ed0d8-4b5d-42eb-b1bf-6955938ebe2d tags:otter_assign_solution_cell
If we use zero-padding instead of the constant extension this equations satisfy $Du = v$ with
$D=\begin{pmatrix}
-2 & 1 & 0 & 0& ... & 0\\
1 & -2 & 1 & 0 &... &0\\
0 & 1 & -2 & 1 &... &0\\
& & & ... & & \\
0 & 0 & ... & 1 & -2 & 1\\
0 & 0 & ... & 0 & 1 & -2\\
\end{pmatrix}$
This is the laplacian matrix.
$\mbox{rank }D = n$.
If we use a Dirichlet boundary condition, i.e. constant extension, we have $u_{-1}=u_0$ and $u_n = u_{n+1}$, therefore
$D=\begin{pmatrix}
-1 & 1 & 0 & 0& ... & 0\\
1 & -2 & 1 & 0 &... &0\\
0 & 1 & -2 & 1 &... &0\\
& & & ... & & \\
0 & 0 & ... & 1 & -2 & 1\\
0 & 0 & ... & 0 & 1 & -1\\
\end{pmatrix}$
In this case the rank is $n-1$ since the sum of rows/columns is 0 and if we remove one row/column they become linearly independent.
%% Cell type:markdown id:aed76fe2-8cb5-4b28-bd3b-fa0bea113a46 tags:
<!-- END QUESTION -->
<Your answers here>
%% Cell type:markdown id:cceac118-c43d-4782-9f7e-4392e5ecc451 tags:
2. Implement a function that creates the $D$ matrix (also called Laplacian). The [diag](https://numpy.org/doc/stable/reference/generated/numpy.diag.html) function in numpy might be useful.
%% Cell type:code id:d41105d8-6933-4ef8-b2f0-c6ac053e1f55 tags:otter_assign_solution_cell
``` python
def lapl_matrix(N):
# BEGIN SOLUTION
O = np.ones(N-1)
D = -2*np.eye(N) + np.diag(O, -1) + np.diag(O, 1)
D[0,0] = -1
D[N-1, N-1] = -1
return D
# END SOLUTION
```
%% Cell type:code id:1abc2b2c tags:
``` python
grader.check("q2")
```
%% Cell type:markdown id:4cea41dc-6f12-42f2-8644-b51e4918504d tags:
<!-- BEGIN QUESTION -->
3. Write down a matrix equation for the discrete version of $f(x_j) = y_j$. What is the rank of the corresponding matrix $B$ ?
%% Cell type:markdown id:63a9357b tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:markdown id:d47029a2-bd30-4da8-ae9a-f41240154377 tags:otter_assign_solution_cell
This operation can be represented by a matrix $B\in\mathbb{R}^{n\times m}$ where the coefficients $b_{kp} = \delta_{ki_p}$.
The rank of this matrix is $m$, under the assumption that all $i_k$ are distinct, since every row has a single non-zero value.
Using this matrix we can rewrite the condition $f(x_j) = y_j$ as $Bu=\begin{pmatrix}y_0\\ \vdots \\y_{m-1}\end{pmatrix}$
%% Cell type:markdown id:418bc83f-7d46-41c2-8a7e-df892857320c tags:
<!-- END QUESTION -->
4. Implement a function that creates matrix $B$
%% Cell type:code id:1d294eda-7edc-42c2-9640-13bec779bd48 tags:otter_assign_solution_cell
``` python
def b_matrix(N, idx_list): # idx_list is a list of valid indices, e.g. N=5, idx_list=[1,3]
# BEGIN SOLUTION
m = len(idx_list)
if m<1:
raise ValueError("Supply at least one index")
if not np.all(np.array(idx_list)<N):
raise ValueError("Invalid list of indices supplied")
if not np.all(np.array(idx_list)>=0):
raise ValueError("Negative indices supplied")
if N<1 or m >= N:
raise ValueError("Invalid dimensions")
B = np.zeros((m, N))
for p in zip(range(m), idx_list):
B[p[0], p[1]] = 1.
return B
# END SOLUTION
```
%% Cell type:code id:58dbd947 tags:
``` python
grader.check("q4")
```
%% Cell type:markdown id:b77f215e-e4d0-4976-9ed6-7ee210c689a6 tags:
<!-- BEGIN QUESTION -->
3. Write down a matrix equation for the full problem (Hint: Consider the matrix $C=\begin{pmatrix}D\\B\end{pmatrix}$). Discuss the rank of the matrix and deduce a way to numerically solve the linear system. Implement a function that returns matrix $C$.
%% Cell type:markdown id:19ecce3a tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:markdown id:c9ea268f-a134-43d5-a74f-7853bd8d3547 tags:otter_assign_solution_cell
Let us use the matrix $C=\begin{pmatrix}D\\B\end{pmatrix}$.
The following holds $Cu = z$, with $z=\begin{pmatrix}v_0\\v_1\\...\\v_{n-1}\\y_0\\y_1\\...\\y_{m-1}\end{pmatrix}$.
Let us suppose there exist non zero $\alpha_i$ s.t.
$\sum_{i=0}^{n-1} \alpha_i C_i = 0$, where $C_i$ is the $i$-th column of $C$.
Looking at the bottom part of $C$, we must have $\alpha_i=0, i\in\{i_1,...,i_m\}$. The $\alpha_i$ are can only be non-zero when the lower part of the vector only has 0s. Given a subset of columns of $D$ will have at least one row with a single non zero coefficient making the associated $\alpha_i = 0$, which then propagates to the rest (since columns have disjoint support), the rank of $C$ is $n$.
We can get the solution using $C^+$.
%% Cell type:code id:2eaf5a8c-b83a-4f53-9c4e-6192bf4e6fa4 tags:otter_assign_solution_cell
``` python
def c_matrix(D, B):
# BEGIN SOLUTION
sd = D.shape
sb = B. shape
C = np.zeros((sd[0]+sb[0], sd[1]))
C[:sd[0], :] = D
C[sd[0]:, :] = B
return C
# END SOLUTION
```
%% Cell type:code id:c0c17d8b tags:
``` python
grader.check("q4")
```
%% Cell type:markdown id:10a3d136-f5f3-4c1b-9c0a-ece58f891dfe tags:
<!-- END QUESTION -->
<!-- BEGIN QUESTION -->
5. What explicit formula can you use to compute the pseudo-inverse of $C$ (justify)?
%% Cell type:markdown id:fd6f7561 tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:markdown id:4daa620b-d617-49e8-9504-41151b02ab00 tags:otter_assign_solution_cell
The rank of $C$ is $n$, the columns of $C$ are independent, therefore the matrix is 1-1. We can use the left pseudo inverse
$C^+=(C^TC)^{-1}C^T$
%% Cell type:markdown id:3726eb03-0c82-4aed-b179-77ff947583c3 tags:
<!-- END QUESTION -->
<!-- BEGIN QUESTION -->
6. Implement a function that return the solution of the problem. You can either use the formula above (you may then use the [linalg.inv](https://numpy.org/doc/stable/reference/generated/numpy.linalg.inv.html) or compute the pseudo-inverse using [linalg.pinv](https://numpy.org/doc/stable/reference/generated/numpy.linalg.pinv.html).
%% Cell type:code id:79f8c6d1-7b92-48b6-bb90-e888aaa41243 tags:otter_assign_solution_cell
``` python
# v is a vector of size N
# u is a vector of size len(idx_list)
def solve_poisson(N, v, idx_list, u):
# BEGIN SOLUTION
D = lapl_matrix(N)
B = b_matrix(N, idx_list)
C = c_matrix(D, B)
z = np.zeros(C.shape[0])
z[:N] = v
for p in zip(range(len(u)), u):
z[N+p[0]] = p[1]
Cinv = np.linalg.pinv(C)
#or
# Cinv = np.linalg.inv(C.T@C)@C.T
return Cinv@z
# END SOLUTION
```
%% Cell type:markdown id:ff6184e1-e308-47d1-8b72-ab4559b48adb tags:
<!-- END QUESTION -->
<!-- BEGIN QUESTION -->
7. Let us now consider the solutions of the particular case $g(x) = 0, \forall x\in\mathbb{R}$. What are the analytical solutions that would solve Poisson equation ?
%% Cell type:markdown id:d47ff697 tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:markdown id:fc80136a-8da3-4fa2-a9c1-c77962761e38 tags:otter_assign_solution_cell
The analytical solutions are polynomial of degree 1 (s.t. their second derivative is 0), i.e. linear functions, i.e. $f(x)=ax+b$.
%% Cell type:markdown id:4a33db39-077d-4775-a55d-ae5bd7400b23 tags:
<!-- END QUESTION -->
<!-- BEGIN QUESTION -->
Let us verify that our implementation of the solution works with a small example. Fill the values for $u$ s.t. they belong to the solution space.
%% Cell type:code id:96381701-e737-41c1-8642-fa1a5c13924c tags:otter_assign_solution_cell
``` python
N = 50
v = np.zeros(N)
idx_list = [10, 20, 30, 40]
u = [10, 20, 30, 40] # SOLUTION
sol = solve_poisson(N, v, idx_list, u)
plt.plot(sol) # plot the result
```
%% Output
[<matplotlib.lines.Line2D at 0x10af67a10>]
%% Cell type:markdown id:04e6fc55-6c8a-4dfa-8ca0-9b45ed2f157c tags:
<!-- END QUESTION -->
<!-- BEGIN QUESTION -->
9. Are the results conform to what we expected ? What happens near the boundaries ?
%% Cell type:markdown id:41b1af96 tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:markdown id:5710b3b8-a7be-4263-86a5-81d735062bad tags:otter_assign_solution_cell
We obtain a solution that is linear except near it boundaries. The aspect of the solution near the boundaries is due to the Dirichlet conditions we impose (i.e. solution should be constant outside the interval displayed). The values we had set as constraints are matched closely.
%% Cell type:markdown id:ecf46b87-accd-4043-8ab6-06835830699b tags:
<!-- END QUESTION -->
<!-- BEGIN QUESTION -->
10. Let us use a step function for $f$ and try to solve the system
%% Cell type:code id:d81b4501-b2a5-4412-a266-d04ffa8dd2c7 tags:
``` python
idx_list = [10, 20, 30, 40]
u2 = [1, 1, -1, -1]
sol2 = solve_poisson(N, v, idx_list, u2)
plt.scatter(idx_list, u2, marker='+', color='r')
plt.plot(sol2)
```
%% Output
[<matplotlib.lines.Line2D at 0x10fe128d0>]
%% Cell type:markdown id:4918d31c-40a2-4df3-88ec-d18a80e879e7 tags:
What do you observe and what is the reason for that ?
%% Cell type:markdown id:e0d2ec3d tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:markdown id:6f86a2fe-3fd2-45ed-871f-8850cbe198a4 tags:otter_assign_solution_cell
If we deliberately choose $f$ to be outside of the solution space, we obtain a poor approximation and our constraints are not met (which is expected).
We can check that the Laplacian of computed solution and the Laplacian of the step function (from which the constraints are sampled) do not match
%% Cell type:code id:21d1f686-af17-4660-a29e-656e6eee670b tags:otter_assign_solution_cell
``` python
D = lapl_matrix(N)
z = D@sol2 # Laplacian of the computed solution
step=np.zeros(50)
step[:25]=1
step[25:]=-1
plt.plot(z)
plt.plot(D@step) # plot the Laplacian of the step
plt.legend(['Lapl. solution', 'Lapl. step'])
```
%% Output
<matplotlib.legend.Legend at 0x1187e2610>
%% Cell type:markdown id:b520b2c6-7c84-4c3e-be16-8eb866648a97 tags:
<!-- END QUESTION -->
<!-- BEGIN QUESTION -->
11. Let us now consider the application of the Poisson equation to electrostatic. Knowing a distribution of electric charges, we can solve it to compute the electric potential. The distribution of electric charges will be very simple, consisting of two punctual charges
%% Cell type:code id:96932aa1-74aa-4f4e-bbf5-df0f0d85c47c tags:
``` python
N = 50
v3 = np.zeros(N)
v3[10] = 1
v3[40] = 1
plt.plot(v3)
```
%% Output
[<matplotlib.lines.Line2D at 0x10feeef50>]
%% Cell type:markdown id:42861911-53e1-4cb3-8dc7-aa617c01a7e8 tags:
- What are the analytical solutions for this problem ? (remember the discrete derivative of a step function is a Dirac function, i.e. 0 everywhere except at one point where it is 1).
- Plot the analytical solution
- Compute and plot the numerical solution
- What do you observe ?
%% Cell type:markdown id:ddc8c11f tags:otter_answer_cell
_Type your answer here, replacing this text._
%% Cell type:markdown id:bc86b8ba-89c7-42a4-bae3-4279b8314fee tags:otter_assign_solution_cell
The analytical solution is a piecewise linear solution. Integrating once the 2-peaked will lead to picewise constant function, and piecewise linear if we integrate twice.
If you use the constraints we impose for the numerical example you can compute the theoretical solution of the problem satisfying the constraints (that was not asked in the question so the above answer was sufficient). If you impose the solution to be constant between 10 and 40, it means its derivative is 0 in this interval. Below 10 the derivative is then -1 and above 40 the derivative is 1.
%% Cell type:code id:f42fc23b-c412-4791-b5b6-bef54fa3bab3 tags:otter_assign_solution_cell
``` python
# BEGIN SOLUTION
d=np.zeros(50)
d[:10]=-1
d[40:]=1
plt.plot(d)
# END SOLUTION
```
%% Output
[<matplotlib.lines.Line2D at 0x10ff78890>]
%% Cell type:markdown id:6aba7c71-0e5b-40de-ae37-bce24524da06 tags:otter_assign_solution_cell
If we solve $-x+a = 1$ for $x=10$ and $x+b=1$ for $x = 40$, the analytical solution that satisfies the constraint is:
- $f(x) = -x+11, x<10$
- $f(x) = 1, 10\le x\le 40$
- $f(x) = x - 39, x>40$
%% Cell type:code id:f2f9bd66-067e-4590-8088-87b7969621bb tags:otter_assign_solution_cell
``` python
# Plot analytical solution
# BEGIN SOLUTION
x1 = np.arange(0, 11)
x2 = np.arange(10, 41)
x3 = np.arange(40, 51)
plt.plot(x1, -x1+11)
plt.plot(x2, np.ones(x2.shape))
plt.plot(x3, x3-39)
# END SOLUTION
```
%% Output
[<matplotlib.lines.Line2D at 0x10fff5990>]
%% Cell type:code id:22eae7ab-4c91-4000-b123-7105cc4c023a tags:otter_assign_solution_cell
``` python
idx_list = [10, 20, 30, 40]
u3 = [1, 1, 1, 1]
sol3 = solve_poisson(N, v3, idx_list, u3)
plt.scatter(idx_list, u3, marker='+', color='r')
plt.plot(sol3)
```
%% Output
[<matplotlib.lines.Line2D at 0x10ff7ad90>]
%% Cell type:code id:b0310f39-9bbf-4518-be7d-af0e3f6724a7 tags:otter_assign_solution_cell
``` python
# check the 2nd derivative of the solution computed
plt.plot(lapl_matrix(50)@sol3[:50]) # SOLUTION
```
%% Output
[<matplotlib.lines.Line2D at 0x1180751d0>]
%% Cell type:markdown id:6e71baf6-6e79-4b39-b420-0b29d47444b5 tags:otter_assign_solution_cell
We can see the solution computed is acceptable as its second derivative matches the original distribution (it is not fully constant between the peaks but close enough). The Dirichlet boundary conditions are once again interfering at $x=0$ and $x=50$, and the derivative not being continuous also has an impact on the constant part.
%% Cell type:markdown id:daf06368 tags:
<!-- END QUESTION -->
Source diff could not be displayed: it is too large. Options to address this: view the blob.