**ECSE 343: Numerical Methods in Engineering** *Assignment 1: _Matrix Factorizations and Solving Linear Systems_* Due: Thursday, February 3rd, 2022 at 11:59pm EST on [myCourses](https://mycourses2.mcgill.ca/) Final weight: __15%__ This assignment follows the _same_ policies and processes as Assignment 0 (refer to section 1 in the A0 handout). The assignment deliverables are designed to be completed in the order they are presented. That being said, we suggest you take time to read the entire handout _before_ starting your work. $\newcommand{\norm}[1]{\left\| #1 \right\|}$ $\newcommand{\normsq}[1]{\norm{#1}^2}$ $\newcommand{\bsalpha}{\boldsymbol{\alpha}}$ $\newcommand{\bfa}{\mathbf{a}}$ $\newcommand{\bfb}{\mathbf{b}}$ $\newcommand{\bfe}{\mathbf{e}}$ $\newcommand{\bfr}{\mathbf{r}}$ $\newcommand{\bfv}{\mathbf{v}}$ $\newcommand{\bfx}{\mathbf{x}}$ $\newcommand{\bfy}{\mathbf{y}}$ $\newcommand{\cond}{\operatorname{cond}}$ $\newcommand{\argmin}[1]{\underset{#1}{\operatorname{argmin}}}$ $\newcommand{\ihat}{\mathbf {\hat i}}$ $\newcommand{\jhat}{\mathbf {\hat j}}$ $\newcommand{\khat}{\mathbf {\hat k}}$ # LU Solver You will implement a simplified LU linear system solver. This will comprise a bare bones LU decomposition, as well as the forward and backward substitution algorithms. Later in the assignment, you'll test your solver on polynomial regression problems. ## LU Decomposition Your first task is to decompose a matrix $\mathbf{A}$ as the product of a lower triangular matrix $\mathbf{L}$ and an upper triangular matrix $\mathbf{U}$. Your decomposition will not treat pivoting. In this setting, the elements of $\mathbf{L}$ and $\mathbf{U}$ can be expressed algebraically as: \begin{equation} \label{eq3} \begin{split} {U}_{ij} = \begin{cases} \displaystyle {A}_{ij} & \quad \text{for } i = 1, \\ \displaystyle {A}_{ij} - \sum_{k=1}^{i-1}{{L}_{ik}{U}_{kj}} & \quad \text{ i > 1 } \end{cases} \end{split} \end{equation} \begin{equation} \label{eq4} \begin{split} {L}_{ij} = \begin{cases} \displaystyle \frac{{A}_{ij}}{{U}_{jj}} & \quad \text{for } j = 1, \\ \displaystyle \frac{{A}_{ij} - \sum_{k=1}^{j-1}{{L}_{ik}{U}_{kj}}}{{U}_{jj}} & \quad \text{ j > 1 } \end{cases} \end{split} \end{equation} !!! warning: **Deliverable 1 [10%]**
Perform a simplified LU decomposition: Use equations (1) and (2) to complete `LU_Decomposition` in the base code. The routine takes an $(n, n)$ `numpy.array` $\mathbf{A}$ and returns the lower and upper triangular matrices $\mathbf{L}$ and $\mathbf{U}$. ## Forward and Backward Substitution Given a lower triangular linear system $\mathbf{L}\mathbf{y} = \mathbf{b}$, forward substitution solves for $\mathbf{y}$ as: \begin{equation} \label{eq5} \begin{split} {y}_{i} = \begin{cases} \displaystyle\frac{{b}_{1}}{{L}_{11}} & \quad \text{for } i = 1, \\ \displaystyle\frac{1}{{L}_{ii}}\left({b}_{i} - \sum\limits_{j = 1}^{i-1} {L}_{ij} {y}_{j}\right) & \quad \text{ otherwise. } \end{cases} \end{split} \end{equation} !!! warning: **Deliverable 2 [10%]**
Perform forward substitution: Use equation (3) to complete `ForwardSubstitution` in the base code. The routine takes a lower triangular $(n, n)$ `numpy.array` $\mathbf{L}$ and a $(n,)$ `numpy.array` $\mathbf{b}$; it returns an $(n,)$ `numpy.array` $\bfy$. Given an upper triangular linear system $\mathbf{U}\mathbf{x} = \mathbf{y}$, backward substitution solves for $\mathbf{x}$ as: \begin{equation} \label{eq6} \begin{split} {x}_{i} = \begin{cases} \displaystyle\frac{{y}_{n}}{{U}_{nn}} & \quad \text{for } i = n, \\ \displaystyle\frac{1}{{U}_{ii}}\left( {y}_{i} - \displaystyle\sum\limits_{j = i+1}^{n} {U}_{ij} {x}_{j} \right) & \quad \text{otherwise. } \end{cases} \end{split} \end{equation} !!! warning: **Deliverable 3 [10%]**
Perform backward substitution: Use equation (4) to complete `BackwardSubstitution` in the base code. The routine takes an upper triangular $(n, n)$ `numpy.array` $\mathbf{U}$ and a $(n,)$ `numpy.array` $\mathbf{y}$; it returns an $(n,)$ `numpy.array` $\bfx$. You now have all the pieces needed to piece together a linear system solver. A solution to the general system $\mathbf{Ax} = \mathbf{LUx} = \bfb$ can be obtained by solving the two simplified systems: \begin{equation} \label{eq1} \mathbf{Ly} = \mathbf{b} \text{ and } \mathbf{Ux} = \mathbf{y}~. \end{equation} !!! warning: **Deliverable 4 [10%]**
Put together a simple solver: Complete the routine `LU_solver` to solve the linear system $\mathbf{A}\bfx = \mathbf{b}$ using your simplified LU decomposition, forward and backward substitution. # Solving Polynomial Regression Let's test out your linear solver on a few polynomial regression problems. Here, you'll formulate polynomial regression problems as solutions to linear systems of equations, before applying your solver to them. ## Polynomial Regression System Given $m$ data points $({t}_i, {b}_i)$ we want to find the $n - 1$ degree polynomial $$ p_{n-1}(t) = \sum_{j=1}^n {x}_j \, t^{j-1} $$ that best fits the data points and where the coefficient vector $\bfx = [{x}_1,...,{x}_n]$ fully defines the polynomial. We can formulate this problem as the solution to a linear system of equations $$ \begin{equation*} \mathbf{Ax} = \begin{bmatrix} 1 & t_1 & \cdots & t_1^{n-1} \\ 1 & t_2 & \cdots & t_2^{n-1} \\ \vdots & \vdots & \ddots & \vdots \\ 1 & t_m & \cdots & t_m^{n-1} \\ \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \\ \end{bmatrix} = \begin{bmatrix} b_1 \\ b_2 \\ \vdots \\ b_m \\ \end{bmatrix} = \bfb \end{equation*} $$ where the Vandermonde matrix $\mathbf{A}$ can be formed using the independent variables of our data points, $t_i$. !!! warning: **Deliverable 5 [10%]**
Given an $(m,)$ `numpy.array` $\mathbf{t} = [t_1,...,t_m]$ and positive integer $n$, complete the implementation of `GetVandermonde` to construct and return an $(m, n)$ `numpy.array` Vandermonde matrix $\mathbf{A}$ for the degree $n-1$ polynomial. ## Fully-constrained and Overdetermined Polynomial Fitting If the number of data points $m$ matches the number of unknowns $n$, we can perfectly solve $\mathbf{Ax}=\bfb$. Consequently, the Vandermonde matrix $\mathbf{A}$ here would be square. If, on the other hand, we have more data points $m$ than degrees of freedom for our polynomial, we have an overdetermined problem. A perfect fit will not generally exist, but we can solve for the fit that minimizes the squared residual $2$-norm $||\mathbf{Ax} - \bfb||_2^2$. The _normal equations_ allow us to express the least-squares solution using the modified system $\mathbf{A^T A x} = \mathbf{A^T} \bfb$. !!! warning: **Deliverable 6 [20%]**
Solve the polynomial regression problem: Complete the implementation of `PolynomialFit`. It takes as input an $(m, 2)$ `numpy.array` of the $m$ data point $(t_i, b_i)$ and a positive integer $n$ to denote the polynomial degree $(n-1)$. You should use your `GetVandermonde` and `LU_solver` routines. # Image Reconstruction using Deconvolution Most real-world problems are sufficiently complex to require more robust solvers than what we developed above: without pivoting and careful performance-minded vectorization/implementation, your LU solver won't likely scale to larger problems. Setting up a problem and _using_ a solver is just as important as know how to _write_ one. The final set of deliverables will focus on understanding a more complex problem, formulating its solution as a linear system, and solving it with an industry-caliber LU solver. ## Image Filtering -- a motivating example Imagine taking a photo with your smartphone only to realize, after the fact, that the photo came out blurry. Luckily, your phone's accelerometer was able to register a horizontal motion at the time of capture. We could model the _forward process_ that polluted the image as a linear operator. Specifically, by _convolving_ the discretized (1D) horizontal blur _kernel_ (that the accelerometer registered) along the horizontal axis of the _uncorrupted_ image, we can arrive at the blurry image. The blur kernel, which we'll visualize as a few dots ●●●, has an odd number of elements, each with a numerical weight. You can imagine centering the kernel atop every pixel in the original uncorrupted image, weighting each pixel it overlaps with by the corresponding kernel value, and summing accross the weighted pixel values in order to obtain a corrupted image pixel value (⬣, below). When any part of the kernel falls outside the bounds of the image, it "loops over" to the opposing end of the image. For example below, to compute the ⬣ pixel value in the corrupted image on the right, we weight the intensities of solid pixels from the uncorrupted image on the left by the one overlapping element (of three elements, in this example) of the blur kernel. ************************************************************************************************* * * o o o o o ⬚ ⬚ ⬚ ⬚ ⬚ * o o o o o ⬚ ⬚ ⬚ ⬚ ⬚ * o * * * o -----> ⬚ ⬚ ⬣ ⬚ ⬚ * o o o o o ⬚ ⬚ ⬚ ⬚ ⬚ * o o o o o ⬚ ⬚ ⬚ ⬚ ⬚ ************************************************************************************************* By "sliding" the kernel, and repeating this weighted sum, along each pixel of each row of the uncorrupted image, we construct the final blurred image: ************************************************************************************************* * * * * o o * ⬣ ⬚ ⬚ ⬚ ⬚ * * * o o ⬚ ⬣ ⬚ ⬚ ⬚ o * * * o ⬚ ⬚ ⬣ ⬚ ⬚ * o o o o o ⬚ ⬚ ⬚ ⬚ ⬚ o o o o o ⬚ ⬚ ⬚ ⬚ ⬚ o o o o o ⬚ ⬚ ⬚ ⬚ ⬚ * o o o o o -----> ⬚ ⬚ ⬚ ⬚ ⬚ , o o o o o -----> ⬚ ⬚ ⬚ ⬚ ⬚ , o o o o o -----> ⬚ ⬚ ⬚ ⬚ ⬚ , * o o o o o ⬚ ⬚ ⬚ ⬚ ⬚ o o o o o ⬚ ⬚ ⬚ ⬚ ⬚ o o o o o ⬚ ⬚ ⬚ ⬚ ⬚ * o o o o o ⬚ ⬚ ⬚ ⬚ ⬚ o o o o o ⬚ ⬚ ⬚ ⬚ ⬚ o o o o o ⬚ ⬚ ⬚ ⬚ ⬚ * * * o o * * ⬚ ⬚ ⬚ ⬚ ⬣ o o o o o ⬚ ⬚ ⬚ ⬚ ⬚ * o o o o o ⬚ ⬚ ⬚ ⬚ ⬚ o o o o o ⬚ ⬚ ⬚ ⬚ ⬚ * . . . , o o o o o -----> ⬚ ⬚ ⬚ ⬚ ⬚ , . . . , o o o o o -----> ⬚ ⬚ ⬚ ⬚ ⬚ * o o o o o ⬚ ⬚ ⬚ ⬚ ⬚ o o o o o ⬚ ⬚ ⬚ ⬚ ⬚ * o o o o o ⬚ ⬚ ⬚ ⬚ ⬚ * o o * * ⬚ ⬚ ⬚ ⬚ ⬣ * ************************************************************************************************* ## Blurring an Image If we _flatten_ the uncorrupted input image row-wise into a single $n \times 1$ column vector $\mathbf{x}$, where $n$ is the number of image pixels, then we can model this linear corruption process with a matrix $\mathbf{A}$ that relies only on the $k \times 1$ blur kernel elements. !!! warning: **Deliverable 7 [20%]**
Construct the blur matrix:Complete the implementation of `CreateBlurMatrix`. It takes as input a $(k, 1)$ `numpy.array` with the 1D horizontal blur kernel elements, as well as two positive integers denoting the `width` and `height` of the images -- we assume that the corrupted and uncorrupted images have the same dimensions. Once you form $\mathbf{A}$, you can obtain the linearized corrupted image $\mathbf{b}$ (of size $n \times 1$) as $\mathbf{A\,x}$. !!! warning: **Deliverable 8 [5%]**
Blur an image:Complete the implementation of `BlurImage`. It takes as input a $(n,n)$ `numpy.array` of the blur matrix computed with `CreateBlurMatrix` and a $($`width`,`height`$)$ `numpy.array` of the uncorrupted grayscale image. It should output a $($`width`, `height`$)$ `numpy.array` of the __corrupted__ grayscale image. ## Deblurring an Image We're lucky that, in our setting, the accelorometer was able to provide us with an estimate of the blurring kernel. Given this, and the _forward model_ of how the uncorrupted image was corrupted with the kernel, we can solve the _inverse_ problem: given only the corrupted image and the blur kernel, we aim to __retrieve__ the uncorrupted image. This problem is referred to a _non-blind deconvolution_; if we were __not__ given the blur kernel, the (much harder) problem is referred to as _blind deconvolution_. Here, we can retriev uncorrupted image $\mathbf{x}$ given the blur matrix $\mathbf{A}$ derived from the 1D horizontal blur kernel and the corrupted (blurred) image $\mathbf{b}$ by solving for $\mathbf{x}$ in $\mathbf{Ax} = \mathbf{b}$. !!! warning: **Deliverable 9 [5%]**
Deblur an image:Complete the implementation of `DeblurImage`. It takes as input a $(n,n)$ `numpy.array` of the blur matrix computed with `CreateBlurMatrix` and a $($`width`,`height`$)$ `numpy.array` of the __corrupted__ grayscale image. It should output a $($`width`, `height`$)$ `numpy.array` of the __uncorrupted__ grayscale image. Note: Your LU solver will not scale to the sizes of images we will be using. You should use `scipy`'s LU solving routines, which we have imported for you. ![The test images we provide you for deliverables seven through nine. Your code will generate the two missing images, which should match those of the upper row.][q3.png width="50px" border="1"] # You're Done! Congratulations, you've completed the 1st assignment. Review the submission procedures and guidelines at the start of the Assignment 0 handout before submitting your `Python` script file assignment solution. [q3.png]: