Linear Algebra Notes

Notes on some linear algebra topics. Sometimes I find things are just stated as being obvious, for example, the dot product of two orthogonal vectors is zero. Well... why is this? The result was a pretty simple but nonetheless it took a little while to think it through properly. Hence, these notes... they're my musings on the "why", even if the "why" might be obvious to the rest of the world!

Page Contents

Orthogonal Vectors and Linear Independence

Image showing 3 perpendicular vectors
Figure 1

Orthogonal vectors are vectors that are perpendicular to each other. In a standard 2D or 3D graph, this means that they are at right angles to each other and we can visualise them as seen in Figure 1: $\vec x$ is at 90 degrees to both $\vec y$ and $\vec z$. $\vec y$ is at 90 degrees to $\vec x$ and $\vec z$, and $\vec z$ is at 90 degrees to $\vec y$ and $\vec z$.

In any set of any vectors, $[\vec{a_1}, ..., \vec{a_n}]$, the vectors are said to be linearly linearly independent if every vector in the set is orthogonal to every other vector in the set.

So, how do we tell if one vector is orthogonal to another? The answer is the dot product which is defined as follows. $$x \cdot y = \sum_1^n {x_i y_i}$$

We know when two vectors are orthogonal when their dot product is zero: $x \cdot y = 0 \implies$ x and y are orthogonal.

But why is this the case? Lets imagine any two aribtrary vectors, each on the circumference of a unit circle (so we know that they have the same length and are therefore a proper rotation of a vector around the centre of the circle). This is shown in figure 2. From the figure, we know the following:

Picture showing a vector being rotated
Figure 2

$$ \begin{align} x_1 &= r \cos{\theta_1}& y_1 &= r \sin{\theta_1} \\ x_2 &= r \cos{\theta_1 + \theta_n} & y_2 &= r \sin{\theta_1 + \theta_n} \end{align} $$ The vector $x_2$ is the vector $x_1$ rotated by $\theta_n$ degrees. We can use the following trig identities: $$ \begin{align} \sin(a \pm b)& = \sin(a)\cos(b) \pm \cos(a)\sin(b) \\ \cos(a \pm b)& = \cos(a)\cos(b) \mp \sin(a)\sin(b) \end{align} $$ Substitute these into the formauls above, we get the following. $$ \begin{align} x_2 &= r\cos\theta_1\cos\theta_n - r\sin\theta_1\sin\theta_n \\ y_2 &= r\sin\theta_1\cos\theta_n + r\cos\theta_1\sin\theta_n \\ \end{align} $$ Which means that... $$ \begin{align} x_2 &= x_1\cos\theta_n - y_1\sin\theta_n \\ y_2 &= y_1\cos\theta_n + x_1\sin\theta_n \end{align} $$ For a 90 degree rotation $\theta_n = 90^\circ$ we know that $\cos\theta_n = 0$ and $\sin\theta_n = 1$. Substiuting these values into the above equations we can clearly see that... $$ \begin{align} x_2 &= -y_1 \\ y_2 &= x_1 \end{align} $$ Therefore, any vector in 2D space that is $[a, b]$ will become $[-b, a]$ and therefore the dot product becomes $-ab + ab = 0$. And voilĂ , we know know why the dot product of two orthogonal 2D vectors is zero. I'm happy to take it on faith that this extrapolates into n dimensions :)

Orthogonal Matrices

So, onto orthogonal matrices. A matrix is orthogonal if $AA^T = A^TA = I$. If we take a general matrix and multiply it by it's transpose we get...

$$ AA^T = \begin{pmatrix} a & c \\ b & d \\ \end{pmatrix} \begin{pmatrix} a & b \\ c & d \\ \end{pmatrix} = \begin{pmatrix} aa + cc & ab + cd \\ ab + cd & bb + dd \\ \end{pmatrix} = \begin{pmatrix} 1 & 0 \\ 0 & 1 \\ \end{pmatrix} $$

The pattern $ab + cd$ looks pretty familiar right?! It looks a lot like $x_1x_2 + y_1y_2$, our formula for the dot product of two vectors. So, when we say $a=x_1$, $b=x_2$, $c=y_1$ and $d=y_2$, we will get the following: a matrix of two row vectors $\vec v_1 = [x_1, y_1]$, $\vec v_2 = [x_2, y_2]$. $$ A = \begin{pmatrix} \vec v_1 \\ \vec v_2 \\ \end{pmatrix} = \begin{pmatrix} x_1 & y_1 \\ x_2 & y_2 \\ \end{pmatrix} $$ $$ \therefore AA^T = \begin{pmatrix} x_1 & y_1 \\ x_2 & y_2 \\ \end{pmatrix} \begin{pmatrix} x_1 & x_2 \\ y_1 & y_2 \\ \end{pmatrix} = \begin{pmatrix} x_1^2 + y_1^2 & x_1x_2 + y_1y_2 \\ x_1x_2 + y_1y_2 & x_2^2 + y_2^2 \\ \end{pmatrix} $$ If our vectors $\vec v_1$ and $\vec v_2$ are orthogonal, then the component $x_1x_2 + y_1y_2$ must be zero. This would give us the first part of the identify matrix pattern we're looking for.

The other part of the identity matrix would imply that we would have to have $x_1^2 + y_1^2 = 1$ and $x_2^2 + y_2^2 = 1$... which are just the formulas for the square of the length of a vector. Therefore, if our two vectors are normal (i.e, have a length of 1), we have our identity matrix.

Does the same hold true for $A^TA$? It doesn't if we use our original matrix A!... $$ A^TA = \begin{pmatrix} x_1 & x_2 \\ y_1 & y_2 \\ \end{pmatrix} \begin{pmatrix} x_1 & y_1 \\ x_2 & y_2 \\ \end{pmatrix} = \begin{pmatrix} x_1^2 + x_2^2 & x_1y_1 + x_2y_2 \\ x_1y_1 + x_2y_2 & y_1^2 + y_2^2 \\ \end{pmatrix} $$ Oops, we can see that we didn't get the identity matrix!! But, perhaps we can see why. If $A$ was a matrix of row vectors then $A^T$ is a matrix of column vectors. So for $AA^T$ we were multiplying a matrix of row vectors with a matrix of column vectors, which would, in part, give us the dot products as we saw. So if we want to do $A^TA$ it would follow that for this to work $A$ now was to be a matrix of column vectors because we get back to our original ($A^T$ would become a mtrix of row vectors): $$ A^TA = \begin{pmatrix} x_1 & y_1 \\ x_2 & y_2 \\ \end{pmatrix} \begin{pmatrix} x_1 & x_2 \\ y_1 & y_2 \\ \end{pmatrix} = \begin{pmatrix} x_1^2 + y_1^2 & x_1x_2 + y_1y_2 \\ x_1x_2 + y_1y_2 & x_2^2 + y_2^2 \\ \end{pmatrix} $$

So we can say that if we have matrix who's rows are orthogonal vectors and who's columns are also orthogonal vectors, then we have an orthogonal matrix!

Okay, thats great n' all, but why should we care? Why Are orthogonal matricies useful?. It turns our that orthogonal matricies preserve angles and lengths of vectors. This can be useful in graphics to rotate vectors but keep the shape they construct, or in numerical analysis because they do not amplify errors.


Eigenvectors and Eigenvalues

Stackoverflow thread: What is the importance of eigenvalues and eigenvectors.


Eigenvectors and values exist in pairs: every eigenvector has a corresponding eigenvalue. An eigenvector is a direction, ... An eigenvalue is a number, telling you how much variance there is in the data in that direction ...