Linear Algebra Notes

Notes on some linear algebra topics. Sometimes I find things are just stated as being obvious, for example, the dot product of two orthogonal vectors is zero. Well... why is this? The result was a pretty simple but nonetheless it took a little while to think it through properly. Hence, these notes... they started of as my musings on the "why", even if the "why" might be obvious to the rest of the world!

A lot of these notes are now just based on the amazing Khan Academy linear algebra tutorials. In fact, id say the vast majority are... they are literally just notes on Sal's lectures!

Another resource that is really worth watching is 3 Blue 1 Brown's series "The essence of linear algebra".

Page Contents

Vector Stuff

Vector Spaces


A vector is an element of a vector space. A vector space, also known as a linear space, V , is a set which has addition and scalar multiplication defined for it and satisfies the following for any vectors \vec u , \vec v , and \vec w and scalars c and d :

1 Closed under addition: \vec u + \vec v \in V
2 Communative: \vec u + \vec v = \vec v + \vec u
3 Associative: (\vec u + \vec v) + \vec w = \vec u + (\vec v + \vec w)
4 Additive identity: \vec 0 \in U, \text{i.e.,} \vec u + \vec 0 = \vec u
5 Inverse: \forall \vec u \in V, \exists -\vec u | \vec u + (-\vec u) = \vec 0
6 Closed under scalar multiply: c\vec u \in V
7 Distributive: c(\vec u + \vec v) = c\vec u + c\vec v
8 Distributive: (c + d)\vec u = c\vec u + d\vec u
9 Distributive: c(d\vec u) = (cd)\vec u
10 Multiply identity: 1 \vec u = \vec u

Functions Spaces

First let's properly define a function. If X and Y are sets and we have x \in X and y \in Y , we can form a set F that consists of ordered pairs (x_i, y_j) . If it is the case that (x_1, y_1) \in F \cap (x_1, y_2) \in F \implies y_1 == y_2 , then F is a function and we write y = F(x) . This is called being "single valued", i.e., each input to the function is unambiguous. Note, F(x) is not a function, it is the value returned by the function: F is the function that defines the mapping from elements in X (the domain) to the elements in Y (the range).

One example of a vector space is the set of real-valued functions of a real variable, defined on the domain [a \le x \le b] . I.e., we're talking about \{f_1, f_2, \ldots, f_n\} where each f_i is a function. In other words, f_i : \mathbb{R} \rightarrow \mathbb{R} with the aforementioned restriction on the domain.

We've said that a vector is any element in a vector space, but usually we thing of this in terms of, say, [x, y] \in \mathbb{R}^2 . We can also however, given the above, think of functions as vectors: the function-as-a-vector being the "vector" of ordered pairs that make up the mapping between domain and range. So in this case the set of function is still like a set of vectors and with scalar multiplication and vector addition defined appropriately the rules for a vector space still hold!

What Isn't A Vector Space

It sounded a lot like everything was a vector space to me! So I did a quick bit of googling and found the paper "Things that are/aren't vector spaces". It does a really good example of showing how in maths we might build up a model of something and expect it to behave "normally" (i.e., obeys the vector space axioms), but in fact doesn't, in which case it becomes a lot harder to reason about them. This is why vectors spaces are nice... we can reason about them using logic and operations we are very familiar with and feel intuitive.

Linear Combinations

A linear combination of vectors is one that only uses the addition and scalar multiplication of those variables. So, given two vectors, \vec a and \vec b the following are linear combinations: 0\vec a + 0\vec b \\ 5\vec a - 2.5\vec b \\ -123\vec a + \pi\vec b This can be generalise to say that for any set of vectors \{\vec{v_1}, \vec{v_2}, \ldots, \vec{v_n}\} \text{ for } \vec{v_1}, \vec{v_2}, \ldots, \vec{v_n} \in \mathbb{R}^m that a linear combination of those vectors is c_1\vec{v_1} + c_2\vec{v_2} + \cdots + c_n\vec{v_n} \text{ for } c_i \in \mathbb{R} .


graph showing span of two vectors in 2D space The set of all linear combinations of a set of vectors \{\vec{v_1}, \vec{v_2}, \ldots, \vec{v_n}\} is called the span of those vectors: \mathrm{span}(\vec{v_1}, \vec{v_2}, \ldots, \vec{v_n}) = \{c_1\vec{v_1} + c_2\vec{v_2} + \cdots + c_n\vec{v_n} \;|\; c_i \in \mathbb{R}\} The span can also be refered to as the generating set of vectors for a subspace.

In the graph to the right there are two vectors \vec a and \vec b . The dotted blue lines show all the possible scalings of the vectors. The vectors span those lines. The green lines show two particular scalings of \vec a added to all the possible scalings of \vec b . One can use this to imagine that if we added all the possible scalings of \vec a to all the possible scalings of \vec b we would reach every coordinate in \mathbb{R}^2 . Thus, we can say that the set of vectors \{\vec a, \vec b\} spans \mathbb{R}^2 , or \mathrm{span}(\vec a, \vec b) = \mathbb{R}^2 .

From this we may be tempted to say that any set of two, 2d, vectors spans \mathbb{R}^2 but we would be wrong. Take for example \vec a and \vec b = -\alpha\vec a . These two variables are co-linear. Because they span the same line, no combination of them can ever move off that line. Therefore they do not span \mathbb{R}^2 .

Lets write the above out a little more thoroughly... \vec a = \begin{bmatrix} a_1 \\ a_2 \end{bmatrix} , \vec b = \begin{bmatrix} -\alpha a_1 \\ -\alpha a_2 \end{bmatrix} The span of these vectors is, by our above definition: \mathrm{span}(\vec a, \vec b) = \left\{ c_1 \begin{bmatrix} a_1 \\ a_2 \end{bmatrix} + c_2 \begin{bmatrix} -\alpha a_1 \\ -\alpha a_2 \end{bmatrix} \mid c_i \in \mathbb{R} \right\} Which we can re-write as: \begin{align} \mathrm{span}(\vec a, \vec b) &= \left\{ c_1 \begin{bmatrix} a_1 \\ a_2 \end{bmatrix} -\alpha c_2 \begin{bmatrix} a_1 \\ a_2 \end{bmatrix} \mid c_i \in \mathbb{R} \right\} \\ &= \left\{ (c_1 -\alpha c_2) \begin{bmatrix} a_1 \\ a_2 \end{bmatrix} \mid c_i \in \mathbb{R} \right\} \\ \end{align} ... which we can clearly see is just the set of all scaled vectors of \vec a . Plainly, any co-linear set of 2d vectors will not span \mathbb{R}^2 .

Linear Independence

We saw above that the set of vectors \{\vec a, -\alpha\vec a\} are co-linear and thus did not span \mathbb{R}^2 . The reason for this is that they are linealy dependent, which means that one or more vectors in the set are just linear combinations of other vectors in the set.

For example, the set of vectors \{[1, 0], [0, 1], [2, 3]\} spans \mathbb{R}^2 , but [2, 3] is redundant because we can remove it from the set of vectors and still have a set that spans \mathbb{R}^2 . In other words, it adds no new information or directionality to our set.

A set of vectors will be said to be linearly dependent when: c_1\vec{v_1} + \cdots + c_n\vec{v_n} = \vec 0 \mid \exists \{c_i \mid c_i \ne 0 \} Thus a set of vectors is linearly independent when the following is met: c_1\vec{v_1} + \cdots + c_n\vec{v_n} = \vec 0 \mid \forall \{c_i \mid c_i = 0 \} Sure, this is a definition, but why does this mean that the set is linearly dependent?

Picture of linearly indpendent vectors

Think of vectors in a \mathbb{R}^2 . For any two vectors that are not co-linear, there is no combination of those two vectors that can get you back to the origin unless they are both scaled by zero. The image to the left is trying to demonstrate that. If we scale \vec a , no matter how small we make that scaling, we cannot find a scaling of \vec b that when added to \vec a will get us back to the origin. Thus we cannot find a linear combination c_av_a + c_bv_b = \vec 0 where c_a and/or c_b does not equal zero.

This shows that neither vector is a linear combination of the other, because if it where, we could use a non-zero scaling of one, added to the other, to get us back to the origin.

We can also show that two vectors are linearly independent when their dot product is zero. This is covered in a latter section.

Another way of thinking about LI is that there is only one solution to the equation A\vec x = \vec 0 .

Linear Subspace

If V is a set of vectors in \mathbb{R}^n , then V is a subspace of \mathbb{R}^n if and only if:

  1. V contains the zero vector: \vec 0 \in V
  2. V is closed under scalar multiplication: \vec x \in V, \alpha \in \mathbb{R} \implies \alpha\vec x \in V
  3. V is closed under addition: \vec a \in V, \vec b \in V \implies \vec a + \vec b \in V

What these rules mean is that, using linear operations, you can't do anything to "break" out of the subspace using vectors from that subspace. Hence "closed".

Surprising, at least for me, was that the zero vector is actually a subspace of \mathbb{R}^n because the set \{\vec 0\} obeys all of the above rules.

A very useful thing is that a span is always a valid subspace. Take, for example, V = \mathrm{span}(v_1, v_2, v_3) . We can show that it is a valid subspace by checking that it objects the above 3 rules.

First, does it contain the zero vector? Recall that a span is the set of all linear combinations of the span vectors. So we know that the following linear combination is valid, and hence the span contains the zero vector. 0\vec v_1 + 0\vec v_2 + 0\vec v_3 = \vec 0 Second, is it closed under scalar multiplication? Because the span is the set of all linear combinations, for any set of scalars