What’s the difference between a plane and a grid? They all contain infinite points (or vectors), but a plane is a vector space, while a grid is not. That is, a plane follows the closure conditions: 1.u in V –> ku in V 2.u, v in V -> u+v in V. The simple condition is used in many models, and it signals the “linearity”. We can directly show that if u, v in V then x1*u+x2*v is also in V. The formula follows for more vectors, which has its own name: linear combination.
A plane can be “spanned” by 2 vectors. A 3-D space can be spanned by 3 vectors. A smallest subset of vectors so that any vector is a linear combination of that set is called a basis. The number of vectors in a basis is the dimension of the vector space.
Note that the idea of linear combination is not new. Consider a linear system of equations. Aligning all the factors column by column. If the right hand side is considered a column vector, then it should be a linear combination of column vectors from LHS, with the coefficients of (x1,x2..xn). A short way of writing the whole thing is Ax=b, where x is the list of coefficients, b is a vector and A is a list of vectors. Actually, A is a rectangular of numbers, which is called a matrix.
Take another “linear” example. A linear transformation from Rn –> Rm would also follow “closure conditions”: f(kv) = kf(v) and f(u+v)=f(u)+f(v). As it has something in common with vector space, matrix can also be applied here. The matrix A = (c1,c2..cn) in which ci is the image of ei is called the standard matrix for the linear transformation. With standard matrix, things seem to be clearer: f(v) = Av with any vector v.
The composition of transformation turns out to be so much fun. f(v)=Av, g(u)=Bu –> g(f(v)) = BAv. This would reveal some nature of matrix multiplication. With Av, we change v into some combination of A’s vectors. Now, with B*A, we turn v into B*A1, whose multiplication follows exactly the same rule as in linear systems.
Note that a transformation is invertible if and only if it is a bijection. That is, a basis would be turn into a basis. Columns of A should be independent.
Fortunately (for mathematicians), or unfortunately (for students), matrix is more than some linear system stuff or some linear transformation stuff. It’s a rectangular of numbers, which can be viewed by rows or by columns. There are four (at least) vector spaces associated with it: column space C(A), row space C(A’), nullspace N(A) and left-nullspace N(A’). C(A) has the same dimension with C(A’). N(A) is orthogonal to C(A’), while N(A’) is orthogonal to C(A). That’s why dim(C(A’))+dim(N(A))=m, and N(A’)+C(A)=n.
Oh, I forgot to discuss about orthogonality. Actually, it’s just perpendicularity. It’s more comfortable to deal with orthogonal vectors when we do projections. A matrix consists of orthonormal vectors is an orthogonal matrix: A’ = inv(A)
And here comes the determinant of matrix. It shows the scaling factor of volume during transformation. If that scaling factor is 0, for example, when a plane is transform into a line, then the transformation is not invertible. Checking whether a matrix is invertible by calculating its determinant seems to be troublesome, but sometimes it’s the only way.
For example, when the matrix is not full of numbers, but there are still variables inside. That’s the case when we find eigenvectors. Given a square matrix, eigenvectors are vectors that remain the same direction after transformation: Ax=Lx. By finding eigenvalues L, we can find x through the linear system (A-LI)x=0. L is found by solving the characteristic polynomial det(A-LI)=0.
Eigenvectors are used to construct matrix that can diagonalize A: inv(P)*A*P = D (D: a diagonal matrix). In other words, we try to find some “simple” matrix D (in this case, diagonal – matrices only do scaling) which is similar to A. The problem of eigenvectors and similarity lead to deeper studies in linear algebra.