     Egwald Mathematics: Linear Algebra

Systems of Linear Equations

by

Elmer G. Wiens

Egwald's popular web pages are provided without cost to users.
Please show your support by joining Egwald Web Services as a Facebook Fan: Follow Elmer Wiens on Twitter: Systems of Linear Equations.

A system of linear equations can be expressed as:

 a1,1 x1 + a1,2 x2   + . . . +   a1,n xn = b1 a2,1 x1 + a2,2 x2   + . . . +   a2,n xn = b2                   . . . . . . . . . . . am,1 x1 + am,2 x2   + . . . +   am,n xn = bm

Example.

 6 * x1 + 2 * x2 + 2 * x3 = 6 4 * x1 + -3 * x2 + 5 * x3 = -4 1 * x1 + 6 * x2 + 4 * x3 = 3

Matrix Representation.

The same problem expressed in matrix and vector form is:

 A *x   =   b

where A = [ai, j] is the matrix of coefficients, xT = (x1, x2, . . . . xn) is the vector of unknowns, and bT = (b1, b1, . . . . bm) is the vector of RHS (right hand side) coefficients.

If all the coefficients of the vector b are zero, the system A * x = b is called a homogeneous system.

Otherwise, the system is called nonhomogeneous.

Example: Nonhomogeneous System.

A =
 6 2 2 4 -3 5 1 6 4
x =
 x1 x2 x3
b =
 6 -4 3

The columns (and rows) of the matrix A are linearly independent, as confirmed by the following diagram, where I use the standard (x, y, z) coordinate system.

Please note that I distinguish between the axis x, and the vector xT = (x1, x2, x3). Also, note that while the vector xT is a row vector, and the vector x is a column vector, the diagrams below do not distinguish between column and row vectors.

Graphically, the matrix A, its column vectors, and the column vector b can be represented Since the columns of A span R3, one can express any three dimensional vector, such as b in the example, as a linear combination of the columns of A. The solution to the linear equation problem, the vector

xT = (x1, x2, x3) = ( 1, 1, -1)

provides the coefficients of the linear combination of the columns of A that yield b.

A*x=b
 6 2 2 4 -3 5 1 6 4
*
 1 1 -1
=
 6 -4 3

Example: Homogeneous System.

Since the columns of A span R3, with the RHS vector bT = (0, 0, 0) the only solution is xT = (0, 0, 0), called the zero solution.

Any homogeneous system, (ie a homogeneous system with arbitrary matrix A and bT = oT the zero vector), always has at least one solution, the zero or trivial solution where xT = oT.

However, if the columns of A are not linearly independent, its homogeneous system has nontrivial solutions.

General Solution.

If xp is a particular solution of the nonhomogeneous system, and xh is a solution of the homogeneous system, then:

A * (xp + xh) = b + o = b,

proving that (xp + xh) is also a solution of the nonhomogeneous system.

The Image of a Matrix.

Let A be an m by n matrix. Write A = {a1, a2, ... an} as the set of m by 1 column vectors of A, ie A = [a1, a2, ... an].

The image of A, im(A), is the span of the columns of A. Formally:

im(A) = span(A) = {y in Rm | y = A * x,  for some x in Rn.

Sometimes this is referred to as the column space of the matrix A, denoted by R(A).

The Kernel of a Matrix A.

The kernel of A, ker(A), is the set of all vectors x in Rn for which A * x = o, the zero vector in Rm.  Formally:

ker(A) = {x in Rn | A * x = 0  in Rm}

Sometimes this is referred to as the null space of the matrix A, denoted by N(A).

The matrix A, and its transpose matrix AT (represented by A' in the diagrams below) generate four important subspaces: im(A), ker(A), im(AT), and ker(AT).

Example: General Solution with Linearly Dependent Columns (and Rows).

The matrix of the next linear equation problem has linearly dependent columns (and rows).

A*x=b
 8 2 3 2 4 6 1 2 3
*
 x1 x2 x3
=
 4 -6 -3

The following diagram displays this linear equation problem. The image of the matrix A, Im(A), is the two dimensional yellow subspace of R3 generated by the columns of A. Because the vector bT = (4, -6, -3) lies within the Im(A), it can be expressed as a linear combination of the columns of A. The particular solution to the linear equation problems is xpT = (1, -2, 0), since:

A*xp=b
 8 2 3 2 4 6 1 2 3
*
 1 -2 0
=
 4 -6 -3

The ker(A) is the line (also a subspace) through the vector xhT = (0, -6, 4), since xh is a solution of the homogeneous system for the matrix A:

A*xh=o
 8 2 3 2 4 6 1 2 3
*
 0 -6 4
=
 0 0 0

Now let us look at AT, the transpose matrix of the matrix A.

The image of the matrix AT, Im(AT), is the two dimensional yellow subspace of R3 generated by the rows of A (the columns of AT). As suggested by the diagram, ker(A) — the line through the vector xh — is perpendicular to the Im(AT) subspace.

Similarly, the kernel of the matrix AT, ker(AT) is the line through the vector yhT = (0, -3, 6), since yh is a solution of the homogeneous system for the matrix AT:

AT*yh=o
 8 2 1 2 4 2 3 6 3
*
 0 -3 6
=
 0 0 0

The subspace ker(AT) is perpendicular to the subspace Im(A), as one can see two diagrams above.

Example: General Solution.

The general solution to the linear equation problem A * x = b is x = xp + s * xh for any scalar s. If x is defined by:

x=xp+ s * xh
 x1 x2 x3
=
 1 -2 0
+ s *
 0 -6 4

then

A * x = A * xp + s * A * xh = b + s * o = b.

Example: Inconsistent System.

Consider the system with the same matrix A as in the example above, but with a RHS vector b that lies outside of the Im(A).

A*x=b
 8 2 3 2 4 6 1 2 3
*
 x1 x2 x3
=
 4 -6 3

The following diagram displays this linear equation problem. The image of the matrix A, Im(A), is the same two dimensional yellow subspace of R3 generated by the columns of A. However, the vector bT = (4, -6, 3) lies outside the Im(A) subspace. Therefore, it cannot be expressed as a linear combination of the columns of A. Thus, this linear equation problem has no particular solution, although its homogeneous system has solutions consisting of each vector on the line through the vector xhT = (0, -6, 4).

Solutions: Inconsistent System.

In the diagram above, I projected the vector bT = (4, -6, 3) onto the Im(A) subspace, yielding the vector pT = (4, -3.6, -1.8). Recall that the point of the vector p is the closest point on Im(A) to the point of the vector b. Consequently, the vector (b - p) is perpendicular to the vector p. To verify this, note that the dot product pT * (b - p) = (4, -3.6, -1.8)T * (0, -2.4, 4.8) = (0, 0, 0)T.

A case can be made that the vector pT = (4, -3.6, -1.8) provides a "solution" to the inconsistent system A * x = b, if one solves the system A * x = p, instead.

In least-squares problems, with more equations than unknowns, the solution provided by the algorithms is the vector x such that A * x = p. That is A * x equals the projection of the vector b onto Im(A).

Writing the matrix A = [a1, a2, a3], one can see that a3 = 1.5 * a2, ie a2 and a3 are linearly dependent. Dropping the vector a2 yields the least-squares problem B * z = b given by:

B*z=b
 8 3 2 6 1 3
*
 z1 z2
=
 4 -6 3

The least-squares solution is zT = (0.8286, -0.8762) since

B*z=p
 8 3 2 6 1 3
*
 0.82857 -0.87619
=
 4 -3.6 -1.8

A least-squares problem always has a solution for B * z = p, since p is in the Im(B). Furthermore, if the columns of B are linearly independent, the solution vector z is unique.

To get back to the inconsistent system, write wT = (0.82857, 0, -0.87619). Then, w is a solution of A * x = p since

A*w=p
 8 2 3 2 4 6 1 2 3
*
 0.82857 0 -0.87619
=
 4 -3.6 -1.8

However, the vector x = w + s * xh, where s is a scalar and xh is a solution of A * x = o, given by:

x=w+ s * xh
 x1 x2 x3
=
 0.8286 0 -0.8762
+ s *
 0 -6 4

is also a solution to A * x = p,  because  A * (w + s * xh) = A * w + s * A * xh = p + s * o = p.

If small changes in the coefficients of the matrix A of a least-squares problem produce a matrix Â with linearly dependent columns, the columns of A are "almost" linearly dependent. The estimates of the coefficients of the solution vector are tentative, since small changes in the data matrix A result in a system with a matrix Â having many solutions.

Solution: Inconsistent System - Minimum Length.

One might choose as "optimal" the solution vector xT = (x1, x2, x3)T = w + s * xh with minimum length. The length (norm) of x = norm(w + s * xh) is:

norm(x) = sqrt(x21 + x22 + x23) = sqrt[841/1225+36*s^2+(-92/105+4*s)^2],

having a minimum for s+ = 0.0674. The solution of minimum length to the inconsistent system is:

x+ = w + s+ * xh = (0.8286, -0.4044, -0.6066).

The singular value decomposition algorithm provides x+ = A+ * b = (0.8286, -0.4044, -0.6066), where A+ is the pseudoinverse of A.

Echelon Algorithm.

The Echelon Algorithm is similar to the Gauss-Jordan method and the Gaussian method. The Echelon Algorithm permits one to calculate the general solution of a system of linear equations having more columns than rows. Begin by writing the matrix A and the vector b side by side as shown in Tableau0 below. This matrix arrangement is called the augmented matrix.

Using the following row operations on the tableaux of the augmented matrix, convert the left hand matrix to echelon form.

 1. exchange two rows to get the largest entry (in absolute value) onto the diagonal. 2. divide a row by the diagonal entry. 3. add a multiple of one row to another row.

Begin in the upper left hand corner of the left hand matrix. Proceed along the diagonal converting each column into the appropriate column of the identity matrix as shown in the sequence of tableaux. Exchange rows with the lower row that has the largest entry (in absolute value) for the column. If all entries on and below the diagonal of the column are zeros, proceed to the next column.

Tableau0
Ab
 R01 R02 R03
 6 12 1 -5 4 8 10 6 1 2 3 2
 7 14 4

Tableau1
Ãß
 R11 = R01 / 6 R12 = R02 - (4) * R11 R13 = R03 - (1) * R11
 1 2 0.1667 -0.8333 0 0 9.3333 9.3333 0 0 2.8333 2.8333
 1.1667 9.3333 2.8333

Tableau2
Ãß
 R21 = R11 - (0.1667) * R22 R22 = R12 / 9.3333 R23 = R13 - (2.8333) * R22
 1 2 0 -1 0 0 1 1 0 0 0 0
 1 1 0

The last tableau contains the information one needs to obtain the general solution to the original linear equation problem A * x = b.

Echelon Tableau
Ã*x=ß
 1 2 0 -1 0 0 1 1 0 0 0 0
*
 x1 x2 x3 x4
=
 1 1 0

The solution to the Echelon Tableau yields the particular solution, xp, and the entire set of solutions of the associated homogeneous system, ie all vectors in the Ker(A).

Vectors in the Ker(A) are scalar multiples of the basis vectors, xh1, xh2, of the Ker(A) subspace.

x=xp + s1 * xh1 + s2 * xh2
 x1 x2 x3 x4
=
 1 0 1 0
+ s1 *
 -2 1 0 0
+ s2 *
 1 0 -1 1

Use the form below to solve some other linear equation problem.

Specify the dimensions of A (m <= n, 2 <= m,n) and fill in / change the numbers in the upper left hand corner, and then click "Solve".

Linear Equations Echelon Algorithm.

Matrix Ab
 6 12 1 -5 4 8 10 6 1 2 3 2 0 0 0 0 0 0 0 0
 7 14 4 0 0
 Row dimension of A: m:   3 Column dimension of A: n:   4

If you add a constant to an entry of the b vector, the system might be inconsistent. If you add a constant to an entry of a basis vector of Ker(A), you can create a matrix with more linearly independent columns.

Singular Value Decomposition Algorithm.

Consider the same linear equation problem A*x = b solved with the Echelon Algorithm.

A*x=b
 6 12 1 -5 4 8 10 6 1 2 3 2
*
 x1 x2 x3 x4
=
 7 14 4

Using singular value decomposition yields the orthogonal matrices U and V, and the diagonal matrix S of singular values of A:

A * x = U * S * VT * x = b,

USVTx=b
 -0.658 0.7523 0.0342 -0.7268 -0.6225 -0.2903 -0.1971 -0.2158 0.9563
 17.9819 0 0 0 0 10.8005 0 0 0 0 0 0
 -0.3922 -0.7844 -0.4737 -0.0815 0.1674 0.3348 -0.5667 -0.734 -0.9045 0.402 0.1005 -0.1005 0 -0.3333 0.6667 -0.6667
 x1 x2 x3 x4
 = = = =
 7 14 4

Since the matrix S has 2 positive singular values, the rank of the matrix is 2. Furthermore, the columns of U and V provide orthonormal bases for the following subspaces:

SubspaceOrthonormal Bases
Im(A)first 2 columns of U
Ker(AT)last 1 columns of U
Im(AT)first 2 columns of V
Ker(A)last 2 columns of V

The Pseudoinverse is A+ = V * S+ * UT

A+ = V*S+*UT
 0.026 0.0062 0.001 0.052 0.0124 0.0019 -0.0221 0.0518 0.0165 -0.0481 0.0456 0.0156
=
 -0.3922 0.1674 -0.9045 0 -0.7844 0.3348 0.402 -0.3333 -0.4737 -0.5667 0.1005 0.6667 -0.0815 -0.734 -0.1005 -0.6667
*
 0.0556 0 0 0 0.0926 0 0 0 0 0 0 0
*
 -0.658 -0.7268 -0.1971 0.7523 -0.6225 -0.2158 0.0342 -0.2903 0.9563

Both the columns and rows of A are linearly dependent. Setting x+ = A+ * b yields p = A * x+, the projection of b onto the Im(A) subspace.

x+ = A+ * b

x+=A+* b=
 x1 x2 x3 x4
 = = = =
 0.026 0.0062 0.001 0.052 0.0124 0.0019 -0.0221 0.0518 0.0165 -0.0481 0.0456 0.0156
 7 14 4
 = = = =
 0.2727 0.5455 0.6364 0.3636

p = A * x+

p=A* x+=
 p1 p2 p3
 = = =
 6 12 1 -5 4 8 10 6 1 2 3 2
 0.2727 0.5455 0.6364 0.3636
 = = =
 7 14 4

The RHS vector, b, lies inside the Im(A). Consequently, the system A * x = b is consistent. Therefore, p = b.

Overdetermined Systems
With # Rows m >= # Columns n

Suppose one wants to derive a linear equation model of the form A * x = b from a data set with a design (data) matrix A having more rows than columns. The columns of A and the column vector b represent measurable factors of the phenomena of interest. Each row of A represents one observation of the measured factors of the phenomena.

Since the matrix A has more rows than columns, one expects that A * x = b will be inconsistent, ie that b will lie outside the Im(A). Consequently, b will not be represented exactly as a linear combination of the columns of A.

I discuss least squares methods, also called multiple regression methods, on the Statistical web pages. Here I derive methods that statisticians and econometricians use to evaluate the least square (multiple regression) estimates for overdetermined systems of linear equations.

Least Squares (Multiple Regression) Solution.

With an inconsistent system, the least squares solution obtains a vector x+ from the data such that the vector p given by:

p = A * x+

is the projection of the vector b onto Im(A).

To be the projection of the vector b onto Im(A), p must be the vector in Im(A) that minimizes the distance between p and b, ie:

minimize sqrt{ (b1 - p1)2 + (b2 - p2)2 + .   .   . + (bn - pn)2 }

giving the name "least squares" to this solution.

As the number of observations increases, one hopes that ones understanding of the phenomena improves. This could be the case if the least squares estimated values of each parameter of the vector x+ converge as the number of observations increases.

Furthermore, one might deduce that a model of the form:

z = a1* x+ 1 + a2* x+ 2 +   .   .   + an*x+ n,

for a given instance of factors represented by the vector

aT = (a1, a2, .   .   .   an)

could be used to predict the measured value of the factor represented in the b column, if the a vector was observed.

Normal Equations.

The normal equations provide a direct method for obtaining the least squares parameters:

x+T = (x+1, x+2, .   .   .   x+n)

to the linear equations system A * x = b.

To reduce the dimensions of the system, multiply the equation A * x = b by the matrix AT, the transpose of A. The resulting equations are called the normal equations:

AT * A * x = AT * b.

Example.

As an experimenter, I want to summarize the data set with an equation in R3 space with standard coordinates (x, y, z):

z = x+1 * x + x+2 * y

described by the linear equations system A * x = b. The matrix A has two columns representing the independent variables (factors) of my model, x and y. The RHS vector b represents the dependent variable, z. I want to estimate the coefficients x+1 and x+2 using least squares. Perhaps, I want to use the equation to predict values of the dependent variable z for as yet unknown values of the independent variables x and y.

A*x=b
 4.4 4 2.9 -5.2 -2.8 -5.2 3.9 -0.1 7 -2.5 -2.5 -4.9 5.8 -4.1 -2.5 7.8
*
 x1 x2
=
 4.1 0.75 -5.3 -0.3 4.75 -5.9 -0.55 2.55

The normal equations system AT * A * x = AT * b is:

AT*A*x=AT*b
 4.4 2.9 -2.8 3.9 7 -2.5 5.8 -2.5 4 -5.2 -5.2 -0.1 -2.5 -4.9 -4.1 7.8
*
 4.4 4 2.9 -5.2 -2.8 -5.2 3.9 -0.1 7 -2.5 -2.5 -4.9 5.8 -4.1 -2.5 7.8
*
 x1 x2
=
 4.4 2.9 -2.8 3.9 7 -2.5 5.8 -2.5 4 -5.2 -5.2 -0.1 -2.5 -4.9 -4.1 7.8
*
 4.1 0.75 -5.3 -0.3 4.75 -5.9 -0.55 2.55

Multiplying by the matrix AT yields the reduced system with 2 equations and 2 unknowns:

(AT*A)*x=(AT*b)
 145.96 -31.84 -31.84 178
*
 x1 x2
=
 72.32 79.27

The matrix (AT*A) is symmetric. If the data matrix A has linearly independent columns, (AT*A) has an inverse. The normal equations provide a system with a square coefficient matrix, solvable by standard methods.

Proceeding with the inverse of (AT*A):

x+=(AT*A)-1*(AT*b)=
 x+1 x+2
=
 0.0071 0.0013 0.0013 0.0058
*
 72.32 79.27
=
 0.6167 0.5556

The least squares solution using the normal equations is x+ = ( 0.6167, 0.5556 ).

The projection of b onto Im(A) is p = A * x+. Comparing p and b and computing b - p:

a1a2 b-p=(b-p)
 1 2 3 4 5 6 7 8
 4.4 2.9 -2.8 3.9 7 -2.5 5.8 -2.5
 4 -5.2 -5.2 -0.1 -2.5 -4.9 -4.1 7.8

 4.1 0.75 -5.3 -0.3 4.75 -5.9 -0.55 2.55
-
 4.936 -1.101 -4.6161 2.3495 2.9277 -4.2644 1.2986 2.7923
=
 -0.836 1.851 -0.6839 -2.6495 1.8223 -1.6356 -1.8486 -0.2423

The distance from Im(A) to b — the length of the vector (b-p) — is 4.592, found by computing the square root of the sums of the squares of the (b-p) column. (Im(A) is a subspace of R8, while b and p are vectors in R8)

For each row i of the data matrix A = [a1, a2], the triples (ai,1, ai,2, bi) and (ai,1, ai,2, pi) are points in R3 with standard coordinates (x, y, z). Furthermore, the least squares solution can be represented by the plane:

z = 0.6167 * x + 0.5556 * y

The plane is displayed below. If the point (ai,1, ai,2, bi) lies above the plane, a blue line joins it to the point (ai,1, ai,2, pi) on the plane. If (ai,1, ai,2, bi) lies below the plane, a red line joins it to (ai,1, ai,2, pi). By clicking the refresh button on your browser, you can generate a new system of linear equations, with the new solution with least squares parameters x+ = (x+1, x+2) displayed above.

The functional form,

z = x+1 * x + x+2 * y,

ensures that the solution plane passes through the origin.

In the example of the next section, the estimated solution plane is not restricted to pass through the origin.

Singular Value Decomposition and Least Squares.

The "normal equations" least squares method of solving systems of linear equations computes the matrix (AT * A). For systems with many observations, the data matrix A has many rows. Multiplying two matrices using a computer incurs the risk of roundoff errors distorting the components of the matrix (AT * A) from their true values. Furthermore, the matrix (AT * A) may be ill-conditioned if the columns of A are nearly linearly dependent. Experimenters often include as many factors as possible in their design matrix A, hoping to improve the "explanatory power" of their model. The probability that subtle inter dependencies among the columns of A will render A nearly linearly dependent increases with the number of included factors.

The singular value decomposition method controls for such errors. Firstly, it controls roundoff errors by decomposing the data matrix A using Householder and Jacobi transformations, known for their stability. This decomposition is obtained without computing (AT * A). Secondly, singular value decomposition controls for linear dependence (and roundoff error) by setting very small singular values to zero.

Example.

The matrix A has two columns representing the independent variables (factors) of my model, x and y. The RHS vector b represents the dependent variable, z. As an experimenter, I want to summarize the data set with an equation in R3 space with standard coordinates (x, y, z):

z = x+1 + x+2 * x + x+3 * y

described by the linear equations system A * x = b. I need to estimate three coefficients x+1, x+2, and x+3 since I do not require the solution plane to pass through the origin. To obtain a z-intercept coefficient given by x+1, I append a column of ones to the front of the matrix A

A*x=b
 1 -2.9 4.7 1 -3.8 6.8 1 2 -7.9 1 -2.4 5.5 1 -4 5.5 1 -7.6 -0.8 1 7.8 -6.7 1 -6.6 -2
*
 x1 x2 x3
=
 -0.3 3 -1.75 3.75 2.15 -4.7 2.95 -1.4

Singular value decomposition yields the orthogonal matrices U and V, and the diagonal matrix S of singular values of A:

A * x = U * S * VT * x = b,

USVTx=b
 -0.2925 -0.0864 0.3372 -0.4081 -0.1584 0.3278 0.3834 0.3823 0.3727 -0.3067 -0.1755 0.369 -0.3634 -0.0584 0.3018 -0.2398 0.6187 0.0654 0.5411 -0.1209 0.6326 -0.1566 0.6243 0.0912
 18.7303 0 0 0 10.1194 0 0 0 2.513
 -0.045 0.6637 -0.7466 0.1014 -0.7405 -0.6644 0.9938 0.1056 0.034
 x1 x2 x3
 = = =
 -0.3 3 -1.75 3.75 2.15 -4.7 2.95 -1.4

Since the matrix S has 3 positive singular values, the rank of the matrix A is 3. Computing A+ = V * S+ * UT, the least squares solution to A * x = b is:

x+ = A+ * b.

x+=A+* b=
 x1 x2 x3
 = = =
 0.1332 0.129 0.1503 0.1449 0.1196 0.0326 0.2477 0.0427 0.0101 0.0109 0.0013 0.0175 0.0041 -0.051 0.0546 -0.0474 0.0219 0.0311 -0.0353 0.0287 0.0224 -0.0302 -0.0051 -0.0335
 -0.3 3 -1.75 3.75 2.15 -4.7 2.95 -1.4
 = = =
 1.4022 0.5689 0.4782

The least squares solution using singular value decomposition is x+ = ( 1.4022, 0.5689, 0.4782 ).

The projection of b onto Im(A) is p = A * x+. Comparing p and b and computing b - p:

a1a2a3 b-p=(b-p)
 1 2 3 4 5 6 7 8
 1 1 1 1 1 1 1 1
 -2.9 -3.8 2 -2.4 -4 -7.6 7.8 -6.6
 4.7 6.8 -7.9 5.5 5.5 -0.8 -6.7 -2

 -0.3 3 -1.75 3.75 2.15 -4.7 2.95 -1.4
-
 2 2.4922 -1.238 2.667 1.7567 -3.3044 2.6358 -3.3093
=
 -2.3 0.5078 -0.512 1.083 0.3933 -1.3956 0.3142 1.9093

The distance from Im(A) to b — the length of the vector (b-p) — is 3.582, found by computing the square root of the sums of the squares of the (b-p) column. (Im(A) is a subspace of R8, while b and p are vectors in R8)

For each row i of the data matrix A = [a1, a2, a3], the triples (ai,2, ai,3, bi) and (ai,2, ai,3, pi) are points in R3 with standard coordinates (x, y, z). Furthermore, the least squares solution can be represented by the plane:

z = 1.4022 + 0.5689 * x + 0.4782 * y

The plane is displayed below. If the point (ai,2, ai,3, bi) lies above the plane, a blue line joins it to the point (ai,2, ai,3, pi) on the plane. If (ai,2, ai,3, bi) lies below the plane, a red line joins it to (ai,2, ai,3, pi). By clicking the refresh button on your browser, you can generate a new system of linear equations, with the new solution with least squares parameters x+ = (x+1, x+2, x+3) displayed above.

A Square Matrix.

If A is a square matrix of dimension n, the system of linear equations problem has of n equations in n unknowns. If the rank of A, rank(A), equals n, its columns and rows are linearly independent and the problem has a unique solution.

The variables x = [xi] that solve the system of linear equations A * x = b can be obtained by various methods.

You can adjust the matrix A and the RHS vector b in the form below to see how the different methods solve the problem. The available methods are based on the techniques for decomposing a matrix that I discussed on the matrices web page. These methods are the A inverse solution using the Gauss-Jordan method,   P*A = L*U solution using the Gaussian method, A = Q*R solution using Householder transformations, and the svd solution using singular value decomposition.

Solve A * x = b
*x1 + *x2 + *x3 + *x4 + *x5=
*x1 + *x2 + *x3 + *x4 + *x5=
*x1 + *x2 + *x3 + *x4 + *x5=
*x1 + *x2 + *x3 + *x4 + *x5=
*x1 + *x2 + *x3 + *x4 + *x5=
 Square dimension of A: 5

The Gauss-Jordan Inverse Method

The Gausss-Jordan inverse method uses row operations on the initial tableau [ A  |  I ] to convert the RHS matrix to the identity matrix I and the LHS matrix to the inverse matrix A-1.

Solve A * x = b with A-1.

x=A-1* b=
 x1 x2 x3 x4 x5
 = = = = =
 2 -3 1 0 -0 -3 6 -4 1 0 1 -4 6 -4 1 0 1 -4 6 -2.8 -0 0 1 -2.8 1.64
 310 325 356 405 475
 = = = = =
 1 1 1 1 1

Solution vector xT
11111

The Gaussian Decomposition Method

The Gaussian decomposition method factors the matrix A into the product of a lower triangular matrix L and an upper triangular matrix U, keeping track of row switches to construct a permutation matrix P so that P * A = L * U.

Solve A * x = b using P * A * x = L * U * x = P * b.

LUx=Pb
 1 0 0 0 0 0.7333 1 0 0 0 0.7467 0.65 1 0 0 0.7867 0.35 0.5385 1 0 0.8667 0.125 0.1923 0.3571 1
 75 80 90 105 125 0 -2.6667 -7 -12 -16.6667 0 0 -0.65 -1.6 -2.5 0 0 0 -0.5385 -1.1538 0 0 0 0 -0.3571
 x1 x2 x3 x4 x5
 = = = = =
 0 0 0 0 1 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0
 310 325 356 405 475

The solution vector x is obtained in two steps. First, solve for y in L * y = P * b by forward substitution:

Ly=P * b
 1 0 0 0 0 0.7333 1 0 0 0 0.7467 0.65 1 0 0 0.7867 0.35 0.5385 1 0 0.8667 0.125 0.1923 0.3571 1
 y1 y2 y3 y4 y5
 = = = = =
 475 310 325 356 405

Solution vector yT
475-38.33333-4.75-1.69231-0.35714

Next, solve for x in U * x =y by backward substitution:

Ux=y
 75 80 90 105 125 0 -2.6667 -7 -12 -16.6667 0 0 -0.65 -1.6 -2.5 0 0 0 -0.5385 -1.1538 0 0 0 0 -0.3571
 x1 x2 x3 x4 x5
 = = = = =
 475 -38.3333 -4.75 -1.6923 -0.3571

Solution vector xT
11111

The Householder QR Method

The Householder decomposition method factors the matrix A into the product of an orthonormal matrix Q and an upper triangular matrix R, such that A = Q * R.

Solve A * x = b using A * x = Q * R * x = b.

QRx=b
 -0.3939 -0.73 0.5585 -0 0 -0.4011 -0.3366 -0.7228 -0.4509 -0 -0.4226 0.0152 -0.2782 0.8106 0.2945 -0.4656 0.3046 0.0697 0.0751 -0.8246 -0.5372 0.5107 0.2886 -0.3661 0.483
 -139.614 -146.626 -161.044 -183.664 -215.702 0 2.4143 6.513 11.2295 15.5124 0 0 0.5585 1.5112 2.4155 0 0 0 -0.4509 -0.993 0 -0 0 0 0.2945
 x1 x2 x3 x4 x5
 = = = = =
 310 325 356 405 475

Multiplying by QT gives R * x = QT * b.

Rx=QT * b
 -139.614 -146.626 -161.044 -183.664 -215.702 0 2.4143 6.513 11.2295 15.5124 0 0 0.5585 1.5112 2.4155 0 0 0 -0.4509 -0.993 0 -0 0 0 0.2945
 x1 x2 x3 x4 x5
 = = = = =
 -846.65 35.6692 4.4853 -1.444 0.2945

These equations are solved for x by back substitution yielding:

Solution vector xT
11111

The Singular Value Decomposition Method

The singular value decomposition method factors the matrix A into the product A = U * S* VT, where U consists of the eigenvectors of A * AT, V consists of the eigenvectors of AT * A, and S consists of the square roots of the eigenvalues of A * AT (which equal the eigenvalues of AT * A).

Solve A * x = b using A * x = U * S * VT * x = b.

USVTx=b
 -0.3631 0.642 -0.5234 0.3778 -0.1982 -0.3816 0.4404 0.1746 -0.6017 0.5176 -0.4196 0.1006 0.6398 -0.0213 -0.6356 -0.4791 -0.2708 0.2518 0.6143 0.5063 -0.5629 -0.5572 -0.472 -0.3427 -0.1801
 384.067 0 0 0 0 0 10.1448 0 0 0 0 0 0.5623 0 0 0 0 0 0.1488 0 0 0 0 0 0.0767
 -0.3631 -0.3816 -0.4196 -0.4791 -0.5629 0.642 0.4404 0.1006 -0.2708 -0.5572 -0.5234 0.1746 0.6398 0.2518 -0.472 0.3778 -0.6017 -0.0213 0.6143 -0.3427 -0.1982 0.5176 -0.6356 0.5063 -0.1801
 x1 x2 x3 x4 x5
 = = = = =
 310 325 356 405 475

Since the matrix S has 5 positive singular values, the rank of the matrix A is 5. Computing A+ = V * S+ * UT, the solution to A * x = b is:

x=A+* b=
 x1 x2 x3 x4 x5
 = = = = =
 2 -3 1 0 -0 -3 6 -4 1 0 1 -4 6 -4 1 -0 1 -4 6 -2.8 -0 0 1 -2.8 1.64
 310 325 356 405 475
 = = = = =
 1 1 1 1 1

Solution vector xT
11111

References.

• Ayres, Frank Jr. Matrices. New York: Schaum McGraw-Hill, 1962.
• Ayres, Frank Jr. Modern Algebra. New York: Schaum McGraw-Hill 1965.
• Bretscher, Otto. Linear Algebra with Applications. Upper Saddle River: Prentice Hall, 1997.
• Burden, Richard L. and J. Douglas Faires. Numerical Analysis. 6th ed. Pacific Grove: Brooks/Cole, 1997.
• Cohn, P. M. Linear Equations. London: Routledge, 1964.
• Demmel, James W. Applied Numerical Linear Algebra. Philadelphia: Siam, 1997.
• Dowling, Edward T. Mathematics for Economists. New York: Schaum McGraw-Hill, 1980.
• Lipschutz, Seymour. Linear Algebra. New York: Schaum McGraw-Hill, 1968.
• Mathews, John H. and Kurtis D. Fink. Numerical Methods Using MATLAB. 3rd ed. Upper Saddle River: Prentice Hall, 1999.
• Press, William H., Brian P. Flannery, Saul A. Teukolsky, and William T. Vetterling. Numerical Recipes: The Art of Scientific Computing. Cambridge: Cambridge UP, 1989.
• Strang, Gilbert. Linear Algebra and Its Applications. 3d ed. San Diego: Harcourt, 1976.
• Varah, James. Numerical Linear Algebra: Computer Science 402 Lecture Notes. Vancouver: University of B.C., 2000.
• Watkins, David S. Fundamentals of Matrix Computations. New York: John Wiley, 1991. Copyright © Elmer G. Wiens:   Egwald Web Services All Rights Reserved.    Inquiries 