After solving a few systems of equations, you will recognize that it doesn't matter so much what we call our variables, as opposed to what numbers act as their coefficients. A system in the variables $x_1,\,x_2,\,x_3$ would behave the same if we changed the names of the variables to $a,\,b,\,c$ and kept all the constants the same and in the same places. In this section, we will isolate the key bits of information about a system of equations into something called a matrix, and then use this matrix to systematically solve the equations. Along the way we will obtain one of our most important and useful computational tools.

## Matrix and Vector Notation for Systems of Equations

Definition M (Matrix) An $m\times n$ matrix is a rectangular layout of numbers from $\complex{\null}$ having $m$ rows and $n$ columns. We will use upper-case Latin letters from the start of the alphabet ($A,\,B,\,C,\dotsc$) to denote matrices and squared-off brackets to delimit the layout. Many use large parentheses instead of brackets --- the distinction is not important. Rows of a matrix will be referenced starting at the top and working down (i.e. row 1 is at the top) and columns will be referenced starting from the left (i.e. column 1 is at the left). For a matrix $A$, the notation $\matrixentry{A}{ij}$ will refer to the complex number in row $i$ and column $j$ of $A$.
(This definition contains Notation M.)

Be careful with this notation for individual entries, since it is easy to think that $\matrixentry{A}{ij}$ refers to the whole matrix. It does not. It is just a number, but is a convenient way to talk about the individual entries simultaneously. This notation will get a heavy workout once we get to Chapter M:Matrices.

Example AM: A matrix.

When we do equation operations on system of equations, the names of the variables really aren't very important. $x_1$, $x_2$, $x_3$, or $a$, $b$, $c$, or $x$, $y$, $z$, it really doesn't matter. In this subsection we will describe some notation that will make it easier to describe linear systems, solve the systems and describe the solution sets. Here is a list of definitions, laden with notation.

Definition CV (Column Vector) A column vector of size $m$ is an ordered list of $m$ numbers, which is written in order vertically, starting at the top and proceeding to the bottom. At times, we will refer to a column vector as simply a vector. Column vectors will be written in bold, usually with lower case Latin letter from the end of the alphabet such as $\vect{u}$, $\vect{v}$, $\vect{w}$, $\vect{x}$, $\vect{y}$, $\vect{z}$. Some books like to write vectors with arrows, such as $\vec{u}$. Writing by hand, some like to put arrows on top of the symbol, or a tilde underneath the symbol, as in $\underset{\sim}{\textstyle u}$. To refer to the entry or component that is number $i$ in the list that is the vector $\vect{v}$ we write $\vectorentry{\vect{v}}{i}$.

Be careful with this notation. While the symbols $\vectorentry{\vect{v}}{i}$ might look somewhat substantial, as an object this represents just one component of a vector, which is just a single complex number.

Definition ZCV (Zero Column Vector) The zero vector of size $m$ is the column vector of size $m$ where each entry is the number zero,

\begin{align*} \zerovector= \colvector{0\\0\\0\\\vdots\\0} \end{align*}

or defined much more compactly, $\vectorentry{\zerovector}{i}=0$ for $1\leq i\leq m$.
(This definition contains Notation ZCV.)

Definition CM (Coefficient Matrix) For a system of linear equations,

\begin{align*} a_{11}x_1+a_{12}x_2+a_{13}x_3+...+a_{1n}x_n&=b_1\\ a_{21}x_1+a_{22}x_2+a_{23}x_3+...+a_{2n}x_n&=b_2\\ a_{31}x_1+a_{32}x_2+a_{33}x_3+...+a_{3n}x_n&=b_3\\ \vdots&\\ a_{m1}x_1+a_{m2}x_2+a_{m3}x_3+...+a_{mn}x_n&=b_m \end{align*}

the coefficient matrix is the $m\times n$ matrix \begin{equation*} A= \begin{bmatrix} a_{11}&a_{12}&a_{13}&...&a_{1n}\\ a_{21}&a_{22}&a_{23}&...&a_{2n}\\ a_{31}&a_{32}&a_{33}&...&a_{3n}\\ \vdots&\\ a_{m1}&a_{m2}&a_{m3}&...&a_{mn}\\ \end{bmatrix} \end{equation*}

Definition VOC (Vector of Constants) For a system of linear equations,

\begin{align*} a_{11}x_1+a_{12}x_2+a_{13}x_3+...+a_{1n}x_n&=b_1\\ a_{21}x_1+a_{22}x_2+a_{23}x_3+...+a_{2n}x_n&=b_2\\ a_{31}x_1+a_{32}x_2+a_{33}x_3+...+a_{3n}x_n&=b_3\\ \vdots&\\ a_{m1}x_1+a_{m2}x_2+a_{m3}x_3+...+a_{mn}x_n&=b_m \end{align*}

the vector of constants is the column vector of size $m$ \begin{equation*} \vect{b}= \begin{bmatrix} b_1\\ b_2\\ b_3\\ \vdots\\ b_m\\ \end{bmatrix} \end{equation*}

Definition SOLV (Solution Vector) For a system of linear equations,

\begin{align*} a_{11}x_1+a_{12}x_2+a_{13}x_3+...+a_{1n}x_n&=b_1\\ a_{21}x_1+a_{22}x_2+a_{23}x_3+...+a_{2n}x_n&=b_2\\ a_{31}x_1+a_{32}x_2+a_{33}x_3+...+a_{3n}x_n&=b_3\\ \vdots&\\ a_{m1}x_1+a_{m2}x_2+a_{m3}x_3+...+a_{mn}x_n&=b_m \end{align*}

the solution vector is the column vector of size $n$ \begin{equation*} \vect{x}= \begin{bmatrix} x_1\\ x_2\\ x_3\\ \vdots\\ x_n\\ \end{bmatrix} \end{equation*}

The solution vector may do double-duty on occasion. It might refer to a list of variable quantities at one point, and subsequently refer to values of those variables that actually form a particular solution to that system.

Definition MRLS (Matrix Representation of a Linear System) If $A$ is the coefficient matrix of a system of linear equations and $\vect{b}$ is the vector of constants, then we will write $\linearsystem{A}{\vect{b}}$ as a shorthand expression for the system of linear equations, which we will refer to as the matrix representation of the linear system.

Example NSLE: Notation for systems of linear equations.

Definition AM (Augmented Matrix) Suppose we have a system of $m$ equations in $n$ variables, with coefficient matrix $A$ and vector of constants $\vect{b}$. Then the augmented matrix of the system of equations is the $m\times(n+1)$ matrix whose first $n$ columns are the columns of $A$ and whose last column (number $n+1$) is the column vector $\vect{b}$. This matrix will be written as $\augmented{A}{\vect{b}}$.

The augmented matrix represents all the important information in the system of equations, since the names of the variables have been ignored, and the only connection with the variables is the location of their coefficients in the matrix. It is important to realize that the augmented matrix is just that, a matrix, and not a system of equations. In particular, the augmented matrix does not have any "solutions," though it will be useful for finding solutions to the system of equations that it is associated with. (Think about your objects, and review technique L.) However, notice that an augmented matrix always belongs to some system of equations, and vice versa, so it is tempting to try and blur the distinction between the two. Here's a quick example.

Example AMAA: Augmented matrix for Archetype A.

## Row Operations

An augmented matrix for a system of equations will save us the tedium of continually writing down the names of the variables as we solve the system. It will also release us from any dependence on the actual names of the variables. We have seen how certain operations we can perform on equations (Definition EO) will preserve their solutions (Theorem EOPSS). The next two definitions and the following theorem carry over these ideas to augmented matrices.

Definition RO (Row Operations) The following three operations will transform an $m\times n$ matrix into a different matrix of the same size, and each is known as a row operation.

1. Swap the locations of two rows.
2. Multiply each entry of a single row by a nonzero quantity.
3. Multiply each entry of one row by some quantity, and add these values to the entries in the same columns of a second row. Leave the first row the same after this operation, but replace the second row by the new values.

We will use a symbolic shorthand to describe these row operations:

1. $\rowopswap{i}{j}$:    Swap the location of rows $i$ and $j$.
2. $\rowopmult{\alpha}{i}$:    Multiply row $i$ by the nonzero scalar $\alpha$.
3. $\rowopadd{\alpha}{i}{j}$:    Multiply row $i$ by the scalar $\alpha$ and add to row $j$.

Definition REM (Row-Equivalent Matrices) Two matrices, $A$ and $B$, are row-equivalent if one can be obtained from the other by a sequence of row operations.

Example TREM: Two row-equivalent matrices.

Notice that each of the three row operations is reversible (exercise RREF.T10), so we do not have to be careful about the distinction between "$A$ is row-equivalent to $B$" and "$B$ is row-equivalent to $A$." (exercise RREF.T11) The preceding definitions are designed to make the following theorem possible. It says that row-equivalent matrices represent systems of linear equations that have identical solution sets.

Theorem REMES (Row-Equivalent Matrices represent Equivalent Systems) Suppose that $A$ and $B$ are row-equivalent augmented matrices. Then the systems of linear equations that they represent are equivalent systems.

So at this point, our strategy is to begin with a system of equations, represent it by an augmented matrix, perform row operations (which will preserve solutions for the corresponding systems) to get a "simpler" augmented matrix, convert back to a "simpler" system of equations and then solve that system, knowing that its solutions are those of the original system. Here's a rehash of Example US as an exercise in using our new tools.

Example USR: Three equations, one solution, reprised.

## Reduced Row-Echelon Form

The preceding example amply illustrates the definitions and theorems we have seen so far. But it still leaves two questions unanswered. Exactly what is this "simpler" form for a matrix, and just how do we get it? Here's the answer to the first question, a definition of reduced row-echelon form.

Definition RREF (Reduced Row-Echelon Form) A matrix is in reduced row-echelon form if it meets all of the following conditions:

1. If there is a row where every entry is zero, then this row lies below any other row that contains a nonzero entry.
2. The leftmost nonzero entry of a row is equal to 1.
3. The leftmost nonzero entry of a row is the only nonzero entry in its column.
4. Consider any two different leftmost nonzero entries, one located in row $i$, column $j$ and the other located in row $s$, column $t$. If $s>i$, then $t>j$.

A row of only zero entries will be called a zero row and the leftmost nonzero entry of a nonzero row will be called a leading 1. The number of nonzero rows will be denoted by $r$.

A column containing a leading 1 will be called a pivot column. The set of column indices for all of the pivot columns will be denoted by $D=\set{d_1,\,d_2,\,d_3,\,\ldots,\,d_r}$ where $d_1 < d_2 < d_3 < \cdots < d_r$, while the columns that are not pivot columns will be denoted as $F=\set{f_1,\,f_2,\,f_3,\,\ldots,\,f_{n-r}}$ where $f_1 < f_2 < f_3 < \cdots < f_{n-r}$.
(This definition contains Notation RREFA.)

The principal feature of reduced row-echelon form is the pattern of leading 1's guaranteed by conditions (2) and (4), reminiscent of a flight of geese, or steps in a staircase, or water cascading down a mountain stream.

There are a number of new terms and notation introduced in this definition, which should make you suspect that this is an important definition. Given all there is to digest here, we will mostly save the use of $D$ and $F$ until Section TSS:Types of Solution Sets. However, one important point to make here is that all of these terms and notation apply to a matrix. Sometimes we will employ these terms and sets for an augmented matrix, and other times it might be a coefficient matrix. So always give some thought to exactly which type of matrix you are analyzing.

Example RREF: A matrix in reduced row-echelon form.

Example NRREF: A matrix not in reduced row-echelon form.

Theorem REMEF (Row-Equivalent Matrix in Echelon Form) Suppose $A$ is a matrix. Then there is a matrix $B$ so that

1. $A$ and $B$ are row-equivalent.
2. $B$ is in reduced row-echelon form.

The procedure given in the proof of Theorem REMEF can be more precisely described using a pseudo-code version of a computer program, as follows:


input $m$, $n$ and $A$
$r\leftarrow 0$
for $j\leftarrow 1$ to $n$
$i\leftarrow r+1$
while $i\leq m$ and $\matrixentry{A}{ij}=0$
$i \leftarrow i+1$
if $i\neq m+1$
$r\leftarrow r+1$
swap rows $i$ and $r$ of $A$ (row op 1)
scale entry in row $r$, column $j$ of $A$ to a leading 1 (row op 2)
for $k\leftarrow 1\text{ to }m$, $k\neq r$
zero out entry in row $k$, column $j$ of $A$ (row op 3 using row $r$)
output $r$ and $A$


Notice that as a practical matter the "and" used in the conditional statement of the while statement should be of the "short-circuit" variety so that the array access that follows is not out-of-bounds.

So now we can put it all together. Begin with a system of linear equations (Definition SLE), and represent the system by its augmented matrix (Definition AM). Use row operations (Definition RO) to convert this matrix into reduced row-echelon form (Definition RREF), using the procedure outlined in the proof of Theorem REMEF. Theorem REMEF also tells us we can always accomplish this, and that the result is row-equivalent (Definition REM) to the original augmented matrix. Since the matrix in reduced-row echelon form has the same solution set, we can analyze the row-reduced version instead of the original matrix, viewing it as the augmented matrix of a different system of equations. The beauty of augmented matrices in reduced row-echelon form is that the solution sets to their corresponding systems can be easily determined, as we will see in the next few examples and in the next section.

We will see through the course that almost every interesting property of a matrix can be discerned by looking at a row-equivalent matrix in reduced row-echelon form. For this reason it is important to know that the matrix $B$ guaranteed to exist by Theorem REMEF is also unique.

Theorem RREFU (Reduced Row-Echelon Form is Unique) Suppose that $A$ is an $m\times n$ matrix and that $B$ and $C$ are $m\times n$ matrices that are row-equivalent to $A$ and in reduced row-echelon form. Then $B=C$.

We will now run through some examples of using these definitions and theorems to solve some systems of equations. From now on, when we have a matrix in reduced row-echelon form, we will mark the leading 1's with a small box. In your work, you can box 'em, circle 'em or write 'em in a different color --- just identify 'em somehow. This device will prove very useful later and is a very good habit to start developing right now.

Example SAB: Solutions for Archetype B.

Archetypes A and B are meant to contrast each other in many respects. So let's solve Archetype A now.

Example SAA: Solutions for Archetype A.

Example SAE: Solutions for Archetype E.

These three examples (Example SAB, Example SAA, Example SAE) illustrate the full range of possibilities for a system of linear equations --- no solutions, one solution, or infinitely many solutions. In the next section we'll examine these three scenarios more closely.

Definition RR (Row-Reducing) To row-reduce the matrix $A$ means to apply row operations to $A$ and arrive at a row-equivalent matrix $B$ in reduced row-echelon form.

So the term row-reduce is used as a verb. Theorem REMEF tells us that this process will always be successful and Theorem RREFU tells us that the result will be unambiguous. Typically, the analysis of $A$ will proceed by analyzing $B$ and applying theorems whose hypotheses include the row-equivalence of $A$ and $B$.