Linear vs. Nonlinear Transformations
In the last post we started to explore how we can use coordinate transformations to manipulate points in space, and set ourselves the task of finding the transformations that would let us represent a robot that moves and rotates around in its environment.
An important step in this is to understand that we can separate all transformations into two distinct categories: linear and nonlinear. We'll talk in more detail about what linearity means later, but for now all we need to know is that for our purposes ^{1} linear functions are functions of the form:
$f(\mathbf{p}) = \mathbf{A} \mathbf{p}$
Where $\mathbf{A}$ is an $n\times n$ matrix, and $n$ is the number of dimensions in $\mathbf{p}$. So the $\mathbf{A}$ matrix for the 1D, 2D, and 3D cases respectively would look like:
$\begin{bmatrix} a \end{bmatrix}, \begin{bmatrix} a & b \\ c & d \end{bmatrix}, \begin{bmatrix} a & b & c \\ d & e & f \\ g & h & i \end{bmatrix}$
If the function is anything other than a matrix multiplied by the input then it is nonlinear. And while this seems like a pretty heavy limitation (as we'll see in future posts) we can still do some impressive transformations, and understanding how to use them will set us up well for understanding nonlinearities further down the track.
Just a quick warning, don't be fooled into thinking a function like $y = mx + b$ is linear just because it is a line. Being linear in the polynomial sense is not the same as being a linear function/transformation, and that particular function is nonlinear due to the added $b$ term.
Exploring Linear Transformations
Let's explore this idea a bit further using the example of a 2D linear transformation. We'll have a look at how each of these four matrix elements ($a$, $b$, $c$, and $d$) will influence the behaviour of our transformation.
$\begin{align*} \mathbf{p}_2 &= \mathbf{A} \mathbf{p}_1 \\ \begin{bmatrix} x_2 \\ y_2 \end{bmatrix} &= \begin{bmatrix} a & b \\ c & d \end{bmatrix} \begin{bmatrix} x_1 \\ y_1 \end{bmatrix} \\ &= \begin{bmatrix} ax_1 + by_1 \\ cx_1 + dy_1 \end{bmatrix}\end{align*}$The Identity Matrix
The first, and simplest matrix we will look at is the identity matrix (i.e. $a$ and $d$ are $1$, $b$ and $c$ are $0$). When we multiply any vector by the identity matrix, we get the same vector out  so this transformation does nothing! This will be our starting point for looking at the other transformations.
$\begin{align*} \begin{bmatrix} x_2 \\ y_2 \end{bmatrix} &= \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} \begin{bmatrix} x_1 \\ y_1 \end{bmatrix} \\ &= \begin{bmatrix} x_1 \\ y_1 \end{bmatrix}\end{align*}$Scaling
The $a$ element will determine how much $x_1$ affects $x_2$, and the $d$ element will determine how much $y_1$ affects $y_2$. If we only change these values, this will result in a scale in the $x$ or $y$ axis (or both). Reducing both of these to zero is a special case where every single point is shrunk down into the origin!
$\begin{align*} \begin{bmatrix} x_2 \\ y_2 \end{bmatrix} &= \begin{bmatrix} a & 0 \\ 0 & d \end{bmatrix} \begin{bmatrix} x_1 \\ y_1 \end{bmatrix} \\ &= \begin{bmatrix} a x_1 \\ d y_1 \end{bmatrix}\end{align*}$Shearing
The two offdiagonal elements control the shear transform. The $b$ element determines how much the old $y$ affects the new $x$, and the $c$ element determines how much the old $x$ affects the new $y$. Keeping $a$ and $d$ at 1 and adjusting these one at a time produces a shear/skew effect.
$\begin{align*} \begin{bmatrix} x_2 \\ y_2 \end{bmatrix} &= \begin{bmatrix} 1 & b \\ c & 1 \end{bmatrix} \begin{bmatrix} x_1 \\ y_1 \end{bmatrix} \\ &= \begin{bmatrix} x_1 + b y_1 \\ y_1 + c x_1 \end{bmatrix}\end{align*}$Combinations
Combining these lets us produce some other interesting effects. For example, if we use the matrix $\begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix}$ we are saying that the new $x$ is only influenced by the old $y$ and vice versa. This results in a mirror of all the points across the line $y=x$.
Have a play around with the Geogebra example at the bottom of the page and see if you can find a combination that will rotate all the points around the origin by 90 degrees.
Properties of linear transformations
So why do we care so much about linear transformations? Well all linear transformation have a bunch of useful common properties that make them much more pleasant to work with, compared to nonlinear functions. Here are a few:
 $0$ always maps to $0$. There is no way to move the origin.
 Linear transformations are always odd ($f(\mathbf{p}) = f(\mathbf{p})$). This results in a sort of mirroring effect. If you pick any point and see how it moves, the point exactly opposite (through the origin) will move the opposite way, and will continue to be its mirror. You can imagine there is a pin through the origin and everything is stretching and mirroring around it.
 Linear transformations chain through multiplication. If we want to scale some points, then shear them, then rotate them, we just need to multiply all the matrices together:
This last property is extremely helpful. It doesn't just make the equations simpler on page, it also improves computation speed as we can premultiply the matrices as necessary, and then transform whatever points we need with a single final multiplication.
Conclusion
It might seem like we've strayed a little from our task  we were looking for practical transformations, ones that translate and rotate our points. However, even though real robots don't scale or shear, understanding the structure of linear transformations is important because if we can find a way to translate and rotate within this framework it will make our calculations far simpler.
Examples
Geogebra
Start by adjusting the sliders one at a time, returning them to their start values after each adjustment. Try doing two or more at a time to see what effects it causes. Can you create a rotation? What about a translation?
Loading
MATLAB/Octave
The MATLAB script below generates a small "house" shape, offset from the origin. It then shows what the house would look like with the following transformations:
 A scale change (both X and Y)
 A shear (both X and Y)
 A randomly generated transformation (note that if you run the code you'll get a different shape for this one)
Try changing the numbers for one of them and see what you get!
% Set up an array of points
x_points = [2, 2, 0.5, 1, 1, 2];
y_points = [1, 2, 3, 2, 1, 1];
points = [x_points; y_points;];
% Scale
scale_mat = [1.5, 0; ...
0, 0.5];
% Shear
shear_mat = [ 1, 0.7; ...
0.2, 1];
% Random
rand_mat = 3*rand(2)  1.5;
% Transform the points
for p = 1:size(points,2)
scale_pts(:,p) = scale_mat * points(:,p);
shear_pts(:,p) = shear_mat * points(:,p);
rand_pts(:,p) = rand_mat * points(:,p);
end
% Plot everything
clf;
plot(0,0,'+k', 'DisplayName', 'Origin');
hold on;
plot(points(1,:), points(2,:), 'xk', 'DisplayName', 'Original Points');
plot(scale_pts(1,:), scale_pts(2,:), 'xg', 'DisplayName', 'Scale');
plot(shear_pts(1,:), shear_pts(2,:), 'xr', 'DisplayName', 'Shear');
plot(rand_pts(1,:), rand_pts(2,:), 'xb', 'DisplayName', 'Random');
legend show; grid on; axis equal;
Extra Resources
 Wikipedia has an article on the broader mathematical definition of linear functions/maps/transformations (of which we are looking at a subset  coordinate transformations).
Footnotes

We are assuming that the problem to solve is a singleinput, singleoutput problem where the input space and output space are of the same dimension. ↩