Infinity and beyond!: Moore-Penrose Pseudoinverse

Generalization of the inverse of a matrix.

I believe, we pay too much attention to implementation, and too less
attention in the study of the concept that is implemented. I have been
on then teams of many Data Science and Machine Learning projects, and
I would always reiterate on one simple idea; that is, “If you do not
know the math, you don’t know it at all.”

This is a piece of philosophy I deeply believe in. With the advent of packages like numpy, matplotlib, scikit-learn etc., implementing a machine learning model with a moderately difficult data set and problem is fairly simple.
The magic then stays in being able to tweak the algorithm and getting something new (or weird) out of the model. And, for you to be capable of doing so, you will have to know the mechanism behind it.
The Moore-Penrose pseudoinverse in the soul of PCA (Principal Component Analysis), one of the most popularly used Dimensionality reduction techniques.

How do we define the inverse of a matrix?
Provided that the matrix is a square matrix and non-singular, we simple divide the adjoint of the matrix with its determinant.
Mathematically, for $A_{m \times m}$ and $|A| \neq 0$ , the inverse of $A$ is defined as,
$A^{-1} = \frac{ \text{adj.} A}{|A|} \tag*{(1)}$

Of course, the above method is computationally very expensive. Hence, we can get the inverse of the matrix recursively using the Fadeev-LeVerrier equation ( Read about that in this blog of mine).

Now, how do we deal with matrices that are non-square? How do you find the inverse of a matrix that looks like this,
$B = \begin{bmatrix} x_{11} & x_{13} \\ x_{21} & x_{23}\\ x_{d1} & x_{d3} \end{bmatrix}_{3\times 2}$
This is where the Generalization of inverse of a matrix happens, named the Moore-Penrose Pseudoinverse.

For every $A_{m \times n}$ , there exists a pseudoinverse $A^{\dagger}_{n \times m}$ . ( $A^\dagger$ is read as “A dagger”).
$A^{\dagger}$ is mathematically defined as,
$A^{\dagger}=(A^{T}A)^{-1}A^{T} \tag*{(2)}$
This is dimensionally consistent. Please check and verify.

Now, say we have,
$Q = \begin{bmatrix} 1 \\ 2 \end{bmatrix}$
It is impossible to find $Q^{-1}$ by the conventional method $(1)$ . So, we use the Generalized Inverse at $(2)$ .
So,
$Q^{T} = \begin{bmatrix} 1 && 2 \end{bmatrix}$
So, $Q^{T}Q$ comes out to be $[5]_{1 \times 1}$ . So, $(Q^{T}Q)^{-1}$ comes out to be $\frac{1}{5}$ .
Hence,
$Q^{\dagger}=(Q^{T}Q)^{-1}Q^{T} =\frac{1}{5} \begin{bmatrix} 1 && 2 \end{bmatrix} = \begin{bmatrix} \frac{1}{5} && \frac{2}{5} \end{bmatrix}$ which is the pseudoinverse or the generalized inverse.

For a square matrix (i.e., $m \times m$ ),
$A^{\dagger}=A^{-1}$
In detail,
$(A^{T}A)^{-1}A^{T}=\frac{\text{adj. A}}{|A|}$

Some properties of the generalized inverse are,
1. $AA^{\dagger}A=A$
2. $A^{\dagger}AA^{\dagger}=A^{\dagger}$
3. $(AA^{\dagger})=AA^{\dagger}$
One important point to remember is, $A^{\dagger}$ always exists and is unique.

Cheers!

Infinity and beyond!

Friday, 24 November 2017

Moore-Penrose Pseudoinverse

Generalization of the inverse of a matrix.

No comments:

Post a Comment