Infinity and beyond!

Sunday, 9 September 2018

Catalan's Conjecture - A learning exercise for the bored mind - Contd.

Before we move to Mihailescu’s proof, let us cover some more mathematical concepts (so that we feel superior to the people around us, just kidding :smile:),

There is an interesting theorem that Mihailescu uses in his proof, called the Stickelberger’s theorem,

Stickelberger’s theorem:

This is a result of algebraic number theory, which gives more information about Galois Module structure of class groups of Cyclotomic Fields.
This theorem consists of Stickelberger’s element and Stickelberger’s ideal.
I will now state the complete definition of the theorem and then visit its corners as we move along,

Let $K_m$ denote the $m$ -th cyclotomic field ${[1]}$ . It is a Galois extension of $\mathbb{Q}$ with Galois Group $G_m$ isomorphic to the multiplicative group of integers modulo $m$ $(\mathbb{Z}/m\mathbb{Z})^{\times}$ .

The Stickelberger element (of level $m$ or of $K_m$ ) is an element in thr group ring $\mathbb{Q}[G_m]$ and the Stickelberger ideal is an ideal in the group ring $\mathbb{Z}[G_m]$ . (Note: $\mathbb{Z} \subset \mathbb{Q}$ ).

The definition of both the Stickelberger element and ideal are, let $\zeta_m$ denote a primitive $m$ -th root of unity ${[2]}$ . The isomorphism from $(\mathbb{Z/mZ})^{\times}$ to $G_m$ is given by sending $a$ to $\sigma_a$ by the relation $\sigma_a(\zeta_m)=\zeta^{a}_m.$
The Stickelberger element of level $m$ is given by,
$\theta(K_m)=\frac{1}{m}\sum_{a=1 \ (a,m)=1}^ma\cdot \sigma_a^{-1} \in \mathbb{Q}[G_m].$
The Stickelberger ideal of level $m$ is given by,
$I(K_m)=\theta(K_m) \mathbb{Z}[G_m] \cap \mathbb{Z}[G_m].$

Inkeri $[3]$ , used the concept of Wiefrich pair (explain in the previous blogpost of this series) in the context of Catalan’s equation as follows:

A Wieferich pair is a pair $(p,q)$ of primes such that $p^{q-1} \equiv 1( \text{mod} \ q^2)$ and $q^{p-1} \equiv 1(\text{mod} \ p^2).$

He showed that if the Catalan’s equation holds, then either $(p,q)$ is a Wieferich pair, or $q$ divides $h_p$ , the class number $[4]$ of cyclotomic field $\mathbb{Q}(\zeta_p)$ , or $p$ divides $h_q$ , the class number of cyclotomic field $\mathbb{Q}(\zeta_q)$ , there were other developments in this direction too.

Bugeaud and Hanrot $[5]$ proved a class number criterion concerning Catalan’s equation, which implies that the Catalan’s Equation ( $x^p - y^q =1$ ) has no solutions in non-zero integers $x$ and $y$ if $p$ and $q$ are primes such that one of them is smaller than 43. This was a huge achievement , I recommend you to have a look at the paper at [5].

Mihailescu proved that the Catalan equation has no solutions if $p$ and $q$ are odd and $q$ does not divide $p - 1$ . By this result the Catalan conjecture became a theorem. And later Mihailescu succeeded in finding a more elegant proof of Catalan’s conjecture in the case where $q$ does divide $p-1$ . Thus, Catalan’s conjecture is a theorem with an algebraic proof in which no computer calculations
are needed.

In this section e wll discuss some breakthrough results by Mihailescu. The most important one is that $q^2$ divides $x$ .

The following lemma will be used for that, an element $a$ of a ring $R$ is called nilpotent if $\exists$ and integer $n$ such that $a^n=0$ .

Lemma 1: The ring $\mathbb{O}_k/(q)$ does not contain nilpotent elements, if $\alpha$ , $\beta$ $\in$ $\mathbb{O}_k = \mathbb{Z}[\zeta]$ satisfy the congruence $\alpha^p \equiv \beta^q (\text{mod} \ q^2)$

Theorem 1: For $\theta \in {I_s}^{-}$ , the element $(x-\zeta)^\theta$ is a $q^{th}$ in $K=\mathbb{Q}(\zeta)$ . We also have that $q^2$ divides $x$ and $p^2$ divides $y$ .

The proofs of the above Lemma and Theorem are beyond the scope of this blog; regardless to say that the pre-requisites are already covered in detail. For the interested, you can refer Catalan’s Conjecture - A cyclotomic field.

Cheers!

Tuesday, 10 April 2018

Catalan's Conjecture - A learning exercise for the bored mind.

I was going through a video by Numberphile where they were talking about the Catalan’s conjecture.
This is another conjecture which is simple to state; however extremely difficult to prove; just like Collatz conjecture, about which I already have a blog in place.
At the very outset of the blog; let me bring it to your knowledge that this blog is just for learning and understanding and contains “very little”-to-“null” original work on the subject. However, this blog will be exhaustive and will contain a lot of references that will help a newbie (like myself) get into the depths of the conjecture and fully understand concepts used in its proof.
Let us now define the conjecture;
This statement was conjectured by Eugene Catalan (1814–1894) and was sent to the editor of Journal fur die Reine und Angewandte Mathematik $^{[1]-\text{Introduction}}$ .
The conjecture is as follows,
$2^3$ and $3^2$ are two powers of natural numbers whose values are consecutive (i.e., 8 and 9); the conjecture is for a mathematical statement such as,
$\boxed{x^a-y^b=1} \tag*{(1)}$
The only solution of $(1)$ is for $a$ , $b$ $>$ $1$ and $x$ , $y$ $>$ $0$ is $x=3$ , $a=2$ , $y=2$ , $b=3$ .
Catalan’s conjecture was proven true by Preda Mihăilescu in 2002; the proof involves the theory of cyclotomic fields and Galois Modules $^{[2]-\text{History}}$ .
So, as you see now, the breadth of the subject blew up! On second were talking about squares and cubes and now we are talking of Cyclotomic fields and Galois Modules!
Nevertheless, I will try and cover each topic in brief and quench our mathematical thirst!
Let us consider $(1)$ (with $a=p$ and $b=q$ ; for the sake of eliminating any confusion between $a$ and an english “a”), unless otherwise stated, $x$ and $y$ can be negative integers as well. Now, we re-write $(1)$ as follows,
$(x-1)\frac{x^p-1}{x-1}=y^q \tag*{(2)}$
The GCD of the two factors on the left hand side of the equation (after considering $x^p = ((x-1)+1)^p$ ) is either $p$ or $1$ (How did this happen?).

Some concepts before we move ahead:

The Wieferich pairs
$^{[3]-\text{Wieferich pair}}$ In mathematics,
a Wieferich pair is a pair of prime number $p$ and $q$ that satisfy,
$p^{q-1} \equiv 1( \mod p^2)$
Let us write $(2)$ as, $(x-1)\frac{x^p-1}{x-1}=pb^q$ for, $p \nmid b$
This suggests a traditional approach of factorizing the left hand side
in $\mathbb{Z}[\zeta]$ , the ring of integers in the $p$ th cyclotomic
field $\mathbb{Q}(\zeta)(\zeta=e^{2\pi i/p})$

Ring $^{[4]-\text{Ring-Wolfram}}$ :
A ring in the mathematical sense is a set $S$ together with two binary
operators $+$ and $.$ (Addition and multiplication), satisfying the
following conditions:
1. Additive associativity: For all $a,b,c \in S$ , $(a+b)+c=(a+b)+c$ ,
2. Additive commutativity: For all $a,b\in S$ , $a+b=b+a$ ,
3. Additive identity: There exists an element $0$ in $S$ such that for all a in $S$ , $0+a=a+0=a$ .
4. Additive inverse: For every a in $S$ there exists $-a \in S$ such that $a+(-a)=(-a)+a=0$ ,
5. Left and right distributivity: For all $a,b,c \in S$ , $a.(b+c)=(a.b)+(a.c)$ and $(b+c).a=(b.a)+(c.a)$ ,
6. Multiplicative associativity: For all $a,b,c \in S$ , $(a.b).c=a.(b.c)$ ( $a$ ring satisfying this property is sometimes
explicitly termed an associative ring). Conditions 1-5 are always
required. Though non-associative rings exist, virtually all texts also
require condition 6.
7. Multiplicative commutativity: For all $a,b \in S$ , $a.b=b.a$ ( $a$ ring satisfying this property is termed a commutative ring),
8. Multiplicative identity: There exists an element $1 \in S$ such that for all $a \neq 0 \in S$ , $1.a=a.1=a$ (a ring satisfying this
property is termed a unit ring, or sometimes a “ring with identity”),
9. Multiplicative inverse: For each $a \neq 0 \in S$ , there exists an element $a^{-1} \in S$ such that $\forall a \neq 0 \in S, a.a^{-1}=a^{-1}.a=1$ , where $1$ is the identity element.

A brief history of the past developments on this conjecture is a must to be read and understood; the significance comes due to the period of 150 years for which it remained an open problem $^{[5]-\text{Catalan's Conjecture: Are 8 and 9 the Only Consecutive Powers?}}$ ,

Only after six years after Catalan formally defined the conjecture, a result was proposed by French mathematician, Victor Lebesgue. He stated that, for the equation, $x^p-y^2=1$ ; where $p$ is a prime; has no solutions for positive values of $x$ and $y$ . A proof of the same will be discussed in brief in the later part of the blog.
After Lebesgue’s work, all development solely consisted of small exponents, and then Naggel showed in 1921 that the difference between a third power and an other perfect power never is equal to 1.
In 1932, Selberg proved that, $x^4-y^n=1$ has no solution in positive integers when $n>1$ . A stronger result to this was proved by Ko Cho in 1965, that stated that the equation $x^2-y^q=1$ has no solutions for positive integers when $q \geq 5$ .
Cassels made some observations for $x^p-y^q=1$ where $p$ and $q$ are odd-primes. He proved that is this equality holds for positive integers $x$ and $y$ , then $p$ divides $y$ and $q$ divides $x$ . For the case $p=2$ , this had already been shown by Naggel
Inkeri defined the concept of a Wieferich pair [the definition and explanation of the same is given above] in the concept of Catalan equation as follows:
If the Catalan’s equation $(1)$ holds, then either $(p,q)$ is a Wieferich Pair, or $q$ divides $h_p$ , the class number of the cyclotomic field $\mathbb{Q}(\zeta_p)$ , or $p$ divides $h_q$ , the class number of the cyclotomic field $\mathbb{Q}(\zeta_q)$

Cyclotomic Field $^{[5]-\text{Cyclotomic field}}$ :
In number theory, a cyclotomic field is a number field obtained by adjoining a complex primitive root of unity to $\mathbb{Q}$ , the field of rational numbers. The $n$ -th cyclotomic field $\mathbb{Q}(\zeta_n)$ (where $n > 2$ ) is obtained by adjoining a primitive $n$ -th root of
Primitive root of unity $^{[6]-\text{Root of unity}}$
In mathematics, a root of unity, occasionally called a de Moivre number, is any complex number that gives $1$ when raised to some positive integer power $n$ .

Some time later Mihailescu proved that the Catalan equation has no solutions if $p$ and $q$ are odd and $q$ does not divide $p-1$ . By this result the Catalan conjecture became a theorem

More on Cyclotomic Fields:
Let $p$ be an odd-prime number. Let $\Phi_p$ be the $p$ -th cyclotomic polynomial in $\mathbb{Q}[X]$ i.e., $\Phi_p = \frac{X^p-1}{X-1}$ . Consider the field extension $\mathbb{Q}[X]/(\Phi_p) \cong \mathbb{Q}(\zeta)$ of $\mathbb{Q}$ , where $\zeta$ denotes a primitive $p$ -th root of unity. This is a field extension of degree $p-1$ and it is reducible in $\mathbb{Q}[X]$ . We denote $\mathbb{Q}(\zeta)$ by $k$ from now on.
This field extension is Galois with Galois Group,
$G=\text{Gal}(\mathbb{Q}(\zeta)/\mathbb{Q}) \cong (\mathbb{Z}/p\mathbb{Z})^{*},$
Since the map,
$(\mathbb{Z}/p\mathbb{Z})^{*} {{\sim} \atop {\to}}\text{Gal}(\mathbb{Q}(\zeta)/\mathbb{Q})$
$a (\mod p) \mapsto (\sigma_a : \zeta \mapsto \zeta^a)$
is an isomorphism
The automorphism $\sigma_{p-1}$ acts in all embeddings as complex conjugation. Therefore, we call $\sigma_{p-1}$ complex conjugation.
The fixed field of complex conjugation is $\mathbb{Q}(\zeta+\zeta^{-1})$ , which is called the maximal real subfield of $\mathbb{Q}(\zeta)$ . We denote $\mathbb{Q}(\zeta + \zeta^{-1})$ by $K^{+}$ . The field extension $\mathbb{Q}(\zeta+\zeta^{-1})$ of $\mathbb{Q}$ has degree $\frac{p-1}{2}$ and it is Galois with Galois theory.
$G^{+}=\text{Gal}(\mathbb{Q}(\zeta+\zeta^{-1})/\mathbb{Q}) \cong (\mathbb{Z}/p\mathbb{Z})^{*}/(\pm 1)$

Another important concept that is

Mihailescu’s proof

To be contd…in the next blogpost!

Friday, 24 November 2017

Moore-Penrose Pseudoinverse

Generalization of the inverse of a matrix.

I believe, we pay too much attention to implementation, and too less
attention in the study of the concept that is implemented. I have been
on then teams of many Data Science and Machine Learning projects, and
I would always reiterate on one simple idea; that is, “If you do not
know the math, you don’t know it at all.”

This is a piece of philosophy I deeply believe in. With the advent of packages like numpy, matplotlib, scikit-learn etc., implementing a machine learning model with a moderately difficult data set and problem is fairly simple.
The magic then stays in being able to tweak the algorithm and getting something new (or weird) out of the model. And, for you to be capable of doing so, you will have to know the mechanism behind it.
The Moore-Penrose pseudoinverse in the soul of PCA (Principal Component Analysis), one of the most popularly used Dimensionality reduction techniques.

How do we define the inverse of a matrix?
Provided that the matrix is a square matrix and non-singular, we simple divide the adjoint of the matrix with its determinant.
Mathematically, for $A_{m \times m}$ and $|A| \neq 0$ , the inverse of $A$ is defined as,
$A^{-1} = \frac{ \text{adj.} A}{|A|} \tag*{(1)}$

Of course, the above method is computationally very expensive. Hence, we can get the inverse of the matrix recursively using the Fadeev-LeVerrier equation ( Read about that in this blog of mine).

Now, how do we deal with matrices that are non-square? How do you find the inverse of a matrix that looks like this,
$B = \begin{bmatrix} x_{11} & x_{13} \\ x_{21} & x_{23}\\ x_{d1} & x_{d3} \end{bmatrix}_{3\times 2}$
This is where the Generalization of inverse of a matrix happens, named the Moore-Penrose Pseudoinverse.

For every $A_{m \times n}$ , there exists a pseudoinverse $A^{\dagger}_{n \times m}$ . ( $A^\dagger$ is read as “A dagger”).
$A^{\dagger}$ is mathematically defined as,
$A^{\dagger}=(A^{T}A)^{-1}A^{T} \tag*{(2)}$
This is dimensionally consistent. Please check and verify.

Now, say we have,
$Q = \begin{bmatrix} 1 \\ 2 \end{bmatrix}$
It is impossible to find $Q^{-1}$ by the conventional method $(1)$ . So, we use the Generalized Inverse at $(2)$ .
So,
$Q^{T} = \begin{bmatrix} 1 && 2 \end{bmatrix}$
So, $Q^{T}Q$ comes out to be $[5]_{1 \times 1}$ . So, $(Q^{T}Q)^{-1}$ comes out to be $\frac{1}{5}$ .
Hence,
$Q^{\dagger}=(Q^{T}Q)^{-1}Q^{T} =\frac{1}{5} \begin{bmatrix} 1 && 2 \end{bmatrix} = \begin{bmatrix} \frac{1}{5} && \frac{2}{5} \end{bmatrix}$ which is the pseudoinverse or the generalized inverse.

For a square matrix (i.e., $m \times m$ ),
$A^{\dagger}=A^{-1}$
In detail,
$(A^{T}A)^{-1}A^{T}=\frac{\text{adj. A}}{|A|}$

Some properties of the generalized inverse are,
1. $AA^{\dagger}A=A$
2. $A^{\dagger}AA^{\dagger}=A^{\dagger}$
3. $(AA^{\dagger})=AA^{\dagger}$
One important point to remember is, $A^{\dagger}$ always exists and is unique.

Cheers!

Friday, 17 November 2017

The Linear Quadratic Regulator

Optimal Control and Linear-Quadratic-Regulator (LQR)

Today, I will not write an introductory passage to write off my blog. Because, writing an introduction to Optimal Control in itself will required a blog. However, I will add in small tidbits as and when needed.
To understand the topic, we need some basic definitions with us.

1.
A control system can be represented in terms of State Space, as follows,
$\boxed{\dot{x}(t) = A(t)x(t)+B(t)u(t) \\ y(t)=C(t)x(t)+D(t)u(t)}$
In the above formulation,
$x(.)$ is the state vector; $x(t) \in \mathcal{R}^{n}$ .
$y(.)$ is the output vector; $y(t) \in \mathcal{R}^{q}$ .
$u(.)$ is the input vector; $u(t) \in \mathcal{R}^{p}$ .
$A(.)$ is the System Matrix; $\mathrm{dim}[A(.)]=n\times n$ .
$B(.)$ is the Input Matrix; $\mathrm{dim}[B(.)]=n\times p$ .
$C(.)$ is the Output Matrix; $\mathrm{dim}[C(.)]=q\times n$ .
$A(.)$ is the Feed-forward Matrix; $\mathrm{dim}[D(.)]=q\times p$ .
Now, for a system to be controllable, we first define a matrix $\mathcal{Q_c}$ , called the controllability matrix, such that,
$\boxed{\mathcal{Q}_c=\begin{bmatrix} B & AB & A^{2}B & \ldots & A^{n-1}B \\ \end{bmatrix}}$
The system is controllable if $\mathcal{Q}_c$ has full row rank (i.e. rank( $\mathcal{Q}_c$ ) $=n$ ).

We will assume that we deal with Controllable systems only.

Usually, a single input system’s state feedback controller is designed using the Eigen-value method, or Pole Placement method.

2.
Pole placement method is the methodology of finding the control vector $\mathcal{U}$ in the form $-\mathcal{k}x$
So, the state space representation changes as,
$\dot{x}=(A-Bk)x$
$k$ is found as,
$|sI-(A-Bk)|=(s-\mu_1)(s-\mu_2)...(s-\mu_n)$
Here, $\mu$ are the desired pole locations. Note that $k$ is defined as $k=\begin{bmatrix} k_1 & k_2 & k_3 & \ldots & k_n\ \end{bmatrix}$

However, for a multi-input system the feedback gain i.e. $k$ is not unique.
Linear Quadratic Control strategy is used to deal with this issue.

Now, we dive into the Linear Quadratic Regulator (LQR) formulation, for an $m$ -input and $n$ -state system with $x \in \mathcal{R}^n$ , $u \in \mathcal{R}^m$ . Consider a system,
$\dot{x}=A(t)x(t)+B(t)u(t) \text{ provided }x(0)=x_0 \tag*{(1)}$
Our aim is to find an open loop control $u(\tau)$ , for $\tau \in [t_0, t_f]$ such that we minimize:
$\boxed{J(u, x_0, t_0, t_f) = \int_{t_0}^{t_f}[x^{T}(t)Q(t)x(t)+u^{T}(t)R(t)u(t)]dt+x(t_f)^{T}Sx(t_f) } \tag*{(2)}$
where $Q(t)$ and $S$ are symmetric positive semi-definite $n \times n$ matrices.
$R(t)$ is a symmetric positive definite $m \times m$ matrix. Note that $x_0$ , $t_0$ and $t_f$ are fixed and given data.
The controller aim is to basically keep $x(t)$ close to 0 especially at $t_f$ , which is the final time.
In $(2)$ ,

$x^T(t)Q(t)x(t)$ works against the transient response.
$x^T(t_f)Sx(t_f)$ works against the finite state.
$u^T(t)R(t)u(t)$ works against the control effort.

The above formulation can regulate the output $y(t) = C(t)x(t)$ near $0$ .
Note that, we can define, $S$ and $Q(t)$ as $C^T(t)W(t)C(t)$ where, $W(t) \in R^{r \times r}$
We can now have a theorem as follows,

For a system with fixed initial and final conditions, $\dot{x}=f(x,u,t)$ ; and clearly $x(t_0)=x_0$ . We define our time horizon as $[t_0,t_f]$ such that $t \in [t_0,t_f]$ . We find $u(t)$ such that our cost function, $J$ is minimized. $J$ is defined as,
$J(u(.),x_0)= \phi(x(t_f))+\int_{t_0}^{t_f}L(x(t), u(t), t)dt$
Here, the first term of $J$ is the final cost and the second term is the recurring cost.

Now, we will formulate some important functions that will convert the $J$ which is a constrained optimal control problem to a unconstrained optimal control problem. [THIS MAY NOT MAKE SENSE TO YOU, WHICH IS NATURAL. HOLD ON].
$\boxed{\dot{\lambda}=-H_x=-\frac{\partial{L}}{\partial{x}}-\lambda^T\frac{\partial{f}}{\partial{x}}} \tag*{(3)}$
Note that, $\lambda(t)$ ( $\in R^n$ ) is called the Lagrangian.
$H$ is the Hamiltonian operator. Defined in terms of $L$ and $\lambda$ as in $(3)$ . Or it can be defined as,
$\boxed{H(x,u,t) := L(x,u,t)+\lambda^T(t)f(x,u,t)} \tag*{(4)}$
The above definition is in terms of $L$ as defined in the theorem. So, we define $\lambda$ in the same lines. Just for convenience of computation.
$\boxed{\lambda^T(t_f)=\frac{\partial \phi}{\partial x}(x(t_f))} \tag*{(5)}$
$(5)$ can be written as $\lambda^T(t_f)=\phi_x(x(t_f))$
Equation $(3)$ , $(4)$ and $(5)$ together form a set of $2n$ differential equations (in $x$ and $\lambda$ , obviously) with split boundary conditions at $t_0$ and $t_f$ . Now, we can easily define $u(t)$ in terms of $x$ or/and $\lambda$ .
As mentioned earlier, the solution is found by converting $J$ from a constrained optimal problem to a constrained optimal problem using a Lagrange multiplier function $\lambda(t)$ :
$\boxed{\bar{J}(u,x_0)=J(u(.),x_0)+\int_{t_0}^{t_f} \lambda^T [f(x,u,t)-\dot{x}]dt} \tag*{(6)}$
Notice that,
$\frac{d}{dt}(\lambda^T(t)\dot{x}(t))=\dot{\lambda}^T(t)x(t)+\lambda^T(t)\dot{x} \tag*{(7)}$
Therefore,
$\int_{t_0}^{t_f}\lambda^T\dot{x}\ dt=\lambda^T(t_f)\dot{x}(t_f)-\lambda^T(t_0)\dot{x}(t_0)-\int_{t_0}^{t_f}\dot{\lambda}^Tx\ dt \tag*{(8)}$
As the Hamiltonian Function is defined in $(4)$ , thus,
$\boxed{\bar{J}=\phi(x(t_f))-\lambda^T(t_f)x(t_f) + \lambda^T(t_0)x(t_0)+\int_{t_0}^{t_f}[H(x(t),u(t),t)+\dot{\lambda}(t)x(t)] \ dt} \tag*{(9)}$
The necessary condition for an optimal solution is $\delta \bar{J}$ of the modified cost with respect to all variations of the system be minimal at all times from $t_0$ to $t_f$ .
We will define $\delta \bar{J}$ analytically in the next post and formulate the Riccati Equation that will lay the foundation to some amazing control strategies.
Cheers!

Sunday, 5 November 2017

i!

Define the Factorial of a Complex number.

In usual sense, factorial is defined as,
$n! = \prod_{k=1}^{n} k=1 \cdot2\cdot3\ldots(n-2)\cdot(n-1)\cdot n \tag*{(1)}$
Now, the not so usual definition is based on the famous Gamma Function,
$\Gamma(t) = \int_0^\infty e^{-x}x^{t-1}\mathrm{d}x \tag*{(2)}$
There is an unique and very useful property,
$\Gamma(n+1)=n! \tag*{(3)}$
To extend $(3)$ into the complex domain, we will first have to go through Analytic Continuation, please read about Analytic Continuation here.
Therefore, after analytic continuation, we can write it as,
$z! \overset{\text{def}}{=} \Gamma(z+1) \tag*{(4)}$
For, $z \in \mathcal C-{-1,-2...}$
So, now, clearly,
$i!= \Gamma(i+1) \tag*{(5)}$
By $(3)$
$\Gamma(i+1)=\int_0^\infty e^{-x} x^{(i+1)-1}\mathrm{d}x \tag*{(6)}$
Clearly,
$\Gamma(i+1)=\int_0^\infty e^{-x} x^{i}\mathrm{d}x \tag*{(6)}$
For easier computation, please catch that,
$x^i = e^{i \ln x} = \cos ( \ln x)+i \sin( \ln x)$
Let’s break it down,
$i! = \int_0^\infty e^{-x} (\cos ( \ln x)+i \sin( \ln x))\mathrm{d}x$
$\implies i! = \int_0^\infty e^{-x} \cos ( \ln x)\mathrm{d}x+i\int_0^\infty e^{-x} \sin ( \ln x)\mathrm{d}x$
If you have reached this far, you obviously know how to solve the above integral.
Cheers!

Saturday, 26 August 2017

A new Kilogram? What? How?

To understand this, let us see how time is defined. We all know the SI unit of time is seconds, and it is defined as the international unit of time, the second, is defined by measuring the electronic transition frequency of caesium atoms. Similarly length (SI unit metre) is defined as the length traveled by light in $\frac{1}{3 \times 10^8}$ seconds or to be more precise $\frac{1}{299792458}$ seconds.
The inspiration to write this blog was derived from Veritasium|How We’re Redefining the kg.
We see that, all these are fundamental quantities are define on a given/fixed standard.
However, the kilogram is set as the weight of a metal cylinder in Paris. That is not that ideal, huh!
So, the NIST is trying to define the unit of mass, i.e., kilogram using some already standardized quantities like the Planck’s constant and the Avogadro’s number. The Planck’s constant takes a dive into advanced applied physics (no, not Quantum physics :p ); therefore, demands my blog’s space and time.
The engineering behind it is pretty simple, to put it in the simplest of the terms;

There is a balancing device, which has a mass unit and a coil
unit. The mass unit is balanced with the magnetic field from
the coil using a motor fixed to it, unit and unless, $F_{magnetic} = F_{gravitational}$ . To know more, read Watt Balance or Kibble Balance.

Let’s look into the working now, in this process we will generate some cool equations and in the process will learn some new concepts, as and when required.
First, the watt balance, the principle of operation itself says that,
$F_{magnetic} = F_{gravitational}$
Taking into account all the usual scientific nomenclature, we can write,
$\boxed{m \times g = B \times i \times l} \tag*{(1)}$
Here, $m$ is the mass, $g$ is the acceleration due to gravity, $B$ is the magnetic flux density (in Tesla( $T$ )), $i$ is the current in the coil and $l$ length of the conductor in the field.
Equation $(1)$ is for the weighing mode operation of Watt Balance. In which, the weights are matched on both sides.
There is another mode, called the velocity mode in which the mass( $m$ ) is lifted at a height and then the coil is moved back and forth in the magnetic field. This motion induces a voltage, $V$ , in the coil. By Faraday’s motional emf expression,
$\boxed{V = B \times l \times v} \tag*{(2)}$
Here, $v$ is the velocity of the conductor in the magnetic field.
Now, $(1)$ can be written as,
$B \times l = \frac{m \times g}{i} \tag*{(3)}$
and $(2)$ can be written as,
$B \times l = \frac{V}{v} \tag*{(4)}$
Equating $(3)$ and $(4)$ , we have,
$\frac{m \times g}{i} = \frac{V}{v}$
Which can be written as,
$\boxed{m \times g \times v = V \times i} \tag*{(5)}$
Interestingly, $m \times g \times v$ is the mechanical power (refer Encyclopedia of Electrochemical Power Sources for more info) and $V \times i$ is the electrical power.
You must all be thinking, how does Planck’s constant come to play. Well, please hold your horses. We are almost there.
To measure $V$ in $(5)$ , we go into a concept of superconductivity. Called the Josephson Phenomenon. Please read up on its working here. It is also significant because of being the standard of Voltage.
When DC voltage is applied to a Josephson Junction, the junction experiences an oscillation of frequency(read more),
$f = \frac{2eV}{h}$
Here, $f$ is the frequency, $e$ is the elementary charge, and $h$ is the Planck’s constant( $6.62607004 \times 10^{-34} m^2 kg \cdot s^{-1}$ ).
The above equation can be written as,
$\boxed{V = \frac{h\times f}{2e}} \tag*{(6)}$
For many Junctions ( say $n$ ) it is,
$V = n\frac{h\times f}{2e}$
The Voltage measure here is accurate to $10^{10}$ parts, refer this.
Now, if we write, $i$ as $\frac{V}{R}$ (where $R$ is the resistance offerered by the junction), then, $(5)$ changes to,
$\boxed{\frac{h^2\times f^2}{4e^2R} = m \times g \times v} \tag*{(7)}$
Another question now is how do we measure $R$ ?, for that we will use the idea of Quantum Hall Effect. Quantum Hall effect is the standard for resistance, please refer the paper for more information. But suffice to say, the resistance is defined as,
$\boxed{R = \frac{1}{k}\frac{h}{e^2}}$
Here, $k$ is $1, 2, 3, ...$
Please note that, without the integer fraction i.e., $R_k = \frac{h}{e^2}$ is called the von-Klitzing constant, this guy got a Nobel for this. Please read more here.

Using the above equation in $(7)$ we get,
$\frac{h^2\times f^2}{4e^2 \frac{1}{k}\frac{h}{e^2}} = m \times g \times v$
Which comes to,
$h^2\times f^2 = m \times g \times v \times 4e^2 \frac{1}{k}\frac{h}{e^2}$
$\implies h\times f^2 \times k = m \times g \times v \times 4$
$\implies h = \frac{m \times g \times v \times 4}{f^2 \times k} \tag*{(8)}$
This is for a single Josephson Junction, for $n$ junctions, it becomes,
$\implies h = \frac{m \times g \times v \times 4}{f^2 \times k \times n^2}$
Which can look more elegant if we write it as,
$\boxed{h = \frac{4}{k \times n^2}\frac{g \times v}{f}m}$
We have seen that $V$ and $R$ were measured very accurately. Similarly, there is a need that we measure the factors very accurately as well. The $v$ is measured using a Laser Interferometer. The $g$ was measured using a Gravimeter.
So, the scientists in NIST, are just putting in some mass $m$ and get the $h$ , and keep tuning the value of $m$ till we get a very accurate $h$ .
Cheers!

Monday, 7 August 2017

Study of Brachistochrone Problem.

This blog will deal with one of the most elegant topics in mathematics. The famous Brachistochrone curve. Vsuace did a great job at explaining it in this video. This looks good to me, but it doesn’t feel good; because of course the video does not contain tons of fancy mathematical vocabulary and millions of lines of $\LaTeX$ .

To get started, we will just put forward a small fact that, this topic is from a larger field of mathematics called the Calculus of Variations.

Before we get into the Analysis of Brachistochrone Curve (which happens to be the heart of this post), let us brush-up on some concepts.

Partial Differentiation (advanced preliminary):

Let us look at how we define the derivative of a given function, say we have $g(x)$ and we have,
$\boxed{\frac{\mathrm{d}}{\mathrm{d}x}g(x)=g'(x)=\lim_{c \to 0}\frac{f(x+ c)-f(x)}{c}}$
Provided this limit exists.

Now, for a function of many variables it is not easy to compute the total derivative (the usual derivative). Therefore, we calculate the Partial Derivative. You might have already guessed it by now, we define partial derivative as, say we have a function $h(x,y,z)$ , then,
$\boxed{\frac {\partial}{\partial x}h(x,y,z) = \lim_{c \to 0} \frac{h(x+c, y,z)-f(x,y,z)}{c}}$
Let us look into a small example now, say we have,
$f(x,y,z) = x^2y^3+x+y+z+\sin(x) \cos(y)$
Now,
$\frac{\partial}{\partial x}f(x,y,z) = 2xy^3+1+0+0+\cos(x) \cos(y)$
similarly,
$\frac{\partial}{\partial y}f(x,y,z) = 3x^2y^2+0+1+0-\sin(x) \sin(y)$
and,
$\frac{\partial}{\partial z}f(x,y,z) = 0 +0+0+1+0$

Idea of Speed and Time (basic preliminary):

Let us consider, the following scenario, (Made using Geogebra)

We have, in the figure we have denoted the path ACB as $\pi_1$ and the path ADB as $\pi_2$ .
Consider that, The time taken for a body to move from A to B using path $\pi_1$ is $t_{AB_{\pi_1}}$ similarly, the time taken for a body to move from A to B using path $\pi_2$ is $t_{AB_{\pi_2}}$ .
Also, consider that the distance for $\pi_1$ be $l_1$ and the distance for $\pi_2$ be $l_2$ .

Then,
Speed ( $\pi_1$ ) = $v_1$ = $\frac{l_1}{t_{AB_{\pi_1}}}$ and Speed ( $\pi_2$ ) = $v_2$ = $\frac{l_2}{t_{AB_{\pi_2}}}$ .

3. Idea of Distance(basic preliminary):

Distance between two points $(a,b)$ and $(c,d)$ is given by,
$l = \sqrt{(a-c)^2+(b-d)^2}$
In case, the distance is measured from origin ( $O(0,0)$ ) to some point $A(x,y)$ , we get,
$l = \sqrt{(x-0)^2+(y-0)^2} = \sqrt{x^2 + y^2}$

Now that the notion is clear, we shall proceed into the analysis.
[NOTE : Some topics which are new to me (or you) will be explained as and when it is required].

Time needed for a body to travel from $A$ to $B$ is given by,
$t_{AB}=\frac{l}{v}$ for a linear path.

For a curve or a non-linear path, we consider piece-wise linear distance ( $\mathrm{d}l$ ) and speed ( $v$ ). We can define, $t_{AB}$ as,
$\boxed{t_{AB} = \int_{A}^{B}\frac{\mathrm{d}l}{v}} \tag*{(1)}$
Now, we must understand a fact that, all distance is composed of $x$ and $y$ coordinates. We consider the point $A$ as the origin ( $O(0,0)$ ) and $B$ as $(x,y)$ . Hence,
$l = \sqrt{x^2 + y^2}\\ \implies \mathrm{d}l = \sqrt{\mathrm{d}x^2 + \mathrm{d}y^2}\\ \implies \mathrm{d}l = \sqrt{(1 + \frac{\mathrm{d}y^2}{\mathrm{d}x^2})\mathrm{d}x^2}$
$\boxed{\implies \mathrm{d}l = \sqrt{(1 + \frac{\mathrm{d}y^2}{\mathrm{d}x^2})}\mathrm{d}x} \tag*{(2)}$

Also, since the translation is both in the $x$ and $y$ axes, we can say that the speed is gained by $KE$ and $PE$ equality,
$KE = PE\\ \implies \frac{1}{2}mv^2 = mgy \\$
$\implies \boxed{v = \sqrt{2gy}} \tag*{(3)}$

Use $(2)$ and $(3)$ in $(1)$ , we have,
$\boxed{t_{AB} = \int_{A}^{B}\frac{\sqrt{(1 + \frac{\mathrm{d}y^2}{\mathrm{d}x^2})}\mathrm{d}x}{\sqrt{2gy}}} \tag*{(4)}$
We write
$\frac{\mathrm{d}y}{\mathrm{d}x}=y'$
Now, $(4)$ becomes,
$t_{AB} = \int_{A}^{B}\frac{\sqrt{(1 +y'^2 )}}{\sqrt{2gy}}\mathrm{d}x \tag*{(4.1)}$
Therefore, the function is,
$\theta = \frac{\sqrt{(1 +y'^2 )}}{\sqrt{2gy}}\\ \boxed{\theta = (1 +y'^2 )^{1/2}(2gy)^{-1/2}} \tag*{(5)}$

Please notice that, the integral $t_{AB}$ can be written as,
$t_{AB} = \int_{A}^{B} f(y, y', x) \mathrm{d}x \tag*{(6)}$
In our case, $f(y, y', x)$ is $\theta$ .

Equation $(6)$ closely follows the The Euler-Lagrange equation.

It is pretty simple to define,

The Euler-Lagrange equation

For any $I$ such that,
$I = \int_{a}^{b}f(y,y',x)\mathrm{d}x$
Where, $y' = \frac{\mathrm{d}y}{\mathrm{d}x}$ , then $I$ has an stationary point, if the Euler-Lagrange Equation, given by,
$\frac{\partial f}{\partial y}-\frac{\mathrm{d}}{\mathrm{d}x}\frac{\partial f}{\partial y'}=0$
is satisfied.

Therefore, for our analysis,
$\boxed{\frac{\partial \theta}{\partial y} - \frac{\mathrm{d}}{\mathrm{d}x}\frac{\partial \theta}{\partial y'}=0} \tag*{(7)}$

Stationary value

This is a value at the stationary point.
A stationary point $a$ is the point at which the first derivative of a function becomes zero,
$f'(x)|_{x=a} = 0 ; a = \text{Stationary point}$
Side note :
Find the stationary points of $f(x) = x^3-6x^2+9x$ .

Beltrami Identity

Our $\theta$ (from $(5)$ ) is such a cool expression, because $x$ does not appear in that; therefore, of course $\boxed{\frac{\partial \theta}{\partial x} = 0}$ ; this again leads us to another beautiful form, Beltrami Identity, which is given by,
$\boxed{\theta - y' \frac{\partial \theta}{\partial y'}=C ; C \text{ is a constant}}.\tag*{(8)}$

We now find, $\frac{\partial \theta}{\partial y'}$ ,
$\frac{\partial \theta}{\partial y'}=\frac{\partial}{\partial y'}[(1 +y'^2 )^{1/2}(2gy)^{-1/2}] \\ \implies \frac{\partial \theta}{\partial y'} =(2gy)^{-1/2} \frac{1}{2}(1 +y'^2 )^{-1/2}\cdot 2y'$
$\therefore \boxed{\frac{\partial \theta}{\partial y'} = (2gy)^{-1/2} (1 +y'^2 )^{-1/2}\cdot y'} \tag*{(9)}$

Use $\theta$ and $\frac{\partial \theta}{\partial y'}$ in equation $(8)$ , we get,
$(1 +y'^2 )^{1/2}(2gy)^{-1/2}-y' \cdot (2gy)^{-1/2} (1 +y'^2 )^{-1/2}\cdot y' = C$
$\implies \frac{\sqrt{(1 +y'^2 )}}{\sqrt{2gy}}-\frac{y'^2}{\sqrt{2gy}\sqrt{1+y'^2}} = C$
$\implies \frac{1+y'^2-y'^2}{\sqrt{2gy}\sqrt{1+y'^2}}=C$
$\implies \boxed{\frac{1}{\sqrt{2gy}\sqrt{1+y'^2}}=C} \tag*{(10)}$
Rearranging a little gives,
$\sqrt{1+y'^2}=\frac{1}{\sqrt{2gy}C}$
$\implies \boxed{(1+\frac{\mathrm{d}y}{\mathrm{d}x})y = \frac{1}{2gC^2} = M^2} \tag*{(11)}$
This is the equation of the cycloid as per [4], I am yet to figure out how this happened.
The solution of $(11)$ can be found using a parametric equation,
$\boxed{x(t) = \frac{1}{2}M^2(t-\sin t)\\ y(t)=\frac{1}{2}M^2(1-\cos t)} \tag*{(12)}$
To derive the above equation, please refer Math-stackexchange|Derive the parametric equations of a cycloid.

I have even plotted the solutions at $(12)$ , we can see that it is a cycloid,

If you wish to play with the visualisation please go to Cycloid | Parametric @ Desmos by Pragyaditya

Cheers!

References:
1. Stationary Points
2. Euler Lagrange Equation
3. Derivation of Beltrami Identity
4. Introduction to calculus of variations
5. Brachistochrone @ Wolfram