\documentclass[12pt]{article}
\usepackage[french]{babel}

\voffset -2.8 cm 
\textwidth 17 cm
\textheight 24.7 cm
\oddsidemargin -.54 cm
\evensidemargin -1.54 cm
\usepackage[latin1]{inputenc}

\pagestyle{plain}
\newcommand{\vc}[1]{\mathbf{#1}}

\begin{document}

\centerline {\Large \sc Information and coding theory}
\bigskip
\bigskip
\hrule
\bigskip
\bigskip
\centerline {\underline{\Large Solution of exercise sheet n$^\circ$ 6}:} 
\bigskip


\bigskip
\noindent {\bf 6-1.}
\par
\medskip

{\bf (a)}
The size of the matrix is given by $n=6$ and $m=4$. $n$ corresponds to the size of the codewords. The rank of $H$ is equal to $m=4$ and corresponds to the number of parity bits. We define a codeword vector $\vc{x}$ of components $x_i, \; i=1,2,...,n$. The condition $H\vc{x}=0$ can be written in terms of the system of equations:
$$
\left\{
\begin{array}{c}
x_1+x_5+x_6=0 \\
x_1 + x_2 + x_6 =0 \\
x_2 + x_3 + x_6=0 \\
x_1 + x_4 + x_6 =0
\end{array}
\right.
$$
 
We find the following solutions:

$$
\vc{x}=
\left(
\begin{array}{c}
1 \\ 0 \\ 1\\ 0 \\ 0 \\ 1 
\end{array}\right),
\left(
\begin{array}{c}
1 \\ 1 \\ 1 \\ 1 \\ 1 \\ 0
\end{array}\right),
\left(
\begin{array}{c}
0\\ 1 \\ 0 \\ 1 \\ 1 \\ 1
\end{array}\right),
\left(
\begin{array}{c}
0 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0
\end{array}\right)
$$
The minimal distance between the codewords is $3$ thus one can only correct single errors.

{\bf (b)}
To correct 1 error and to detect two errors, the minimal Hamming distance has to be $d=4$. This corresponds to having the 3 linearly independent columns of $H$  (see exercise 6-3). In particular, the last column of $H$ can neither be equal to another column of $H$ nor equal to a linear combination of any two columns. There are $5$ columns given and the number of linear combinations of two of them is $\frac{5\cdot(5-1)}{2}=10$. Remember that the column with all zeros is also forbidden, so this gives us $5+10+1=16$ different forbidden columns. Because the $h_{i,6}$ are bits, only $2^4=16$ different combinations are possible, and they are all excluded by the argument above. Therefore the Hamming distance $d$ can not be $4$. 


\bigskip
\noindent {\bf 6-2.}
\par
\medskip

{\bf (a)}
The first $3$ columns of $G_1$ are linearly independent and correspond to $k=3$ bits of information, while the last two columns correspond to $m=2$ parity bits. There are thus $2^k=8$ codewords. The Hamming matrix has 2 rows (number of parity bits) and 5 columns (lengths of the codewords) and contains at least two linearly independent columns. $H$ can be found by solving the equation $H\vc w=0$ for a codeword $\vc w$. We can try a solution of the form:
$$
\left(
\begin{array}{ccccc}
h_{1,1} & h_{1,2} & h_{1,3} & 1 & 0\\
h_{2,1} & h_{2,2} & h_{2,3} & 0 & 1 
\end{array}
\right)
$$
We find:
$$
\left(
\begin{array}{ccccc}
1 & 1 & 1 & 1& 0 \\
0 & 1 & 1 & 0& 1 
\end{array}
\right)
$$

Since all codewords have at least two bits the minimal distance between the words is $d=2$. This code allows the detection of single errors without correcting them.

{\bf (b)}

In this case $G_2$, $n=4$ and $k=1$. 
The number of codewords is $2^{k}=2$.
$m=n-k=3$, which corresponds to 3 parity bits.
Therefore 3 columns of $H$ can be written as the identity matrix. 
We find:
$$
\left(
\begin{array}{cccc}
1 & 1 & 0 & 0  \\
1 & 0 & 1 & 0  \\
1 & 0 & 0 & 1 
\end{array}
\right)
$$
The code has only two codewords:
$$
\vc{x}=
\left(
\begin{array}{c}
0 \\ 0 \\ 0\\ 0 
\end{array}\right),
\left(
\begin{array}{c}
1 \\ 1 \\ 1 \\ 1  
\end{array}\right)
$$
The Hamming distance is $d=4$.
This is a repetition code which corrects single errors and detects double errors.

Remark:
The transmission rate is given by $R=k/n$. We observe that $R_{G_1}=3/5$, but it does not permit to correct any error (can only detect single errors).
In contrary $R_{G_2}=1/4$ but permits to correct single errors and detect double errors.
There is a tradoff between the transmission rate and the possibility to correct errors.


\noindent {\bf 6-3.} 
\par
\medskip
A Hamming code corrects up to $e-1$ errors and detects (but not necessarily correct) up to $e$ errors iff the minimal Hamming distance is $d=2e$. It remains to show that $d=2e$ is equivalent to the requirement that all sets of $2e-1$ columns of the parity matrix $H$ are linearly independent. 
\newline
\par
If $\vc{w}_i$ is a codeword one has $H\vc{w}_i=\vc{0}$. 
Let $\vc{w}_j=\vc{w}_k+\vc{z}$, thus the number of $1$s in $\vc{z}$ (the weight $W(\vc{z})$) is equal to the distance $d_{jk}$ between $\vc{w}_j$ and $\vc{w}_k$. One has $H\vc{w}_j=H\vc{w}_k+H\vc{z}=0$ and $H\vc{w}_k=0$ since it is a codeword. One obtains $H\vc{z}=0$. 
This only holds if there are $d_{jk}$ columns of $H$ that are linearly dependent.
\newline
\par
But the minimal Hamming distance is $d=min_{ij}\{d_{ij}\}$, that means:
$d$ is the smallest number of linearly \textbf{dependent} columns in $H$. This again means that one requires all sets of $d-1$ columns of $H$ to be linearly independent. Thus, $d=2e$ is equivalent to require that all sets of $2e-1$ columns of $H$ are linearly independent.


 \bigskip
\noindent {\bf 6-4.}
\par
\medskip
Let $\vc{w}_i$ be a codeword and $W(\vc{w}_i)$ its weight. The weight can be written as
$W(\vc{w}_i)=d(\vc{w}_i,\vc{0})$. We are going to use the distance property:
$d(\vc{w}_i,\vc{w}_j)=d(\vc{w}_i-\vc{w}_k,\vc{w}_j-\vc{w}_k)$. By replacing $k$ by $j$
one obtains $d(\vc{w}_i,\vc{w}_j)=d(\vc{w}_i-\vc{w}_j,\vc{0})$, $\vc{w}_i-\vc{w}_j$ which is also a codeword because all codewords form a group.
\newline
\newline
{\bf Proof.}
\newline
The definition of the minimal Hamming distance $d=min_{i,j}\{d(\vc{w}_i,\vc{w}_j)\}$ implies in particular $\vc{w}_j=\vc{0}$ ($\vc{0}$ is always a codeword) 
$\forall i: d(\vc{w}_i,\vc{0})=W(\vc{w}_i)\geq d$. But it also implies that there is
at least one couple $(l,m)$ such that $d(\vc{w}_l,\vc{w}_m)=d$ 
(since there are at least two codewords which attain the minimum),
which offers the possibility to write $d(\vc{w}_l-\vc{w}_m,\vc{0})=d$. There is thus a $k$  $\vc{w}_k=\vc{w}_l-\vc{w}_m$ satisfies $W(\vc{w}_k)=d$.
\newline
\newline
Inversely, if on assumes that $W(\vc{w})\geq d$ then $\forall i,j$   
$\exists  k: \vc{w}_{k}=\vc{w}_i-\vc{w}_j$ such that
$d(\vc{w}_i,\vc{w}_j)=d(\vc{w}_i-\vc{w}_j,\vc{0})=W(\vc{w}_k)\geq d$.
As there is a $z$ such that $W(\vc{w}_z)= d$, one also has a pair $(a,b)$
such that $d(\vc{w}_a,\vc{w}_b)=d(\vc{w}_a-\vc{w}_b,\vc{0})=W(\vc{w}_z)=d$.
On obtains thus $d=min_{i,j}\{d(\vc{w}_i,\vc{w}_j)\}$.  


\end{document}


% \noindent {\bf 6-1.} 
% \par
% \medskip
% 
% Soit la matrice de Hamming $H$. Si un 
% mot-code $\vc{w}$ est altéré durant la transmission 
% et devient $\vc{r}=\vc{w}+\vc{z}$, alors l'action 
% de la matrice doit permettre de rendre compte de l'erreur 
% qui s'est produite i.e.
% $H \vc{r}= H (\vc{w}+\vc{z})= H\vc{z}$ car $H \vc{w}=\vc{0}$.
% $\vc{z}$ est le vecteur des erreurs. 
% Si $e$ erreurs se produisent, alors $\vc{z}$ a autant de 
% $1$ et l'action de $H$ produit un vecteur qui est 
% une somme de $e$ colonnes de $H$. Le nombre minimum 
% d'erreurs pour lequel le mot code ne donne pas un 
% autre mot code est donné par le nombre minimum de colonnes 
% de H indépendantes. Autrement, on aurait 
% $H (\vc{w}+\vc{z})=0$, ce qui correspond précisément 
% à la définition d'un mot code.  
% 
% Pour détecter $e+1$ erreurs, $2e$ colonnes linéairement 
% indépendantes ne suffisent pas car un mot code $\vc{w}$ avec 
% $e+1$ erreurs $\vc{z}$ peut être considéré comme
% un autre mot code $\vc{w'}$ avec $e$ erreurs $\vc{z'}$. En effet, 
% en algèbre binaire $\vc{w'}=\vc{z}+\vc{w}+\vc{z'}$ et 
% donc $H\vc{w'}=H(\vc{z}+\vc{w}+\vc{z'})=H(\vc{z}+\vc{z'})$.
% Si $\vc{z}+\vc{z'}$ est constitué de $2e+1$ fois du bit  $1$, alors 
% $H(\vc{z}+\vc{z'})$ est une somme de $2e+1$ colonnes distinctes 
% qui peuvent être dépendant. ce qui implique 
% qu'il existe $\vc{z}$ et $\vc{z'}$ tel que 
% $H(\vc{z}+\vc{z'})=0$.  
% Il faut donc 
% au minimum $2e+1$ colonnes indépendantes pour avoir 
% $H(\vc{z}+\vc{z'})\not=0$ quel que soit $\vc{z}+\vc{z'}$
% constitué de $2e+1$ bits égaux à $1$.
% 
% 
% Avec $2e+1$, on peut corriger $e$ erreurs car 
% on aura tjs $H (\vc{w}+\vc{z})\not=H (\vc{w'}+\vc{z'})$.
% c-à-d $H(\vc{z}+\vc{z'})\not=0$ puisque $\vc{z}+\vc{z'}$ 
% est une combinaison linéaire de $2e$ colonnes de $H$ et donc 
% indépendantes.
% 
% Inversément avec tout ensemble de $2e+1$ colonnes 
% linéairement indépendantes, 
% on peut détecter $e+1$ erreurs car même si deux mots 
% code deviennent identiques après erreurs, on constate 
% qu'ils produisent $e+1$ colonnes de $H$ indépendantes. On n'arrive 
% pas à détecter $e+2$ erreurs car ils existent $e+2$ colonnes de $H$ 
% qui peuvent se réécrire comme combinaison linéaire de 
% $e-1$ colonnes. Donc on ne pourra dire si on a eu $e+2$ ou 
% $e-1$ erreurs. Enfin, on ne peut corriger deux 
% mots code avec $e+1$ erreurs dans le cas où ils sont identiques.
% 
% 
% 
% Les colonnes qui ne sont pas reprises dans cette liste 
% et distinctes des colonnes de $H$ sont:
% 
% $$
% \left(
% \begin{array}{c}
% h_{1,6} \\ h_{2,6} \\ h_{3,6} \\ h_{4,6}
% \end{array}
% \right)=
% \left(
% \begin{array}{c}
% 1 \\ 1 \\ 0 \\ 0 
% \end{array}
% \right),
% \left(
% \begin{array}{c}
% 1 \\ 0 \\ 1 \\ 0
% \end{array}
% \right),
% \left(
% \begin{array}{c}
% 0 \\ 1 \\ 1 \\ 1
% \end{array}
% \right)
% $$ 
%  
% Dans le premier cas, les mots code correspondant sont alors:
% $$
% \vc{x}=
% \left(
% \begin{array}{c}
% 1 \\ 0 \\ 1\\ 1 \\ 0 \\ 1
% \end{array}\right),
% \left(
% \begin{array}{c}
% 1 \\ 1 \\ 1 \\ 0 \\ 1 \\ 0
% \end{array}\right),
% \left(
% \begin{array}{c}
% 0\\ 1 \\ 0 \\ 1 \\ 1 \\ 1
% \end{array}\right),
% \left(
% \begin{array}{c}
% 0 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0
% \end{array}\right)
% $$
% Dans cet exemple toutes les distances entre mots codes sont égales à $4$,
% et donc trivialement, la distance minimale de Hamming est $d=4$.