PROF. DR. MAHMOOD 2018-01-06 20
Chapter Two
1- Channel:
In telecommunications and computer networking, a communication channel
or channel, refers either to a physical transmission medium such as a wire, or to
a logical connection over a multiplexed medium such as a radio channel. A channel is
used to convey an information signal, for example a digital bit stream, from one or
several senders (or transmitters) to one or several receivers. A channel has a certain
capacity for transmitting information, often measured by its bandwidth in Hz or its data
rate in bits per second.
2- Symmetric channel:
The symmetric channel have the following condition:
a- Equal number of symbol in X&Y, i.e. P(Y∣X) is a square matrix.
b- Any row in P(Y∣X) matrix comes from some permutation of other rows.
For example the following conditional probability of various channel types as shown:
Β 

a- 𝑃( π‘Œ ∣ 𝑋 ) = [ ] is a BSC, because it is square matrix and 1st row is the
0 .9 0 .1

0.1 0.9permutation of 2nd row.
b- 𝑃( π‘Œ ∣ 𝑋 ) = [0 00..05 05 .9 0 00..05 05 .9 0 00..05 05 .9 ] is TSC, because it is square matrix and each
row is a permutation of others.
Β 

c- 𝑃( π‘Œ ∣ 𝑋 ) = [ ] is a non-symmetric since since it is not square Β 
0 .8 0 .1 0 .1

0.1 0.8 0.1although each row is permutation of others.
Β 

d- 𝑃( π‘Œ ∣ 𝑋 ) = [0 0 ..8 1 0 0 ..1 7 0 0 ..1 2 ] is a non-symmetric although it is square since 2nd

0.1 0.1 0.8row is not permutation of other rows.
PROF. DR. MAHMOOD 2018-01-06 21
2.1- Binary symmetric channel (BSC)
It is a common communications channel model used in coding theory and information
theory. In this model, a transmitter wishes to send a bit (a zero or a one), and the receiver
receives a bit. It is assumed that the bit is usually transmitted correctly, but that it will
be "flipped" with a small probability (the "crossover probability").
A binary symmetric channel with crossover probability p denoted by BSCp, is a
channel with binary input and binary output and probability of error p; that is, if X is the
transmitted random variable and Y the received variable, then the channel is
characterized by the conditional probabilities:
Pr( π‘Œ = 0 ∣ 𝑋 = 0 ) = 1 - 𝑃
Pr( π‘Œ = 0 ∣ 𝑋 = 1 ) = 𝑃
Pr( π‘Œ = 1 ∣ 𝑋 = 0 ) = 𝑃
Pr( π‘Œ = 1 ∣ 𝑋 = 1 ) = 1 - 𝑃
2.2- Ternary symmetric channel (TSC):
The transitional probability of TSC is:
𝑃( π‘Œ ∣ 𝑋 ) =
π‘₯1
π‘₯2
π‘₯3 [

1𝑃𝑦-𝑒1 2𝑃𝑒 1-𝑃𝑦𝑒22𝑃𝑒 𝑦𝑃 𝑃3𝑒 𝑒
𝑃𝑒 𝑃𝑒 1- 2𝑃𝑒

]
The TSC is symmetric but not very practical since practically π‘₯1 and π‘₯3 are not affected
so much as π‘₯2. In fact the interference between π‘₯1 and π‘₯3 is much less than the
interference between π‘₯1 and π‘₯2 or π‘₯2 and π‘₯3.
1 1
0
1-P
P P
1-P
0
PROF. DR. MAHMOOD 2018-01-06 22
1-2Pe
Hence the more practice but nonsymmetric channel has the trans. prob.
𝑃( π‘Œ ∣ 𝑋 ) =
π‘₯1
π‘₯2
π‘₯3 [1𝑃𝑦0𝑒1- 𝑃𝑒 1𝑃𝑒-𝑃𝑦𝑒221𝑃𝑒- 𝑃𝑦 0𝑃𝑒3𝑒]
Where π‘₯1 interfere with π‘₯2 exactly the same as interference between π‘₯2 and π‘₯3, but π‘₯1
and π‘₯3 are not interfere.
3- Discrete Memoryless Channel:
The Discrete Memoryless Channel (DMC) has an input X and an output Y. At any given
time (t), the channel output Y= y only depends on the input X = x at that time (t) and it
does not depend on the past history of the input. DMC is represented by the conditional
probability of the output Y = y given the input X = x, or P(YX).
1-2Pe
Β 

1-2Pe
X1

Pe
Pe
Y1
Y2
X3 Y3
X2
1-2Pe
Pe
Pe
1-Pe
X2
Β 

1-Pe
X1

X3
Y2
Y3
Y1
X

Channel
P(YX)
Y

PROF. DR. MAHMOOD 2018-01-06 23
4- Binary Erasure Channel (BEC):
The Binary Erasure Channel (BEC) model are widely used to represent channels or links
that β€œlosses” data. Prime examples of such channels are Internet links and routes. A
BEC channel has a binary input X and a ternary output Y.
Note that for the BEC, the probability of β€œbit error” is zero. In other words, the
following conditional probabilities hold for any BEC model:
Pr( π‘Œ = "π‘’π‘Ÿπ‘Žπ‘ π‘’π‘Ÿπ‘’" ∣ 𝑋 = 0 ) = 𝑃
Pr( π‘Œ = "π‘’π‘Ÿπ‘Žπ‘ π‘’π‘Ÿπ‘’" ∣ 𝑋 = 1 ) = 𝑃
Pr( π‘Œ = 0 ∣ 𝑋 = 0 ) = 1 - 𝑃
Pr( π‘Œ = 1 ∣ 𝑋 = 1 ) = 1 - 𝑃
Pr( π‘Œ = 0 ∣ 𝑋 = 1 ) = 0
Pr( π‘Œ = 1 ∣ 𝑋 = 0 ) = 0
5- Special Channels:
a- Lossless channel: It has only one nonzero element in each column of the
transitional matrix P(Y∣X).
𝑃( π‘Œ ∣ 𝑋 ) =
π‘₯1
π‘₯2
π‘₯3 [30/04𝑦1 0 10𝑦/24 1/𝑦3030 𝑦240/03 𝑦5100 ]
This channel has H(X∣Y)=0 and I(X, Y)=H(X) with zero losses entropy.
Pe
1-Pe
Β 

1-Pe Β 
X1 Y1

X2
Erasure
Y2

PROF. DR. MAHMOOD 2018-01-06 24
b- Deterministic channel: It has only one nonzero element in each row, the
transitional matrix P(Y∣X), as an example:
𝑃( π‘Œ ∣ 𝑋 ) =
π‘₯1
π‘₯2
π‘₯3
[
𝑦1 𝑦2 𝑦3
1 0 0
1 0 0
0 0 1
0 1 0
0 1 0 ]
This channel has H(Y∣X)=0 and I(Y, X)=H(Y) with zero noisy entropy.
c- Ideal channel: It has only one nonzero element in each row and column, the
transitional matrix P(Y∣X), i.e. it is an identity matrix, as an example:
Β 

𝑃( π‘Œ ∣ 𝑋 ) =
π‘₯2
[ Β 
0 1 0

Β 

π‘₯1 Β  Β 
𝑦1 𝑦2 𝑦3
1 0 0

π‘₯3 001]
This channel has H(Y∣X)= H(X∣Y)=0 and I(Y, X)=H(Y)=H(X).
d- Noisy channel: No relation between input and output:
H( X ∣ Y ) = 𝐻(π‘Œ), H( Y ∣ X ) = 𝐻(𝑋)
𝐼(𝑋, π‘Œ) = 0, 𝐢 = 0
𝐻(𝑋, π‘Œ) = 𝐻(𝑋) + 𝐻(π‘Œ)
6- Shannon’s theorem:
a- A given communication system has a maximum rate of information C known as
the channel capacity.
b- If the information rate R is less than C, then one can approach arbitrarily small
error probabilities by using intelligent coding techniques.
c- To get lower error probabilities, the encoder has to work on longer blocks of
signal data. This entails longer delays and higher computational requirements.
Thus, if R ≀ C then transmission may be accomplished without error in the presence
of noise. The negation of this theorem is also true: if R > C, then errors cannot be
avoided regardless of the coding technique used.
PROF. DR. MAHMOOD 2018-01-06 25
Consider a bandlimited Gaussian channel operating in the presence of additive
Gaussian noise:
The Shannon-Hartley theorem states that the channel capacity is given by:
𝐢 = π΅π‘™π‘œπ‘”2 (1 + 𝑁𝑆 )
Where C is the capacity in bits per second, B is the bandwidth of the channel in Hertz,
and S/N is the signal-to-noise ratio.
7- Channel Capacity (Discrete channel)
This is defined as the maximum of I(X,Y):
𝐢 = π‘β„Žπ‘Žπ‘›π‘›π‘’π‘™ π‘π‘Žπ‘π‘Žπ‘π‘–π‘‘π‘¦ = max[𝐼(𝑋, π‘Œ)] 𝑏𝑖𝑑𝑠/π‘ π‘¦π‘šπ‘π‘œπ‘™
Physically it is the maximum amount of information each symbol can carry to the
receiver. Sometimes this capacity is also expressed in bits/sec if related to the rate of
producing symbols r:
𝑅(𝑋, π‘Œ) = π‘Ÿ Γ— 𝐼(𝑋, π‘Œ) 𝑏𝑖𝑑𝑠/ sec π‘œπ‘Ÿ 𝑅(𝑋, π‘Œ) = 𝐼(𝑋, π‘Œ)/ πœΜ…
a- Channel capacity of Symmetric channels:
The channel capacity is defined as max[𝐼(𝑋, π‘Œ)]:
𝐼(𝑋, π‘Œ) = 𝐻(π‘Œ) - 𝐻( π‘Œ ∣ 𝑋 )
𝐼(𝑋, π‘Œ) = 𝐻(π‘Œ) + βˆ‘ βˆ‘ 𝑃(π‘₯𝑖, 𝑦𝑗)
𝑛
𝑖=1
π‘š
𝑗=1
log2𝑃(𝑦𝑗 ∣ π‘₯𝑖)
But we have
PROF. DR. MAHMOOD 2018-01-06 26
𝑃(π‘₯𝑖, 𝑦𝑗) = 𝑃(π‘₯𝑖)𝑃(𝑦𝑗 ∣ π‘₯𝑖) 𝑝𝑒𝑑 𝑖𝑛 π‘Žπ‘π‘œπ‘£π‘’ π‘’π‘žπ‘’π‘Žπ‘‘π‘–π‘œπ‘› 𝑦𝑖𝑒𝑙𝑑𝑒𝑠:
𝐼(𝑋, π‘Œ) = 𝐻(π‘Œ) + βˆ‘ βˆ‘ 𝑃(π‘₯𝑖)𝑃(𝑦𝑗 ∣ π‘₯𝑖)
𝑛
𝑖=1
π‘š
𝑗=1
log2𝑃(𝑦𝑗 ∣ π‘₯𝑖)
If the channel is symmetric the quantity:
βˆ‘ 𝑃(𝑦𝑗 ∣ π‘₯𝑖)log2𝑃(𝑦𝑗 ∣ π‘₯𝑖) = 𝐾
π‘š
𝑗=1
Where K is constant and independent of the row number i so that the equation
becomes:
𝐼(𝑋, π‘Œ) = 𝐻(π‘Œ) + 𝐾 βˆ‘ 𝑃(π‘₯𝑖)
𝑛
𝑖=1
Hence 𝐼(𝑋, π‘Œ) = 𝐻(π‘Œ) + 𝐾 for symmetric channels
Max of 𝐼(𝑋, π‘Œ) = max[𝐻(π‘Œ) + 𝐾] = max[𝐻(π‘Œ)] + 𝐾
When Y has equiprobable symbols then max[𝐻(π‘Œ)] = π‘™π‘œπ‘”2π‘š
Then
𝐼(𝑋, π‘Œ) = π‘™π‘œπ‘”2π‘š + 𝐾
Or
𝐢 = π‘™π‘œπ‘”2π‘š + 𝐾
Example 9:
For the BSC shown:
Find the channel capacity and efficiency if 𝐼(π‘₯1) = 2𝑏𝑖𝑑𝑠
Solution:
Β 

𝑃( π‘Œ ∣ 𝑋 ) = [ ]
0 .7 0 .3

0.3 0.70.7
Β 

0.7
X1

Y1
X2 Y2

PROF. DR. MAHMOOD 2018-01-06 27
Since the channel is symmetric then
𝐢 = π‘™π‘œπ‘”2π‘š + 𝐾 and 𝑛 = π‘š
π‘€β„Žπ‘’π‘Ÿπ‘’ 𝑛 π‘Žπ‘›π‘‘ π‘š π‘Žπ‘Ÿπ‘’ π‘›π‘’π‘šπ‘π‘’π‘Ÿ π‘Ÿπ‘œπ‘€ π‘Žπ‘›π‘‘ π‘π‘œπ‘™π‘’π‘šπ‘› π‘Ÿπ‘’π‘π‘’π‘ π‘‘π‘–π‘£π‘’π‘™π‘¦
𝐾 = 0.7π‘™π‘œπ‘”20.7 + 0.3π‘™π‘œπ‘”20.3 = -0.88129
𝐢 = 1 - 0.88129 = 0.1187 𝑏𝑖𝑑𝑠/π‘ π‘¦π‘šπ‘π‘œπ‘™
The channel efficiency πœ‚ = 𝐼(𝑋,π‘Œ)
𝐢
𝐼(π‘₯1) = -π‘™π‘œπ‘”2𝑃(π‘₯1) = 2
𝑃(π‘₯1) = 2-2 = 0.25 π‘‘β„Žπ‘’π‘› 𝑃(𝑋) = [0.25 0.75]𝑇
And we have 𝑃(π‘₯𝑖, 𝑦𝑗) = 𝑃(π‘₯𝑖)𝑃(𝑦𝑗 ∣ π‘₯𝑖) so that
Β 

𝑃(𝑋, π‘Œ) = [ ]=[
0 .7 Γ— 0 .25 0 .3 Γ— 0 .25

0.3 Γ— 0.75 0.7 Γ— 0.750 0..175 225 0 0..075 525]
𝑃(π‘Œ) = [0.4 0.6] β†’ 𝐻(π‘Œ) = 0.97095 𝑏𝑖𝑑𝑠/π‘ π‘¦π‘šπ‘π‘œπ‘™
𝐼(𝑋, π‘Œ) = 𝐻(π‘Œ) + 𝐾 = 0.97095 - 0.88129 = 0.0896 𝑏𝑖𝑑𝑠/π‘ π‘¦π‘šπ‘π‘œπ‘™
Then πœ‚ = 0.0896
0.1187
= 75.6%
To find the channel redundancy:
𝑅 = 1 - πœ‚ = 1 - 0.756 = 0.244 π‘œπ‘Ÿ 24.4%
1- Channel capacity of nonsymmetric channels:
We can find the channel capacity of nonsymmetric channel by the following
steps:
a- Find I(X, Y) as a function of input prob:
𝐼(𝑋, π‘Œ) = 𝑓(𝑃(π‘₯1), 𝑃(π‘₯2) … … … , 𝑃(π‘₯𝑛))
And use the constraint to reduce the number of variable by 1.
b- Partial differentiate I(X, Y) with respect to (n-1) input prob., then equate
these partial derivatives to zero.
c- Solve the (n-1) equations simultaneously then find
𝑃(π‘₯1), 𝑃(π‘₯2) … . . . . , 𝑃(π‘₯𝑛) that gives maximum I(X, Y).
PROF. DR. MAHMOOD 2018-01-06 28
d- Put resulted values of input prob. in the function given in step 1 to find 𝐢 =
max[𝐼(𝑋, π‘Œ)].
Example 10:
Find the channel capacity for the channel having the following transition:
Β 

𝑃( π‘Œ ∣ 𝑋 ) = [ ]
0 .7 0 .3

0.1 0.9Assume that 𝑃(π‘₯1) = 𝑃 β‰… 0.862
Solution: First not that the channel is not symmetric since the 1st row is not
permutation of 2nd row.
a- Let 𝑃(π‘₯1) = 0.862 , π‘‘β„Žπ‘’π‘› 𝑃(π‘₯2) = 1 - 𝑃 = 0.138.
𝑃(𝑋, π‘Œ) = 𝑃(𝑋) Γ— 𝑃( π‘Œ ∣ 𝑋 )
Β 

∴ 𝑃(𝑋, π‘Œ) = [ ]
0.1(1-0.862) 0.9(1-0.862) Β 
0.7Γ—0.862 0.3Γ—0.862 Β 

= [00..0138 6034 00..2586 1242]
From above results 𝑃(π‘Œ) = [0.6172 0.3828]
𝐻(π‘Œ) = - βˆ‘ 𝑃(𝑦𝑗)
π‘š
𝑗=1
log2𝑃(𝑦𝑗)
∴ 𝐻(π‘Œ) = 0.960 𝑏𝑖𝑑𝑠/π‘ π‘¦π‘šπ‘π‘œπ‘™
We have
𝐻(π‘Œ ∣ 𝑋) = - βˆ‘ βˆ‘ 𝑃(π‘₯𝑖, 𝑦𝑗)
𝑛
𝑖=1
π‘š
𝑗=1
log2𝑃(𝑦𝑗 ∣ π‘₯𝑖)
𝐻(π‘Œ ∣ 𝑋) = -[0.6034 𝑙𝑛0.7 + 0.2586𝑙𝑛0.3 + 0.0138𝑙𝑛0.1 + 0.1242𝑙𝑛0.9]/𝑙𝑛
PROF. DR. MAHMOOD 2018-01-06 29
H( Y ∣ X ) = 0.8244 bits/symbol
𝐢 = max[𝐼(𝑋, π‘Œ)] = 𝐻(π‘Œ) - H( Y ∣ X )
= 0.96021 - 0.8244 = 0.1358 𝑏𝑖𝑑𝑠/π‘ π‘¦π‘šπ‘π‘œπ‘™.
Review questions:
A binary source sending π‘₯1 with a probability of 0.4 and π‘₯2 with 0.6 probability
through a channel with a probabilities of errors of 0.1 for π‘₯1 and 0.2 for π‘₯2.Determine:
1- Source entropy.
2- Marginal entropy.
3- Joint entropy.
4- Conditional entropy 𝐻(π‘Œοƒ΄π‘‹).
5- Losses entropy 𝐻(π‘‹οƒ΄π‘Œ).
6- Transinformation.
Solution:
1- The channel diagram:
Or 𝑃(π‘Œοƒ΄X) = [0 0..9 2 0 0..1 8]
𝐻(𝑋) = - βˆ‘ 𝑝(π‘₯𝑖)
𝑛
𝑖=1
π‘™π‘œπ‘”2𝑝(π‘₯𝑖)
𝐻(𝑋) = -
[0.4 ln(0.4) + 0.6 ln(0.6)]
𝑙𝑛2 = 0.971
𝑏𝑖𝑑𝑠
π‘ π‘¦π‘šπ‘π‘œπ‘™
2- 𝑃(𝑋, π‘Œ) = 𝑃(π‘Œοƒ΄X) Γ— 𝑃(𝑋)
∴ 𝑃(𝑋, π‘Œ) = [0 0..9 2 Γ— Γ— 0 0..4 6 0 0..1 8 Γ— Γ— 0 0..4 6] = [0 0..36 12 0 0..04 48]
∴ 𝑃(π‘Œ) = [0.48 0.52]
0.9
0.8
0.1
0.2
0.6
0.4 π‘₯1
π‘₯2
𝑦1
𝑦2
PROF. DR. MAHMOOD 2018-01-06 30
𝐻(π‘Œ) = - βˆ‘ 𝑝(𝑦𝑗)
π‘š
𝑗=1
π‘™π‘œπ‘”2𝑝(𝑦𝑗)
𝐻(π‘Œ) = -
[0.48 ln(0.48) + 0.52 ln(0.52)]
ln(2) = 0.999 𝑏𝑖𝑑𝑠/π‘ π‘¦π‘šπ‘π‘œπ‘™
3- 𝐻(𝑋, π‘Œ)
𝐻(𝑋, π‘Œ) = - βˆ‘ βˆ‘ 𝑃(π‘₯𝑖, 𝑦𝑗)
𝑛
𝑖=1
π‘š
𝑗=1
log2𝑃(π‘₯𝑖, 𝑦𝑗)
𝐻(𝑋, π‘Œ) = -
[0.36 ln(0.36) + 0.04 ln(0.04) + 0.12 ln(0.12) + 0.48 ln(0.48)]
ln(2)
= 1.592 𝑏𝑖𝑑𝑠/π‘ π‘¦π‘šπ‘π‘œπ‘™
4- 𝐻(π‘Œοƒ΄X)
𝐻(π‘Œοƒ΄X) == - βˆ‘ βˆ‘ 𝑃(π‘₯𝑖, 𝑦𝑗)
𝑛
𝑖=1
π‘š
𝑗=1
log2𝑃(𝑦𝑗π‘₯𝑖)
𝐻(π‘Œοƒ΄X) = - [0.36 ln(0.9) + 0.12 ln(0.2ln ) +(20).04 ln(0.1) + 0.48 ln(0.8)]
= 0.621
𝑏𝑖𝑑𝑠
π‘ π‘¦π‘šπ‘π‘œπ‘™
Or 𝐻(π‘Œοƒ΄X) = 𝐻(𝑋, π‘Œ) - 𝐻(𝑋) = 1.592 - 0.971 = 0.621 𝑏𝑖𝑑𝑠
π‘ π‘¦π‘šπ‘π‘œπ‘™
5- 𝐻(𝑋Y) = 𝐻(𝑋, π‘Œ) - 𝐻(π‘Œ) = 1.592 - 0.999 = 0.593 𝑏𝑖𝑑𝑠/π‘ π‘¦π‘šπ‘π‘œπ‘™
6- 𝐼(𝑋, π‘Œ) = 𝐻(𝑋) - 𝐻(𝑋Y) = 0.971 - 0.593 = 0.378 bits/symbol
2- Cascading of Channels
If two channels are cascaded, then the overall transition matrix is the product of the two
transition matrices.
p(z / x) ο€½ p(y / x).p(z / y)
matrix
(nο‚΄ k)
matrix
(nο‚΄ m)
matrix
(mο‚΄ k)
Β 

Channel 1
. .

Β 

Channel 1
. .

1 m
. .
1 n
1 k
PROF. DR. MAHMOOD 2018-01-06 31
For the series information channel, the overall channel capacity is not exceed any of
each channel individually.
𝐼(𝑋, 𝑍) ≀ 𝐼(𝑋, π‘Œ) & 𝐼(𝑋, 𝑍) ≀ 𝐼(π‘Œ, 𝑍)
Example:
Find the transition matrix p(z / x) for the cascaded channel shown.

οƒͺ
ο€½
0.3 0 0.7
0.8 0.2 0
p(y / x) ,

οƒͺοƒͺοƒͺ
ο€½
1 0
1 0
0.7 0.3
p(z / y)

οƒͺ
ο€½

οƒͺοƒͺοƒͺ

οƒͺ
ο€½
0.91 0.09
0.76 0.24
1 0
1 0
0.7 0.3
0.3 0 0.7
0.8 0.2 0
p(z / x)
0.7
0.8
0.3
0.2
0.7
0.3
1
1Β 
Β 

default cover
Material File