Direct Video Broadcast (DVB) Systems ( ) y

Transcription

Direct Video Broadcast (DVB) Systems ( ) y
Direct Video Broadcast ((DVB)) Systems
y
Slide: Courtesy, Hung Nguyen
Processing of The Streams in The Set-Top Box (STB)
Slide: Courtesy, Hung Nguyen
Multimedia Communications
Standards and Applications
Slide: Courtesy, Hung Nguyen
Video Coding
g Standards
• ITU H.261 for Video Teleconference (VTC)
(
)
• ITU H.263 for VTC over POTS
• ITU H.262 for VTC over ATM/broadband and digital TV
networks
• ISO MPEG-1 for movies on CDROM (VCD)
– 1.2 Mbps
p for video coding
g and 256 Kbps
p for audio coding
g
• ISO MPEG-2 for broadcast quality video on DVD
– 2-15 Mbps allocated for audio and video coding
• Low-bit
L
bit rate
t ttelephony
l h
over POTS
– 10 Kbps for video and 5.3 Kbps for audio
• Internet and mobile communication: MPEG-4
MPEG 4
– Very Low Bit Rate (VLBR) code to be compatible with H.263
• Multimedia content description interface MPEG-7
– Description schemes and description definition language for
integrated multimedia search engine
History
y
•
•
•
•
H.261:
– First video coding standard
standard, targeted for video conferencing over
ISDN. Uses block-based hybrid coding framework with integerpixel MC
H 263:
H.263:
– Improved quality at lower bit rate, to enable video
conferencing/telephony below 54 kbps (modems, desktop
conferencing)
f
i )
– Half-pixel MC and other improvement
MPEG-1 video
– Video on CD and video on the Internet (good quality at 1.5
mbps)
– Half
Half-pixel
pixel MC and bidirectional MC
MPEG-2 video
– SDTV/HDTV/DVD (4-15 mbps)
– Extended
E tended from MPEG
MPEG-1,
1 considering interlaced video
ideo
Video compression principles
Video: moving pictures and the terms “frame” and
“ i t ”
“picture”.
one approach to compressing a video source is to
apply the JPEG algorithm to each frame
i d
independently.
d tl This
Thi approach
h is
i known
k
as moving
i
JPEG or MJPEG.
between 10:1 and 20:1, neither of which is large
enough
h on its
it own to
t produce
d
th compression
the
i ratios
ti
needed.
• Redundancy is often present between a set of frames
• Example:
– movement of a person’s lips or eyes in video telephony application
– a person or vehicle moving across the screen in a movie. (only a small
portion of each frame is involved with any motion that taking place.)
• Hence, sending only information relating to those segments of
each frame that have movement associated with them.
((considerable
id bl additional
dditi
l savings
i
iin b
bandwidth
d idth can b
be made
d b
by exploiting
l iti th
the
temporal differences that exist between many of the frames).
JJust a selection
l i iis sent iin iindividually-compressed
di id ll
d fform and,
d
for the remaining frames, only the differences between the
actual frame contents and the predicted frame contents are
sent.
Sub-sampling
p g of Chrominance Information
• Transforming (R,G,B)->(Y,Cb,Cr)
provides two advantages:
• 1)The human visual system (HVS) is
more sensitive to Y component than
the Cb or Cr components.
• 2) Cb and
d Cr
C are ffar lless correlated
l t d
with Y than R with G, R with Blue and
Blue with G, thus reducing TV
transmission bandwidths.
• Cb and Cr both require far less
bandwidth and can be sampled more
coarsely (Shannon).
• By doing so we can reduce data
without
ith t affecting
ff ti g visual
i l quality
lit ffrom a
personal view.
Color Space Conversion
• In general , each pixel in a
picture consists of three
components : R (Red), G (Green),
B (Blue). (R,G,B) must be
, , ) in MPEG-1
converted to ((Y,Cb,Cr)
before processing
• We can view the color value of
each pixel from RGB color space
, or YCbC
YCbCr color
l space
• Because (Y,Cb,Cr) is less
correlated than (R,G,B), coding
using (Y,Cb,Cr)
(Y Cb Cr) components is
more efficient.
• (Y,U,V) can also be used to
denote (Y,Cb,Cr), however it
most appropriately represents
the analog TV equivalent
Macroblock structure
The basic coding unit is a 8 by 8 matrix block.
A macroblock is consists of six block:
4 block of luminance (Y) ,
one block
bl k off Cb chrominance,
h
i
and
d
one block of Cr chrominance
Macro Blocks & Color Sub
Sub-sampling
sampling Schemes
A macroblock
consists of 4 8x8
pixel blocks
Slide: Courtesy, Hung Nguyen
Picture Frames - Overview
Three frame types:
• I-Picture (Intra-frame picture),
• P-Picture (Inter-frame predicted picture)
• B-Picture
B Picture (Bi-directional
(Bi directional predictedpredicted interpolated
pictures)
I-frames
• Are encoded without reference to any other
frames.
• Each frame is treated as a separate (digiti
(digitized)
ed)
picture and the Y, Cb and Cr matrices are
encoded independently using the JPEG
algorithm (DCT, quantization, entropy encoding)
except that the quantization threshold values
that are used are the same for all DCT
coefficients.
• Hence the level of compression obtained with Iframes is relatively small.
P-frames
• The encoding of a P-frame is relative to the
contents of either a preceding I-frame
I frame or a
preceding P-frame.
• P-frames
P frames are encoded using a combination
of motion estimation and motion
compensation
Bf
B-frames
• Their contents are predicted using search
regions
i
i both
in
b th pastt and
d future
f t
f
frames.
• allowing for occasional moving objects,
this also provides better motion estimation.
estimation
Group off pictures
G
i t
or GOP:
GOP
The number of frames/pictures
p
between
successive I-frames
It is given the symbol N and typic values
for N are from 3 through to 12.
12
Example
p Frame Sequences
q
I and P Frames Only
I,P and B Frames
The number of frames between a Pframe and the immediately preceding Ior P-frame
Pf
is
i called
ll d the
th prediction
di ti span.
It is given the symbol M (1 & 3)
•A typical sequence of frames involving just I- and P-frames is shown in Figure 4.11(a)
and a sequence involving all three frame types is shown in part (b) of the figure.
• P-frames their contents are encoded by considering the
contents of the current (uncoded) frame relative to the
contents of the immediately preceding (uncoded) frame.
• B
B-frames,
f
h
however
th
three ((uncoded)
d d) fframe contents
t t are
involved: the immediately preceding I- or P-frame, the
current frame being
g encoded,, and the immediately
y
succeeding I- or P-frame.
• This results in an increase in the encoding (and
decoding) delay which is equal to the time to wait for the
next II or P-frame
P frame in the sequence.
sequence
Decoding P frame
• With P-frames, the received information is
g information
first decoded and the resulting
is then used, together with the decoded
contents of the preceding II or P-frame
P frame, to
derive the decoded frame contents.
D
Decoding
di B fframes
• IIn the
th case off B-frames,
Bf
the
th received
i d iinformation
f
ti
is first decoded and the resulting information is
then used
used, together with both the immediately
preceding I- or P-frame contents and the
immediately succeeding P- or I-frame contents,
t derive
to
d i the
th d
decoded
d d fframe contents.
t t
• H
Hence iin order
d tto minimize
i i i th
the titime required
i d tto
decode each B-frame, the order of encoding
(and transmission) of the (encoded) frames is
changed so that both the preceding and
succeeding I- or P-frames are available when
th B
the
B-frame
f
is
i received.
i d
Frame Types
•
there are two basic types of compressed frame:
– those that are encoded independently
– those that are predicted.
p
•
Intracoded Frames -> I-Frames
– Level of compression
p
is relatively
y small 10:1 to 20:1
– Present at regular intervals to limit extent of errors
– Number of frames between I-frames is known as the Group of
pictures ((GOP))
p
– 10:1 to 20:1 compression ratio
Intercoded Frames (interpolation frames)
– Predicted Frames->
Frames > P
P-Frames
Frames
• Significant compression level achieved here
• Errors are propagated
• 20:1 to 30:1 compression ratio
– Bidirectional Frames -> B-Frames
• Highest levels of compression achieved
• B-frames
Bf
are nott used
d for
f prediction,
di ti
thus
th errors are nott
propagated
• 30:1 to 50:1 compression ratio
•
Motion Compensation (MC) And Motion
Estimation (ME)
• Motion Estimation is to predict a block of pixels' value
in next picture using a block in current picture. The
location difference between these blocks is called
Motion Vector. And the difference between two blocks
is called p
prediction error.
• In MPEG-1, encoder must calculate the motion vector
and prediction error. When decoder obtain these
information , it can use this information and current
picture to reconstruct the next picture.
• We usually call this process as Motion Compensation.
In general, motion compensation is the inverse process
of motion Estimation
Slide: Courtesy, Hung Nguyen
Motion Compensation
•
•
•
Try to match each block in the actual picture to content in the
previous picture. Matching is made by shifting each of the 8 x 8
blocks of the two successive pictures pixel by pixel each direction
->
> Motion vector
Subtract the two blocks -> Difference block
Transmit the motion vector and the difference block
Motion estimation
Estimation any movement between
successive frames.
(The accuracy of the prediction operation?)
Motion compensation
additional information must also be sent to
indicate any small differences between the
predicted and actual positions of the
moving segments involved
involved.
Motion Estimation (ME)
Slide: Courtesy, Hung Nguyen
Motion Compensation (MC)
Slide: Courtesy, Hung Nguyen
P-Frame Encoding: Macroblock
Structure
P-Frame Encoding: Encoding
Procedure
DCT (discrete cosine transform)
• DCT is used to convert data from the spatial domain to data in
frequency domain. The higher frequency coefficients can be more
coarsely quantized without a perceived loss of image quality due
to the fact that the HVS is less sensitive to the higher frequencies
and they contain less energy.
• The DCT coefficient at location (0,0) is called DC coefficient and
the other values we call them AC coefficients. In general, we use
large quantization step in quantizing the higher AC coefficients.
Hi h precision
Higher
i i iis required
i d ffor the
h DC term iin order
d to avoid
id
blocking in the reconstructed image.
• IIn MPEG
MPEG-1,
1 we use 8*8 DCT.
DCT By
B using
i this
thi ttransform
f
we can convertt
a 8 by 8 pixel block to another 8 by 8 block. In general most of the
energy(value) is concentrated to the top-left corner.
• After quantizing the transformed matrix, most data in this matrix
may be zero, then using zig-zag order scan and run length coding
can achieve a high compression ratio.
ratio
Transform Coding (TC)
•
•
•
•
Pack
P
k the
h signal
i
l energy iinto as ffew transform
f
coefficients
ffi i
as possible
ibl
The DCT yields nearly optimal energy concentration
A2
2-dimensional
dimensional DCT with block size of 8x8 pixels is commonly
used in today’s image coder
Transform is followed by quantization and entropy coding
Slide: Courtesy, Hung Nguyen
2D DCT and IDCT
u, v, x, y = 0, 1,2, ….,7
Slide: Courtesy, Hung Nguyen
DCT Scan Modes
•
•
The zigzag scan used in MPEG-1 is suitable for progressive images where
frequency components have equal importance in each horizontal and vertical
direction. (Frame pictures only)
In MPEG-2, an alternate scan is introduced because interlaced images tend to
have higher frequency components in the vertical direction. Thus, the scanning
order
d weighs
i h more on th
the hi
higher
h vertical
ti l ffrequencies
i th
than th
the same h
horizontal
i
t l
frequencies. Selection between these two zigzag scan orders can be made on
a picture basis. (Frame and field pictures allowed)
Slide: Courtesy, Hung Nguyen
Quantization
• In MPEG-1, a matrix called the quantizer ( Q[i,j] )
defines the quantization step. If ( X[i,j] ) is the DCT
matrix
t i with
ith the
th same size
i as Q[i,j],
Q[i j] X[i
X[i,j]
j] iis di
divided
id d b
by
Q[i,j]*QSF to obtain the quantized value matrix Xq[i,j]
. QSF is the Quantization Scale Factor
– Quantization Equation :
• Xq[i,j] = Round( X[i,j]/(Q[i,j] *QSF))
• Inverse Quantization (dequantize) is to reconstruct
original value.
– Inverse
I
Quantization
Q
ti ti E
Equation
ti :
• X'[i,j]=QSF*Xq[i,j]*Q[i,j]
• The difference between actual value and
reconstructed value from quantized value is called the
quantization error. In general if we carefully design
Q[i j] visual quality will not be affected.
Q[i,j],
affected
Quantization (cont
(cont’d)
d)
Slide: Courtesy, Hung Nguyen
Intra frame Encoding Process
Intra-frame
•
•
•
•
•
•
•
Decomposing image to three components in RGB space
Converting RGB to YCbCr
Dividing image into several macroblocks (each macroblock has 6
blocks , 4 for Y, 1 for Cb, 1 for Cr)
DCT transformation for each block
After DCT transform , Quantizing each coefficient
Then use zig-zag scan to gather AC value Use DPCM to encode
the DC value, then use VLC to encode it
Use RLE to encode the AC value, then use VLC to encode it
Coding of P Pictures
•
•
As in I pictures, the encoder needs
to store the decoded P pictures
since this may be used as the
starting point for motion
compensation. Therefore, the
encoder will reconstruct the image
f
from
the
th quantized
ti d coefficients.
ffi i t
In coding P pictures, the encoder
has more decisions to make than in
the case of I pictures
–
–
–
–
Selection of Macroblock Type:
There are 8 types of macroblock in
P pictures.
Motion Compensation Decision: The
encoder
d h
has an option
ti on whether
h th
to transmit motion vectors or not
for predictive-coded macroblocks.
Intra/Non-intra Coding Decision
Coded/Not Coded Decision:
After quantization, if all the
coefficients in a block is zero then
the block is not coded.
Quantizer/No Quantizer Decision:
Quantizer scale can be altered
which will affect the picture
quality.
Slide: Courtesy, Hung Nguyen
The Inter-frame Encoding Flow
Chart
Slide: Courtesy, Hung Nguyen
MPEG (Moving Picture Expert Group)
• Established in January 1988
• Operated in the framework of the Joint ISO/IEC
T h i lC
Technical
Committee
itt
• ISO: International Organization for Standardization
• IEC:
IEC International
I t
ti
l Electro-technical
El t t h i l C
Commission
i i
• First meeting was in May 1988 with 25 experts
participated
• Grown to 350 experts from 200 companies in some 20
countries
• As a rule, MPEG meets in March, July and November &
could be more often as needed Slide: Courtesy, Hung Nguyen
RGB Image
RGB Image
50
100
150
200
250
300
350
400
450
500
100
200
300
400
500
600
700
800
Compressed Image (QSF=24)
Compressed Image
50
100
150
200
250
300
350
400
450
500
100
200
300
400
500
600
700
800
Luminance Plane (Y)
Luminance Plane
50
100
150
200
250
300
350
400
450
500
100
200
300
400
500
600
700
800
Blue Chrominance Plane (Cb)
Blue Chrominance Plane
50
100
150
200
250
300
350
400
450
500
100
200
300
400
500
600
700
800
Red Chrominance Plane (Cr)
Red Chrominance Plane
50
100
150
200
250
300
350
400
450
500
100
200
300
400
500
600
700
800
Red
Red RGB Plane
50
100
150
200
250
300
350
400
450
500
100
200
300
400
500
600
700
800
Green
Green RGB Plane
50
100
150
200
250
300
350
400
450
500
100
200
300
400
500
600
700
800
Blue
Blue RGB Plane
50
100
150
200
250
300
350
400
450
500
100
200
300
400
500
600
700
800