Einleitung - Institut für Informatik

Transcription

Einleitung - Institut für Informatik
MPEG
(Motion Picture Expert Group)
Eine Einführung in die MPEG Video- und AudioKompression für Internet, CD-ROM und DVD
Universität Osnabrück
Rechenzentrum
Dipl.-Math. Frank Elsner
12.12.1999
Version 1.0
http://www.rz.uni-osnabrueck.de/Multimedia
1
Inhaltsverzeichnis
Einleitung.......................................................................................................................................................... 3
Ein Steilkurs ..................................................................................................................................................... 4
Einführende Beispiele ..................................................................................................................................... 9
MPEG Video Compression............................................................................................................................ 15
MPEG-1 Layer 3 Audio Compression........................................................................................................... 25
Profiles ............................................................................................................................................................ 27
MPEG Encoder ............................................................................................................................................... 32
MPEG Player................................................................................................................................................... 40
Probleme, Tips und Tricks ............................................................................................................................ 42
DVD Grundlagen und Authoring................................................................................................................... 48
Weiterführende Dokumentation ................................................................................................................... 55
Anhang ............................................................................................................................................................ 57
2
Einleitung
MPEG ist ein Modewort, das selbst Eingang in „Nicht-Computer-Fachliteratur“ gefunden hat. Ausschlaggebend ist die Hysterie, die MP3 (MPEG-1 Layer 3) im Internet ausgelöst hat. Weitere wichtige Anwendungen
von MPEG (genauer MPEG-2) sind DVD-Video und Digitales Fernsehen. Für beide Anwendungen stellt
MPEG-2 das zugrundeliegende Kompressionsverfahren dar.
In diesem Skript erhalten Sie eine Einführung in die Video- und Audio-Kompression mit Hilfe des MPEG
Verfahrens.
Im Kapitel „Ein Steilkurs“ wird auf die Historie von MPEG und die grundlegenden Ideen eingegangen.
Im Kapitel „Einführende Beispiele“ wird anhand einfacher Beispiele demonstriert, wie Videos im Format AVI
nach MPEG transformiert und bearbeitet werden können.
Im Kapitel „MPEG Compression“ wird im Detail behandelt, wie MPEG Video-Kompressionsraten von bis zu
1:100 erzielen kann.
Im Kapitel „MPEG-1 Layer 3“ wird auf das neue MP3 Audio Format eingegangen.
Im Kapitel „Profiles“ werden die Unterschiede zwischen MPEG-1 und MPEG-2 behandelt sowie diverse Profile vorgestellt.
In den Kapitel „Encoder“ und „Player“ werden Encoder zum Erzeugen von MPEG Dateien, zum Analysieren, Schneiden und Zusammenfügen sowie einige Player vorgestellt.
Im Kapitel „Probleme, Tips und Tricks“ werden mögliche Probleme und Lösungen aufgezeigt, die im Produktionsprozeß auftreten können.
Im Kapitel „DVD Grundlagen und Authoring“ wird das Thema DVD angeschnitten.
Abschließend liefern die Kapitel „Weiterführende Dokumentation“ und „Anhang“ Links zu Herstellern und
Referenzdokumenten sowie Auszüge aus einigen der genannten Dokumente.
Eine Anmerkung in eigener Sache:
In Anbetracht der Tatsache, daß (fast) alle Dokumente zu MPEG in englischer Sprache und in ausgezeichneter Qualität vorliegen, habe ich mich entschlossen, die Texte in der Originalfassung zu übernehmen (und
nicht durch eine Übersetzung zu verschlechtern J)., Ich bin aber gern bereit, Teile des Skriptes oder auch
das ganze Skript zu übersetzen, wenn trifftige Gründe (?) hierfür sprechen.
Als Ergänzung zu diesem Skript ist vom RZ eine CD-ROM erhältlich mit dem Titel: „MPEG und DVD –
Software, Dokumentation und Clips“.
3
Ein Steilkurs
In diesem Kapitel wird auf die Entwicklung und Normierung von MPEG eingegangen.
Video Parameter
Video is simply an electronic sequence of still images displayed or projected (quickly) in succession to one
another. As a result, the human mind is fooled into believing that people or objects in the presented sequence
move. In terms of computers, there are three important characteristics of video:
1. How fast each picture is displayed (frame rate)?
2. How many elements create each picture in both the horizontal and vertical dimensions (frame size)?
This is normally given in terms of pixels (or pels).
3. How many different colors the picture/pixel is made from (color depth)?
Frame rate
Frame rate is the number of frames that are displayed to a viewer each second. For example, in motion picture film in the United States it is common to display 24 frames each second. In color television for the US home (called NTSC) 29.97 frames a second are displayed. [German PAL is based on a
frame rate of 24 frames each second. – FE]. Even though computers are not normally thought of in
terms of frame rates, most computers “refresh” the screen by repainting every element of the screen
as often as 72 times a second.
Frame size or number of picture elements
Frame size or number of picture elements is the next component of video. This is measured horizontally and vertically in pixels. “Pixels” are picture elements -- the small dots which make up the displayed picture. Some common dimensions, or resolutions numbers, in the computer world include:
640 horizontal pixels x 480 vertical pixels, 1024 horizontal x 768 vertical, and 800 horizontal x 600
vertical pixels.
Number of colors
The number of colors which make up each picture or frame is a third component of video. As is the
case with a painter’s palette, a color can be described in terms of several “primary” colors. For instance, when playing with paints as a child, mixing equal parts of red, yellow and blue created black. By
mixing these primary colors in different combinations, it is possible to produce any other color. Color
mixing works a bit differently with light than with paints, but we can still make any color from three
primaries. In the video world, however, we substitute green for yellow in our “primary” color palette.
Color Spaces
In mixing colors of light, we vary the amount of red, green and blue light that makes up the color of a pixel. To
make video practical, it is necessary to limit the number of dffering shades of red, green, or bue that can be
generated. This puts an upper limit on the total number of colors that video can recreate. Here is an example
of a common digital color scheme.
Each primary color (red, green, or blue) may have 256 different levels or shades. Since a color may be composed of the three primaries, this means we can generate 16.8 million different colors, or 256 levels of red
times 256 levels of green times 256 levels of blue (16.8 million roughly equals 256x256x256). The color for a
pixel is normally written as follows:
pixel_color = (red_level, blue_level, green_level).
The previous example just described 24 bit video, without calling it that. The term 24 bit comes from the fact
that 256 shades of the primaries may be represented as an 8 bit value. Since it takes three primaries torepresent a single value it takes 8+8+8 or 24 bits to represent color for a single pixel:
8bits red 8bits green 8bits blue
RRRRRRRR GGGGGGGG
BBBBBBBB
4
= 24 bits
As a final note about color and video, it is possible to choose different primaries or entirely different colorspaces/colorsystems. The way we described colors above is not the only way to identify colors. Different “colorspaces,” or methods of describing colors have different uses.
For example the common colorspace for printing is CMYK, or cyan, magenta, yellow, and black. Another colorspace is YCrCb, or luminance (shade intensity) and chrominance-red and chrominance-blue (chrominance components define the hue and value of the color). This last colorspace is commonly used in video,
primarily because it more closely resembles the colorspace of human eyes, where rods detect luminance
components and cones detect the chrominance components of color.
The international standard CCIR-610-1 specifies eight-bit digital coding for component video. For Rec. 601-1
coding in eight bits per component,
Y_8b
= 16 + 219 * Y Cb_8b = 128 + 112 * (0.5/0.886) * (Bgamma - Y) Cr_8b
= 128 +
112 * (0.5/0.701) * (Rgamma - Y)
CCIR-610-1 Rec. calls for two-to-one horizontal subsampling of Cb and Cr, to achieve 2/3 the data rate of
RGB with virtually no perceptible penalty. This is denoted 4:2:2. JPEG and MPEG normally subsample Cb
and Cr two-to-one horizontally and also two-to-one vertically, to get 1/2 the data rate of RGB. This is denoted
4:2:0. To get good results using subsampling you should not just drop and replicate pixels, but implement
proper decimation and interpolation filters.
For the purposes of this discussion, let's assume our video source is a typical professional digital video format called ITU-T 601 (formerly known as CCIR 601). In this format, we see the video is represented in the
following fashion:
1.
2.
3.
frame rate of 30 frames a second
picture size of one frame 720x480 (NTSC)
color and colorspace: YCrCb 4:2:2
Luminance (Y) is sampled at full resolution; each chrominance component (Cr and Cb) is sampled at full
resolution one half as often. On average then, it takes 16 bits to represent each pel. Using these values, it is
easy to calculate the total disk space required to hold one second of uncompressed video in this format:
720 horiz. Pixels X 480 vert. Pixels = 345600 pixel per frame
345600 pixel per frame X 30 frames per second =10368000 pixels per second
10368000 pixels per second X 2 bytes per pixel =20,736,000 total bytes per second
This means that a 20 Gigabyte hard drive could hold about 1000 seconds of uncompressed video. Clearly
this is not practical for most applications.
5
MPEG ISO Standard
Die Motion Picture Expert Group (MPEG) wurde Ende der 80er Jahre zur Festlegung eines digitalen Standards für Bewegtbilddarstellung ins Leben gerufen. Bis zur Verabschiedung der Norm MPEG-1 standen bereits verschiedene Verfahren zur Verfügung. Zu den bis dato wichtigsten Vertretern gehörten Motion-JPEG
(M-JPEG) und die Recommendation H.261 der CCITT.
MPEG is an acronym for Moving Pictures Experts Group which commonly refers to the international standard for digital video and audio compression. The official name of the MPEG-1 standard is: “Coding of Moving Pictures and Associated Audio for Digital Storage Media at up to about 1.5 Megabits per second.” It is
sometimes referred to by its ISO/IEC project number, 11172 parts 1 through 5. However, this video standard
is usually just called “MPEG”.
MPEG-1 and MPEG-2 are motion video compression standards created by the Moving Picture Experts
Group. This group is a joint committee of the International Standardization Organization (ISO) and the
International Electrotechnical Commission.
The MPEG-1 Standard, completed as a draft in 1992, defines a bit stream of compressed audio and video
data with a data rate of 1.5 Mbits/sec as being suitable for CD-ROMs and VideoCD applications. It is possible to generate MPEG-1 streams with other data rates. The MPEG-1 Standard is formally described in
ISO/IEC 11172.
The MPEG-2 Standard was designed later for digital transmission of broadcast quality video with data rates
from 2 to 10 Mbits/sec. It was written to be more “generic”, that is to address a broader range of applications, and is the compression standard for DVD and various digital television systems. The MPEG-2 Standard is described in ISO/IEC 13818 documents.
MPEG Compression Overview
(This is a short introduction. Please refer to the following chapters for detailed information!)
The basic idea behind MPEG video compression is to remove spatial redundancy within a video frame
and remove temporal redundancy between video frames. As in JPEG, the standard for still image compression, DCT-based (Discrete Cosine Transform) compression is used to reduce spatial redundancy. Motion compensation is used to exploit temporal redundancy. The images in a video stream usually do not
change much within small time intervals. The idea of motion-compensation is to encode a video frame based
on other video frames temporally close to it.
Intra-frame compression (compression within a picture)
1. Discrete Cosinus Transformation (DCT)
2. Quantization
3. Huffmann / Arithmetic Encoding
Die folgenden Abbildungen skizzieren die Abläufe:
6
Inter-frame compression (temporal redundancy)
Inter-frame compression (compression relating to nearby pictures)
1. Motion Compensation
An MPEG stream can have three types of frames:
•
•
•
I-frames
P-frames
B-frames
Intra (I) frames are coded without any references to any other frames.
Predicted (P) frames reference previously encoded P or I frames, and encode only the changes. Predicted
frames provide significantly better compression than Intra coded (I) frames.
Bi-directional interpolated (B) frames contain references to both previous P or I frames and the next P or I
frame. Bi-directional frames provide the best compression.
7
The primary difference among these frames is how motion vectors are used in them. Intraframes (I frames)
do not use any form of motion vectors.Predictive frames (P frames) make use of predictive type motion vectors. Bi-directional frames (B frames) make use of both predictive and interpolative motion vectors.
The three frame types and their sequence (for instance, I BB P BB P BB P BB), represent a Group of Pictures (GOP).
Take for example, an AVI file of 30 frames per second. Each frame is a standalone picture, and does not reference any other frame. If we use the letter “I” to represent a frame, a single second could then be represented like this by 30 I-frames:
(I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I)
An MPEG encoder can gather most of the information for the group of similar pictures from the first frame in
a scene and encode that as an I frame. The encoder could continue to encode each of these frames as an I
frame, but it is much more efficient to only record data about the pixels that have changed. The encoder
might then look ahead to the fourth frame, note only the things that changed in #1 and #4, and record those
few changes as a P frame. Finally, the encoder can analyze the I and the P frame, and from that create very
small B frames for frames #2 and #3. Because video generally has 24-30 fps, GOP structures can get quite
complex. In addition, the encoded order of frames is not always the same as the display order, and many
GOPs will be required to accurately portray just a few seconds of video. This encoder analysis is a process
known as motion estimation.
Using the superior compression of GOP structures, therefore, the same second in MPEG might be represented like this:
([I BB P BB P BB P BB P BB] [I BB P BB P BB P BB P BB])
It consists of two GOPs. Both of the examples contain 30 frames, but the MPEG file will be significantly
smaller due to the use of P and B frames.
8
Einführende Beispiele
In diesem Kapitel werden einige typische Anwendungsbeispiele behandelt:
1.
2.
3.
4.
Umwandeln einer AVI Datei nach MPEG-1 (Encode and Multiplex)
Trennen von Video- und Audio-Stream (De-Multiplex)
Zusammenfügen von 2 MPEG Dateien (Join)
Bearbeiten von MPEG Dateien (Snapshot, ...)
Beispiel 1 – Ligos Quick Start Tutorial
To get started immediately with the LSX-MPEG Encoder, we’ve provided this Quick Start Tutorial. Detailed
information is provided throughout this Help file for additional features such as Variable Bitrate control, skipping frames for lower bitrate encoding, and Batch Processing.
The Balloon is a YUV compressed AVI, 3 seconds, 352x240, 2960 KB/Sec video, 44.1 kHz 16 bit stereo audio, balloon_yuv.avi when it’s unzipped. It will be used for the tutorial below.
Encoding with the Profile Manager
When using the LSX-MPEG Encoder, the parameters used to encode an MPEG file are kept in a Profile.
You can use the default Profile, use a predefined Profile selected from the Profile Manager, or you can create
your own custom Profile. In this Tutorial we’ll show you how to quickly encode a file using the Profile Manager, and then take you through the steps of creating a custom Profile.
Start LSX-MPEG Encoder by choosing the program from the Start menu or clicking on the icon for the application in the program group.
9
The Encoder will automatically load a set of default parameters that are displayed in the Main Window, such
as Frame Rate, Video Stream data rate, Frame Size, etc.. This is the information that will control the encoding process and determine the characteristics of the output file we’ll generate. Together, this information is
known as a Profile. The top portion of the interface will remain blank until we actually start encoding some
video.
Balloon_yuv.avi is a 3 second, 8.67 MB AVI file, and we want to encode it to MPEG to reduce the file size, but
retain the quality. The easiest way to do this when first starting out with the LSX-MPEG Encoder is to use the
Profile Manager. The Profile Manager includes predefined recommended Profiles, and allows you to create
and store Profiles for later use. Open it from either the pulldown File menu item, or the Open Profile Manager
button on the toolbar.
When the Profile Manager is opened, you’ll see a two-part interface. The left side is a list box that displays
predefined and custom MPEG-1 and MPEG-2 Profiles with descriptive names. When a Profile is selected,
the right side displays the characteristics that make up that Profile (Format, Frame Rate, Data Rate,
etc.). Either you can select a Profile based on the recommendation, or based on the goal you are trying to
achieve.
For instance, if you want something that is a good choice for a SIF resolution (352x240) AVI source file, you
should get good results using the Profile “MPEG-1 (Recommended for SIF, 352x240 NTSC)”. If, however,
you have a specific goal of taking a large AVI file and making it small enough for quick download over the Internet, you’d probably want to select “MPEG-1 (low bitrate, 10 fps simulated)” due to its optimized low data
and frame rates. The choice is yours, and you can always choose one that’s close and customize it later.
We have a goal for Balloon_yuv.avi (reduce file size, keep the quality), so it is best to choose the recommended Profile. Select “MPEG-1 (recommended for SIF, 352x240 NTSC)” , and click Load Profile.
Now we need to select our Input File to convert and encode to MPEG. Near the Input File box, click on the
Browse button. A standard Windows file dialog is displayed, prompting you to choose an AVI file for encoding. Select the file “Balloon_yuv.avi” from the “Media” sub-directory, and click the Open button.
An Information popup appears with an analysis of the Balloon_yuv.avi file. Click OK to continue. In addition to
the Input File box being filled in, the Output File is now automatically named "Balloon_yuv.mpg", saved in the
same folder. This can be changed, but we’ll leave it like this for now.
10
Let’s Preview the file… click on the Preview button to the left of the Input File box. A window appears and
plays back the clip. Close the movie window.
That’s all there is to it! As you can see, the parameters in the Main Window have been adjusted to match the
Input File and the Profile we chose. If we don’t have any other changes to make, we can encode the file. Click
on the Start MPEG Encoding button on the toolbar . The application will display and begin encoding the video
portion of the stream. A new section of the interface, MPEG Video Encoding in Progress, will appear. A meter
will display a frame by frame accounting as each is encoded, and show information on image quality, current
frame quality and elapsed encoding time. The application will then show a meter for the encoding progress of
audio, followed by a meter for the multiplexing of audio and video.
When finished, the Multiplexing Completed box is displayed, presenting a summary of information regarding
the process. Close the dialog.
11
Let’s Play the finished file… click on the Play button to the left of the Output File box. A media player should
open to play the clip. Click play. As you can see, it is the same quality as the AVI file we input. A quick check
on the file size (using Windows Explorer) shows that the resulting file is only about 500 kilobytes, 6% the size
of the original! Close the movie window.
Beispiel 2 – Darim DVMPEG Multiplexer/Demultiplexer
The MPEG balloon.mpg will be separated into video (*.mpv) and audio track (*.mp2).
Show video and audio stream parameters:
12
Beispiel 3 – Camel MPEGJoin
Join the files balloon_yuv.mpg and dolphin_yuv.mpg into one single file joined.mpg.
13
Beispiel 4 - Womble MPEG-VCR 3.02
The Womble MPEG-VCR is application software for editing compressed digital movies that are compliant
with the MPEG international standards. This current release is for all 32-bit Windows platforms.It has been
tested for Windows 95, Windows 98, Windows NT workstation and Windows NT server.
The main features include
1.
2.
3.
4.
Frame-accurate editing for cut, copy, paste, and record.
insert simple transitions with video special effect.
still image overlay for text and logo insertion.
an audio editor component for separate MPEG audio editing.
The following is a list of the supported MPEG formats
1. MPEG Video: all MPEG-1 and MPEG-2 video formats, including VBR (variable rate).
2. MPEG Audio: MPEG layer-I, layer-II, and layer-III (layer-III is input only).
3. MPEG Systems: all MPEG-1 and MPEG-2 formats, including Program and Transport.
Other video data that the editor will read as input
1.
2.
3.
4.
Windows AVI DIB RGB video sequence.
Windows bitmap still images.
Windows bitmap still images.
JPEG still images.
Die folgende Abbildung zeigt einen Schnappschuß der Benutzeroberfläche:
14
MPEG Video Compression
In diesem Kapitel wird detailliert auf die einzelnen Schritte bei der Video-Kompression eingegangen.
MPEG Goals (from C-Cube)
This chapter presents an overview of the Moving Picture Experts Group (MPEG) standard that is implemented by the CL480. The standard is officially known as ISO/IEC Standard, Coded Representation of Picture,
Audio and Multimedia/hypermedia Information, ISO 11172. It is more commonly referred to as the MPEG-1
standard.
MPEG addresses the compression, decompression and synchronization of video and audio signals. The
MPEG video algorithm can compress video signals to an average of about 1/2 to 1 bit per coded pixel. At a
compressed data rate of 1.2 Mbits per second, a coded resolution of 352 x 240 at 30 Hz is often used, and
the resulting video quality is comparable to VHS. Image quality can be significantly improved by using a more
highly-compressed data rate (for example, 2 Mbits per second) without changing the coded resolution.
MPEG System Stream Structure
In its most general form, an MPEG system stream is made up of two layers:
•
•
The system layer contains timing and other information needed to demultiplex the audio and video
streams and to synchronize audio and video during playback.
The compression layer includes the audio and video streams.
The system decoder extracts the timing information from the MPEG system stream and sends it to the other
system components. The system decoder also demultiplexes the video and audio streams from the system
stream; then sends each to the appropriate decoder.
The video decoder decompresses the video stream as specified in Part 2 of the MPEG standard. The audio
decoder decompresses the audio stream as specified in Part 3 of the MPEG standard.
Figure 2-1 shows a generalized decoding system for the audio and video streams.
15
Figure 2-1: General MPEG Decoding System
Video Stream Data Hierarchy
The MPEG standard defines a hierarchy of data structures in the video stream as shown schematically in Figure 2-2:
1.
2.
3.
4.
5.
6.
Video Sequence
Group of Pictures (GOP)
Picture
Slice
Macroblock
Block
Figure 2-2 MPEG Data Hierarchy
Video Sequence
Begins with a sequence header (may contain additional sequence headers), includes one or more groups of
pictures, and ends with an end-of-sequence code.
Group of Pictures (GOP)
A header and a series of one or more pictures intended to allow random access into the sequence.
Picture
The primary coding unit of a video sequence. A picture consists of three rectangular matrices representing
luminance (Y) and two chrominance (Cb and Cr) values. The Y matrix has an even number of rows and columns. The Cb and Cr matrices are one-half the size of the Y matrix in each direction (horizontal and vertical).
Figure 2-3 shows the relative x-y locations of the luminance and chrominance components. Note that for every four luminance values, there are two associated chrominance values: one Cb value and one Cr value. (The
location of the Cb and Cr values is the same, so only one circle is shown in the figure.)
16
Slice
One or more ``contiguous'' macroblocks. The order of the macroblocks within a slice is from left-to-right and
top-to-bottom. Slices are important in the handling of errors. If the bitstream contains an error, the decoder
can skip to the start of the next slice. Having more slices in the bitstream allows better error concealment, but
uses bits that could otherwise be used to improve picture quality.
Macroblock
A 16-pixel by 16-line section of luminance components and the corresponding 8-pixel by 8-line section of the
two chrominance components. See Figure 2-3 for the spatial location of luminance and chrominance components. A macroblock contains four Y blocks, one Cb block and one Cr block as shown in Figure 2-4. The
numbers correspond to the ordering of the blocks in the data stream, with block 1 first.
Figure 2-4 Macroblock Composition
Block
A block is an 8-pixel by 8-line set of values of a luminance or a chrominance component. Note that a luminance block corresponds to one-fourth as large a portion of the displayed image as does a chrominance
block.
YcbCr Coding
The MPEG-1 algorithm operates on images represented in YUV color space (Y Cr Cb). If an image is stored in RGB format, it
must first be converted to YUV format. In YUV format, images are also represented in 24 bits per pixel (8 bits for the luminance information (Y) and 8 bits each for the two chrominance information (U and V)). The YUV format is subsampled. All
luminance information is retained. However, chrominance information is subsampled 2:1 in both the horizontal and vertical directions. Thus, there are 2 bits each per pixel of U and V information. This subsampling does not drastically affect quality because the eye is more sensitive to luminance than to chrominance information. Subsampling is a lossy step. The 24 bits RGB
information is reduced to 12 bits YUV information, which automatically gives 2:1 compression. Technically speaking, MPEG-1
is 4:2:0 YCrCb.
Im Folgenden wird dargestellt, wie ein digitalisiertes Videobild für MPEG-1 vorbehandelt werden muß, um zu handhabbaren
Datenmengen zu kommen (1,5 MBit/s), welche Bildtypen verwendet werden und wie Codierung und Decodierung unter Beibehaltung akzeptabler Bildqualität erfolgen.
Die üblichen Auflösungen von digitalisiertem Video sind 720x480 Pixel bei 60 Hz (NTSC) oder 720x576 Pixel bei 50 Hz (PAL),
wobei das Material in 4:2:2 Form vorliegt. Diese Auflösung wird in horizontaler und vertikaler Richtung um die Hälfte verringert.
Die horizontale Auflösung wird dabei im Allgemeinen nicht einfach durch Weglassen von Luminanz- oder Chrominanzwerten
verringert. Gebräuchlich ist eine gewichtete Mittelung eines Pixels mit seinen später nicht mehr verwendeten Nachbarpixeln.
In vertikaler Richtung kann eine ähnliche Filterung erfolgen, oder es wird einfach jede zweite Zeile weggelassen, also nur ein
Halbbild verwendet.
Die Chrominanz-Auflösung wird in vertikaler Richtung ein weiteres Mal durch Filterung halbiert, so daß ein 4:2:0 Abtastungsmuster entsteht (Abbildung ). Vier Abtastpunkte der Lunimanz entsprechen hierbei einem Abtastpunkt für die Chrominanz.
Die einzelnen Abtastwerte werden, wie bei JPEG, zu 8x8-Matrizen zusammengefaßt, vier der 8x8-Blöcke bilden einen Makroblock. Die Farbinformation wird mit je einer 8x8- Matrix für und dargestellt, so daß pro Makroblock insgesamt 6 Matrizen der
Ausmaße 8x8 verwendet werden. Der Vorgang der Auflösungsreduzierung wird in sogenannten Dezimierungsfiltern entweder
im MPEG-Prozessorbaustein, wie dem weiter unten beschriebenen VRP von C-Cube, oder in Videodigitalisierung-Bausteinen
vorgenommen. Bei der Darstellung komprimierten Videomaterials muß umgekehrt die anfängliche Auflösung wiederhergestellt werden. Dazu werden zwischen den Luminanz- bzw. Chrominanzwerten Nullwerte eingefügt, und anschließend wird eine
gewichtete Mittelung durchgeführt. Die Gewichte sind die Filterkoeffizienten eines sogenannten Interpolationsfilters. Der Effekt ist, daß zum Beispiel aus der Folge 10,11,12 eine Folge 10, 10.5, 11, 11.5, 12 erzeugt wird. Einfachere Verfahren zur Interpolation arbeiten mit der Wiederholung von Werten.
17
Audio Stream Data Hierarchy
The MPEG standard defines a hierarchy of data structures that accept, decode and produce digital audio
output. The MPEG audio stream, like the MPEG video stream, consists of a series of packets. Each audio
packet contains an audio packet header and one or more audio frames as shown in Figure 2-5.
Figure 2-5 Audio Stream Structure
Each audio packet header contains the following information:
•
•
Packet start code - Identifies the packet as being an audio packet
Packet length - Indicates the number of bytes in the audio packet.
An audio frame contains the following information:
•
•
•
Audio frame header - Contains synchronization, ID, bit rate, and sampling frequency information
Error-checking code - Contains error-checking information
Audio data - Contains information used t o reconstruct the sampled audio data.
18
•
Ancillary data - Contains user-defined data.
Step 1: Intra-picture (Transform) Coding
The MPEG transform coding algorithm includes these steps:
•
•
•
Discrete cosine transform (DCT)
Quantization
Run-length encoding
Both image blocks and prediction-error blocks have high spatial redundancy. To reduce this redundancy, the
MPEG algorithm transforms 8 x 8 blocks of pixels or 8 x 8 blocks of error terms from the spatial domain to
the frequency domain with the Discrete Cosine Transform (DCT).
Next, the algorithm quantizes the frequency coefficients. Quantization is the process of approximating each
frequency coefficient as one of a limited number of allowed values. The encoder chooses a quantization matrix that determines how each frequency coefficient in the 8 x 8 block is quantized. Human perception of
quantization error is lower for high spatial frequencies, so high frequencies are typically quantized more coarsely (i.e., with fewer allowed values) than low frequencies.
The combination of DCT and quantization results in many of the frequency coefficients being zero, especially
the coefficients for high spatial frequencies. To take maximum advantage of this, the coefficients are organized in a zigzag order to produce long runs of zeros (see Figure 2-10). The coefficients are then converted
to a series of run-amplitude pairs, each pair indicating a number of zero coefficients and the amplitude of a
non-zero coefficient. These run-amplitude pairs are then coded with a variable-length code, which uses
shorter codes for commonly occurring pairs and longer codes for less common pairs.
Huffman Coding
For a given character distribution, by assigning short codes to frequently occurring characters and longer codes to infrequently occurring characters, Huffman's minimum redundancy encoding minimizes the average number of bytes required to
represent the characters in a text.
Static Huffman encoding uses a fixed set of codes, based on a representative sample of data, for processing texts. Although
encoding is achieved in a single pass, the data on which the compression is based may bear little resemblance to the actual
text being compressed.
Dynamic Huffman encoding, on the other hand, reads each text twice; once to determine the frequency
distribution of the characters in the text and once to encode the data. The codes used for compression are computed on the
basis of the statistics gathered during the first pass with compressed texts being prefixed by a copy of the Huffman encoding
table for use with the decoding process.
Some blocks of pixels need to be coded more accurately than others. For example, blocks with smooth intensity gradients need accurate coding to avoid visible block boundaries. To deal with this inequality between
blocks, the MPEG algorithm allows the amount of quantization to be modified for each macroblock of pixels.
This mechanism can also be used to provide smooth adaptation to a particular bit rate.
Figure 2-10 Transform Coding Operations
The encoding scheme used is similar to JPEG compression. Each 8x8 block is encoded independently with one exception
explained below. The block is first transformed from the spatial domain into a frequency domain using the DCT (Discrete Co-
19
sine Transform), which separates the signal into independent frequency bands. Most frequency information is in the upper left
corner of the resulting 8x8 block. After this, the data is quantized. Quantization can be thought of as ignoring lower-order bits
(though this process is slightly more complicated). Quantization is the only lossy part of the whole compression process other
than subsampling. The resulting data is then run-length encoded in a zig-zag ordering to optimize compression. This zig-zag
ordering produces longer runs of 0's by taking advantage of the fact that there should be little high-frequency information (more 0's as one zig-zags from the upper left corner towards the lower right corner of the 8x8 block). The afore-mentioned exception to independence is that the coefficient in the upper left corner of the block, called the DC coefficient, is encoded relative
to the DC coefficient of the previous block (DCPM coding).
Since MPEG is targeted for a set of specific applications, there is only one color space (4:2:0 YCbCr), one sample precision
(8 bits), and one scanning mode (sequential). Luminance and chrominance share quantization tables. The range of sampling
dimensions are more limited as well. MPEG adds adaptive quantization at the macroblock (16 x 16 pixel area) layer. This
permits both smoother bit rate control and more perceptually uniform quantization throughout the picture and image sequence. Adaptive quantization is part of the JPEG-2 charter. MPEG variable length coding tables are non-downloadable, and
are therefore optimized for a limited range of compression ratios appropriate for the target applications.
The local spatial decorrelation methods in MPEG and JPEG are very similar. Picture data is block transform coded with the
two-dimensional orthonormal 8x8 DCT. The resulting 63 AC transform coefficients are mapped in a zig-zag pattern to statistically increase the runs of zeros. Coefficients of the vector are then uniformly scalar quantized, run-length coded, and finally
the run-length symbols are variable length coded using a canonical (JPEG) or modified Huffman (MPEG) scheme. Global
frame redundancy is reduced by 1-D DPCM, of the block DC coefficients, followed by quantization and variable length entropy
coding.
MCP DCT ZZ Q Frame -> 8x8 spatial block -> 8x8 frequency block -> Zig-zag scan -> RLC VLC quantization -> run-length
coding -> variable length coding.
Step 2: Inter-Picture Coding
Much of the information in a picture within a video sequence is similar to information in a previous or subsequent picture. The MPEG standard takes advantage of this temporal redundancy by representing some pictures in terms of their differences from other (reference) pictures, or what is known as inter-picture coding. This
section describes the types of coded pictures and explains the techniques used in this process.
The MPEG standard specifically defines three types of pictures: intra, predicted, and bidirectional.
Intra Pictures
Intra pictures, or I-pictures, are coded using only information present in the picture itself. I-pictures provide
potential random access points into the compressed video data. I-pictures use only transform coding and
provide moderate compression. I-pictures typically use about two bits per coded pixel.
Predicted Pictures
Predicted pictures, or P-pictures, are coded with respect to the nearest previous I- or P-picture. This
technique is called forward prediction and is illustrated in Figure 2-6. Like I-pictures, P-pictures serve as a
prediction reference for B-pictures and future P-pictures. However, P-pictures use motion compensation to
provide more compression than is possible with I-pictures. Unlike I-pictures, P-pictures can propagate coding
errors because P-pictures are predicted from previous reference (I- or P-) pictures.
Figure 2-6 Forward Prediction
Bidirectional Pictures
Bidirectional pictures, or B-pictures, are pictures that use both a past and future picture as a reference. This
technique is called bidirectional prediction and is illustrated in Figure 2-7. B-pictures provide the most compression and do not propagate errors because they are never used as a reference. Bidirectional prediction
also decreases the effect of noise by averaging two pictures.
20
Figure 2-7 Bidirectional Prediction
Video Stream Composition
The MPEG algorithm allows the encoder to choose the frequency and location of I-pictures. This choice is
based on the application's need for random accessibility and the location of scene cuts in the video sequence. In applications where random access is important, I-pictures are typically used two times a second.
The encoder also chooses the number of B-pictures between any pair of reference (I- or P-) pictures. This
choice is based on factors such as the amount of memory in the encoder and the characteristics of the material being coded. For example, a large class of scenes have two bidirectional pictures separating successive
reference pictures. A typical arrangement of I-, P-, and B-pictures is shown in Figure 2-8 in the order in which
they are displayed.
Figure 2-8 Typical Display Order of Picture Types
The MPEG encoder reorders pictures in the video stream to present the pictures to the decoder in the most
efficient sequence. In particular, the reference pictures needed to reconstruct B-pictures are sent before the
associated B-pictures. Figure 2-9 demonstrates this ordering for the first section of the example shown above.
Figure 2-9 Video Stream versus Display Ordering
21
Motion Compensation
Motion compensation is a technique for enhancing the compression of P- and B-pictures by eliminating temporal redundancy. Motion compensation typically improves compression by about a factor of three compared to intra-picture coding. Motion compensation algorithms work at the macroblock level.
Bewegungskompensation bedeutet, daß redundante Bildinformationen, welche sich durch Koordinatenverschiebungen innerhalb einer Bildsequenz ergeben, nur durch einen Vektor mit Referenzierung auf einen Urblock codiert werden. Bei der Berechnung der Motion Compensation wird sich dabei jedoch ein Bilddetail nicht immer identisch über eine Folge mehrerer<Bilder fortsetzen. Ein Pixelblock wird sich im Fall von Realvideo aufgrund des Grundrauschens immer mehr oder weniger
vom<vorhergehenden unterscheiden. Bei einer Person, die sich durch das Bild bewegt, ändert sich zum Beispiel der Sitz
oder die<Schattierung der Kleidung.
Falls die Bildunterschiede signifikant sind, muß neben dem Motion-Vektor auch noch ein Fehlerbild codiert werden. Die Entscheidung, wohin sich ein Bildinhalt bewegt, kann nur aufgrund objektiver Kriterien erfolgen. Ein Video-Encoder wird daher in
der Umgebung des früheren Ausgangsblocks nach einem Pixelblock suchen, der eine größtmögliche Ähnlichkeit besitzt (Abbildung ). Ein denkbares Entscheidungskriterium ist zum Beispiel der mittlere quadratische Abstand der Werte der beiden
16xl6-Pixelblöcke. Gemeint ist damit, daß die Quadrate der Differenzen aller Luminanzwerte und Chrominanzwerte des Originalblocks und des Kandidatenblocks innerhalb des Suchbereiches errechnet und aufsummiert werden. Auf diese Art und
Weise erhält man ein Maß für die Ähnlichkeit zweier Blöcke. Hat sich ein Block zum nächsten fortgepflanzt ohne sich zu verändern, ist die Differenz gleich Null. Eine sehr rechenaufwendige Methode wäre, für alle denkbaren Verschiebungen innerhalb
des Suchbereichs die Summe der quadrierten Differenzen zu bilden.
Im Encoder wird dann der Bewegungsvektor des Bildes mit dem kleinsten quadratischen Abstand zum Original als der beste
ausgewählt. Die Suche nach dem besten Motion-Vektor kann mit einer Auflösung von einem Pixel oder einem halben Pixel
erfolgen.
Die für die Codierung verwendeten Vektoren besitzen dabei eine Auflösung von bis zu einem halben Pixel. Für die Suche
nach dem Motion-Vektor kann linear zwischen benachbarten Pixeln interpoliert werden. Da der Rechenaufwand sehr erheblich ist, werden unterschiedliche Suchstrategien angewandt. So kann beispielsweise zunächst das Gitter der 48x48 ganzzahligen Verschiebungen abgesucht werden, um danach die 8 benachbarten Positionen mit einem Abstand von einem halben
Pixel zu untersuchen.
Eine weitere Methode benutzt für die Suche zunächst ein grobes Raster mit einem Abstand von mehreren Pixeln um es dann
um die beste Position nach und nach zu verfeinern. Diese Methode kommt mit noch weniger Schritten aus. Allerdings wird die
Wahrscheinlichkeit geringer, den optimalen Motion-Vektor zu finden. JPEG kann in der Regel mit einer 20- bis 25fachen Datenverdichtung Bilder guter Qualität komprimieren und dekomprimieren. MPEG erreicht durch das Motion-CompensationVerfahren den dreifachen Wert. Berücksichtigt man, daß bei MPEG-1 Videobilder vor der eigentlichen Kompression auf CIF
heruntergerechnet werden, so ergeben sich Datenverdichtungen um den Faktor 240. (CIF (Common Intermediate Format)
entspricht einer Auflösung von 352*288 Pixel (352*240 Pixel bei NTSC) - ermöglicht ganzzahlige Aufteilung in 16x16 Blöcke.)
Dies bedeutet - bildlich gesehen, daß zehn Pixel mit je acht Bit für die Rot-, Grün- und Blau-Werte durch nur ein Bit dargestellt werden.
When a macroblock is compressed by motion compensation, the compressed file contains this information:
•
•
The spatial vector between the reference macroblock(s) and the macroblock being coded (motion
vectors)
The content differences between the reference macroblock(s) and the macroblock being coded (error terms)
Not all information in a picture can be predicted from a previous picture. Consider a scene in which a door
opens: The visual details of the room behind the door cannot be predicted from a previous frame in which the
door was closed. When a case such as this arises--i.e., a macroblock in a P-picture cannot be efficiently represented by motion compensation--it is coded in the same way as a macroblock in an I-picture using transform coding techniques.
The difference between B- and P-picture motion compensation is that macroblocks in a P-picture use the
previous reference (I- or P-picture) only, while macroblocks in a B-picture are coded using any combination of
a previous or future reference picture.
Four codings are therefore possible for each macroblock in a B-picture:
•
•
•
•
Intra coding: no motion compensation
Forward prediction: the previous reference picture is used as a reference
Backward prediction: the next picture is used as a reference
Bidirectional prediction: two reference pictures are used, the previous reference picture and the next
reference picture
Backward prediction can be used to predict uncovered areas that do not appear in previous pictures.
22
Das MPEG-Verfahren nutzt die Tatsache, daß in Folgen bewegter Bilder zwischen aufeinanderfolgenden Bildern große Ähnlichkeit besteht. Mit der Ausnahme krasser Szenenwechsel werden sich Bilddetails kontinuierlich von einem Bild zum nächsten fortsetzen, wie zum Beispiel ein sich von links nach rechts bewegendes Fahrzeug oder eine weiße Wolke, die vor dem
Hintergrund eines blauen Himmels vorbeizieht. Ein zentraler Bestandteil von MPEG ist nun die sogenannte Motion Compensation:
Die Bewegung des Fahrzeugs wird einfach durch einen Vektor beschrieben, zum Beispiel durch die Angabe, daß das Fahrzeug sich von einem Bild zum nächsten um 12 Pixel nach rechts und 10 Pixel nach oben bewegt hat. Die Erkennung eines
zusammengehörigen Objekts wäre in der Praxis allerdings viel zu aufwendig. Stattdessen werden sogenannte Makroblöcke
mit einer Pixelgröße von 16x16 untersucht. Diese Makroblöcke entsprechen 4 Blöcken, wie sie bei JPEG codiert werden. Im
nächsten Schritt wird die Differenz aus dem realen Makroblock in Filmbild 1 und dem verschobenen Makroblock aus Filmbild
2 gebildet. Dieses Fehlerbild muß neben dem Verschiebungsvektor zur Beobachtung der Fehlerfortpflanzung codiert und gespeichert werden. Der geringste Speicheraufwand entsteht natürlich, wenn der Unterschied zwischen den verschobenen Makroblöcken und den tatsächlich dargestellten Blöcken so klein ist, daß auf die Codierung der Differenz ganz verzichtet werden
kann. MPEG steuert die Darstellung von komprimiertem Video durch die Festlegung einer Syntax. Die Regeln zur Erfassung
der Bewegungskompensation lassen hingegen viele Freiheiten zu, so daß die Qualität des MPEG-Endprodukts auch maßgeblich von der Güte des verwendeten Codierungs-Algorithmus abhängt.
Synchronization
The MPEG standard provides a timing mechanism that ensures synchronization of audio and video. The
standard includes two parameters: the system clock reference (SCR) and the presentation timestamp (PTS).
The MPEG-specified ``system clock'' runs at 90 kHz. System clock reference and presentation timestamp
values are coded in MPEG bitstreams using 33 bits, which can represent any clock cycle in a 24-hour period.
An SCR is a snapshot of the encoder system clock which is placed into the system layer of the bitstream, as
shown in Figure 2-11. During decoding, these values are used to update the system clock counter in the
CL480.
23
Figure 2-11 SCR Flow in MPEG System
Presentation timestamps are samples of the encoder system clock that are associated with video or audio
presentation units. A presentation unit is a decoded video picture or a decoded audio time sequence. The
PTS represents the time at which the video picture is to be displayed or the starting playback time for the audio time sequence.
The decoder either skips or repeats picture displays to ensure that the PTS is within one picture's worth of 90
kHz clock tics of the SCR when a picture is displayed. If the PTS is earlier (has a smaller value) than the current SCR, the decoder discards the picture. If the PTS is later (has a larger value) than the current SCR, the
decoder repeats the display of the picture.
24
MPEG-1 Layer 3 Audio Compression
In diesem Kapitel erhalten Sie eine kurze Einführung in MP3 und Hinweise auf Encoder und Player.
Overview
The ISO/MPEG Audio Coding Standard describes the compression of audio signals using high performance
perceptual coding schemes. It specifies a family of three audio coding schemes, simply called Layer 1, Layer
2 and Layer 3.
Compression gain (sound quality per bit) and encoder complexity increase from Layer 1 to Layer 3.
All Layers use the same basic structure. The coding scheme can be described as perceptual noise shaping
or perceptual subband/transform coding.
The encoder analyses the spectral components of the audio signal by calculating a filterbank or transform
and applies a psychoacoustic model to estimate the just noticeable noise-level. In its quantization and coding
stage, the encoder tries to allocate the available number of data bits in a way to meet both the bitrate and
masking requirements.
The decoder is much less complex. Its task is to synthesize an audio signal out of the encoded spectral components.
Compression rates:
You can achieve a compression rate of
1:4
1:6..8
1:10..12
with Layer 1 (or 192 kbps per audio channel),
with Layer 2 (or 128..96 kbps per audio channel), and
with Layer 3 (or 64..56 kbps per audio channel),
and the reconstructed audio signal will maintain a CD-like sound quality.
There is a lot of confusion surrounding the terms audio compression, audio encoding, and audio decoding.
This section will give you an overview what audio coding (another one of these terms...) is all about.
The purpose of audio compression
Up to the advent of audio compression, high-quality digital audio data took a lot of hard disk space to store.
Let us go through a short example.
You want to, say, sample your favorite 1-minute song and store it on your harddisk. Because you want CD
quality, you sample at 44.1 kHz, stereo, with 16 bits per sample. 44100 Hz means that you have 44100 values per second coming in from your sound card (or input .le). Multiply that by two because you have two
channels. Multiply by another factor of two because you have two bytes per value (that's what 16 bit means).
The song will take up
44 100 sample/sec * 2 channels * 2 bytes/sample * 60 sec/min= 10 Mbyte/min
Means 10 MB of storage space on your harddisk per minute. If you wanted to download that over the internet,
given an average 28.8 modem, it would take you (at least)
10 000 000 bytes 8 bits/byte * 28.800 bits/sec * 60 sec/min = 45 min
¾ h just to download one minute of music!
Digital audio coding, which - in this context - is synonymously called digital audio compression as well, is the
art of minimizing storage space (or channel bandwidth) requirements for audio data. Modern perceptual audio
25
coding techniques (like MPEG Layer-3) exploit the properties of the human ear (the perception of sound) to
achieve a size reduction by a factor of 12 with little or no perceptible loss of quality.
Therefore, such schemes are the key technology for high quality low bit-rate applications, like soundtracks for
CD-ROM games, solid-state sound memories, Internet audio, digital audio broadcasting systems, and the
like.
The two parts of audio compression
Audio compression really consists of two parts. The .rst part, called encoding, transforms the digital audio
data that resides, say, inaWAVE .le, into a highly compressed form called bitstream. To play the bitstream on
your soundcard, you need the second part, called decoding. Decoding takes the bitstream and re-expands it
to a WAVE .le. The program that e.ects the .rst part is called an audio encoder. MP3Enc is such an encoder;
there are others, see http://www.fhg.iis.de/audio/.
The program that does the second part is called an audio decoder. One well-known MPEG Layer-3 decoder
is WinPlay3, another l3dec. Both can be found on http://www.fhg.iis.de/audio/.
Compression ratios, bitrate and quality
It has not been explicitly mentioned up to now: What youend up with after encoding and decoding is not the
same sound .le anymore: All superflous information has been squeezed out, so to say. It is not the same .le,
but it will sound the same { more or less, depending on how much compression had been performed on it.
Generally speaking, the lower the compression ratio achieved, the better the sound quality will be in the end {
and vice versa. Table 1.1 gives you an overview about quality achievable.
Because compression ratio is a somewhat unwieldy measure, experts use the term bitrate when speaking of
the strength of compression. Bitrate denotes the average number of bits that one second of audio data will
takeup in your compressed bitstream. Usually the units used will be kbps, which is kbits/s , or 1024 bits/s.
To calculate the number of bytes per second of audio data, simply divide the number of bits per second by
eight.
Fraunhofer Homepage: http://www.iis.fhg.de/amm/
MP3 FAQ: http://www.iis.fhg.de/amm/techinf/layer3/layer3faq/index.html
26
Profiles
In diesem Kapitel werden einige Profile vorgestellt, die beim Erstellen von MPEG Dateien verwendet werden.
What is common and what are the difference between MPEG-1 and MPEG-2?
MPEG-1 is suitable for low and medium data rate applications producing image quality comparable to VHS
tape. Such applications include computer multimedia CD-ROM titles, computer games, video training materials, video databases, and networked video applications. It is generally used for video of resolutions up to
352x288 and data rates up to 2 Mbits/sec.
MPEG-2 is designed for higher quality video applications like DVD, video on demand, and digital
broadcasting. It is generally used for resolutions and data rates greater than those listed above for MPEG-1.
MPEG-1 and MPEG-2 are international standards. The MPEG files are operating system independent, unlike
formats such as AVI and QuickTime. MPEG-1 playback is now standard on most systems sold today. With
the proliferation of DVD players on PCs, the same is happening with MPEG-2. In most cases, MPEG provides better quality video in a smaller file size than AVI or QuickTime codecs. In the past few years, it has become very easy to capture analog input video as an AVI file with an inexpensive video capturing card on a
PC, optionally edit the AVI video file, and then convert the AVI file to MPEG.
Profiles and Layers
Although it is possible to create MPEG-1 files with frame sizes of up to 4095x4095 and data rates greater
than 1812 kbits/sec, not all decoders can play such MPEG-1 streams. There is a minimum required set of parameters of MPEG-1 streams that many low-end hardware and software MPEG-1 players usually support.
The minimum set of parameters is specified below as "constrained parameters".
A constrained parameter bitstream (CBR) is defined in the MPEG-1 Standard as following:
1.
2.
3.
4.
5.
6.
7.
Horizontal frame size less than or equal 768 pixels.
Vertical frame size less than or equal 576 pixels.
Picture area less than or equal 396 macroblocks (101376 pixels).
Frame rate less than or equal 30 frames/sec.
Motion vectors less than or equal 64.
Bitrate less than or equal 1.86 Mbits/sec
VBV buffer size less than or equal to 40 Kbytes/sec. (40 Kbytes/sec for constrained files and 224
Kbytes/sec for non-constrained files.)
Constrained Parameters do not apply to MPEG-2 streams, as they have broader applications than CD-ROM
or VideoCD. MPEG-2 was designed to be a very generic standard in that it is to be used for a variety of applications, everything from DVD and computer video to digital satellite and HDTV systems. MPEG-2 does,
however, have defined Profiles and Levels of compatibility. Profiles specify syntax (i.e. algorithms), and
Levels specify coding parameters (sample rates, frame dimensions, coded bitrates, etc.). Defined together,
Profiles and Levels specify interchange standards for specific applications of MPEG-2.
As shown in the table below, there are 5 Profiles (Simple, Main, SNR, Spatial and High), each with a maximum of 4 possible Levels (Low, Main, High-1440, and High). Not all combinations have been defined in
the MPEG-2 specification. MPEG-2 Main Profile at Main Level (MP@ML) can be considered similar to
MPEG-1's constrained parameters, and supports up to 720 pixels x 480 lines x 30 frames/sec, at a total
sampling rate up to 10.4 Msamples/second (i.e., consistent with the CCIR-601 video format standard).
If compatibility with a specific MPEG-2 application or decoder is important in your work, be sure to specify the
correct parameters, and the correct Profile/Level combination. The following table shows the MPEG-2
Profile (horizontal) / Level (vertical) cross-reference structure, along with the values that defined the upper limits of each combination. For a more in depth explanation of this systems, check the MPEG-2 specification
or a reference on the subject.
27
Level ß/ Profile Þ
LOW
SIMPLE
undefined
MAIN
MP@LL
352 pels/line
288 lines/frame
30 frames/sec
3.04 Msamples/s
4 Mbits/s
SNR
SPATIAL
352 pels/line
undefined
288 lines/frame
30 frames/sec
3.04 Msamples/s
4 Mbits/s both layers
3 Mbits/s base layer
720 pels/line
576 lines/frame
30 frames/sec
10.4 Msamples/s
15 Mbits/s
MP@ML
HIGH 1440
undefined
1440 pels/line
1152 lines/frame
60 frames/sec
47 Msamples/s
60 Mbits/s
undefined
HIGH
undefined
1920 pels/line
1152 lines/frame
60 frames/sec
62.7 Msamples/s
80 Mbits/s
undefined
MAIN
720 pels/line
576 lines/frame
30 frames/sec
10.4 Msamples/s
15 Mbits/s
HIGH
undefined
288 lines/frame
undefined
30 frames/sec
3.04 Msamples/s
4 Mbits/s both layers
3 Mbits/s base layer
720 pels/line
576 lines/frame
30 frames/sec
11.06 Msamples/s
or 14.75 samples/s
20 Mbits/s 3 layes
15 Mbits/s
base + middle
4 Mbits/s base layer
1440 pels/line
1440 pels/line
1152 lines/frame
1152 lines/frame
60 frames/sec
60 frames/sec
47 Msamples/s
47 Msamples/s or
60 Mbits/s 3 layes
62.7 Msamples/s
40 Mbits/sbase + 80 Mbits/s 3 layers
middle
60 Mbits/s
15 Mbits/s base lay- base + middle
er
20 Mbits/s
base layer
undefined
1920 pels/line
1152 lines/frame
60 frames/sec
62.7 Msamples/s or
83.5 Msamples/s
100 Mbits/s
3 layes
80 Mbits/s
base + middle
25 Mbits/s
base layer
In den folgenden Abschnitten finden Sie Informationen darüber, welche dieser Kombinationen von ausgewählten Encodern unterstützt werden.
Predefined Profiles (Ligos)
Achtung: Der Begriff Profile umfaßt im folgenden Profile/Layer (siehe zuvor).
This first set of four are MPEG-1 Profiles very suitable for Internet use. Using a combination of low data rates and our special “Skip B frames” mode, these profiles produce very compact files suitable for e-mail or
web pages.
1. MPEG-1 (Low bitrate, 10 fps simulated)
MPEG-1, 30 fps (with special “Skip B frames” mode enabled to simulate 10 fps), 190 kbits/sec video data
rate, 96 kbits/sec data rate MPEG-1 Layer II audio
2. MPEG-1 (Low bitrate, 15 fps simulated)
MPEG-1, 29.97 fps (with special “Skip B frames” mode enabled to simulate 15 fps), 498 kbits/sec video
data rate, 96 kbits/sec data rate MPEG-1 Layer II audio
3. MPEG-1 (Low bitrate, 15 fps simulated, variable bitrate)
MPEG-1, 23.976 fps (with special “Skip B frames” mode enabled to simulate 15 fps), variable bitrate mode with a peak maximum of 600 kbits/sec video data rate, average bitrate of 350 kbits/sec, and 96
kbits/sec data rate MPEG-1 Layer II audio
4. MPEG-1 (Low bitrate, 5 fps simulated, variable bitrate)
MPEG-1, 30 fps (with special “Skip B frames” mode enabled to simulate 5 fps), variable bitrate mode with
a peak maximum of 125 kbits/sec video data rate, average maximum bitrate of 75 kbits/sec, average minimum bitrate of 20 kbits/sec, and 32 kbits/sec data rate MPEG-1 Layer II audio
This next set of MPEG-1 Profiles are much more general and popular Profiles based on “good” data and frame rates for different input videos (based on frame size and frame rate).
28
1. MPEG-1 (Recommended for QSIF, 176x120)
MPEG-1, 24 fps, 600 kbits/sec video data rate, 96 kbits/sec data rate MPEG-1 Layer II audio
2. MPEG-1 (Recommended for SIF, 352x240 NTSC)
MPEG-1, 29.97 fps, 1198 kbits/sec video data rate, 128 kbits/sec data rate MPEG-1 Layer II audio
3. MPEG-1 (Recommended for SIF, 352x288 PAL)
MPEG-1, 25 fps, 1198 kbits/sec video data rate, 128 kbits/sec data rate MPEG-1 Layer II audio
4. MPEG-1 NTSC (General)
MPEG-1, 30 fps, 1098 kbits/sec video data rate, 112 kbits/sec data rate MPEG-1 Layer II audio
5. MPEG-1 PAL (General)
MPEG-1, 25 fps, 1098 kbits/sec video data rate, 112 kbits/sec data rate MPEG-1 Layer II audio
This next set of files give you “one-click” access to the settings for VideoCD, both NTSC and PAL standard.
1. MPEG-1 VideoCD NTSC
MPEG-1 VideoCD, 29.97 fps, 1123 kbits/sec video data rate, 224 kbits/sec data rate MPEG-1 Layer II
audio
2. MPEG-1 VideoCD PAL
MPEG-1 VideoCD, 25 fps, 1123 kbits/sec video data rate, 224 kbits/sec data rate MPEG-1 Layer II audio
The remaining seven Profiles are best used for video larger than 352x288, and for higher data rates than listed above. They default to MPEG-2, and cover resolutions from Half-Horizontal Resolution (HHR, or Half
D1) up to fullscreen. Also included is an MPEG-2 Profile using the new Variable Bitrate mode that produces
excellent fullscreen quality in a compact file size.
1. MPEG-2 (Recommended for HHR, 352x480 NTSC)
MPEG-2, 30 fps, 2048 kbits/sec video data rate, 192 kbits/sec data rate MPEG-1 Layer II audio
2. MPEG-2 (Recommended for HHR, 352x576 PAL)
MPEG-2, 25 fps, 2048 kbits/sec video data rate, 192 kbits/sec data rate MPEG-1 Layer II audio
3. MPEG-2 (Recommended for NTSC Full Screen)
MPEG-2, 29.97 fps, 4096 kbits/sec video data rate, 224 kbits/sec data rate MPEG-1 Layer II audio
4. MPEG-2 (Recommended for PAL Full Screen)
MPEG-2, 25 fps, 4096 kbits/sec video data rate, 224 kbits/sec data rate MPEG-1 Layer II audio
5. MPEG-2 (Sample VBR profile for Full Screen)
MPEG-2, 29.97 fps, Variable Bitrate with maximum peak 6000 kbits/sec video data rate, maximum average of 3000 kbits/sec, minimum average of 1500 kbits/sec, 224 kbits/sec data rate MPEG-1 Layer II audio
6. MPEG-2 (Main Profile @ Low Level)
MPEG-2, 30 fps, 3906 kbits/sec video data rate, 112 kbits/sec data rate MPEG-1 Layer II audio
7. MPEG-2 (Main Profile @ Main Level)
MPEG-2, 30 fps, 14648 kbits/sec video data rate, 112 kbits/sec data rate MPEG-1 Layer II audio
Predefined Profiles (Xing)
The XingMPEG Encoder comes with a number of pre-defined Stream Profiles.
Audio Only Layer 3 Folder contains Stream Profiles for making audio only MPEG files for a wide range of target networks.
1. 128K Stream Profile creates CD quality layer 3 audio only MPEG-1 layer 3 files for broadband target networks. Great for music!
29
2. 112K Stream Profile creates CD quality layer 3 audio only MPEG-1 layer 3 files for broadband target networks. Great for music!
3. 64K Stream Profile creates a radio quality audio only MPEG-2 layer 3 files for narrowband target networks.
4. 28.8 Modem Stereo Stream Profile creates a telephone quality stereo audio only MPEG-2 layer 3 files for
narrowband target networks.
5. 28.8 Modem Mono Stream Profile creates a telephone quality mono audio only MPEG-2 layer 3 files for
narrowband target networks.
MPEG 1 Folder contains Stream Profiles for making various kinds of MPEG files, from full size down to
smaller Internet based sizes.
1. Match Source Stream Profile creates an MPEG file that best matches the source file (near CD quality
audio). Use the Match Source Stream Profile when you are unsure which Stream Profile to use or to
avoid unnecessary re-encoding.
When you provide an MPEG system source file (.mpg), MPEG video source file (.mpv) or an MPEG audio
source file (.mpa), the XingMPEG Encoder tries to use that source file without re-encoding it. However, if your
Stream Profile does not match the source file's properties exactly (Data Rate, Resolution, Frame Rate, etc.),
the Encoder assumes you want the file re-encoded with the new properties. Use the Match Source Stream
Profile to prevent unnecessary re-encoding.
If you want to use an MPEG-2 or LBR Algorithm, you need to create a custom Match Source Stream Profile.
Use the MPEG 1 Match Source Stream Profile and change the Algorithm field to MPEG-2 or LBR.
1. NTSC Stream Profile creates a full-screen MPEG file following the US (NTSC) standard for color television broadcast signals (near CD quality audio).
2. PAL Stream Profile creates a full-screen MPEG file following the European standard (PAL) for color television broadcast signals (near CD quality audio).
3. FILM Stream Profile creates a full-screen MPEG file following the 35mm motion picture film standard
(near CD quality audio).
4. 600K Stream Profile creates a full-screen MPEG file with high radio quality audio suitable for broadband
target networks.
5. 384K Stream Profile creates a full-screen MPEG file with radio quality audio suitable for broadband target
networks.
6. 128K Stream Profile creates a quarter-screen MPEG file with radio quality audio suitable for narrowband
target networks.
VideoCD Folder contains Stream Profiles for making VideoCD MPEG files for the Whitebook Standard.
1. NTSC Stream Profile creates a VideoCD MPEG file following the US (NTSC) standard for color television
broadcast signals (CD quality audio).
2. PAL Stream Profile creates a VideoCD MPEG file following the European (PAL) standard for color television broadcast signals (CD quality audio).
3. FILM Stream Profile creates a VideoCD MPEG file following the 35mm motion picture film standard (CD
quality audio).
Audio/Video Layer 2 folder contains Stream Profiles for making audio and video MPEG files for a wide range
of target networks.
1. 1.5Mb Stream Profile creates a full-screen TV quality MPEG-1 file with near CD quality layer 2 audio for
broadband target networks.
2. 600K Stream Profile creates a full-screen TV quality MPEG-1 video file with high radio quality layer 2 audio for broadband target networks.
3. 384K Stream Profile creates a full-screen TV quality MPEG-1 video file with radio quality layer 2 audio for
broadband and narrowband target networks.
4. 128K ISDN to 28.8K Modem Stream Profile creates a quarter-screen MPEG-1 video file with radio quality
layer 2 audio for 128K ISDN down to 28.8K Modem target networks.
5. 128K ISDN to 14.4K Modem Stream Profile creates a quarter-screen MPEG-1 video file with radio quality
layer 2 audio for 128K ISDN down to 14.4K Modem target networks.
Audio/Video Layer 3 folder contains Stream Profiles for making audio and video MPEG files for a wide range
of target networks.
30
1. 1.5Mb Stream Profile creates a full-screen TV quality MPEG-1 video file with CD quality layer 3 audio for
broadband target networks.
2. 600K Stream Profile creates a full-screen TV quality MPEG-1 video file with near CD quality layer 3 audio
for broadband target networks.
3. 384K Stream Profile creates a full-screen TV quality MPEG-1 video file with radio quality layer 3 audio for
broadband and narrowband target networks.
4. 128K ISDN to 28.8K Modem Stream Profile creates a quarter-screen MPEG-1 video file with radio quality
layer 3 audio for 128K ISDN down to 28.8K Modem target networks.
Audio Only Layer 2 folder contains Stream Profiles for making audio only MPEG files for a wide range of target networks.
1. 384K Stream Profile creates a CD quality layer 2 audio only MPEG-1 file for broadband and narrowband
target networks.
2. 128K ISDN Stream Profile creates a near CD quality layer 2 audio only MPEG-1 file for broadband and
narrowband target networks.
3. 64K ISDN Stream Profile creates a high radio quality layer 2 audio only MPEG-2 file for narrowband target networks.
4. 14.4 Modem Stream Profile creates a radio quality LBR audio only file for narrowband target networks.
5. 28.8 Modem Stream Profile creates a radio quality layer 2 audio only MPEG-2 file for narrowband target
networks.
6. 9600 Modem Stream Profile creates a radio quality LBR audio only file for narrowband target networks.
Audio Only Layer 3 folder contains Stream Profiles for making audio only MPEG files for a wide range of target networks.
1. 192K Stream Profile creates the highest quality layer 3 audio only MPEG-1 file for broadband target networks. While Xing supports custom Stream Profiles up to 384K audio only layer 3 files, the quality gained
above 192K is negligible.
2. 128K ISDN Stream Profile creates a CD quality layer 3 audio only MPEG-1 file for broadband target networks. Great for music!
3. 64K ISDN Stream Profile creates a high radio quality layer 3 audio only MPEG-2 file for narrowband target networks.
4. 28.8 Modem Stereo Stream Profile creates a radio quality layer 3 audio only MPEG-2 file for narrowband
target networks.
5. 28.8 Modem Mono Stream Profile creates a radio quality layer 3 audio only MPEG-2 file for narrowband
target networks.
Profiles (Heuris)
There are several base templates in MPEG Power Professional. These are built in templates that optimize
MPEG encoding parameters for various types of applications. The base templates include:
1. CD 1X
Optimized for playback from single speed CD-ROM
2. CD 2X
Optimized for playback from double speed CD-ROM
3. CD-I
Optimized for playback from a CD-I disc. This template provides MPEG files that are properly “pinked” for
insertion into CD-I applications.
4. Internet
Low bit-rate encoding for transmission over the Internet.
5. Video CD
Optimized for Video CD – a format based on Phillips’ White Book Standard. These files will work in Video CD 1.1 or 2.0 applications.
31
MPEG Encoder
Im folgenden finden Sie Beschreibungen einiger kommerzieller und freier MPEG Encoder. Die Beschreibungen stammen weitestgehend aus der mitgelieferten Online Dokumentation.
Überblick
Folgende MPEG Video- und Audio Encoder werden im folgenden beschrieben:
1.
2.
3.
4.
5.
6.
Darim DVMPEG Encoder (http://www.darvision.com)
Ligos LSX MPEG Encoder (http://www.ligos.com)
Heuris MPEG Professional Encoder 2.0 (http://www.heuris.com)
Xing MPEG Encoder (http://www.xingtech.com/)
Panasonic MPEG-1 Encoder Plugin (http://www.pwi.co.jp/products/mpeg/)
AVI2MPG und BBMPEG (http://members.home.net/beyeler/bbmpeg.html) [Freeware!]
Weitere MPEG Video- und Audio Encoder sind u.a. verfügbar von:
1. Digami MEGAPEG MPEG Encoder (http://www.digami.com)
2. PixelTools MPEG Encoder (http://www.pixeltools.com/)
3. Berkeley MPEG_Encode (nur Folgen von Einzelbildern, *.ppm!) [Freeware]
Weitere MPEG Audio (MP3-) Encoder sind u.a. verfügbar von:
1.
2.
3.
4.
5.
6.
Fraunhofer MP3 Producer (max. 36 Kbit/sec, hm)
Fraunhofer MP3 Encoder Kommandozeile
BladeEncoder inkl. Tools (Feurio, ...)
Gogo MP3 Encoder
Real Jukebox
Real Producer G2 7.0 Beta
Video Encoding Parameter
To encode an MPEG file, many other parameters of the MPEG video stream must be specified. Among these
Standard and Advanced parameters are (see definitions above):
1)
2)
3)
4)
5)
6)
the frame rate of the video
the resolution (frame size) of the video
video and audio data rate, which is an average amount of data transferred in an MPEG stream per
unit of time (usually kilobits or megabits per second)
the amount of P frames that are to be stored between every pair of I frames
amount of B frames that are to be encoded between every pair of P frames
maximum vertical and horizontal motion vector values for P and B frames, which are necessary to limit the area covered by the motion estimation process
32
Darim DVMPEG Encoder 5.0
The Darim Vision MPEG compression software for Windows® 95, Windows® 98 and Windows NT™
(DVMPEG) is a versatile software-only tool that allows to create highly compressed MPEG video, audio and
combined video/audio streams from existing movies or animation. Because of DVMPEG’s compatibility with
Video For Windows™ industry standard for video and audio compression, the DVMPEG plug-in drivers can
be used together with virtually any video editing or animation creation software. Examples include Adobe
Premiere, Ulead Media Studio, Kinetix 3D Studio MAX, Asymetrix DVP, DPS VideoAction and many more.
In general, any application that can output AVI files compressed using Microsoft Video for Windows™ interface will be able to produce MPEG files directly, thus saving a lot of storage space and time. We believe that
you will be impressed with DVMPEG’s unparalleled features, quality, performance and ease of use.
The new DVMPEG plug-in drivers and applications for Windows 95/98/NT are native 32-bit software, specially designed for these platforms. This allows DVMPEG to take full advantage of modern CPUs capabilities
(such as MMX™ extensions) and operating systems architecture.
The following are the most important features of DVMPEG plug-in drivers:
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
Easy to use add-on to your favorite video editing or computer graphics animation software
Creates MPEG1 and MPEG2 files ‘on the fly’ eliminating the need in any intermediate files;MPEG files
do not have any limitations on their size other than amount of available disk space
Any source files supported by host video editing or CG animation software can be compressed; this includes AVI and QuickTime movies, and image sequences in various formats
High performance 32-bit video and audio encoding engine, optimized for Pentium™ or better CPUs with
MMX™ extensions
Flexible video resolution settings from 32x32 pixels (for MPEG1) to 768x576 (for MPEG2)
Provides control over many advanced MPEG encoding parameters (video aspect ratio, GOP structure,
relative size of I, P and B frames, etc.) new!
Several parameters presets are supplied for all kinds of input data and MPEG output
Can output interlaced or progressive MPEG2 video depending on source video and target playback
platform new!
Batch MPEG compression Compatible with batch processing commands of many video editing programs new!
User may see preview of the resulting MPEG clip during encoding new!
Produces the following types of MPEG format:
1. MPEG1 layer II elementary audio (ISO/IEC 11172-3),
2. MPEG1 system stream (ISO/IEC 11172-1),
3.
VideoCD (White book) compatible 1
4. MPEG2 elementary video Main Profile @ Main Level (ISO/IEC 13818-2),
5. MPEG2 program stream (ISO/IEC 13818-1)
6.
DVD compatible video track (Constant Bitrate only)2
33
DVMPEG 5.0 also features the new AVI2MPEG front-end application that could be extremely helpful for
novice users and anyone who needs to convert existing video and audio files in AVI, TGA, BMP, JPEG
and WAV formats into MPEG1 or MPEG2. The AVI2MPEG program helps user to choose optimal
compression parameters depending on the type of the source video. See section 4.2 for more information
on AVI2MPEG.
Finally, DVMPEG can be easily integrated into custom MPEG encoding applications. Naturally, it can be
used via standard Video for Windows™ API for generic video and audio compression. Alternatively, a
set of custom COM objects and interfaces exported by DVMPEG software can be used. Please contact
Darim Vision for the description of these interfaces and to obtain preliminary SDK
Ligos LSX MPEG Encoder 3.0
LSX-MPEG Encoder is an application for transcoding AVI files into MPEG files. Specifically, the LSX-MPEG
Encoder can create multiplexed MPEG-1 and MPEG-2 video and audio streams. The LSX-MPEG Encoder is
optimized to achieve very fast encoding for most standard frame sizes and frame rates of video required in
multimedia applications.
Advantages:
LSX-MPEG Encoder utilizes Intel MMXTM technology for maximum MPEG encoding speeds. The LSXMPEG Encoder automatically detects if it is running on MMX compatible processors and will utilize those
instructions for faster encoding, if available.
LSX-MPEG Encoder utilizes our revolutionary motion estimation algorithm for fastest MPEG encoding
available in software. Due to our LightSpeed algorithm, the LSX-MPEG Encoder works several times faster
than other software encoding solutions, while providing virtually the best possible compression and quality.
Our super fast algorithm of motion estimation is available for licensing for software and hardware
implementations of image compression and recognition.
The LSX-MPEG Encoder is flexible. Our simple and intuitive interface is easy for new users, but still allows
video professionals to control a number of MPEG encoding parameters. A special mode for creating very low
data rate MPEG video files makes the program great for creating low-bandwidth MPEG files for the Internet.
New features such as Variable Bitrate control allow the user to produce better MPEG files than ever before.
If you are new to MPEG encoding and the LSX-MPEG Encoder, you can get started immediately with our
newest
Quick
Start
Tutorial.
Using
an
AVI
clip
available
from
our
website
at
http://www.ligos.com/products/sample_clips.shtm , you’ll be able to quickly familiarize yourself with the
powerful but simple interface of the LSX-MPEG Encoder, and immediately see the advantages MPEG has
over AVI codecs for producing smaller, better looking video.
34
LSX-MPEG Encoder Features
1)
2)
3)
4)
5)
6)
7)
8)
9)
10)
11)
Single pass Variable Bitrate control with three modes of control and operation
Support for 48 kHz audio input and output
Custom controls for adding MPEG Sequence Headers
Ability to specify “Closed GOPs” for increased compatibility with MPEG editing applications
Full implementation of Video Buffer Verifier (VBV), including protection from overflow and underflow
errors
New "Edges" filter allows users to cover up edge garbage and noise that often accompanies source
files captured from VHS
18 new and revised encoding Profiles that are more flexible and easier to use. Encode anything from
exceptionally low variable bitrate MPEG video for the Internet, to MPEG for VideoCD, to full-blown
MP@ML MPEG-2, all with just a few clicks
Improved deinterlace function that creates higher quality progressive video from originally interlaced
source
LSX-MPEG Encoder now includes the LSX-MPEG Player, a software filter that provides quality
playback of MPEG-2 video directly through Microsoft's Windows Media Player
Improved, more efficient interface
New Quick Start Tutorial
Supported MPEG Output Formats:
1)
2)
3)
4)
5)
MPEG-2 program streams (ISO/IEC 13818-1)
MPEG-2 video elementary streams (ISO/IEC 13818-2)
MPEG-1 system streams (ISO/IEC 11172-1), including White Book VideoCD
MPEG-1 video elementary streams (ISO/IEC 11172-2)
MPEG-1 audio layer II elementary streams (ISO/IEC 11172-3)
Specifications for MPEG-1 Encoding Support:
1)
Supports user-defined frame sizes per the "constrained parameter bit stream" section of the MPEG-1
specification:
a)
b)
c)
d)
e)
f)
2)
3)
4)
5)
Horizontal frame sizes less than or equal 768 pixels
Vertical frame sizes less than or equal 576 pixels
Picture area less than or equal 396 macroblocks/picture
Supports user-defined frame rates less than or equal 30 frames/sec
Supports user-defined data rates less than or equal 1812 Kbits/sec
Motion vectors less than or equal 64
Supports creation of audio only streams. (via *.mp2 temporary files)
Supports encoding of MPEG-1 files at either constant bitrate or variable bitrate
Supports creation of multiplexed MPEG-1 streams with audio stream (Layer 2 only) compression.
The bit rate range follows ISO/IEC 11172-3 standard. It also supports mono, stereo, dual channel,
and joint stereo mode
Supports creation of “White Book” compliant MPEG-1 streams for VideoCD
Specifications for MPEG-2 Encoding Support:
1)
Support for the following MPEG-2 Profiles and Levels
a)
b)
c)
d)
2)
3)
4)
Main Profile and Low Level
Main Profile and Main Level
High Profile and High-1440 Level
High Profile and High Level
Supports user-defined aspect ratios of 1:1, 4:3, 16:9, 2.21:1
Supports encoding of MPEG-2 files at either constant bitrate or variable bitrate
Supports creation of multiplexed MPEG-2 streams with audio streams (MPEG-1 Layer II)
compression. The bit rate range follows ISO/IEC 11172-3 standard. It also supports mono, stereo,
dual channel, and joint stereo mode
Specifications for MPEG-2 Decoding (LSX-MPEG Player):
35
-
-
Works directly with the Windows Media Player as a Direct Show filter, giving you a fully powered
media player (DirectShow and ActiveMovie compatible)
Support for real time decoding of MPEG-2 Program Streams (up to Main Profile@Main Level and 10
Mbps)
Support for full screen playback of up to Full D-1 resolution files, NTSC (720x480, 30fps) or PAL
(720x576, 25 fps) on a Pentium II 450 MHz system with a DirectX compatible video card supporting
YUV overlay mode, and Half D-1 on a 300 MHz Pentium II
Optimized for most efficient use of processor
Support for re-routing MPEG-1 System Stream video decode through our filter
Heuris MPEG Power Professional
MPEG Power Professional is the most widely used professional MPEG encoder. It provides the features and
image quality demanded by professional video editors, without the expensive hardware. It is also the toprated professional MPEG encoder, having won New Media magazine's prestigious Hyper Award in 1997 and
1998. MPEG Power Professional is available for Windows 95/98/NT, Compaq Alpha, and Power Macintosh
systems.
This demo version of MPEG Power Professional is limited in the amount of video you can convert to MPEG1
or MPEG2. It will expire on December 1, 1999.
The ECL (Event Control List) stores information about all of the actions to be taken by the encoder. The information found in the ECL includes: filters (what type used, and when they are turned on and off), automatic
scene detection, I-frame injection points, search parameters, and telecine information. The ECL generated
by the analysis only feature is based on the best guess MPEG Power Professional can make based on the
information it has.
The Analysis Only Task Type allows you to manually review the suggested encoding events and accept or
reject any of them prior to encoding. This type of analysis produces a framework for reviewing suggested
encoding events. This is possible, because MPEG Power Professional takes the information it gathers during an Analysis Only pass over your video source and pumps it out into an editable file format called the Encoding Control List or .ECL file. Any encoding events which will be used when it actually encodes your video
source can be viewed with the corresponding time index or frame number.
Whenever you run an Analysis Only pass of your video source, MPEG Power Professional generates a corresponding .ECL file. Once the Analysis is complete, you can view the contents of the .ECL file, accept or
reject any of the encoding events it suggests and add or substitute your own encoding controls.
The ECL Editor by Timecode list shows you encoding events sequentially as they’re scheduled to occur. You
can scroll through all of the scheduled encoding events, adding and deleting events throughout the timeline.
You can also view the encoding events by frame. Highlight a timecode based encoding event from the list
and then click on the “By Frame” button. The ECL Editor by Frame dialog box appears, displaying the same
encoding event by frame, which you just viewed based on timecode. If you loaded your source video file, you
also see a thumbnail bitmap of the currently selected frame where the encoding event will occur. If you did
not load your video source file, the frame bitmap and the slider bar are disabled.
36
Xing MPEG Encoder 2.2
XingMPEG Encoder is a high performance software program that converts (encodes) new or existing audio
and/or video files into MPEG files.
For example you can:
1.
2.
3.
4.
5.
6.
7.
Convert an existing .avi file into an MPEG video or audio file.
Convert an existing .wav file into an MPEG audio file.
Create fully compliant MPEG-1 system streams (video and audio streams combined).
Create MPEG-2 audio streams - including MPEG-2 layer 3 audio.
Create StreamWorks System (video and audio) streams for delivery using Xing's StreamWorks products.
Create audio and video files for VideoCDs, KaraokeCDs, and CD-i Movies.
Create VideoCD files that support Single Speed CD-ROM, Whitebook, and other popular MPEG formats;
even still file formats.
8. Create MPEG files for quick downloading over the Internet or an intranet.
9. Create MPEG files from Apple's QuickTime .mov files.
10. Re-encode MPEG files: .mpa/.mp3, (audio), .mpv (video), and .mpg (system) files.
Some of new features introduced in 2.20:
1. MPEG-2 layer 3 audio
Create MPEG-2 layer 3 files for the best audio quality at low bit rates (less than 112kbps).
2. MPEG-1 layer 3 audio
Create MPEG-1 layer 3 files for the best audio quality data at moderate to high bit rates (112-320kbps).
3. Support for Apple's QuickTime .mov files
Create MPEG files from Apple's QuickTime .mov files.
4. Full system MPEG re-encoding
Re-encode MPEG files: .mpa/.mp3, (audio), .mpv (video), and .mpg (system) files.
5. Batch processing
Encode the same file types in an entire directory using the Make Batch button. A wildcard is added preceding the file's extension, and the similar file types are encoded as a batch job.
Panasonic MPEG-1 Encoder Plugin 2.0
This software is the Plug-In software for Adobe Premiere5.x which covers full specification of MPEG1 encoding capability featured our original encoding engine developed by Matsushita Electric Industrial Co.,Ltd..
[Features]
1. Encode high resolution video data
(up to 1024X1024 but must be multiples of 16)
2. 3 choices of Quantizer Matrix for image types
Natural Image, CG/Cartoon, MPEG1 Standard)
3. Variety of filters to enhance its image quality
Noise Reduction(improve an input image quality)
37
Smoothing Video Filter(improve an MPEG image quality)
4. Wide data rate for encoding from 600k to 15Mbps
5. Forced Intra Frame function can change any frames to an Intra frame.
6. Using the same user interface of Premier5.x for parameters settings
This Plug-In is completely installed into "Export Movie Settings" panel in Premiere 5.x
7. Gamma correction for MPEG Data for PC and TV color characteristics
8. Create a low frame rate MPEG1 movie not complying with MPEG1 standard
but it can run on most of MPEG1 player.
9. Create an MPEG1 System Stream for the VideoCD V2.0 Specification.
[Specification]
Version of MPEG Video
Output Stream Data
Hardware Configuration
Operating System
Frame Resolution (WxH)
MPEG Video Stream Data Rate
Video Frame Rate
Audio Stream Data Rate
Audio Mode
Other Functions
MPEG1 (ISO/IEC 11172-2 compliance)
System/Video/Audio(Layer2)
Pentium/Pentium-II based IBM-PC or Compatibles
Minimum System Memory 64MB
Windows95, Windows98 and WindowsNT4.0 or higher
64x64 to 1024x1024 [pixel] (multiples of 16)
600K to 15M [bit/sec]
10/15/23.976/24/25/29.97/30 [frames/sec]
64/96/128/192/224/384 [Kbit/sec]
32/44.1/48 [KHz]
Noise Reduction Filter for Input Data
Smoothing Filter for Output MPEG Data
VBV Buffer Size Selectable
GOP Sequence Selectable
Forced Intra Frame function
3 choices of Quantizer Matrix
Gamma correction for MPEG Data(for PC and TV)
Motion compensation(half pel/full pel)
[How to install...]
1. Confirm you installed Premiere5.x correctly before installing this software.
2. Run "dmpegpie.exe".
3. The installer starts automatically, then folow the dialogs.
4. After starting Premiere5.x, check the list box of "File Type"/"General Settings"
in "Export Movie Settings" panel. If "Panasonic MPEG/Trial" was found among them,
its installation finished correctly.
[Limitations of this trial version]
1. A movie size for encoding is up to 30 seconds.
2. Time expiry function is implemented.(1 month)
3. "Panasonic MPEG1 Encoder" will be printed on an MPEG encoded movie.
4. Encoding failure may occur in case of encoding a high resolution video
with a low data rate setting.
<Alternative to prevent from this problem>
Increase video data rate
320X240/30fps: data rate more than 800kbps
640X480/30fps: data rate more than 2000kbps
Das Plugin kann nur in Zusammenhang mit einem Video Editing System wie z.B. Adobe Premiere genutzt
werden.
38
bbMPEG 1.1 and AVI2MPG2 1.8
bbMPEG and AVI2MPG2 are Windows programs that convert AVI files to MPEG-2 or MPEG-1 (including VideoCD) files. They are freeware. The file bbMPEG.DLL is also a compiler/export plug-in for ADOBE Premiere
5.0 or higher (it will not work with version 4.2). The file AVI2MPG2.EXE is a front-end for bbMPEG.DLL so it
can be used without ADOBE Premiere.
This software was written with the goal in mind of creating MPEG-2 program streams from AVI files captured
by a MotionJPEG video capture board that could be played on the Creative Labs PC-DVD Encore Dxr2
hardware (which has since died, may have to upgrade to a Dxr3 or a Hollywood+ card). All testing was done
on this software with AVI files that had the following specs:
1. MPEG-2 - Video: 640x480 @ 29.97 (or 30) fps, Audio: 16-bit 44.1kHz stereo.
2. MPEG-1 (and VideoCD) - Video: 352x240 @ 29.97 fps, Audio 16-bit 44.1kHz stereo.
3.
If you do encode other types of AVI files and run into problems, let me know and I will try to fix or help you fix
the problem.
The software generates MPEG-2 (ISO/IEC 13818-2) or MPEG-1 (ISO/IEC 11172-2) video streams, MPEG-1
(ISO/IEC 11172-3, layer 1 and 2 only) audio streams and MPEG-2 (ISO/IEC 13818-1) or MPEG-1 (ISO/IEC
11172-1) program streams (including VideoCD compliant streams) or almost any combination of the above.
You can just do multiplexing if you want to, you don't have to encode video or audio. It can also multiplex AC3
audio streams into an MPEG-2 program stream.
The video encoding was derived from MSSG (MPEG Software Simulation Group) MPEG-2 video codec, version 1.2. The audio encoding was derived from the MPEG/Audio Software Simulation Group's audio codec,
version 4.0. The multiplexing was derived from Christoph Moar's MPLEX, version 1.1. Visit www.mpeg.org for
links to all of the above software.
bbMPEG requires either Win95, Win98 or WinNT. It also requires a Pentium processor.
39
MPEG Player
In diesem Kapitel werden kurze Hinweise auf verfügbare Player gegeben.
Überblick
Zum Abspielen von MPEG-1 Video- und Audio -Dateien sind folgende Programme zu empfehlen:
1. Microsoft Media Player 6.4
2. Ligos LSX MPEG Player
3. Xing MPEG Player
Zum Abspielen von MPEG-2 Video- und Audio -Dateien sind folgende Programme (möglichst mit Hardware
MPEG-2 Decoder oder zumindestens einer „hilfreichen“ Grafik-Karte, siehe folgenden Abschnitt) zu empfehlen:
4. Creative DVD Player
5. WinDVD
6. PowerDVD
Hilfreiche Grafik-Karten sind solche, die beim Dekodieren die Schritte Inverse DCT (iDCT) und/oder Motion
Compensation unabhängig von der CPU ausführen. Minimale Voraussetzung ist Overlay Fähigkeit – leider
auch nicht bei allen Karten, insbesondere auf Notebooks!, vorhanden.
Zum Abspielen von MPEG-1 Audio-Dateien (MP3) sind folgende Programme zu empfehlen:
1.
2.
3.
4.
WinAmp
Fraunhofer MP3 Player
Real Player G2 7.0
Sonique
What is the best video card to play DVDs ?
First of all, don't be fooled by adverts you may have read !
Voodoo3, TNT2, as well as G400 cards are NOT DVD accelerated at all !!!
All these amazing cards are just overlay compatible. They just support the min specifications
needed to play DVDs !
To have a smooth playback, you need a fast CPU and a video card that has fast colorspace
conversion (YUV to RGB). There's NO need to have a DVD accelerated card if your CPU is fast
enough (PentiumII 400+MHz).
If you have a mid-range CPU (K6-2 300MHz to PentiumII 350MHz) you may need some
hardware assistance (Motion Compensation or iDCT), if not a specific MPEG-2 card.
We won't list all true DVD accelerated video cards, but keep in mind that there are several
Motion Compensation or iDCT specifications (ATi MC, S3 MC, ATi iDCT etc.). And all
softwares DVD decoders don't know them all (PowerDVD v1.50+ knows S3 MC, ATi MC but
cannot use Rage128 iDCT. Cinemaster knows ATi's MC and iDCT but cannot use S3 MC).
Once again, the video card you have must match the DVD software specifications, if you
want to use some hardware acceleration your own video card supports.
Finaly, to answer the question, it's better to invert its terms !
What video cards shouldn't you have to play DVDs ?
Older cards as Matrox Mystique/Mystique 220/Millenium/MilleniumII, as well as S3 Trio
and ALL OLDER cards are NOT overlay/DVD compatible.
(if you use a Mpeg-2 card, these old cards will be OK ! Overlay is done by the Mpeg-2 card !)
All other new video cards are overlay/DVD compatible and can be used to play DVD.
You still want a brand and a model ?
Well, if you need a cheap DVD system, go for the well known ATi Rage Pro card ! It's
definitively a bad choice if you're a hard core 3D gamer, but Rage Pro has got a really super
40
fast colorspace conversion, and it's Motion Compensation is known by any recent and good
software DVD decoders !
The Rage128 is far from the best 3D cards, but it's got a powerful iDCT that may help
mid-range CPUs to render DVD playback the right way.
Whatever brand you have got, be sure to always install the latest drivers available !
Microsoft Media Player 6.4
Microsoft Windows Media Player is a universal media player you can use to receive audio, video, and mixedmedia files in most popular formats. Use Windows Media Player to listen to or view live news updates or
broadcasts of your favorite sports team, to review a music video on a Web site, to "attend" a concert or seminar, or to preview clips from a new movie.
Media formats supported by Windows Media Player
The following types of media files can be played by Microsoft Windows Media Player. When you open a stored file that has one of the extensions listed below, either by double-clicking a file icon or a link in a Web page, Windows Media Player starts.
Microsoft Windows Media formats
File name extensions: .avi, .asf, .asx, .rmi, .wav, .wma, .wax
Moving Pictures Experts Group (MPEG)
File name extensions: .mpg, .mpeg, .m1v, .mp2, .mp3, .mpa, .mpe
Musical Instrument Digital Interface (MIDI)
File name extensions: .mid, .rmi
Apple QuickTime®, Macintosh® AIFF Resource
File name extensions: .qt, .aif, .aifc, .aiff, .mov
UNIX formats
File name extensions: .au, .snd
41
Probleme, Tips und Tricks
In diesem Kapitel finden Sie Hinweise, wie Sie optimal Videos aufnehmen, digitalisieren und nach MPEG
wandeln können.
Trixter's Desktop MPEG-1 Authoring FAQ
This FAQ can always be found at: http://www.oldskool.org/mpeg/. The HTML version has embedded hypertext anchors to all of the software packages mentioned in this document.
This FAQ attempts to answer some of the more common questions about authoring MPEG files (including
Video CDs) that crop up on rec.video.desktop. While the questions and answers listed here are Windows/Premiere-centric, there are many concepts presented that apply to all OS platforms and editing packages.
Disclaimer: I am not a video editing professional; I don't do this for a living. But I have worked with digital video on the desktop for almost a decade and MPEG-1 for half a decade, and have come to several conclusions about creating MPEGs that make sense. Maybe you agree with me; maybe not. Write me at [email protected] and let me know if you find a glaring error in my conclusions (or if I'm leaving something
major out).
Disclaimer #2: To make this document easier to understand, I assume that you're using NTSC. To wit:
•
•
•
Captured video is at 30 frames (60 fields) per second.
A full capture has 480 lines of resolution (720x480, for example).
A "half" capture has 240 lines of resolution (352x240, for example).
If your country's broadcast standard isn't NTSC, you'll have to substitute your country's numbers for what's listed in this document. For example, PAL is 576 full lines of res, 288 "half" lines of res, and a framerate of 25
("fieldrate" of 50).
What core ideas should I know about before I begin reading this FAQ?
Core Idea #1: The quality of most MPEG encoders is directly tied to the quality of the input you give them.
Remember the old adage, "Garbage in, Garbage out?" It's most evident when encoding MPEGs. If you give
an encoder a noisy signal with lots of weak broadcasting artifacts, the encoder will try to include all of that in
the output, which makes for a noisy bitstream. If your source is extremely clean (or live, like the live output of
a video camera), your end result will be clean. Some encoders are much better than others, but the primary
factor affecting the output is the quality of your input.
Core Idea #2: Frames vs. Fields. Video is 30 frames a second, right? Wrong. Video has a framerate of 30,
but each frame consists of two interlaced fields. A field is a completely new picture. Here's another way to
understand it: Each NTSC "image" is made up of 240 lines. A 480-line capture, therefore, has two "images" in
it--the odd scanlines (1, 3, 5, etc.) make up the first image, and the even scanlines (2, 4, 6, etc.) make up the
second image. The second image is displayed 1/60th of a second after the first image, then you move onto
the next frame. If you still have trouble understanding this, try playing a video with high motion in it in your
VCR and then hit "pause". Notice how the freezed-frame tends to "flicker" or "jitter" quickly between two different images? That's because only one frame is being displayed, and is quickly alternating between the two
fields 60 times a second.
Core Idea #3: Software MPEG encoding takes a really, really long time unless you have a 500MHz (or faster)
machine. Hardware encoders are either real-time (they encode the video as fast as it comes in) or faster than
real-time (they encode off of .AVI files at about 3:1 or faster--a minute of video gets encoded in 20 seconds).
The above information seems useless right now, but you may find it useful later.
What process should I follow to create the best possible MPEGs?
It depends on what your needs are, but what most people in in rec.video.desktop want to do is create Video
CDs (MPEG-1, about 170Kbytes/second, up to 70 minutes of video+audio on a CDROM) that are as close to
the original video source as possible. Here's a generic overview of what to do:
1. Capture full-frame video (480 lines).
2. De-interlace the video frames. This properly combines the two captured fields into a single frame.
3. Smoothly resize the de-interlaced frames down to your output size, typically 352x240. (The "smooth
resize" process is sometimes called "resampling".)
42
4. Encode the resized frames with a software encoder.
This will get you the best possible output quality. For a specific process using Premiere 4.2, here's what I do
to create my Video CDs:
1. Capture at 720x480 and bring the clip (or clips) into Premiere 4.2 and arrange them on the timeline in
the construction window.
2. Right-click each clip in the timeline, select "Field Options" from the menu that pops up, and then select "Always deinterlace" from the available options.
3. Once that's done, right-click each clip again, select "Filters" from the menu that pops up, and then
choose the filters you want to apply for general processing. (I usually apply the Crop filter to get rid of
a few noisy lines outside the frame that accidentally get captured by my capture device.) When you're
done, apply the Resize filter. Make sure it's listed last in the filter list.
4. Go to the Make menu, choose Output Options, and make sure that your output size is the size of your
final MPEG output. (For Video CD, I type in 352x240.) This setting, combined with the Resize filter,
ensures that your video will be resized properly before it gets to the encoder. Don't trust an encoder
to resize your input properly--most won't resize at all, or do it poorly.
5. At this point, I make a choice: If I am working with a short clip I will do a "Make Movie" to a completely
new .AVI file and then encode it with Ligos' LSX-MPEG encoder. If I have a particularly large project,
I will use Xing's MPEG encoder utilizing their plug-in for Premiere. (It shows up under the Make menu
as "XingMPEG Movie".)
There are some probably some time-saving shortcuts you could apply to the above, like making a virtual clip
and applying all of the operations to that one clip, but I wanted to keep it simple for people who want to duplicate the process with other editing packages.
Does a hardware MPEG encoder produce better output than a software MPEG encoder?
It depends on the price, but the general answer is no. Consumer hardware encoders only encode the first
field of a video frame and completely ignore the second field, so you lose motion quality. And because they
have to encode in real time, there usually isn't enough processing time left over to do noise filtering, so the
output can be noisier than a software encoder if your input is noisy.
Of course, software encoding takes forever and a day, so there is still a valid reason to buy hardware encoders. If you have very clean source material, the output of a hardware encoder matches (and sometimes
exceeds, in special cases) the output of a software encoder.
Darim sells a product called the M-Filter, which greatly pre-processes video and assists MPEG compression
with any encoder. However, like the other professional-grade MPEG products they manufacture, it has a price that is beyond most consumers' budgets.
What's the best hardware MPEG encoder in a consumer price range?
General consensus points to the Broadway being the best, with all others trailing slightly in terms of output
quality. It's a bit pricy at $800, but it can deal with marginal source material much better than the others, and
can also output back to TV (the Dazzle DVC can also output to TV). I am unsure if it captures and/or takes
into consideration both video fields, however.
If I had to rank them, I'd rank the Adaptec VideOh (which is a repackaged, OEM'd Futuretel Video Sphynx)
2nd after the Broadway, the Videonics Python ranking third, and the Dazzle DVC after that. But they're all acceptable if you have clean source material. I own a Python myself, and use it to encode live feeds from my
video camera and images generated from PCs with results comparible to software encoders.
What's the best software MPEG encoder in a consumer price range?
General consensus points to Ligos' LSX-MPEG encoder. It's one of the fastest of the bunch, has a ton of options, and even has support for MPEG-2 if you want to experiment with DVD bitstream creation.
Xing's encoder is just as fast, but doesn't handle low bitrate or high-motion clips quite as well as Ligos' encoder does. On the other hand, Xing's encoder comes with a free Premiere plug-in, which is an enormous time saver. You can do the same with Darim's DVMPEG because it installs as a VFW CODEC, but I believe it
costs the most out of all three encoders listed.
For short projects, I render to an .AVI file and use LSX-MPEG. For long projects, I export via Xing's Premiere
plug-in. YMMV. I strongly suggest you download the trial version of all three encoders listed and test them out
for yourself.
Can I encode MPEGs for free? I can't spend money for an encoder.
There are several encoders for multiple platforms at www.mpeg.org, but the easiest one to use for Windows
that can also output Video CD bitstreams is AVI2MPG1 located at http://www.mnsi.net/~jschlic1/. (Be sure to
grab the GUI front-end.)
43
Why should I capture at 480 lines when the MPEG output is only 240 lines? Wouldn't a 352x240 capture be more efficient?
See Core Idea #2 listed at the beginning of this document. If your source material was captured at 240 lines
(352x240, for example), you're missing half of the images. Capturing at 480 lines and then deinterlacing ensures that you are encoding as much of the original video signal as possible.
I've deinterlaced and resized my 480-line video, but when I look at a single frame, it looks "blurry". My
240-line video looks fine. What gives?
A single deinterlaced frame will indeed look more blurry than the same image captured at 240 lines. But if you
play the two captures side by side, you'll notice that the 240-line capture doesn't look as "smooth" during
playback than the 480-line capture that was deinterlaced and resized down to 240 lines.
I have an "all-in-one" video card with embedded capture, but it can only capture at 240 lines. Will my
MPEG output suffer?
It won't be as good as a 480-line capture, but there's nothing preventing you from doing it. :-) 240-line captures aren't bad--they're just not as good as 480-line captures that have been deinterlaced and resized properly.
My MPEG output has horizontal "lines" all over the place whenever there is heavy motion in the content. What gives?
It sounds like you captured at 480 lines, but either forgot to resize cleanly or you're letting the encoder resize
for you. Review the process listed above in the question "What process should I follow to create the absolute
best possible MPEGs?".
How can I avoid the Windows 2gig .AVI file size limitation when encoding MPEGs?
Two ways: You can either generate many MPEG files from different clips and later join the MPEGs together,
or you can generate the entire thing from your editing program.
Joining clips together is the cheap method; you can find several programs to do this at www.mpeg.org, but
one popular program that does this under Windows is Camel's MPEG Joiner. Note: If you are creating a Video CD, you might not have to join video clips together at all. Most VCD authoring programs allow you to
create a "simple video sequence" that plays the MPEGs one right after the other.
There are a couple of ways to do a long, unbroken sequence. The method I use is to put together my entire
project in Premiere, then use Xing's Premiere plug-in to export the entire timeline to a single MPEG file. You
can also use Darim's DVMPEG to output an entire timeline to a single MPEG.
How can I avoid the Windows 2gig .AVI file size limitation when outputting to tape?
If you have a "prosumer" package, such as the Miro DC30+, you probably already have a special version of
Premiere that can either work with files larger than 2gig, or can play multiple files from the timeline seamlessly after rendering transitions. In the Miro product, this appears as a plug-in called "Miro InstantVideo". For
those of us without the budget for such a product, there is an excellent shareware program that, in addition to
being a powerful real-time NLE program, can string multiple pre-rendered clips together on a timeline and
play them in sequence without dropping a single frame. This product is called DDClip, and is well worth the
registration money. I've used it to string together multiple Iomega BUZ-captured clips with the same resolution and audio parameters, and it played them one right after the other without any dropped frames. I was able
to output 10 2gig clips to tape (about 24 minutes of video) using DDClip without having to touch the VCR.
How can I avoid the Windows 2gig .AVI file size limitation when capturing?
AVI_IO was written by Markus Zingg expressly for this purpose: It is a better VidCap32 than VidCap32. You
can capture to multiple files--even on multiple drives--and it won't drop a single frame. I have used it myself
and can verify its effectiveness; I routinely use it to put together 30 minute and 60-minute VideoCDs.
Is it possible to create MPEGs with a low framerate, like 15- or 10-fps? My low-bitrate MPEG has
many artifacts.
You can simulate a low framerate by encoding blank B-frames; this leaves more bits for the encoding of Iand P-frames. Ligos' encoder can be configured to do this; check the help file for exact configuration options.
The Xing encoder can do this as well, but it does so automatically under low-bitrate conditions and it's exact
behavior cannot be specifically controlled. (I've found the results to be perfectly fine--I just like to tweak options ;-)
Is it possible to specify key frames manually? My MPEG has many artifacts because of swiftlychanging scenes in the source material.
Unless you have professional hardware, no. Consumer encoding hardware and software usually don't allow
you to arbitrarily specify where I (key) frames go. (If you're willing to pay for professional hardware, then they
44
will do this automatically. Jason Livingston had this to contribute: "The professional MPEG encoders (Heuris,
Philips/Sun, the high end C-Cube chips) will automatically insert I frames when there is a significant change
in the scene (called auto-scene change detection), and will even choose whether a I, B, or P frame would be
more appropriate based on the current frame content. It wouldn't be unusual to see a professionally encoded
MPEG stream look like IBBPBIBBBBBBBPP...")
The best way to avoid the "blocky scene-change" effect you correctly described earlier is to either use a better/different encoder (Xing and Ligos are best, IMHO), or to encode at a higher bitrate. If you're already using
VideoCD bitrates, then try a different encoder.
Another thing to try is to apply a low-pass filter, median filter, or a very soft Gaussian blur (no more than a 1pixel radius) to the entire video as the last filter in any filter sequences you have. (Apply this as a filter if you're
doing the Make Movie-Xing MPEG Export function, since it will *not* be applied if you specify it in the Make
Movie special/advanced options.) This removes random noise and softens the entire image, which aids compression. This may not eliminate the "blocky sudden scene-change" effect, but it may help reduce it to the
point that only trained eyes can see it.
WHAT YOU SHOULD KNOW BEFORE YOU SHOOT YOUR VIDEO (Heuris, Big Squeeze)
DO opt for a component video format if available.
AVOID converting your video to or from a composite format at any time
during production or post-production....
BECAUSE you will suffer an irreversible quality loss and potentially introduce artifacts that will stick with you
all the way to the
finished product.
DO use high quality, first generation video.
AVOID using second or third generation video.....
BECAUSE the higher the quality you start with, the higher the quality of the end result. High quality video is
often less “noisy”. Since MPEG cannot distinguish between moving video and “noise”, it will attempt to encode the noise, taking bits and
quality away from your moving video.
DO use nice big fonts.
AVOID MPEG encoding text over moving video.....
BECAUSE text is high frequency video data. The moving video in the background will cause your foreground
text to fade in and out. In addition, the text uses lots of bits that could be allocated to making your video look
better.
DO use animation with medium amounts of detail and lines which are several pixels thick.
AVOID using computer-rendered animation with extremely fine lines (less than 3 pixels) or extremely fine
detail...
BECAUSE extremely thin lines and fine details tend to “disappear” due to MPEG’s lower resolutions.
DO use fast moving video with tightly focused close-ups.
AVOID using fast moving video where background and foreground are both highly detailed and in-focus....
BECAUSE when background and foreground are both in focus, they vie with each other for bit allocation-both will require a lot of bits. This can lead to “blockiness” or “pixelization.”
DO use talking heads in video; preferably a tightly focused close-up on the face.
AVOID using talking heads that are too small.....
BECAUSE when characters on the screen talk, the viewer’s focus is drawn to the mouth. If the mouth is too
small, it will not be clear and will distracting.
DO use computer or hand-drawn animation.
AVOID using computer or hand-drawn animation with very sharp diagonal or vertical lines....
BECAUSE this can lead to “aliasing” which makes smooth lines look like stairsteps.
DO use scene changes and relatively quick cuts.
AVOID using extremely fast scene changes that comprise less than 2 frames or blinking or flashing
screens...
45
BECAUSE very rapidly blinking screens and rapid scene changes are difficult for MPEG encoders to handle.
Encoders need to work over multiple frames in order to achieve optimal compression.
DO use video with contrast.
AVOID using high contrasts in luminance, i.e. flames, explosions, fireworks, etc.
BECAUSE high contrasts lead to blockiness.
DO use video with lots of colors.
AVOID using monochrome scenes.....
BECAUSE while the resolution levels MPEG can handle are lower than computers, the number of colors
MPEG is capable of is very high. So, if you can “say it” with
color rather than “cross-hatching,” by all means do so.
These guidelines are not meant to suggest that there are “hard and fast” rules for MPEG encoding.
HOW TO JUDGE MPEG QUALITY (Heuris, Big Squeeze)
Image quality is subjective at best. What looks good to one person may not look good to another. This is
especially frustrating when you are trying to decide which MPEG encoding house to go with. However, you
can educate yourself as to what to look for in MPEG encoded material.
First of all, compare apples to apples. MPEG has a difficult time handling lots of fast motion with detailed
backgrounds, areas of highly contrasting light intensity, (explosions, fireworks, lightning, etc.), and (believe it
or not) simple 2-D animation sequences. Try to compare demos which display some of these difficult scenes.
Just about anybody can make flowers blowing gently on the breeze or a duck gliding over the water look
good.
Next - get close. All MPEG encoding looks the same from 20 feet away.Optimal viewing distance for MPEG
on a standard size computer monitor is 5 feet away, at about eye level. Finally, turn down the sound. The
sound can have a strong effect on your perception of the video quality. If you’re really trying to level the playing field, turn off the sound.
MPEG encoding has a host of potential quality problems all its own. Special things to look out for include:
Blockiness:
When your picture breaks up into little squares. Especially noticeable in fast moving highly detailed sequences, and sequences with high contrasts in light intensity like explosions and fireworks.
Aliasing:
When lines that are supposed to be straight (especially diagonal ones) look like little “stairsteps”. Not necessarily indicative of bad encoding, but aliasing may be reduced by good encoding or extra image processing.
Fuzz and snow:
Images that look as though your monitor is dirty or you lost a contact lens. Little gray or white flecks that intrude randomly throughout the picture.
Worms:
Crawling dots and squirming lines. Probably the result of low quality video or bad digitizing.
Halos:
Small area of distortion surrounding the outline of moving objects.
Balancing Quality and Performance (Ligos)
This section provides some recommendations on how to encode MPEG files achieving optimum balance
between picture quality and performance requirements.
Maximum motion vectors
P frame maximum horizontal motion vector
16, 24, 32
46
P frame maximum vertical motion vector
16, 24
B frame maximum horizontal motion vector
B frame maximum vertical motion vector
8, 16, 24
8, 16
Frames
In most cases the default sequence of 3 P frames between I frames and 2 B frames between P frames gives
fast encoding.
If you want faster encoding then reduce the amount of B frames, because encoding B frames takes more time than encoding P frames. Set 1 B frame between P frames.
47
DVD Grundlagen und Authoring
In diesem Kapitel erhalten Sie eine Einführung in das Thema DVD.
DVD In Short
DVD, which stands for Digital Video Disc or Digital Versatile Disc, is the next generation of optical disc
storage technology. It's essentially a bigger, faster CD that can hold video as well as audio and computer
data. DVD aims to encompass home entertainment, computers, and business information with a single digital
format, eventually replacing audio CD, videotape, laserdisc, CD-ROM, and perhaps even video game cartridges. DVD has widespread support from all major electronics companies, all major computer hardware
companies, and about half of the major movie and music studios, which is unprecedented and says much
for its chances of success.
It's important to understand the difference between DVD-Video and DVD-ROM. DVD-Video (often simply
called DVD) holds video programs and is played in a DVD player hooked up to a TV. DVD-ROM holds computer data and is read by a DVD-ROM drive hooked up to a computer. The difference is similar to that between Audio CD and CD-ROM. DVD-ROM also includes future variations that are recordable one time
(DVD-R) or many times (DVD-RAM). Most people expect DVD-ROM to be initially much more successful
than DVD-Video.
Most new computers with DVD-ROM drives can also play DVD-Videos. DVD disc's can hold up to 5,2 GB of
data, the first disc's could hold 2,6 GB so the technology is still not finished...
These are the advantages of DVD compared to tapes and/or normal CD's:
Superior picture quality - digital video technology offers more than twice the resolution of a VHS video
picture and eliminates static and snow. DVD Video offers pictures that are twice as sharp and clear as VHS.
DVD video has up to 500 lines of horizontal resolution, compared to only 240 lines of horizontal resolution for
VHS.
Superior sound quality - DVDs provide CD quality digital audio. Movie DVDs released in the United States
and Canada use Dolby Digital™ (AC-3) multi-channel Surround Sound to bring true theater audio experience
to the home. Dolby Digital Surround Sound provides five completely separate channels plus a bass channel
(i.e. 5.1): Left, Center, Right, Left-Rear and Right-Rear, plus a subwoofer channel for special bass effects.
As a true digital system, Dolby Digital (AC-3) audio encoding offers CD quality sound, with outstanding dynamic range, low distortion, wide frequency response and wow & flutter beneath the threshold of measurement.
High viewer enjoyment - most DVDs support additional camera angles, wide screen formats, and "behind
the scenes commentary" that is not available on VHS movies.
Multiple Aspect Ratios - Most DVD titles feature both the traditional full-screen television format and also
the widescreen or letterbox format, which presents movies in the same aspect ratio as shown in theaters.
Informational Features - Many DVD titles include additional information on the DVD, such as biographies of
the performers in the movie or music video, notes on the production of the movie, and behind-the-scenes
commentary from the director or actors.
Scene Access - Because DVDs are not tape-based, you can instantly access any specific scene in the
movie. You no longer have to rewind or fast-forward through an entire movie to find your favorite scene.
Camera Angles - Some DVD titles were filmed with multiple camera angles - easy-to-use menus allow you
to choose these alternative angles.
More compact - DVDs are easier to store than VHS tapes.
High durability - a DVD can be played repeatedly without wear and tear and without any degradation to
image quality.
Backward compatibility with audio CDs - you can play music CDs on your DVD player.
48
THE DVD FAQ
http://perso.libertysurf.fr/dvdutils/start_dvdfaq.htm
What is DVD-ROM ?
What are the main features of the DVD ?
How does DVD-ROM differ from DVD-Video ?
Why is an MPEG-2 card required to use a DVD-ROM drive ?
Can DVD-ROM drives play/read standard CD-ROM discs ?
What about compatibility with CD-R discs ?
Can DVD-Video discs be played on a DVD-ROM drive ?
Why should I purchase a DVD-ROM drive instead of a CD-ROM drive ?
What is the capacity of DVD-ROM discs ?
What kind of DVD-ROM titles are available now ?
Can I get the Dolby® AC-3 digital surround sound from a DVD-ROM drive ?
How much can I expect to pay for a complete DVD-ROM kit ?
What sort of system do I need to run a DVD-ROM drive ?
Will DVD-ROM drives be compatible with the upcoming DVD-R write once recordable discs ?
What about compatibility with rewritable DVD ?
What advantage does the proposed DVD+RW format have over DVD-RAM ?
Will DVD+RW discs require a caddy-like cartridge ?
Are larger capacities possible for DVD+RW ?
Are larger capacities possible for DVD+RW and is dual layer recording possible ?
What is DVD-ROM ?
DVD-ROM stands for Digital Versatile Disc Read Only Memory. Like CD-ROM discs, DVD-ROM discs are .
intended for computer use, and are molded with the information pressed right into the disc. However, . unlike
CD-ROM with its 650 MB capacity, DVD-ROM discs can hold up to 4.7 GB of information. Even higher capacities are possible with additional information layers and double sided DVD-ROM discs.
What are the main features of the DVD ?
Over 2 hours of very high-quality (better than laser disc) video on a single disc.
Over 8 hours on a double-sided dual layer disc.
Support for wide screen movies.
Some DVD movies allow you to select wide screen or standard screen.
Up to 8 tracks of digital audio for multiple language support.
Up to 32 subtitle/karaoke tracks.
Up to 9 different viewing angles (DVD disc must be encoded with the different angles).
Automatic "seamless" branching of video for multiple story lines or different ratings of one movie.
Menus and interactive features.
Title, Chapter, and track search.
Durability.
Compact Size.
Language choices.
Parental lock.
Random accessibility.
Dolby Digital AC-3 audio.
...
Making a DVD-Video disc
You need three things to create a DVD:
1. digital content creation system
2. professionalMPEG encoder
3. DVD Authoring system
The first thing that you need to create your DVD is a digital content creation system. These are generally nonlinear video editors from companies like Avid and Media 100. These systems are used to create the clips or
49
movies that will be put on the DVD. The second thing that you need is a professional MPEG encoder. This is
a tool that will create the MPEG files that will be put on the DVD. Generally you want to put Variable Bit Rate
MPEG-2 on your DVD. Many companies offer MPEG-2 realtime encoders : Zapex, Minerva, Optibase, Optivision, etc. The final thing that you need is a DVD authoring system from a company like Daikin or Spruce. A
DVD authoring system is used to ensure that the DVD that is created complies to the DVD-VIDEO specifications. A DVD-VIDEO system does two basic things. First, it ensures that the disk image that you create conforms to the DVD-VIDEO specification. The DVD authoring system understands the DVD-VIDEO specification and ensures that what you produce will be playable on any DVD-VIDEO player. It does this by creating the
correct files and folders. Second, a DVD authoring system allows you to visually hook up the elements that
make up your DVD-VIDEO. Good DVD authoring systems work much like non-linear video editing systems
from Avid or Media 100. The advantage of this is that they are easy to use and are familiar to existing users
of non-linear video editors.
Anforderungen an MPEG-Dateien für DVD Video:
NTSC
PAL
Compression
mat
MPEG-1
MPEG-2
for- Picture resolution
Pictures in Group Aspect ratio
of Pictures (GOP)
and 720 x 480,704 x Fewer than 36 fields 4:3 or 16:9
480,352 x 480,352 x
240
MPEG-1
MPEG-2
and 720 x 576, 704 x Fewer than 30 fields
576, 352 x 576, 352
x 288
4:3 or 16:9
Bit rate (maximum)
9.8 Mbps*
9.8 Mbps*
Tabelle 1: Requirements
THE CHALLENGE OF DVD AUTHORING
Panos Nasiopoulos, Rabab K. Ward and Masato Otsuka
Abstract: The evolution of DVD promises telecomputers will find their way into our living rooms, linked to a
large flat screen which will be able to display HDTV, Standard Definition TV, video games, interactive movies
from Digital Versatile Discs (DVD), Internet, video telephony and computer graphics. For the first time, Hollywood Studios and consumer electronic companies have formed an alliance to support the DVD technology
which is critical for both sides. In this paper we address the complex process of DVD authoring.
INTRODUCTION
One of the most significant technological achievements in the consumer and entertainment industries is the
development of the Digital Versatile Disc (DVD). DVD is the first union of emerging technologies, bringing together computer consumer electronics and entertainment. As a result, we are witnessing the generation of an
entirely new infrastructure that is reshaping the world of entertainment.
DVD is a lot more than just a storage medium. It is a new multi-purpose technology that will affect both the
entertainment and computer worlds. For consumers, this is the first digital medium that offers studio video
and audio quality combined with unprecedented interactivity at a very low cost. For the PC multimedia side,
DVD is the first video distribution medium designed for very high data rates. Its interactivity and distribution
format have the potential to revolutionize the entertainment software industry. This paper describes the complex process of DVD authoring.
DVD: AN OVERVIEW
Storage
As a storage media, DVD can hold from 4.7 GB of digital data on one-side single-layer format to 17 GB on a
double-side dual-layer format [1,2]. This increase in capacity (up to 25 times that of CDís) is achieved by introducing a shorter wavelength laser beam, dual focusing mechanism that allows the use of two layers per
side, smaller pit size and tighter spirals. Furthermore, DVD discs offer ten times the speed of the CD rate,
opening the way to numerous new real-time applications. While storage capacity is very important, it is DVDís
other capabilities that make this technology so attractive.
Hollywood
As an entertainment product, DVD satisfies the goals established by the Hollywood Digital Video Disc Advisory Committee, delivering extraordinary picture and sound quality. DVD takes advantage of a two-pass variable bit rate MPEG-2 video encoding process to offer a superb picture quality comparable to D-1, the studio
production standard. To make it more exciting, this is the first medium to introduce a number of viewing for50
mats such as the 4:3 TV screen format, the 16:9 HDTV screen format and the 20:9 letterbox format [1,2].
Combine this picture experience with Dolbyís AC-3 5.1 channel surround sound (or MPEG-2 7.1 for Europe)
and you have reproduced video and audio quality that rivals that of a theater.
In addition, this technology allows the use of 8 different languages, 32 subtitles, different camera angles and
video-clip paths including interviews with producers and actors. The viewer can choose the camera angles
and language, switching seamlessly from one to another, scan forwards and backwards and play slow motion. Parents are given the option to lock out versions of the movie which range from the directorís cut, to Rrated,
to
PG-13.
Hybrid DVD-Internet
But it is the PC world where DVD will have its biggest impact. The first read-only DVD drives are expected to
offer over 7 times the storage capacity of the current CD, and will also be able to play DVD movie titles and
existing music CDs. DVD brings the added capability of supporting the implementation of interactive adjuncts
to traditional PC content. Embedded navigation such as web browsing can be added to video, enhancing the
userís experience with hybrid content.
A DVD-based PC application that combines DVDís interactive performance and rich video and audio capabilities with the Internet would offer a wealth of opportunities. For example, it is possible to produce a DVDbased department store catalogue that offers an interactive showcase of all the departmentsí merchandise,
complete with audio and video. Playing the disc automatically connects the user with the store via the Internet, allowing the consumer to get current prices, order merchandise, communicate with a personal shopper
or pay bills. Such a service is not viable today because of the Internetís low bandwidth. A similar hybrid application allows DVD-based courses and encyclopedias (which devour huge amounts of space with text, pictures, video sound and animations) to remain up-to-date by cross-referencing a constantly updated web site.
DVD AUTHORING
Authoring is generally defined as the process of preparing content, encoding video and audio, and creating
the final DVD image. In the case of DVD, authoring is a complex process since it involves the laying out of
multiple audio tracks and a video track, generation of sub-titles, menu pages, parental lock-out features, interactive functions such as program search, time search, seamless play, and pause, and finally editing of video and audio. Since authoring is always performed along with encoding and disc formatting, it is, in many
cases, referred to as the entire DVD pre-mastering process.
Preparation of Materials
The first step in authoring is the collection of materials. These materials include video, audio, still images, and
sub-pictures. DVDís video source format is the CCIR-601 studio format compressed to MPEG-2 format. The
frame rate is 29.97 f/s for NTSC sources (North America) and 25 f/s for PAL/SECAM sources (Europe). The
maximum allowable bit rate is 9.8 Mbps. Audio includes the surround track and up to 8 different language
tracks for each title. All language tracks must be compared for level, mix, and equalization so that seamless
switching between languages can be achieved. Still images are used to provide break points in the title, so
that search functions and other interactive functions can be implemented. The preparation of still images includes identification of the breakpoints in the video and definition of the time duration for each image. Subpictures are bitmaps that are overlaid on top of the video. They include menus, sub-titles, graphics, and
simple animation. Once created, their start and stop time must be defined in order to be synchronized with
the associated video and audio elements. Up to a maximum of 32 sub-picture bit-streams are allowed in a
title.
Techniques and Parameters
Good understanding of how various elements will be used in constructing the title is the key to intelligent parameter determination which includes tradeoffs between picture quality, length of program, number and quality of audio channels, number of subtitles, and level of interactivity. The following is a list of some of the basic
parameters needed to be determined for a DVD title [3]:
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
the number of audio channels
the number of language versions
the number of sub-picture elements
the number of breakpoints in the video
the number and the levels of rated versions of the title
the number of still images used at each breakpoint
the type of parental lock outs
the type of directors cuts
the audio encoding techniques
the format used for still images
51
A single-layer single-sided DVD disc can store 2 hours and 13 minutes of video compressed at a nominal
average bit rate of 3.5 Mbps combined with 3 languages encoded using AC-3 5.1 channels and 4 additional
languages encoded as sub-titles [4]. The maximum program rate (i.e., video + audio + sub-pictures) is specified to be 10.08 Mbps. Given the disc capacity, the overall quality depends on determining the different tradeoffs between several parameters. For example, Table 1 shows the average storage requirements for a DVD
title with the following parameters:
1.
2.
3.
4.
5.
6.
7.
Audio tracks encoded using Dolby AC-3 5.1
4 unique languages supported
4 sub-picture streams supported
"G" rated version has a total run length of 100 minutes
"PG" rated version has an additional 4 minute run length
2 previews; each has a run length of 3 minutes
4 trailers; each has a run length of 2.5 minutes
Note that 4% of the total disc capacity is always reserved for backup of the program control data and for additional information that is added after editing. The total run length is 120 minutes, resulting in the average bit
rate of 3.43 Mbps.
Video Encoding
DVD takes advantage of the MPEG-2 compression technology to achieve picture quality comparable to that
of D-1, the CCIR-601 TV studio production standard. MPEG-2 is a flexible and scaleable compression scheme which can produce bit rates that range from 1 to 40 Mbps. As implemented for DVD, MPEG-2 encoding is
a two-pass process. During the first pass, the encoder scans the video source, detects scene changes and
determines the optimal bit rates for each frame. During the second pass, higher bit rates are assigned to
complex frames and sequences with more activity and lower bit rates to "simple" frames. The two-pass
process guarantees the best possible picture quality for the given video clip and disc storage capacity.
For video material originated from film, inverse telecine may be used to improve the compression performance. The reason is that film uses 24 f/s, a rate that is converted to the 30 f/s required by the NTSC standard. This conversion process is known as telecine and involves duplication of frames at regular intervals. Inverse telecine removes the duplicated frames, thus allowing more bandwidth to be allocated to the video.
Audio Encoding
Movies released in North America and Japan can carry Dolbyís AC-3 stereo or 5.1 audio which offers 5 surround channels plus a low frequency (sub-woofer) channel. For movies released in Europe, AC-3 is replaced
by MPEG-2 stereo or 7.1 surround sound. In addition, as an option to AC-3 and MPEG-2 audio, DVD enables
producers to choose uncompressed 16-bit linear PCM stereo sound with Dolby Pro Logic encoding. Table 2
shows audio encoding options as well as the specified sampling frequency rates, bit and transfer rates and
number of channels supported by each option.
Sub-Picture Encoding
Sub-pictures are run-length compressed bit-maps using 2 bits/pixel and 4 colors out of a 16 color palette. The
sub-picture size is 62KB per GOP/cell with 32 KB allocated for control data. Applications may vary from
simple text (sub-titles) to menus to still images used for presentation effects. Pixels are categorized as foreground, background, emphasis-1 and emphasis-2. The still picture format must be a standard image format
such as TIFF, GIF, or BMP. MPEG is used to encode still images which are then incorporated into the video
stream.
Putting it Together
After preparing the different "segments" of a DVD title, a multiplexing process should link everything together
and define the program flow of the DVD title. This final step should specify how each of the media elements
will be presented to the user and how the user can interact with the program. Program flow specifications are
translated to navigation commands that are, in turn, incorporated into program cells and program chains. A
cell consists of a navigation command and all the video and audio data associated with a GOP. The navigation command (button) defines the playback behavior of the corresponding cell and it consists of one or at
most a combination of three of the following instructions [4]:
GoTo
Link
Jump
Compare
SetSystem
Set
à
à
à
à
à
à
branch between commands
transfer between the same domain
transfer between each domain
recognition of parameter value
player system setting
calculate GPRM values
52
A sequence of cells and cell commands (navigation commands) form a program (PG). A program usually
corresponds to one scene. Programs and video objects (nominally a GOP) form a program chain (PGC). A
program chain is separated into the control information (PGCI) and the video object (VOB). PGCI acts as an
address table pointing to cells, thus defining the playback order of Programs.
The Part of Title (PTT) helps to construct multiple versions of the same title. A DVD title can have only one or
multiple program chains. Interactive functions such as PTT searches, directorís cuts, and parental lock-outs
can be achieved by creating the title as a multi-PGC_title, with different directorís cuts and different rated versions on different program chains.
Simulation and Verification
After all the media elements and control information are multiplexed into one stream, simulation testing is to
be performed. The stream must guarantee that audio, video, and sub-pictures are synchronized; otherwise,
the content must be re-edited or re-encoded. Besides synchronization, interactive functions may also be simulated and verified.
References
[1] DVD Format, TOSHIBA, DVD Forum April 1996.
[2] DVD Presentation Data Specifications, VICTOR Company of Japan Ltd., DVD Forum, April 1996.
[3] C. Fogg, DVD Technical Notes, July 1996.
[4] Interactive Functions, HITACHI Ltd., DVD Forum, April 1996.
Table 1. Storage Requirements for Each Media Element in Average-Bit-Rate Calculation Example
Media Element
Total
Length
120 minutes
Average Bit Rate
Total Storage Requirements
0.384 Mbps per language
4 Sub-picture streams
Reserved
120 minutes
0.01 Mbps per language
4*120*60*0.384Mbps/8 = 1382
MB
4*120*60*0.01Mbps/8 = 36 MB
Video
120 minutes
4% of 4.7 Gbytes
SUBTOTAL
3094/(120*60)*8= 3.43 Mbps
188 MB
1606 MB
3094 MB
4 Language Tracks
Sampling Frequency
Number of Bits
Transfer Rate
Number of Channels
Table 2. Audio Data Specifications
Linear PCM
Dolby AC-3
48K, 96K
48K
16/20/24 bits
compressed
max. 6.144 Mpbs
max. 448 kbps
max. 8
max. 5.1
MPEG Audio
48K
compressed
max. 640 kbps
max. 7.1
Erzeugen von VOB Files (StreamWeaver 5.4)
Authoring DVD-Video titles [using CDMotion] is a multi step process.
In the first step the motion video and audio files are captured to digital format as a function of MPEG encoding. This is done using an encoder suitable for the task, that is an encoder that meets at least the minimum
requirements for DVD-Video content. This first step may also include the capture or rendering of raster image
bit map files that are to be used in the DVD-Video title.
In the second step of DVD-Video authoring, the stream content files are created using the Track StreamWeaver tool, the subject of this help file section.
53
When motion video and audio files are captured by the MPEG encoder, they are stored on the development
station as "elementary" stream files. In this format the files are not suitable for use in DVD-Video. They must
be combined into the multiplex file format which in DVD-Video is referred to as the VOB file. VOB files are
created using StreamWeaver.
StreamWeaver accepts as input a number of different types of files. These are then combined by StreamWeaver into the DVD-Video VOB file format. StreamWeaver makes certain assumptions regarding the content of a file based on the files name extension. It is very important that the file name extension correctly
identify the type of content within the file. The extension types supported by StreamWeaver and the assumed
file content for each type are:
•
•
•
•
•
•
•
•
*.M2V
*.MPV
*.AC3
*.MPA
*.WAV
*.BMP
*.SUP
*.PLT
MPEG2 Video
MPEG1 Video
Dolby AC3 Audio
MPEG1 or MPEG2 Audio
PCM Audio
Windows BMP Raster Image
DVD-Video Sub Picture
DVD-Video Sub Picture Palette
Failing to comply with these file naming conventions will result in errors occurring during the multiplexing
process.
54
Weiterführende Dokumentation
In diesem Kapitel finden Sie Hinweise auf verwendete und weiterführende Dokumentation, geordnet nach
den Gebieten MPEG Referenz, MPEG in der Praxis, MPEG Clips, Videotechnik, DVD.
MPEG Referenzdokumentation
1. C-Cube: Compression Technology : An MPEG Overview
http://www.c-cube.com/technology/mpeg.html
2. MPEG Group FAQ -- http://www.crs4.it/~luigi/MPEG/ ; MPEG FAQs and standards
3. MPEG.org – http://www.mpeg.org/ ; MPEG Pointers and Resources
4. Berkeley Multimedia Research Center
MPEG-1 FAQ - http://bmrc.berkeley.edu/frame/research/mpeg/faq/mpeg1.html
MPEG-2 FAQ - http://bmrc.berkeley.edu/frame/research/mpeg/faq/mpeg2.html
5. Haskell, B.; Puri, A.; Netravali, A.; Digital Video: An Introduction to MPEG-2; Chapman & Hall, 1997.
6. Orzessek, M.; Sommer, P.; ATM & MPEG-2: Integrating Digital Video Into Broadband Networks; HewlettPackard Professional Books, 1998.
7. Symes, P.; Video Compression; McGraw-Hill, 1998.
8. Fraunhofer Institut: MPEG 1 Layer 3 (Bestandteil des Fraunhofer MP3 Encoders)
MPEG Encoder, Player, Tips und Tricks
1. Ligos: Guide to MPEG Encoding
http://www.ligos.com/support/guide2MPEG.pdf
2. Trixter's Desktop MPEG Authoring FAQ –
http://www.oldskool.org/mpeg/mpegfaq.html
3. Markus Zingg’s AVI_IO – http://www.nct.ch/multimedia/avi_io/
excellent shareware application that works around the AVI 2 GB file size problem
4. Camel’s MPEGJoin – http://extra.newsguy.com/~theprof/Readme.html
useful utility for joining MPEG streams together
MPEG Clips
1. Darim MPEG Clips
ftp://ftp.darvision.com/pub/mpegs/
2. Ligos MPEG Clips
http://www.ligos.com/products/sample_clips.shtm
Videotechnik allgemein
1. John McGowan’s AVI Overview – http://www.rahul.net/jfm/avi.html ;
Probably the best resource anywhere on details regarding the AVI format
2. Interactive Technology Primer – http://tlc.nlm.nih.gov/resources/publications/primer/primer.html;
the most complete guide to the “big picture” of multimedia
3. Color FAQ
http://www.inforamp.net/~poynton/notes/colour_and_gamma/ColorFAQ.html;
explains all color spaces and gives transformation matrices from RGB to YCrCb
4. AV Video Multimedia Producer- http://www.avvideo.com/
5. Camcorder & Computer Video – Miller Magazines, 4880 Market St., Ventura, CA 93003
6. Computer Videomaker - http://www.videomaker.com/
7. DV (Digital Video) - http://www.dv.com/
8. Videography - http://www.vidy.
9. Multimedia-Datenformate
http://i31www.ira.uka.de/~semin94/Seminar.html
http://i31www.ira.uka.de/~semin94/02_JPEG/ (JPEG)
http://i31www.ira.uka.de/~semin94/06_MPEG/main_html.html (MPEG)
55
DVD
1. DVD Authoring Tool Scenarist
http://www.scenarist.com/products/index/snt_fam.html,
http://www.mtc2000.com/main.html
2. DVD FAQ at Videodiscovery – http://www.videodiscovery.com/vdyweb/dvd/dvdfaq.html
the basics of DVD
3. A Day at the DVD Forum – http://reality.sgi.com/nemec/dvd.html
technical notes on the requirements of MPEG-1 and MPEG-2 for DVD applications, highly recommended
4. Disctronics and Freehand DVD Video website - http://www.dvd-video.co.uk/
technical documents on DVD Video specifications and requirments
5. The Challenge of DVD Authoring
http://www.scenarist.com/white_papers/wp_challenge.html
6. DVD Authoring Tool Sonic DVD
http://www.dvdit.com/
7. DVD Authoring Tool Minerva
http://www.minervasys.com/dvd_solutions/dvd_default.html
8. Publishing in the Age of DVD
http://www.dvdcreator.com/pdf/dvd_primer.pdf;
hervorragende Einführung in alle Aspekte der DVD-Video Produktion
9. DVD FAQ
http://perso.libertysurf.fr/dvdutils/start_dvdfaq.htm
10. DVD Utils Frequently Asked Annoying ;-) Questions
http://perso.libertysurf.fr/dvdutils/start_f2aq.htm;
behandelt DVD Player, Laufwerke aus Sicht „engagierter“ Benutzer (region-free, macrovision, ...)
11. Digital Video Disc: The Coming Revolution in Consumer Electronics
http://www.c-cube.com/technology/dvd.html;
hervorragende Einführung in alle Aspekte der DVD-Video
12. DVD Utils
http://www.dvdutils.com/;
die Seite für den technisch interessierten, „experimentierfreudigen“ DVD-Enthusiasten
13. DVD Tools
http://perso.libertysurf.fr/dvdrip/
56
Anhang
In diesem Anhang finden Sie Auszüge aus ergänzender Dokumentation (siehe zuvor).
An Interactive Technology Primer (Auszug: Compression)
This document is accessible from The Learning Center home page at http://tlc.nlm.nih.gov under "Resoures"
"Publications.
Role of Compression in Digital Multimedia
Compression defined
Since digital multimedia files take up so much space and take so much time to transfer and present, they are
often compressed. Compression involves reducing the size of a file for storage and transmission and reconstituting (decompressing) the information for presentation. There are different compression-decompression
algorithms (CODECs). Information can be stored in compressed or uncompressed form on both digital optical or magnetic media. Special hardware may be required to decompress and display compressed information in some cases; in other cases only software may be needed. In the latter, display rate will depend more on
the speed of the computer's microprocessor and the speed at which information can be transferred from a
compact disc or hard disk or sent over a network. Compression can be done for still images, motion video,
and audio.
7.2 Still compression
One compression method, run length encoding, identifies adjacent pixels or lines on the screen having the
same luminance (brightness) and chrominance (color) and records this value along with how many times it
should be repeated. For example, instead of using 10 bytes to denote 5 red pixels followed by 5 blue ones by
encoding the information as RRRRRBBBBB, only 4 bytes are needed if the information is encoded as 5R5B.
Another compression method, differential pulse code modulation, records only the differences between adjacent pixels. A third method, discrete cosine transformation, samples screen pixels at different intervals and
uses the luminance and chrominance values from these pixels to estimate the values of the intervening ones.
Light values outside the range that the eye can detect are discarded and more luminance information is
sampled than chrominance, since the eye is more sensitive to brightness. A fourth method, fractal compression, does not represent pixels, but uses mathematical formulas called fractals. It is based on the assumption
that image objects are made up of smaller objects that are just like them. For example, the entire sky is made
up of patches of sky that look like it. Fractal compression finds these image relationships, generates formulas
representing them, and discards pixel data. The result is high compression that allows scaling images to any
size without distortion. Since pictures can be scaled to be larger than originally, the technique can be used for
image enhancement. Scanners are used to digitally capture slides and photos.
7.3 Motion compression
Full motion video involves recording images at a rate of 30 per second. Lots of images must be recorded, but
the information within each image is mostly redundant with prior and subsequent ones. Usually, the only thing
different in each image are those objects that have moved or changed position from one frame to the next.
This compression involves sampling full frames at specified intervals, compressing them, and then only compressing those parts of the intervening frames that have changed from one frame to the next. Video capture
boards often are needed to digitally capture motion episodes and they may be needed to display or enhance
the display of the recorded files. There are some software only CODECs for presenting compressed motion
video files, including Apple's Quicktime (for Macintosh and Windows), Microsoft's Video for Windows, and
MPEG (a compression standard from the Motion Picture Experts Group of the ISO for a variety of platforms).
Indeo is compression software from Intel for capturing motion and creating compressed files in Video for
Windows audio video interleave (.avi) or Quicktime (.mov) formats. It is a derivative of the older Digital Video
Interactive (DVI) technology. There are two MPEG video compression formats -- MPEG or MPEG-1 and
MPEG-2 that create .mpg files. MPEG-2 is more recent and is used in digital video disc and digitally
broadcast video. It is a scalable compression standard offering several levels of audio quality and a variety of
frame sizes and transfer rates. Low level provides 352 x 240 pixel displays at 30 frames per second (fps) at a
maximum bit rate of 4mbs. Main level provides 720 x 480 pixel displays at 30 fps at maximum bit rate of
15mbs. High 1440 provides 1440 x 1152 pixel displays at 30 fps at a maximum bit rate of 60mbs. High 1920
provides 1920 x 1080 pixel displays at 30 fps at a maximum bit rate of 80mbs.
57
7.4 Audio compression
There are several types of audio compression. Digital audio compression is a function of the sampling rate,
usually in kilohertz (kHz), at which sound is originally captured, the number of bits used to store the captured
sound, and how the bits are allocted. Higher sampling rates and the use of more bits results in higher quality.
Monoral sound is produced when all bits are allocated to a single channel and stereo sound is produced
when they are divided among two channels. Sometimes the quality of a compression level may be equated
with "telephone" quality or "CD" quality. MPEG 3 (MP3) or MPEG level 3 is a standard to compress audio in a
way that approaches the quality of CD audio.
7.5 Types of compression
Compression can be either symmetrical or asymmetrical, lossless or lossy. In symmetrical compression,
compression and decompression take the same time. In asymmetrical compression, compression is longer
than decompression. Lossless compression means there is no deterioration of the image; lossy means there
is. Low compression ratios of, say, 2 to 1 can be lossless, but the higher ones needed for digital multimedia,
usually 50 or 60 to one, are lossy. This means that any one of the following may change: 1) the size of the
image, 2) the resolution of the image, 3) the amount of color in the image, and 4) if motion is used, the rate at
which individual pictures or frames can be displayed. When motion compression is done, there can be intraframe and interframe compression. The former is compression within a given frame or picture and the latter is compression between frames or pictures (e.g., of the changes between one picture and the next).
Scalable motion compression sacrifices the quality of the image and sound data to maintain a specified rate
of motion or frame rate (e.g., 30 frames per second or 15 frames per second). Scalable timing motion compression is when audio quality is preserved and frames are dropped to insure an uninterrupted soundtrack.
AVI Overview (Auszug MPEG)
by John F. McGowan, Ph.D.
(c) 1996-1999, John F. McGowan
http://www.rahul.net/jfm/
How to convert AVI to MPEG?
AVI to MPEG Conversion at a Glance
Company/Author(s) Product
Price
URL
--------------------------------------------------------------------Corel
PhotoPaint
$500?
http://www.corel.com/
Ulead
MPEG Converter
$249
http://www.ulead.com/
Xing Technologies
XingMPEG Encoder $89
http://www.xingtech.com/
XingMPEG Encoder 2 (May 6, 1997 release)
CeQuadrat
PixelShrink
$199
http://www.cequadrat.com/
Vitec
MPEG Maker
$125
http://vitechts.com/
MainConcept
MainActor
shareware http://www.mainconcept.de/
avi2mpg1
Unknown
freeware
http://www.mnsi.net/~jschlic1/
Stefan Eckhart and others CONVMPG3 freeware kit:
http://www.powerweb.de/mpeg/msdos.html
Ligos Technology
LSX-MPEG Encoder
$179.95
http://www.ligos.com/
------------------Further information, reviews, and live links follow:
58
The following posting from the comp.graphics.animation USENET
newsgroup provides a good answer to this question. I have retained
the header to insure proper credit to the author.
Note: LW refers to the Lightwave 3D animation software package.
Hi, I use LW to do animation, and basically I am not happy with any of the
compression engines aviable for avi. Those codes suck. So what I want to do
is make an UNCOMPRESSED AVI and then translate it to MPEG. anyone know of
any good converters to MPEG or hoe about plug-in for LW to be able to do
MPEG files from the start.
Thank You
Hi,
you're right, every single AVI compression codec is lame.
5 years of the AVI format existance and zero progress so far.
If you're talking about freeware or budget-priced MPEG codecs,
it's a tough task, to find the damn thing. I'm busy in this area
quite for a while already, and here are my findings:
1. XING's MPEG encoder is a classical name on the scene. Had
compatibility problems before, not anymore, I believe. Can cost
you $150 or more, not sure. Scan for 'XING' on the Net, you'll
definitely find some tracks (www.xing.com doesn't show up).
2. Stefan Eckart's CMPEG (DOS) encoder is FREE and GOOD, and stays
so for a couple of years already. Can have troubles converting
some particular streams, but generally not worse than many
commercial programs. (You need to make a TGA sequence first out
of your AVI, though). Again, scan for CMPEG, or use my bookmarks
found on the site Im introducing below.
3. To my surprise, Corel Photopaint 6 has got very decent built-in
MPEG compression option. Open an AVI, Save As an MPEG, and see what
happens (get some coffie, as it'll take a while ;) I checked it out
on a stream where CMPEG gave up and the Corel's conversion did make
a wonder. (If you like to see the result, download my 'Liquid Beatles'
morph clip, 1 Mb: http://www.proteon.nl/synth_art/movies/cross.mpg).
4. Ulead's MPEG converter (www.ulead.com) seems to be the major
player (priced below $250) on the Windows arena. I've heard good
references about their MPEG's quality, but I feel that their
biggest advantage is good integration with Windows and AVI format.
If I'm not mistaken, a very slow codec.
5. Don't mess with DARIM Vision's codec (Korea). I've tried their
demo, it produces low-quality crap. Though fast and cheap (you bet :-).
See my MPEG clips, fractals, morphs, and in general lots of
advanced graphics at
http://www.proteon.nl/synth_art/
Hope this helps,
Valery
http://www.proteon.nl/synth_art/movies.html
--------------------------------------------------------------In addition to the above, there is MPEG Maker from
VITEC-HTS (formerly Vitec Multimedia). Vitec is:
59
Vitec
4366 Independence Court, Suite C
Sarasota, FL 34234
Voice: (941) 351-9344
FAX: (941) 351-9423
http://vitechts.com
CeQuadrat makes a software-only AVI to MPEG converter called
PixelShrink. CeQuadrat is:
CeQuadrat
1804 Embarcadero Road, Suite 101
Palo Alto, CA 94303
Voice: (415) 843-3780
FAX: (415) 843-3799
http://www.cequadrat.com/
And the freeware kit CONVMPG3, a collection of MS-DOS
utilities that can be used to convert AVI to MPEG-1 or
MPEG-1 to AVI. CONVMPG3 includes Stefan
Eckhardt's CMPEG MPEG-1 encoder mentioned above
but also includes utilities to generate the sequence
of Targa files required by CMPEG. The URL for CONVMPG3 is:
http://www.powerweb.de/mpeg/msdos.html
avi2mpg1 is a freeware command line application for Windows 95/NT
that can convert AVI to MPEG-1, supports audio, video, and
interleaved audio/video.
http://www.mnsi.net/~jschlic1/
MainConcept's MainActor product now (March 1997) includes
add-on modules to output MPEG-1 and MPEG-2. With these add-on
modules, MainActor can convert AVI to MPEG-1 or MPEG-2.
Marcus Moenig at MainConcept provided an evaluation copy of the
MPEG-1/2 modules.
In tests, these modules could convert AVI files
to MPEG-1 that could be played using the ActiveMovie software
MPEG player shipping with Microsoft's Windows 95 OSR2.
MainConcept is:
MainConcept
http://www.mainconcept.de
The URL for Ulead is:
Ulead MPEG Converter
http://www.ulead.com/
On May 6, 1997, Xing announce a new product, the Xing MPEG Encoder 2
which accelerates MPEG encoding using Intel MMX instructions on PC's.
The original Xing MPEG Encoder did not use MMX instructions.
The Xing MPEG Encoder 2 can convert AVI and WAV files to MPEG-1.
The URL for Xing is:
Xing Technology Corporation
http://www.xingtech.com/
Ligos Technology markets an LSX-MPEG Encoder to convert
AVI to MPEG-1 and MPEG-2
Ligos Technology
1475 Folsom St. Suite 200
60
San Francisco, CA 94103
+1-415-437-6137
+1-415-437-6139 FAX
[email protected]
http://www.ligos.com/
For further information on the MPEG digital audio and video
format see Tristan Savatier's comprehensive MPEG site:
http://www.mpeg.org/
and The MPEG Home Page:
http://drogo.cselt.it/mpeg/
61