ISO/IEC – Information technology – Generic coding of moving pictures and associated audio information – Part 2: Video. Amendment 2 to International Standard ISO/IEC was prepared by. Joint Technical Committee ISO/IEC JTC 1, Information technology, Sub-. ISO/IEC. Third edition. Information technology — Generic coding of moving pictures and associated audio information —. Part 2: Video.
|Published (Last):||8 February 2015|
|PDF File Size:||4.16 Mb|
|ePub File Size:||14.11 Mb|
|Price:||Free* [*Free Regsitration Required]|
In it was extended by two amendments to include the Registration of Copyright Identifiers and the 4: Video compression is practical because the data in pictures is often redundant in space and time.
For example, the sky can be blue 1381-82 the top of a picture and that blue sky can persist for frame after frame. Also, because of the way the eye works, it is possible to delete some data from video pictures with almost no noticeable degradation in image quality.
TV cameras used in broadcasting usually generate 50 pictures a second in 31818-2 or Digital television requires that these pictures be digitized so that they can be processed by computer hardware. Each picture element a pixel is then represented by one luma number and two chroma numbers. These describe the brightness and the color of the pixel see YCbCr.
Thus, each digitized picture is initially represented by three rectangular arrays of numbers. If the video is not interlaced, then it is called progressive video 133818-2 each picture is a frame.
H.262/MPEG-2 Part 2
MPEG-2 supports both options. Another common practice to reduce the amount of data to be processed is to subsample the two chroma planes after low-pass filtering to avoid aliasing. This works because the human visual system better resolves details of brightness than details in the hue and saturation of colors. Video that has luma and chroma at the same resolution is called 4: MPEG-2 supports all three sampling types, although 4: MPEG-2 includes three basic types of coded frames: An I-frame is a compressed version of a single uncompressed raw frame.
It takes advantage of spatial redundancy and of the inability of the eye to detect certain changes in the image. Unlike P-frames and B-frames, I-frames do not depend on data in the preceding or the following frames.
Briefly, the raw frame is divided into 8 pixel by 8 pixel blocks. The data in each block is transformed by the discrete cosine transform DCT. The result is an 8 by 8 matrix of coefficients.
The transform converts spatial variations into frequency variations, but it does not change the information in the block; the original block can be recreated exactly by applying the inverse cosine transform. The advantage of doing this is that the image can now be simplified by quantizing the coefficients. Many of the coefficients, usually the higher frequency components, will then be zero. The penalty of this step is the loss of some subtle distinctions in brightness and color.
If one applies the inverse transform to the matrix after it is quantized, one gets an image that looks very similar to the original image but that is not quite as nuanced.
Next, the quantized coefficient matrix is itself compressed. Typically, one corner of the quantized matrix is filled with zeros.
By starting in the opposite corner of the matrix, then zigzagging through the matrix to isso the coefficients into a string, then substituting run-length codes for consecutive zeros in 13881-2 string, and then applying Huffman coding to that result, one reduces the matrix to a smaller array of numbers. It is this array that is broadcast or that is put on DVDs. 1388-2 the receiver or the player, the whole process is reversed, enabling the receiver to reconstruct, to a close approximation, the original frame.
Typically, every 15th frame or so is made into an I-frame. P-frames provide more compression than I-frames because they take advantage of the data in a previous I-frame or P-frame – a reference frame. To generate a P-frame, the previous reference frame is reconstructed, just as it would be in a TV receiver or DVD player.
The frame being compressed is divided into 16 pixel by 16 pixel macroblocks. Then, for each of those macroblocks, the reconstructed reference frame is searched to find that 16 by 16 macroblock that best matches the macroblock being compressed.
The offset is encoded as a “motion vector. But, if something in the picture is moving, the offset might be something like 23 pixels to the right and 4 pixels up. The match between the two macroblocks will often not be perfect.
To correct for this, the encoder takes iwo difference of all corresponding pixels of the two macroblocks, and on that macroblock difference then computes the strings of coefficient values as described above. This “residual” is appended to the motion vector and the result sent to the receiver or stored on the DVD for each macroblock being compressed.
Sometimes no suitable match is found.
Then, the macroblock is treated like an I-frame macroblock. The processing of B-frames is similar to that of P-frames except that B-frames use the picture in a subsequent reference frame as well as the picture in a preceding reference frame. As a result, B-frames usually provide more compression than P-frames.
B-frames are never reference frames. While the above generally describes MPEG-2 video compression, there are many details that are not discussed including details involving fields, chrominance formats, responses to scene changes, special codes that label the parts of the bitstream, and ios pieces of information.
For many applications, it is unrealistic and too expensive to support the entire standard. To allow such applications to support only subsets of it, the standard defines profiles and levels. A profile defines sets of features such as B-pictures, 3D video, chroma format, etc. The level 113818-2 the memory and processing power needed, defining maximum bit rates, frame sizes, and frame rates.
A MPEG application then specifies the capabilities in terms of profile and level.
The tables below summarizes the limitations of each profile and level, though there are constraints not listed here. Annex E Note that not all profile and level combinations are permissible, and scalable modes modify the level restrictions. From Wikipedia, the free encyclopedia. A main stream can be recreated losslessly. A Main stream cannot be recreated losslessly.
A 13188-2 stream may be recreated losslessly only if extended references are not used.
H/MPEG-2 Part 2 – Wikipedia
Information technology – Generic coding of moving pictures and associated audio information: Retrieved 13 August Retrieved 24 July Retrieved 1 November Multimedia compression and container formats. See Compression methods for methods and Compression software for codecs. Systems Program stream Part 2: Video based on H. Advanced Video Coding H. Scene description Part ISO base media file format Part MP4 file format Part Streaming text format Part Open Font Format Part 1818-2 Parts 2, 3 and 9: Digital Item Part 5: Unified Speech and Audio Coding.
Transport and Storage of Genomic Information Part 2: Coding of Genomic Information Part 3: Reference Software Part 5: MPEG media transport Part 2: High Efficiency Video Coding Part 3: High Efficiency Image File Format.