Video Transcoding: January 2009

Thursday, January 29, 2009

Resizing SD to HD

Let’s now take a look at up rezing from SD to HD and a couple of different available options. We are going to consider NTSC and assume square pixels for simplicity.

SD has an aspect ratio of 4:3 with video frame with horizontal dimension of 640 and a vertical dimension of 480 pixels or 640 x 480 (we always count horizontal first.) The two flavors of HD are 1280 x 720 and 1920 x 1080 with an aspect ratio of 16:9. Figure 2 shows frames with the same proportions to illustrate.

The simplest approach is to stretch the SD image to fill the HD frame. But this results with very fat looking people. Not very pretty. The next easy option is to enlarge the SD image in the vertical dimension to match HD first. For that we compute the ratios of the vertical sizes. For 720 we have 720/480 = 1.5. So we need to enlarge the horizontal size by the same ratio to maintain the picture proportions: 640 x 1.5 = 960. That means that we are left with 1280 – 960 = 320 in width without picture. We could split the 320 in half to create two vertical black bands of 160 pixels in width each.

Now for the 1080 case we have 1080/480 = 2.25 and 640 x 2.25 = 1440. So we are left with a 1920 – 1440 = 480 wide area of the 16:9 picture that can be split in two bands, one on each side of the SD image of 240 pixels in width.

In both cases (720 and 1080) the resulting video looks like figure 3.

There are some algorithms that stretch the image horizontally to fill the entire HD frame at the expense of distorting the original image. Some algorithms evenly stretch the image horizontally and others keep the center of the original image intact and incrementally stretch out to the sides (kind of like a fish eye effect.)

Next post: Resizing affects color?

Monday, January 26, 2009

Under $1000 Encoding Software Reviews

Below are two links to really good reviews for under $1000 video conversion software for the mac and pc.

They contain samples and compare not only quality but also speed.

Check them out:

http://digitalcontentproducer.com/videoencodvd/revfeat/encoder_shootout_1208/index.html

http://digitalcontentproducer.com/videoencodvd/revfeat/expertise_encoder_shootout_0109/index3.html

Thursday, January 22, 2009

Resizing HD to SD

Let’s look at a very common problem these days: Down rezing from high definition (HD) to standard definition (SD) in NTSC.

HD comes in two flavors, 720 or 1080, which refers to the vertical dimension of the video frame. Considering that the image has square pixels (as in the previous post) the dimensions are:
Horizontal = 1280 and Vertical = 720 where the aspect ratio is 1280/720 = 1.78 or more commonly known as 16:9 (16/9 = 1.78 .)

And the other flavor where Horizontal = 1920 and Vertical = 1080 that also gives an aspect ratio of 16:9. The full NTSC frame dimensions are Horizontal = 640 and Vertical = 480 with an aspect ratio of 4:3. (Figure 3)

If we want to shrink the image without distortion we have to make a “Letterbox” 4:3 meaning that it will have black horizontal bands at the top and at the bottom of the frame. Otherwise people would look really skinny. (Figure 4)

Let’s say that we are working with 1080. We know what the horizontal dimension should be in SD NTSC; 640 and that the HD horizontal dimension is 1920. To find the resizing factor we compute the ratio of the horizontal dimensions 640/1920 = 1/3. We are shrinking the image by a third. And to keep the proportions in the HD image unaltered we need to also shrink the vertical dimensions by one third. That means that the new vertical dimension is 1080/3 = 360 pixels high. But since the full NTSC frame is 480 we have 480 – 360 = 120 pixels that are unused. That means that each black band above and below the letterbox image is 60 pixels high.

The most common method to fill an entire 4:3 target image with a 16:9 source is to do a Pan and Scan. That means that we use a 4:3 area of interest inside the 16:9 picture and move it around to where the action is in the image. Of course the draw back is that we miss a good portion of the original image. It is a costly and time consuming process that is performed manually to a great extent (figure 5.)

Next post I will cover the "Up Rezing" algorithms and tips to up convert SD to HD.

Thursday, January 15, 2009

Resizing Algorithms

In this article I will describe the most common algorithms for resizing images and in particular how to convert images to a smaller size (“Down Rezing.”)

Resizing bitmaps or pixel based images to a smaller dimension consists of a mapping of a larger group of pixels in the original picture to a smaller group of pixels or even a single pixel. This seems pretty obvious; you start with a larger image (more pixels) and end up with a smaller one (less pixels.) The important question is how you go about it. What algorithm will produce the best quality picture?

When you are shrinking an image to half its size you could simply decimate or discard every other pixel. But you end up with a very bad looking picture.

Let’s say that you divide the target image in a grid. Now, each cell in this grid corresponds to one pixel. We’ll assume that the dimensions of the source image are multiples of the target image dimensions to make things easier. You then make the same grid division on the source image so that you end up with the same number of cells in the grid. The cells in the source image grid will now contain more pixels than the target image. (Figure 1)

In nearest neighbor interpolation (Figure 1) you choose the pixel that is closest to the center of the grid without any regard to the other pixels in the grid. The images produced by this algorithm are of very poor quality as much of the detail is lost because none of the surrounding pixels are taken into account when computing pixels for the new image.

Bi-linear is another fast algorithm that is very common that consists of entering four neighboring pixel values into a bi-linear type equation (it computes the weighted average of the four pixels). The result of this equation gives you the value of the pixel in the target image.

Bi-cubic interpolation produces much better quality pictures than bi-linear, but it takes much more processing time as the equations for calculating the value of pixels in the target picture are more complex. It takes into account the values of neighboring pixels and how they change.

There are many other methods such as fractals and super sampling that produce even better quality images. They usually take longer to compute also, but if quality is what you are after, you should consider using them. The link below shows some very good samples of the each method. The samples on the page show enlarging, but I hope you get the idea.

http://www.dpreview.com/learn/?/key=interpolation

Next post will cover the specifics of resizing High Definition content to Standard Definition.

Monday, January 12, 2009

CES 2009

First, I want to thank all of you for taking the time to give me such valuable feedback for the blog.

Now, I will deviate a bit from what I usually write to report what I thought was notable at the CES show in Las Vegas last week.

In general, there were many products for consumers to view video on the internet. There were many devices that connected directly to the internet and offered direct access to yahoo! or YouTube videos. The internet is really making its way into the living room with monitors that are networkable as well as very inexpensive network appliances (under $200) that let you view videos on the internet using a simple remote control.

Monitor resolutions, contrast and size have greatly improved. I saw some 4K monitors that looked fantastic, but for which there is no content for. High Definition, solid state drive video cameras are now the norm from every manufacturer. Many of the cameras use the AVCHD codec.

The definite trend is that video on-demand and on the internet will be the norm for years to come.

Next post I will try cover some key concepts for resizing videos and how it affects quality.

Wednesday, January 7, 2009

Generation Loss. Avoid it!

Happy new year to all. I hope that everybody had a nice time over the holydays.

Generation loss is a term used to describe the degradation of video or audio quality when making copies. It is usually the primary cause for bad quality video. In this article I will give some tips to avoid it and encoding tips when it is unavoidable.

Let’s look at how generation loss happens. For example, you shoot a video with your firewire camcorder and produce a camera original that it is stored on a tape. You then transfer that video to your windows computer using movie maker. The result is a windows media file that is the second generation from the camera original. Depending on the settings that you choose, there will be varying degrees of quality degradation. Now suppose that you would like to send the video to YouTube, but the file is too big for YouTube to accept it. You are now forced to convert the video into a smaller file, so you open the second generation windows media file with movie maker one more time. Now, you play with the parameters to compress the video further and manage to produce a file that is now small enough to be accepted by YouTube. This file is now the third generation from the camera original with more quality degradation. So you upload it to YouTube and a few days later it shows up on the website, but it has now been converted to Flash format and it doesn’t look very good. This is now the fourth generation from the camera original.

The obvious first tip to avoid generation loss is not to cause it in the first place. In the above example, when you get ready to transfer the video from the camera to the computer, you could choose video encoding parameters that will produce a file that is smaller than the maximum required by YouTube. You would go straight to second generation before uploading to YouTube and thus saving one generation loss. If that sounds too complicated you can go to a free site www.compressmyvideos.com where you can transfer the videos from your DV camcorder, phone, webcam or even use a file and make the output file less than 100MB.

But you could also transfer the video from the camcorder to a lossless format or in the case of digital cameras, to a format that is an exact copy of what is on the tape without any loss. Some software packages allow you to do that, including movie maker. Now, the size of the file will be pretty big. For a DV AVI the file size is roughly 215MB per minute. For uncompressed YUV video files (4:2:0) it is roughly 1.25GB per minute. For full RGB uncompressed video you need about 1.87GB per minute in NTSC (29.97 frames per second.) Always try to make a lossless copy of the original video. If it came from a DV camcorder, use a DV AVI or “bump it up” to YUV or RGB.

With tapeless cameras it is easier to obtain a lossless copy of the camera original as you can just copy the file to your computer via USB. Just make sure that you have the appropriate codec to “decode” this file installed in the computer other wise you might be forced to use some kind of software utility to convert it to another format that your computer knows how to “decode.” This can potentially induce a generation loss.

The next tip is not so obvious, but it has a great impact in image quality when making several copies or transcodes. Try not to resize your video. Always try to keep the size of your video picture the same through the conversion process. If you really have to change the size, it is always better to go from big to small.

My last tip is to recommend that if you need to make the video file size small, you need to convert your video using a high quality codec. Some codecs are much better than others. MPEG2 is better than MPEG1 and MPEG4 is better than MPEG2, but VC-1 (windows media) or H.264 or VP6 are better than all the MPEG’s. Always keep looking for new and improved video codecs as technology in this field changes rapidly.

Tomorrow I'll be traveling to Vegas to check out CES. I'll be back with a report of the newest, coolest gadgets for video transcoding presented at the show.

Video Transcoding