Wednesday, October 7, 2009
Practical Mobile Video Workflow
http://mobiledevdesign.com/tutorials/ensure-video-quality-your-mobile-delivery-100209/
Thursday, June 18, 2009
Open Video Conference
What is open video? (Excerpt from the website)
Open Video is a movement to promote free expression and innovation in online video. Join us for three days of inspiring talks, awesome video and film, open hacking sessions, parties, and cutting edge tech from around the world.
Check it out.
Tuesday, May 19, 2009
The order of processing affects the result.
I always try to follow the order below for best results.
1) De-interlace if needed.
2) Frame rate conversion if smaller than original.
3) Cropping if needed.
4) Resize if reducing image size.
5) Color space conversion.
6) De-noise.
7) Resize if enlarging image.
8) Frame rate conversion if greater than original.
9) Sharpen or other enhancements as needed.
The idea is that first you start with a progressive image to then reduce the size and number of frames to work with smaller data sets. Perform color space conversions and apply any noise reduction if needed. If you are enlarging the image, do it after the color space conversion and noise reduction. Then apply frame rate conversions.
Finally perform any sharpening or other image enhancing filters to the final image.
Thursday, May 14, 2009
Preprocessing
There are six important processes that you should consider applying to your source video prior to encoding:
1) De-interlacing. Will avoid creating line crawling artifacts.
2) Noise reduction. Motion adaptive noise reduction can greatly improve the performance of your encoder as random noise in images produce artifacts in the encoded image.
3) Color Space Conversion. Use of high quality color space conversion algorithms can reduce color bands and “blotches” as bit reduction and color plane decimation (going from RGB or 4:4:4 to 4:2:0 for example.)
4) Resampling and resizing. Will greatly reduce encoding time and improve quality. Specially true if you are down sampling or reducing sizes.
5) Cropping. Same as above.
6) Frame Rate Conversion. Will allow you to use high quality frame rate conversion algorithms. If you are reducing the frame rate, you’ll have the added benefit of faster encodes.
Tuesday, April 28, 2009
Video Encoding Parameter Calculator
Tuesday, April 21, 2009
AviSynth
It is not a video editing or special effects workstation. Its strengths lie on allowing the video professional to pre-process videos to resize, crop or remove noise and to conform video sources with simple cuts. Basically you create a script that indicates AviSynth which filters to apply in what order. If you are not comfortable with writing scripts in text form, you could use VirtualDubMod or MeGUI graphic user interfaces to create them.
AviSynth comes very handy when preparing videos for compression in an automated and streamlined fashion. In particular, you can perform noise reduction, resizing and color conversions to ensure that the compressed video looks as best as possible.
It is an open source project that has many followers. Check it out:
Friday, April 17, 2009
Fonts
Some broadcasters recommend that as a rule of thumb font sizes to be displayed on an HD 1920 x 1080 video should be about 80% of the sizes used in SD. For example, you have been using a font that is 23 scan lines high on SD NTSC. First we consider the ratio of the HD / SD heights: 1080 / 486 = 2.22 and 80% of 2.22 is 1.776. Use 1.776 as the font scaling factor. That means that our new font size is 23 x 1.776 = 41 scan lines high (with rounding.)
Always check how your titles look in the target player and display to be sure that the text is readable and that aesthetics translate.
Wednesday, April 8, 2009
Audio Quality
Below is a list of audio encoding artifacts to check for as well as some tips.
- Lip-sync. Most people find "out-of-sync" audio as unacceptable.
- Audio drop outs and pre- echo are perceived as “very annoying” artifacts by most people. They tend to be more prevalent in material that is percussive like a gunshot or a drum hit. Increase the bit rate as much as possible to compensate.
- Narrow audio program bandwidth. If high audio frequencies have been removed, the audio will sound “nasal” or “muddy”. If Low audio frequencies have been removed, the audio will sound “thin” and “harsh”. Make sure the resulting audio covers as much frequency spectrum as the original. Keep the audio sampling rate the same as the source or above 44.1 KHz for best results.
- “Garbled” audio due to over compression tends to have sort of an underwater sound. These artifacts will be more noticeable in material that has random characteristics like rain, applause or city traffic. Give enough bits to the audio stream. Try to keep audio bitrates above 96 kbs for best results.
- Summing some stereo material to mono will cause phasing artifacts and audio dropouts. Experiment by taking only the left or only the right channel if you hear this artifact. You might get better results.
- Audio re-sampling or clock errors often have a metallic / ringing sound and can produce loud ticks. If you are encoding video from tape or other hardware playback device, make sure the audio is properly clocked and sync-locked. When using AES/EBU or SPDIF audio, clock your input device to the source’s digital signal as a rule of thumb.
- Make sure the audio is noise (hiss) free. Be careful of not going overboard if you have to pre-process the audio with noise suppression as you might induce more serious artifacts. Also, make sure that your audio has the required bit resolution. Today, 16 bit audio is common place. Avoid a lower value.
Thursday, April 2, 2009
De-interlacing algorithms
- Spatial Based
- Motion Based
Some popular spatial based algorithms are:
- Simply weaving the odd field and doubling each line. (Produces aliasing)
- Averaging or blending both fields. (Produces ghosting)
- Edge dependent interpolation techniques between adjacent lines in odd-even fields. (Few artifacts if edge direction is inaccurate)
Some motion based algorithms are:
- Selective blending (or motion adaptive blending) basically only blend areas where there is motion and weaving where there is none. (Few artifacts if motion calculation is correct)
- Motion and edge adaptive algorithms combine edge interpolation techniques and motion adaptive blending. (Fewer artifacts at computational expense)
- Motion compensated algorithms can use the information to find the matching blocks in the neighboring fields. (Fewer artifacts at computational expense)
Be sure to check what algorithms your encoding software implements, it might have a great impact in your final video quality.
Thursday, March 26, 2009
De-interlacing
To reproduce the image in its original resolution on a “progressive” type display –LCD, Plasma, etc.- both fields can be combined. However, because each field is recorded at slightly different time (1/59.94 = 16.68 ms) some artifacts may show up in the resulting picture. These artifacts, particularly visible in scenes where there is fast motion, appear as a kind of line crawling over the picture.
Most video original material is interlaced, but some original film material that has been converted to video (Telecine) might also be interlaced. For that reason, when transcoding video for playback in computer systems or on “progressive” type displays, always check if the video source is interlaced and if so, apply de-interlacing. If the source is a Telecine of a film, use an inverse telecine setting.
The link below gives some great examples of interlacing. Check it out.
Monday, March 16, 2009
VirtualDub
VirtualDub is one of many very useful open source programs that allows you to quickly convert and process video.
VirtualDub provides you with a framework for basic video editing, compression and filtering. It comes with a few plug-ins for encoding, frame rate conversion and image processing, but there are many third party plug-ins in the open source community that can be added very easily.
http://www.thedeemon.com/VirtualDubFilters/
With this powerful application you can stitch videos together, change audio or add overlays very quickly. This is one tool that video compression professionals must have in their arsenal. You can get it here:
http://www.virtualdub.org/
More technical video professionals can create or adapt their own plug-ins for VirtualDub using its open source filter SDK.
http://www.virtualdub.org/filtersdk.html
As with any open source solution, the main issue is support. However, with VirtualDub there is a very large community of developers and users that you can tap to find answers.
Monday, March 2, 2009
Video stream analyzers
Elecard StreamEye
An intuitive interface that lets you navigate and display the stream picture by picture while it determines the real bit rate and compares it to the header values. It will tell you the streams real peak bit rate and how it varies as well as frame data sizes. It gives you access to macroblock information and its file address as it is displayed on the corresponding image.
http://www.elecard.com/products/products-pc/professional/streameye-studio/
MTS4EA Tektronix Compressed video elementary stream analyzer.
A much more powerful tool that can help codec developers debug and optimize codec code. The learning curb is steep around their user interface as it offers lots of features .
It verifies stream compliance against standard. It gives you the ability to perform a differentiating analysis against a reference video. (.yuv file) and calculates PSNR, RMSE, MSE and other measurements. It gives you macroblock statistics. The video view selection may show you the hex code for a macroblock or a macroblock set. It includes a bit stream editor for debugging applications and a batch mode with logging capability. Their image inspector lets you view pixel data for individual macroblocks and separates them into their component channels for YUV or RGB images. They have a macroblock type overlay that lets you see, in color, if a macroblock is an intra 4x4 or inter type block. It also overlays the motion vectors as color coded arrows for B or P frames. It is capable of producing a bit usage histogram and give you buffer analysis.
http://www.tek.com/Measurement/applications/video/mpeg_derivatives.html?WT.srch=1&WT.mc_id=ppc,ggl,vid_aw_us_us_ngt,k3AF2,s,1520098873&
Interra Vega-Media analyzers
I wasn’t able to get a trial version of this product so I can only paraphrase what they have on their website.
It supports H264, VC1, MPEG2, Audio and compares with other encoded streams. It has a YUV Dif utility to evaluate quality against an original video. It provides with syntax analysis tools and frame statistics for analysis of interpolation and prediction.
http://www.interrasystems.com/dms/dms_vega.php
Monday, February 23, 2009
HPA Retreate Notes
Here are my notes.
The EBU has published their recommendations for HDTV video compression for acquisition, production and distribution in this document: http://tech.ebu.ch/docs/r/r124.pdf
The full report is only available to EBU members, but we can take note that:
- Progressive pictures look better than interlaced.
- Acquisition is recommended at 4:2:2 where 8bit is sufficient.
- To manintain quality after 7 cycles of encoding, Long GOP, CBR, MPEG-2 at 50Mbit/sec at the least is recommended (post and production).
- Distribution using H.264/AVC, CBR encoding requires 50% less bit rate than MPEG-2.
- Interlaced video requires 20% more bit rate than progressive video for the same quality.
- The dominant image quality impairments are determined by the distribution codec.
In that panel, there were engineers form three mayor broadcast studios who revealed their own conclusions:
MPEG-4 @ 20Mbits/sec, MPEG-4 @25Mbits/sec and MPEG-2 @45Mbits/sec are used in their facilities to maintain reasonably constant quality.
Monday, February 16, 2009
Automated Transcoding and Publishing
A few companies have made great efforts to automate this kind of work flow, but one that stood out for me recently is Entriq’s Dayport solution: http://www.entriq.com/.
Their solution allows content creators to integrate their workflow with FinalCut (and carry relevant metadata to the content management system), live capture off air or tape (which can be scheduled to trigger automatically), thumbnail generation, create rules for security and usage policy, DRM, syndication and CDN integration for delivery. Companies that syndicate content to multiple partners can automate the process and save lots of time re-formatting metadata, transcoding, etc.
Entriq provides a UGC platform that incorporates an approval process along with download controls and players that make the automation of video workflows from creation to publishing possible.
Wednesday, February 11, 2009
Cloud Video Transcoding
Then there is Panda: http://pandastream.com/. It is an open source solution that runs on Amazon EC2, S3 web services. The main issue with this is licensing as it is an open source application based on ffmpeg.
Zencoder, http://zencoder.tv, in beta also runs on EC2 or your own Linux hardware. It works with On2 flix engine and ffmpeg.
Sorenson squish: http://www.sorensonmedia.com/products/?pageID=1&ppc=12&p=41 It is a distributed, Java based, client side encoding solution that uses the submitter’s computer as the transcoding node.
Now for the plug, Framecaster iNCoder Pro: http://framecaster.com/products.html. It is a distributed, client side, web plug-in. Like squish, it performs all video encoding on the submitter’s computer directly from a DV Cam, Webcam, or a Bluetooth enabled cell phone. It offers a Javascript API to integrate with an asset management system with ease. It can output H.264, FLV (VP6), MPEG4, MPEG2, or WMV.
A couple of questions about Panda and Zencoder are that since they use and deploy ffmpeg: How does one manage the licensing of codecs across all the different IP owners? Is that a potential liability on patent infringement? Maybe someone out there has an answers to those questions.
Monday, February 9, 2009
Standard VIdeo Players
Akamai has spearheaded the Open Video Player initiative to provide a set of Flash and Silverlight classes, sample code and other documentation that serves as a media framework and a path to standardize video players.
The application structure from the engineering perspective is described in figure 1 below.
The sample code will give you a head start in creating and fine tuning your own player that adheres to these set of best practices. There are many samples on how to integrate with advertizing platforms and create custom skins. You’ll also get samples on how to connect to Akamai’s servers and manage different types of streams and playlists.
Check it out: http://www.openvideoplayer.com/
Saturday, February 7, 2009
Video encoding basics
Check it out: http://www.flashconnections.com/?p=66
Wednesday, February 4, 2009
Resizing Affects Color?
4:2:2, 4:2:0, 4:0:0, etc. often refers to a YCbCr or YUV (Y’UV) image where Y is the Luma (intensity or detail) and UV are the chrominance components (color.) You can think of these three components as three planes Y, U and V that when super imposed create a color image. More on the subject here: http://en.wikipedia.org/wiki/YUV.
In the case of 4:2:2 images, the horizontal size of the chrominance planes is reduced by half, usually by discarding every other pixel. 4:0:0 images have both horizontal and vertical their UV planes' size reduced by half. When recreating the original image, the UV planes are enlarged to their original size by using simple linear interpolation methods. This process in itself causes artifacts that are compounded when the video is resized (see previous articles on resizing algorithms.) These errors in the UV planes are more obvious in smooth color or slow gradients.
Whenever possible, it is better to resize your images in the RGB or 4:4:4 space and then convert to a compressed YUV type format to minimize artifacts. Try to follow this rule especially with computer animation or other computer generated images.
Sunday, February 1, 2009
Non Square Pixels
Thursday, January 29, 2009
Resizing SD to HD
SD has an aspect ratio of 4:3 with video frame with horizontal dimension of 640 and a vertical dimension of 480 pixels or 640 x 480 (we always count horizontal first.) The two flavors of HD are 1280 x 720 and 1920 x 1080 with an aspect ratio of 16:9. Figure 2 shows frames with the same proportions to illustrate.
The simplest approach is to stretch the SD image to fill the HD frame. But this results with very fat looking people. Not very pretty. The next easy option is to enlarge the SD image in the vertical dimension to match HD first. For that we compute the ratios of the vertical sizes. For 720 we have 720/480 = 1.5. So we need to enlarge the horizontal size by the same ratio to maintain the picture proportions: 640 x 1.5 = 960. That means that we are left with 1280 – 960 = 320 in width without picture. We could split the 320 in half to create two vertical black bands of 160 pixels in width each.
Now for the 1080 case we have 1080/480 = 2.25 and 640 x 2.25 = 1440. So we are left with a 1920 – 1440 = 480 wide area of the 16:9 picture that can be split in two bands, one on each side of the SD image of 240 pixels in width.
In both cases (720 and 1080) the resulting video looks like figure 3.
There are some algorithms that stretch the image horizontally to fill the entire HD frame at the expense of distorting the original image. Some algorithms evenly stretch the image horizontally and others keep the center of the original image intact and incrementally stretch out to the sides (kind of like a fish eye effect.)
Next post: Resizing affects color?
Monday, January 26, 2009
Under $1000 Encoding Software Reviews
http://digitalcontentproducer.com/videoencodvd/revfeat/encoder_shootout_1208/index.html
http://digitalcontentproducer.com/videoencodvd/revfeat/expertise_encoder_shootout_0109/index3.html
Thursday, January 22, 2009
Resizing HD to SD
HD comes in two flavors, 720 or 1080, which refers to the vertical dimension of the video frame. Considering that the image has square pixels (as in the previous post) the dimensions are:
Horizontal = 1280 and Vertical = 720 where the aspect ratio is 1280/720 = 1.78 or more commonly known as 16:9 (16/9 = 1.78 .)
And the other flavor where Horizontal = 1920 and Vertical = 1080 that also gives an aspect ratio of 16:9. The full NTSC frame dimensions are Horizontal = 640 and Vertical = 480 with an aspect ratio of 4:3. (Figure 3)
The most common method to fill an entire 4:3 target image with a 16:9 source is to do a Pan and Scan. That means that we use a 4:3 area of interest inside the 16:9 picture and move it around to where the action is in the image. Of course the draw back is that we miss a good portion of the original image. It is a costly and time consuming process that is performed manually to a great extent (figure 5.)
Next post I will cover the "Up Rezing" algorithms and tips to up convert SD to HD.
Thursday, January 15, 2009
Resizing Algorithms
Resizing bitmaps or pixel based images to a smaller dimension consists of a mapping of a larger group of pixels in the original picture to a smaller group of pixels or even a single pixel. This seems pretty obvious; you start with a larger image (more pixels) and end up with a smaller one (less pixels.) The important question is how you go about it. What algorithm will produce the best quality picture?
When you are shrinking an image to half its size you could simply decimate or discard every other pixel. But you end up with a very bad looking picture.
Let’s say that you divide the target image in a grid. Now, each cell in this grid corresponds to one pixel. We’ll assume that the dimensions of the source image are multiples of the target image dimensions to make things easier. You then make the same grid division on the source image so that you end up with the same number of cells in the grid. The cells in the source image grid will now contain more pixels than the target image. (Figure 1)
In nearest neighbor interpolation (Figure 1) you choose the pixel that is closest to the center of the grid without any regard to the other pixels in the grid. The images produced by this algorithm are of very poor quality as much of the detail is lost because none of the surrounding pixels are taken into account when computing pixels for the new image.
Bi-linear is another fast algorithm that is very common that consists of entering four neighboring pixel values into a bi-linear type equation (it computes the weighted average of the four pixels). The result of this equation gives you the value of the pixel in the target image.
Bi-cubic interpolation produces much better quality pictures than bi-linear, but it takes much more processing time as the equations for calculating the value of pixels in the target picture are more complex. It takes into account the values of neighboring pixels and how they change.
There are many other methods such as fractals and super sampling that produce even better quality images. They usually take longer to compute also, but if quality is what you are after, you should consider using them. The link below shows some very good samples of the each method. The samples on the page show enlarging, but I hope you get the idea.
http://www.dpreview.com/learn/?/key=interpolation
Next post will cover the specifics of resizing High Definition content to Standard Definition.
Monday, January 12, 2009
CES 2009
Now, I will deviate a bit from what I usually write to report what I thought was notable at the CES show in Las Vegas last week.
In general, there were many products for consumers to view video on the internet. There were many devices that connected directly to the internet and offered direct access to yahoo! or YouTube videos. The internet is really making its way into the living room with monitors that are networkable as well as very inexpensive network appliances (under $200) that let you view videos on the internet using a simple remote control.
Monitor resolutions, contrast and size have greatly improved. I saw some 4K monitors that looked fantastic, but for which there is no content for. High Definition, solid state drive video cameras are now the norm from every manufacturer. Many of the cameras use the AVCHD codec.
The definite trend is that video on-demand and on the internet will be the norm for years to come.
Next post I will try cover some key concepts for resizing videos and how it affects quality.
Wednesday, January 7, 2009
Generation Loss. Avoid it!
Let’s look at how generation loss happens. For example, you shoot a video with your firewire camcorder and produce a camera original that it is stored on a tape. You then transfer that video to your windows computer using movie maker. The result is a windows media file that is the second generation from the camera original. Depending on the settings that you choose, there will be varying degrees of quality degradation. Now suppose that you would like to send the video to YouTube, but the file is too big for YouTube to accept it. You are now forced to convert the video into a smaller file, so you open the second generation windows media file with movie maker one more time. Now, you play with the parameters to compress the video further and manage to produce a file that is now small enough to be accepted by YouTube. This file is now the third generation from the camera original with more quality degradation. So you upload it to YouTube and a few days later it shows up on the website, but it has now been converted to Flash format and it doesn’t look very good. This is now the fourth generation from the camera original.
The obvious first tip to avoid generation loss is not to cause it in the first place. In the above example, when you get ready to transfer the video from the camera to the computer, you could choose video encoding parameters that will produce a file that is smaller than the maximum required by YouTube. You would go straight to second generation before uploading to YouTube and thus saving one generation loss. If that sounds too complicated you can go to a free site www.compressmyvideos.com where you can transfer the videos from your DV camcorder, phone, webcam or even use a file and make the output file less than 100MB.
But you could also transfer the video from the camcorder to a lossless format or in the case of digital cameras, to a format that is an exact copy of what is on the tape without any loss. Some software packages allow you to do that, including movie maker. Now, the size of the file will be pretty big. For a DV AVI the file size is roughly 215MB per minute. For uncompressed YUV video files (4:2:0) it is roughly 1.25GB per minute. For full RGB uncompressed video you need about 1.87GB per minute in NTSC (29.97 frames per second.) Always try to make a lossless copy of the original video. If it came from a DV camcorder, use a DV AVI or “bump it up” to YUV or RGB.
With tapeless cameras it is easier to obtain a lossless copy of the camera original as you can just copy the file to your computer via USB. Just make sure that you have the appropriate codec to “decode” this file installed in the computer other wise you might be forced to use some kind of software utility to convert it to another format that your computer knows how to “decode.” This can potentially induce a generation loss.
The next tip is not so obvious, but it has a great impact in image quality when making several copies or transcodes. Try not to resize your video. Always try to keep the size of your video picture the same through the conversion process. If you really have to change the size, it is always better to go from big to small.
My last tip is to recommend that if you need to make the video file size small, you need to convert your video using a high quality codec. Some codecs are much better than others. MPEG2 is better than MPEG1 and MPEG4 is better than MPEG2, but VC-1 (windows media) or H.264 or VP6 are better than all the MPEG’s. Always keep looking for new and improved video codecs as technology in this field changes rapidly.