Audio-video processing

Demo Segmentation ACMMM Grand Challenge

The temporal Segmentation and Annotation Grand Challenge for ACM Multimedia 2013 required, amongst other things, to perform an accurate segmentation of the input video. The algorithm classifies each frame as talk (when the lecturer is shown), presentation (when the camera shows the screen), mix (when both the lecturer and the screen are shown), and blackboard (when the lecturer is writing on the blackboard).

Demo Segmentation TED videos

Another segmentation task is related to the videos from the TED dataset. Here a segmentation algorithm is necessary to separate the content into talk and presentation segments, and to remove the introduction and the commercial appearing at the start and end of the videos. After the segmentation different algorithms can be applied to the different segments.

Demo Presentation Analysis

One example of algorithm applied to presentation data is the following: presentation content is analysed in order to detect changes from one slide to the next, to detect animations or dynamic content (such as videos inside the presentation).

Demo Face Detection and Tracking

Detection and tracking of faces is performed at each frame and allows to track, for each person, the eyes and mouth position, the relative distance from the camera and whether the face is frontal or profile. This also enables more complex tasks such as person recognition or rating the involvement of a person in a meeting. In this demo, red rectangles represent profile faces, while blue ones represent frontal faces. The position of the eyes is marked with white circles.

