Skip to content. | Skip to navigation

Personal tools


You are here: Home / Demos / Automatic recommendation of lectures

Automatic recommendation of lectures

We demonstrate a generic (user-independent) recommendation system over the inEvent repository (containing talks from Klewel plus metadata imported from In a generic recommendation task, the goal is to predict lectures that are related to a given lecture that is being viewed, without any knowledge of a user's profile. For each of the talks in the repository we recommend top five most similar talks based on text and audio content.

To view the demo, please use the following link: Automatic recommendation of lectures screenshot-demo-automatic-recommendation-of-lectures.png

Regarding the text-based recommendations, the proposed solution uses techniques from content-based recommender systems to compute similarity between lectures based on their content descriptors, through standard vector space models. We make use of all available hyper-event meta-data such as titles, speaker names, descriptions, subtitles, and slide titles with start time, in addition to words extracted through speech recognition (or manual transcripts when they are available).

Initially a vector is built for each hyper-event (each position corresponding to a word of the vocabulary), with weights computed using TF-IDF coefficients. Then the feature vectors are projected to a low-dimensional semantic space with Latent Semantic Indexing and finally, a talk similarity matrix is computed with proximity measures (such as cosine similarity).

Regarding the audio-based recommendations, a speaker diarization and linking system are used to structure the dataset by finding the audio segments spoken by all speakers. Then, a segment similarity matrix is computed, which in turn is converted to a talk similarity matrix by averaging the pairwise similarities of segments per talk.

Using the ground truth provided by TED human-made recommendations (available at, we have evaluated the performance of recommendation and found also that using the entire feature set is an optimal solution. Other models have been evaluated as well (in the publication cited below), based on semantic vector spaces using LSI, LDA, Random Projections, and Explicit Semantic Analysis.

Pappas N. and Popescu-Belis A., “Combining Content with User Preferences for TED Lecture Recommendation”, Proceedings of the 11th International Workshop on Content Based Multimedia Indexing (CBMI 2013), Veszprém, Hungary, 2013