Datasets

MMTF-14K: A Multifaceted Movie Trailer Feature Dataset for Recommendation and Retrieval

The MMTF-14K dataset provides a stable and extensive source for devising and evaluating movie recommender systems. MMTF-14K contains audio and visual descriptors in addition to ratings and metadata for 13,623 Hollywood-type movie trailers facilitating research on content-based recommender systems, where content refers not only to metadata, but specifically to visual and auditory characteristics of movies. In addition, the rich data supports the exploration of other multimedia tasks such as popularity prediction, genre classification, or auto-tagging (aka tag prediction).

Mise-en-Scène Dataset: MPEG-7 Visual Features of Movie Trailers

This dataset provides a set of 774 low-level VISUAL features extracted from 3964 movie trailers. The movie IDs are in agreement with the movie IDs provided by "MovieLens (ML) dataset" (ML-20M or Full Version as in May 2017). All the movie titles, ratings and associated movie genres and tags can be collected from the MovieLens website. We used the low-level MPEG-7 Low Level Feature Extraction by Bilkent university namely BilVideo-7 for the extraction of MPEG-7 visual features from movie trailers.

Million Playlist Dataset

As part of the ACM Recommender Systems Challenge 2018, Spotify has released the Million Playlist Dataset (MPD). The 2018 challenge focuses on a novel task in the field of recommender systems and information retrieval: Automatic Playlist Continuation. The MPD comprises a set of 1,000,000 playlists that have been created by Spotify users, and includes playlist titles, track listings and other metadata. In building the MPD we've made every effort to produce a dataset that has real research value while still protecting the privacy of our users. The MPD has some unique research values.