It really depends on a task we perform. Let’s consider three tasks:
1. Find a movie of Sean Connery (either seen or not seen before by the user)
This is basically searching by attributes. An example is IMDB but some more interesting visualizations are used in e.g. FilmFinder which uses dynamic queries and a starfield display.
2. Find a video excerpt with a lion and a blond girl in it (either seen or not seen before by the user)
This task is a bit more complicated.
Christel and Martin (Information Visualization within a Digital Video Library) wrote:
Through the use of speech recognition (transcripts of speech), image processing (key frames of each camera shots), and natural language processing (headline generation), the digital video can be:
- segmented into smaller pieces; each segment consists of a contiguous range of video and/or audio (or an extent of text such as a paragraph or chapter) that is deemed conceptually similar
- analyzed to derive additional data (metadata) for creating alternate representations of the video
- augmented with indices for fast searching and retrieval of segments
Several of these ideas are used in today’s products. Key frames are used in e.g. iMovie. You tube has allowed users to add tags to videos which allows social tagging of videos.
Another interesting idea is to navigate videos by emotions. The idea is to record one’s emotions via facial expressions and then let navigate that video by simply
by remembering a feeling. Which opens new ideas to automatically gathered SOCIAL TAGS of FEELINGS. Imagine You Tube (or TV) record emotions of users (anonymously of course) and store these emotions in a database. People wouldn’t even have to tag Now I want to see the top 10 movies in which people cried a lot. I want to see only funniest excepts of another movie. I want to ….
For several other possibilities read the Future applications in FAQ.