Date Posted: February 28, 2005
Update: May 21, 2009
New version fixes some problems with processing large collections and appending or reprocessing collections, as well as an updated set of classifiers.
What is IBM Multimedia Analysis and Retrieval System?
IBM Multimedia Analysis and Retrieval System (IMARS) is a powerful system that can be used to automatically index, classify, and search large collections of digital images and videos. IMARS works by applying computer-based algorithms that analyze visual features of the images and videos, and subsequently allows them to be automatically organized and searched based on their visual content. In addition to search and browse features, IMARS also:
- Automatically identifies, and optionally removes, exact duplicates from large collections of images and videos
- Automatically identifies near-duplicates
- Automatically clusters images into groups of similar images based on visual content
- Automatically classifies images and videos as belonging or not to a pre-defined set (hereafter called taxonomy) of semantic categories (such as ‘Landmark’, ‘Infant’, etc.)
- Performs content-based retrieval to search for similar images based on one or more query images
- Tags images to create user defined categories within the collection
- Performs text based and metadata based searches.
How does it work?
IMARS is comprised of the IMARS extraction tool and the IMARS search tool. The IMARS extraction tool takes a collection of images and videos from the user, and produces indexes based on mathematical analyses of each piece of content. These indexes organize the results of the analyses for the IMARS search tool. IMARS Extraction tool The IMARS extraction functionality is enabled by two main categories of computer algorithms that work together to bridge the “semantic gap” for images and videos:- The first category is visual feature extraction, which works by using the computer to analyze the pixel-level contents of each image and video, and create a multi-dimensional vector description of its visual features. Since there are many important dimensions of visual contents, such as color, texture, shape and spatial layout, IMARS utilizes a large set of visual feature extraction algorithms that extract descriptors across a wide array of visual dimensions.
- The second category is visual semantic extraction, which works by applying machine learning techniques to the extracted visual descriptors. IMARS is supported by a broad array of pre-trained semantic classifiers that automatically identify whether each new image and video belongs to one or more of the pre-defined semantic categories in the taxonomy based on its extracted visual descriptors. IMARS provides additional capabilities based on unsupervised classification that cluster the images and videos purely based on their extracted visual descriptors, without assigning them any label, and allow searching based on visual similarity.
About the technology author(s)
This tool was developed by the IBM T. J. Watson Research Center Multimedia Research team: John R. Smith, Apostol (Paul) Natsev, Jelena Tešić Lexing Xie, and Rong Yan.
