Alexander Hauptmann
Alexander Hauptmann is a Senior Systems Scientist in Computer Science at Carnegie Mellon University, and a faculty member in the Language Technologies Institute at CMU. His current main interest has been on multi-media analysis and retrieval. Other research interests include speech recognition and interfaces, translation and natural language in general. Most of his time is spent on the Informedia Digital Video project. This work has also spawned three spin-off companies related to digital video archiving and video question answering.
He is also pursuing projects on video observations for patient care for the elderly and personal wearable memory devices. His current passion is the pursuit of a large-scale concept ontology for multimedia to help narrow the semantic gap. Alexander Hauptmann holds a BA and MA degree in Psychology from Johns Hopkins University, a 'Diplom' in Computer Science from the Technische Universität Berlin and obtained a Ph.D. in Computer Science at Carnegie Mellon.
Looking Ahead: Media Understanding and Data Fusion
presentation slides
It is reasonable to predict that over the next few years the internet will see an accumulation of increasingly large collections of audio (e.g., iTunes), imagery (e.g., Flickr), video (e.g., YouTube), and sensor information (weather, traffic data) together with rapid and widespread growth and innovation in new information services in the form of mashups (combinations of multiple, separate data sources into one application or display) and social web activities (e.g., blogging, podcasting, media editing). All this is driving a need for improvements in semantic information extraction from structured and unstructured sources and across media, social network and contextual user modeling, multimedia retrieval, summarization of large diverse data sets, and collaborative work environments and interfaces.
This talk will motivate some of the 'grand research challenges' for the next few years based on my research perspectives. These challenges include: Video, audio, image and graphics understanding, and algorithms to support cross source/media mining, retrieval and fusion. The challenges are coupled with increasingly personalized user modeling and adaptive, device appropriate summarization and presentation design.
The result will be increasing pervasiveness of internet services, and noticeable performance enhancements making everything faster, easier, and just better.



