Topics
S3MR2011 will feature the topics mentioned below. For each topic, a lecture will be presented by an experienced and leading researcher in the field. Please note that the details given are subject to change.
S3MR2011 will feature the topics mentioned below. For each topic, a lecture will be presented by an experienced and leading researcher in the field. Please note that the details given are subject to change.
Automatically converting visual content into textual description has long been a dream of a number of multimedia, computer vision and machine learning researchers. Image and video tagging, which has been studied heavily in recently years, can be regarded as a more realistic step to that ambitious goal. Especially, the explosion of media data and media users on the Internet, as well as the connections among users, among data and between users and data, bring us both challenges and opportunities. In this lecture, firstly we will review the evolution of multimedia tagging in the past decade, and then introduce state-of-the-art learning based tagging approaches, followed by summarizing manual tagging schemes on the Internet environment and presenting Internet-scale data-driven methods for scalable image and video tagging. Finally we will discuss the roles of models, data and users in multimedia tagging systems and study an sustainable ecosystem for multimedia tagging on the Internet environment. We will also discuss promising research and development directions in this area.
Annotation of multimedia data is usually done manually by human annotators or automatically, using algorithms that directly analyze the multimedia data. An alternative to these methods is the use of implicit tagging, where the aim is to observe the user as he/she engages with the multimedia data and infer knowledge about the data by analyzing the user's reactions. Implicit tagging can be performed through various modalities, such as facial expressions, body gestures, EEG analysis, gaze tracking, verbal and non-verbal communication etc.
In this talk we we focus on the continuum of human sensing in a sensor-equipped environment for 'off-line tagging' for future retrieval purposes and 'on-line tagging' for real-time support of a user. What can we learn from a user by observing his or her interaction patterns with an application or with another human? And how can we make off-line and on-line use of this knowledge? Sensor-equipped humans will become 'things' (or nodes) in the 'Internet of Things'. They will voluntarily and involuntarily feed the Internet of Things with their behavioral information, allowing the 'Internet of Things' to do behavioral retrieval, but also allowing the 'Internet of Things' to learn more about their life than they may find acceptable.
In our talk we will also look at possibilities to deceive the environment, that is, feeding the environment with signals that hide our true feelings and intentions. This is not unusual in daily life, and being able to do so in 'everywhere and always' sensing environments will make it possible to maintain 'naturalness' in our behavior in such environments.
Finding similar and relevant media content given a user query or sample image has been at the core of the multimedia retrieval community for a long time. In this talk, I will identify and address multimedia challenges that play a role at Yahoo!, and which go beyond relevancy of images and video to a given multimedia retrieval task.
The true challenge for multimedia is to find a balance between relevancy, freshness, quality, interestingness and diversity in order to provide an engaging rich media experience to the user.
The list of social networking websites is diverse across the globe but the popularity of social media is indisputable. The 640M+ Facebook users, the 480M+ QZone users or the 200M+ Twitter users are used to share observations, opinions and media acting as citizen sensors of the society. This has given an unprecedented access to a vast amount of data on which research communities can perform social media analytics to support socially intelligent applications such as targeted online content delivery, crisis management, organizing revolutions or promoting social development in underdeveloped countries.
This lecture will address challenges for building applications using social media. It will focus on the notion of event for structuring the user online activities. We will briefly show how semantic web technologies can be used to break the silos in which each social network platform tends to lock the users. We will then present methods combining semantic inferencing and visual analysis for detecting events from social media activities and for finding automatically media (photos and videos) illustrating events. We will review the numerous works that try to make sense of microposts (e.g. challenges in extracting named entities and role of semantic/background knowledge enhanced techniques) in order to detect events. We will conclude this talk with research questions spanning multidisciplinary fields such as information retrieval, data mining, semantic web and web science communities.
Video services, user-generated video, and P2P data exchange are major current trends in the Internet. Hence, understanding basics of P2P video transport over the Internet that can ensure some level of quality of experience (QoE) to end-users is essential. To this effect, this talk will cover fundamentals of streaming models and protocols; P2P networking concepts and protocols; very basic video coding; an introduction to P2P video including chunk formation, peer selection and chunk selection; fairness in P2P networking; rate adaptation strategies; definition of quality of experience including rate, distortion, freeze/skip duration, and pre-roll delay. Examples of P2P video streaming applications and solutions, including social media applications, from recent European projects such as P2PNext, SARACEN and DIOMEDES will be provided.
Lecturer: Prof. Murat Tekalp