Interaction Metadata from Multimodal Interactions

1147 Words3 Pages

With advent of digital camera people have been taking more photos than ever [1], as a result our photo collections have been exploding in size. Retrieving the right photograph from these enormous collections comes out as an obvious problem. However it has turned out that people do not have the motivation to do the daunting task of tagging and indexing these huge photo collections [2, 3, 4]. As Fleck M points out [5] that people do not see the usefulness of annotating and indexing of the photographs when they are adding new images to the collection. The real need only surfaces when they already have a massive indexed photo collection and the task of annotating the whole collection is no longer appealing enough. There has been a lot of work done in the area of making it easier for the user to tag photographs, some systems make it easier by letting user a drag and drop names from a list. There is work on using speech input rather than typing the content information.

Apart from this there is a major body of work done towards making this content retrieval work automatic. There are some computer vision algorithms that can attempt to find what content of the pictures are and try to infer the occasion [6,TBD]. For example presence of a bride in a photo could infer that it’s a photo from a wedding. These methods are still unreliable and are very expensive problem to solve. GPS data has also been used to extrapolate context [7, 8], for example a photograph taken next to a tourist landmark could help labelling the photograph. If the four main values of structure of a photo are supposed to be what, who, location and emotion [Unpublished thesis!], it can be argued that these methods can at best give us what, who and location; they are incapa...

... middle of paper ...

...7

[14] Paulo Barthelmess , Edward Kaiser , David R. McGee, Toward content-aware multimodal tagging of personal photo collections, Proceedings of the 9th international conference on Multimodal interfaces, November 12-15, 2007, Nagoya, Aichi, Japan

[15] Rodden, Kerry and Kenneth Wood (2003) "How do People Manage Their Digital Photographs," CHI 2003, pp. 409-416.

[16] Schiano, Dirme .J., Coreen P. Chen, Ellen Isaacs (2002) "How Teens Take, View, Share, and Store Photos," CSCW 2002.

[17] Worthington P (2004) Kiosks and print services for consumer digital photography. Future Image Market Analysis

[18] Y. Qian and L. M. G. Feijs. Exploring the potentials of combining photo annotating tasks with instant messaging fun. In MUM ’04: Proceedings of the 3rd international conference on Mobile and ubiquitous multimedia, pages 11–17, New York, NY, USA, 2004. ACM Press.

[19]

More about Interaction Metadata from Multimodal Interactions

Open Document