The biggest issue when it comes to creating a system for image retrieval based on image queries is the semantic gap. The semantic gap is defined as “the lack of coincidence between the information that one can extract from the visual data and the interpretation that the same data have for a user in a given situation” (Smeulders). What a system can discern from an image can be completely different to what a human user can discern. It can also be the case that two human users assign different descriptions for the same image, adding to the complexity of the problem.
I first got interested in this topic when I first heard of content-based image retrieval (CBIR) about a year ago. A year ago, CBIR systems were still developing and maturing. Most results returned by these systems when presented with an image query back then were images that were almost identical to the query. Today, systems are able to identify the main focus of objects in pictures and return a more varied assortment of results. This in particular is what interest me the most, how such systems came to be more accurate in giving more appropriate results to the end user.
The main component to CBIR systems is Vision Science. Vision Science is the “study of how humans see and interpret the light that lands on the sensor known as the retina” (Palmer). There are five key research points within vision science that relate to CBIR (Marques). First is attention, the concern with how the human visual system (HVS) prioritizes and selects what region of a scene it attends to. Second, perception, the interpretation of sampled visual information. Memory, access to past knowledge, rules, and intuition, as well as the recording of the current imagery. Contextual Effects, the environ...
... middle of paper ...
...ww.sdsc.edu/~gupta/publications/PAMI-01-review.pdf
http://neurobio.drexelmed.edu/Rybakweb/vr.pdf
S. Palmer. Vision Science: Photons to Phenomenology. MIT Press, Cambridge, MA, 1999.
O. Marques, L. M. Mayron, G. B. Borba, and H. R. Gamba. On the potential of incorporating knowledge of human visual attention into CBIR systems. In Special Session on Perceptual Visual Processing, IEEE International Conference on Multimedia & Expo (ICME 2006), Toronto, Canada, July 2006
D. Noton and L. Stark. Scanpaths in eye movements during pattern perception. Science, 171:308-311, Jan. 1971
L. Itti, C. Koch, and E. Niebur. A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. on PAMI, 20(11):1245-1259, Nov 1998
D. Jobson, Z. Rahman, and G. Woodell. Properties and performance of a center/surround retinex. IP, 6(3):451-462, March 1997
A video is put on, and in the beginning of this video your told to count how many times the people in the white shirts pass the ball. By the time the scene is over, most of the people watching the video have a number in their head. What these people missed was the gorilla walking through as they were so focused on counting the number of passes between the white team. Would you have noticed the gorilla? According to Cathy Davidson this is called attention blindness. As said by Davidson, "Attention blindness is the key to everything we do as individuals, from how we work in groups to what we value in our classrooms, at work, and in ourselves (Davidson, 2011, pg.4)." Davidson served as the vice provost for interdisciplinary studies at Duke University helping to create the Program in Science and Information Studies and the Center of Cognitive Neuroscience. She also holds highly distinguished chairs in English and Interdisciplinary Studies at Duke and has written a dozen different books. By the end of the introduction Davidson poses five different questions to the general population. Davidson's questions include, "Where do our patterns of attention come from? How can what we know about attention help us change how we teach and learn? How can the science of attention alter our ideas about how we test and what we measure? How can we work better with others with different skills and expertise in order to see what we're missing in a complicated and interdependent world? How does attention change as we age, and how can understanding the science of attention actually help us along the way? (Davidson, 2011, p.19-20)." Although Davidson hits many good points in Now You See It, overall the book isn't valid. She doesn't exactly provide answers ...
Hubel and Wiesel defined the classic receptive field as a restricted region of the visual cortex. If a specific stimulus fell into this area, this may drive the cell to evoke action potential responses (Zipser, Lamme & Schiller, 1996). By shining orientated slits of light into the cat’s eye, they were able to discover that each cell had its own specific stimulus requirements (Barlow, 1982). Different cells differed from each other in many ways; some preferred a spe...
According to Dr. Vilayanur Ramachandran, in his movie “Secrets of the Mind,” our vision system is divided into two parts, one with our eyes, and the other with our brain. He also says that there are two different pathways in which our brain uses to “see.” One of these pathways, he calls the evolutionary new pathway (the more sophisticated pathway) in which our eyes see, then the information is sent to the thalamus, and eventually entering the visual cortex of the brain. This pathway is the conscious part of seeing. The other pathway Dr. Ramachandran says is more prominent, as well as evolutionarily primitive. An iguana uses this system of seeing. In this second pathway, information enters through the eyes, and then is sent to the brain stem, which in turn relays the information to the higher center of the brain. Dr. Ramachandran says that this second system is used to orientate our eyes to look at things, especially movement. Dr. Ramachandran has looked at patients with what is known as blind-sight to form his hypothesis.
The ‘where visual pathway’ is concerned with constructing three dimensional representations of the environment and helps our brain to navigate where things are, independently of what they are, in space in relation to itself (Mishkin & Ungerleider & Macko, 1983).... ... middle of paper ... ... The 'Standard' of the 'Standard'.
Such recent theories related to vigilance decrement suggest that the reasoning behind the vigilance decrement is related to the decrease of “processing resources”. The participants in this experiment was required continuous monitor of the radar screen for a long period. In this time the continuous groups had to either make “targets or neutral stimuli discrimination” with any form of rest or any form of other activity. Considering these things in mind it is suggested that the resource section probably w...
...visual information is processed to extract identity, location, and ways that we might interact with objects. A prominent anatomical distinction is drawn between the "what" and "where" pathways in visual processing. However, the commonly labeled "where" pathways is also the "how" pathway, at least partially dedicated to action.
...ical pathway. In Dorsal pathway, it provides visual information that detects movement of the objects while in Ventral pathway; it provides visual information about recognition of the object. The distinct properties of location (where) and shape (what) are estimates from very differently sized regions. (Majaj, M. J. & Palomares, M. & Pelli D.G.(2004)
Sajda P. & Finkle, L.H. (1995) Intermediate Visual Representations and the Construction of Surface Perception. Journal of Cognitive Neuroscience, 7, 267-291.
Ratey, John J., and Albert M. Galaburda. A User's Guide to the Brain: Perception, Attention, and
In 1987, Francine Shapiro was walking in a park while moving her eyes from side to side and noticed a reduction in undesirable emotions she was experiencing from disturbing memories. She presumed the desensitizing effect was a result of her eye movement. She later learned isolated eye movement did not produce the complete therapeutic effect and added treatment features to include cognitive components (EMDR Institute, 2017). In order to explain her intervention, Francine Shapiro, developed an information processing theory.
Though the experiment shows that attention is vital for change detection, we should consider the size/ impact of the change in the environment. If the change to an environment is small, would it result in the change being detected? Do providing little clues draw attention effectively to where the change is being made? In support of this argument, Rensink (1997) showed that even with small clues, if the clue is not directed properly then detecting change will not have an effect. A proposal of Rensink is that the absence of attention will cause visual contents to be missed. On the other hand, Simon and Levin (1998) suggest that a person could miss things happening in their environment if his or her attention is occupied by something
Many do not consider where images they see daily come from. A person can see thousands of different designs in their daily lives; these designs vary on where they are placed. A design on a shirt, an image on a billboard, or even the cover of a magazine all share something in common with one another. These items all had once been on the computer screen or on a piece of paper, designed by an artist known as a graphic designer. Graphic design is a steadily growing occupation in this day as the media has a need for original and creative designs on things like packaging or the covers of magazines. This occupation has grown over the years but still shares the basic components it once started with. Despite these tremendous amounts of growth,
There are many different Visual Perception principles in perception. The main principles are Gestalt. Gestalt is a German word meaning 'form' or 'shape'. Gestalt psychologists formulated a series of principles that describe how t...
Blakslee, S. (1993, August 31). The New York Times. Retrieved May 2, 2014, from www.nytimes.com: http://www.nytimes.com/1993/08/31/science/seeing-and-imagining-clues-to-the-workings-of-the-mind-s-eye.html
Sensory memory provides the ability to truly experience photograpical skills, enabling one’s self to focus on the details from the image (Sperling, G., 1960). However, seconds later, short-term memory can only recover few details from the image (Phillips, W. A., 1974). Days later, it may even be difficult to recall the whole image, so the brain may only recall the gist of the image (Brainerd C.J., Reyna V.F., 2005). According to a research from the book “To see or not to see: The need for attention to perceive changes in