Cookies Policy

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies.

I accept this policy

Find out more here

To what extent can matching algorithms based on direct outputs of spatial filters account for human object recognition?

No metrics data to plot.
The attempt to load metrics for this article has failed.
The attempt to plot a graph for these metrics has failed.
The full text of this article is not currently available.

Brill’s MyBook program is exclusively available on BrillOnline Books and Journals. Students and scholars affiliated with an institution that has purchased a Brill E-Book on the BrillOnline platform automatically have access to the MyBook option for the title(s) acquired by the Library. Brill MyBook is a print-on-demand paperback copy which is sold at a favorably uniform low price.

Access this article

+ Tax (if applicable)
Add to Favorites
You must be logged in to use this functionality

image of Spatial Vision
For more content, see Multisensory Research and Seeing and Perceiving.

A number of recent successful models of face recognition posit only two layers, an input layer consisting of a lattice of spatial filters and a single subsequent stage by which those descriptor values are mapped directly onto an object representation layer by standard matching methods such as stochastic optimization. Is this approach sufficient for modeling human object recognition? We tested whether a highly efficient version of such a two-layer model would manifest effects similar to those shown by humans when given the task of recognizing images of objects that had been employed in a series of psychophysical experiments. System accuracy was quite high overall, but was qualitatively different from that evidenced by humans in object recognition tasks. The discrepancy between the system's performance and human performance is likely to be revealed by all models that map filter values directly onto object units. These results suggest that human object recognition (as opposed to face recognition) may be difficult to approximate by models that do not posit hidden units for explicit representation of intermediate entities such as edges, viewpoint invariant classifiers, axes, shocks and object parts.

Affiliations: 1: Department of Psychology and Computer Science, University of Southern California, Hedco Neuroscience Bldg., Los Angeles, CA 90089-2520, USA; 2: Department of Psychology, Iowa State University, Ames, IA 50011-3180, USA


Full text loading...


Data & Media loading...

Article metrics loading...



Can't access your account?
  • Tools

  • Add to Favorites
  • Printable version
  • Email this page
  • Subscribe to ToC alert
  • Get permissions
  • Recommend to your library

    You must fill out fields marked with: *

    Librarian details
    Your details
    Why are you recommending this title?
    Select reason:
    Spatial Vision — Recommend this title to your library
  • Export citations
  • Key

  • Full access
  • Open Access
  • Partial/No accessInformation