Cortical Encoding of Spatial Structure and Semantic Content in 3D Natural Scenes

Our visual system enables us to effortlessly navigate and recognize real-world visual environments. Functional magnetic resonance imaging (fMRI) studies suggest a network of scene-responsive cortical visual areas, but much less is known about the temporal order in which different scene properties are analyzed by the human visual system. In this study, we selected a set of 36 full-color natural scenes that varied in spatial structure and semantic content that our male and female human participants viewed both in 2D and 3D while we recorded magnetoencephalography (MEG) data. MEG enables tracking of cortical activity in humans at millisecond timescale. We compared the representational geometry in the MEG responses with predictions based on the scene stimuli using the representational similarity analysis framework. The representational structure first reflected the spatial structure in the scenes in time window 90–125 ms, followed by the semantic content in time window 140–175 ms after stimulus onset. The 3D stereoscopic viewing of the scenes affected the responses relatively late, from ~140 ms from stimulus onset. Taken together, our results indicate that the human visual system rapidly encodes a scene’s spatial structure and suggest that this information is based on monocular instead of binocular depth cues.

Leave a Reply