Maheswaranathan et al. create a three-layer network model that captures retinal encoding of natural scenes and many ethological phenomena. The model’s structure is interpretable in terms of the actions of real interneurons. A new computational approach automatically generates hypotheses for how interneurons generate ethological computations under natural visual scenes.