The Third Type of Memory: The Role of Memory in Immediate Perception
At the present time there are popularly considered to be two types or two facets of memory in humans and other creatures – short-term memory, and long-term memory. A distinction is made because of the different role that recall of more recent events plays within experience as compared to recall of a more distant past, and differences in physiological function between the two memory types can be noted.
The fact that a distinction exists allows the freedom to propose a third type of memory, which can be called real-time sensory memory, or, simply, sensory memory. Without this sensory memory, involving a time span far shorter than even short-term memory, it would not be possible to discern periodicity (pitch) in sound, nor could any kind of intelligence for identifying objects be applied to the sense of sight.
It is logical to start with the explanation of sensory memory as it relates to hearing, because hearing seems more obviously linear. The ear communicates with the brain via the cochlear (auditory) nerve, which, unlike the optic nerve, is not an outgrowth of the brain itself and is considered to be peripheral to it. The brain innervates (activates and keeps active) the cochlear nerve, and the cochlea sends an incoming signal representing the air pressure of ambient sound at all times, even, apparently, during sleep. The brain does send outgoing signals to the ear, but they are not involved in the process of perception, or only so to the point of optimizing the mechanical environment of the ear, but not the neural.
The ear sends an electrical signal that is an analog of each oscillation that the ear detects via mechanical processes. This is true up to about a frequency of 4000 Hz, after which the analog becomes more general and the ability to define pitch is decreased.
Hearing requires the employment of sensory memory. The act of hearing (in humans) involves the establishment of a dynamic memory buffer of approximately 50 milliseconds, during which the memories of all the oscillation signals that have been received in the 50 ms prior to now are “chunked together” or jammed together. It is this jamming together of oscillation memories that creates the sense of periodicity and accompanying complexities of pitch, and the greater the number of oscillations within the time span of the memory, the greater the stress upon it, and the higher the periodicity sounds.
It is important to understand that there has to be a memory buffer in order for periodicity detection and all intelligibility and identification of sound to occur. Without employing memory, the sense of hearing would be nothing but a continuously varying sense of excitation, with no ability to exercise cognitive processes upon it. What is explicated here is just the basic, momentary memory without which perception could not occur at all, and that has to be distinguished from processes, also involving memory, that enable the recognition of sounds and the intelligibility of speech and melody. The latter processes are longer in duration, and would logically fall into the category of conventional short-term memory.
It seems likely that we are alert to periodicity in sound all the time, and the act of "listening" involves taking small intervals of memory (but still longer than 50 ms) and analyzing them for patterns. This could explain how we are able to follow a conversation amid noise or other conversations in the vicinity. These intervals of memory that are analyzed could be as brief as a phoneme or even briefer.
50 milliseconds translates to 20 Hz, which is considered to be the lowest frequency that humans can hear. In point of fact, however, the ear does transmit frequencies lower than that, but such oscillations take longer than 50 milliseconds to complete, and by the time the end of an oscillation is reached, the beginning has already fallen out of the memory buffer, and so it does not become an auditory experience. Instead, the oscillation (if perceivable by other means, say, visceral pressure) would be experienced as a discrete event rather than a tone.
An oscillation with the exact duration of the memory buffer would probably be perceived as a discrete event also, but any oscillation just a bit briefer, and thus putting a fraction of an extra oscillation into the buffer, would create the stress of a tone.
Beyond a frequency of about 4000 Hz, pitches of different frequencies excite specific areas of the brain, these areas being arranged in a pattern that can be compared to the keys on a piano or other musical instrument. The sensation of these high frequencies are perhaps an analog of the spacial arrangement rather than a true perception of accurate pitch, as indicated by there being only the vague ability to discern pitch at higher frequencies.
Vision is also dependent on a linear signal, although this does not seem as obvious. The optic nerve and the retina of the eye are an outgrowth of the embryonic brain, and the optic nerve is covered by the meninges, the same sheath that covers the brain, so these structures can be considered a part of the brain itself. This makes the sensory data gathering of vision perhaps not as passive as that of hearing and other senses.
The retina is covered by a grid of photoreceptor cells upon which the image received by the lens of the eye is projected. The photoreceptors are “wired” to the optic nerve by nerve tissue. The image that reaches the eye is processed at the eye, so that only the fovea sends a high-definition signal to the rest of the brain. In fact, only a tiny percentage of the light information that falls upon the retina is relayed further. (If the optic nerve did relay all the light information on the retina it would have to be much, much thicker, perhaps almost 100 times thicker, than it is.)
The optic nerve sends its information to its terminus, and from there the visual field would be displayed as an array on the brain. It seems likely that any impression is arrayed for approximately 50 milliseconds, which gives rise to a shaky “panorama” that constitutes the visual impression at any given time.
If all the information gathered by the retina were sent back simultaneously, the sense of vision would be as unintelligent as a camera. The basic function of vision is for identification, and that involves an intelligent scanning of an object for clues as to its identity. Take for example the letters in these words. As you read them you look at tiny points on the letters one after the other, and that is sufficient for identification. Without deliberately now examining it, only a small percentage of those reading this text could give information about the characteristics of the font that it’s in. Similarly, while looking at any novel object (say, a tree), only someone approaching it in the manner of an artist or a naturalist, and hence deliberately identifying more and more detail, could know or say much about it.
The 50 ms memory buffer explains why we can be fooled by a movie into thinking that the intermittent images projected on the screen display continuous motion. The 24 frame per second rate of projection has each image displayed for only 41.7 ms, shorter than the 50 ms buffer, so anything looked at in the previous frame would still be sitting in its corresponding position in the visual array when the next frame appeared. It explains blurring (for example – moving one's finger rapidly back and forth in front of a static background) because the minute points of the finger that were looked at within the past 50 ms still appear. It explains the philosophical enigma of why you can drop something on your foot and feel it at the exact same time as you see it, even though the visual sensation reaches the brain much sooner (the time necessary to send the touch sensation from foot to brain is, assuming a 100 meter per second transmission speed, only about 15 to 20 ms.)
Edit Feb. 27, 2012:
Changes were made to the hearing section to clarify it. Very minor changes were made to the vision section.
Edit Mar. 6, 2012
Changes were made to the vision section.
Edit April 25, 2017
Very minor edit