For the human, stereoscopic vision as always been a necessity. Despite centuries of evolution, the human has preserved this faculty of being able to estimate distances with the help of his sense of vision. This demonstrates how important this physiological characteristic is.
To be able to perceive distances, the brain must do a complex analysis of a huge amount of information. Today's computer imaging systems are still far from being able to do such a complex analysis in real time. The get a better idea of the amount of information to analyze, let's take a closer look at the starting point of the human visual system. This point is the eye, which can produce an instantaneous data flow capable of overflowing worldwide computer resources in a fraction of second. In fact, the optical nerve is considered as the densest communication channel known by the human on the universe scale !
More than the massive information that each eye sends to the brain, it must also do a picture analysis to establish correlation information to find common points in the two images perceived. This seems to be a trivial task when we look at two closely related images of a scene. Nevertheless, our brain must perform a great deal of computations to give us an interpretation which seems so evident to us at first sight. To convince ourselves of the difficulty of the task, it is proposed to the reader to try to imagine an iterative mathematical algorithm which would do such an analysis in real time.
Now that we have a better understanding of the complexity related to this task, let us look on how this retina visual information is sent to the processing center, the brain. The following figure illustrates the way the two eyes interface with it.
Figure 1 - Illustration inspired from the book "Virtual Reality ExCursions with programs in C "
As opposed to what the popular belief is, the stereoscopic analysis isn't done in such a way that each eye sends its information to the opposing hemispheric part of the brain. The reality is that a part of the picture perceived by each eye is sent to the opposed hemisphere of the brain. So, it's not the entire picture from an eye that goes to the opposed hemisphere, but more precisely a part of the visual field of this eye. The illustration above shows this fact. We can see that the light rays from one side of the visual field, in regard to the center of the observer's face, are detected by each eye which relays this information the their respective opposing hemispheres.
This way, the correlative analysis is possible. Given that both hemispheres of the brain are almost isolated one from the other, it wouldn't be possible to proceed to such a stereoscopic analysis if each hemisphere only got a monoscopic information. Furthermore, in the advent of a bad functioning of an hemisphere, the subject would be stereo blind (can't see depth in a scene). This factor is an essential survival element. About 2% of the population (excluding those how are blind) are stereo blind . The main cause of this can be of a neuronal nature (misfunction of the brain regarding stereoscopic analysis) or of optical nature (bad visual acuity of one of the two eyes).
After this brief physiological study, we must now analyze stereoscopic vision from an other angle, the mathematical one. How do certain objects appear closer to us than others? The observation of our surrounding world is the best way to answer such a question.
Everyone has, one day, noted that both eyes of a person looking at a close by object has a tendency to slightly cross his eyes one toward the other. At the other extreme, a person looking at a distance superior to a few hundred feet will have both his eyes perfectly centered. This is what we call the convergence angle. We can illustrate this fact like this:
The above figure illustrate two different convergence cases. When the observer fixes his sight on plane A, his eyes will adopt a convergence angle theta 1. On the other hand, when the plane B is looked at, the convergence angle is more pronounced, thus theta 2. The brain primarily uses this convergence angle to perform his stereoscopic analysis. It obtains the theta angle information by means of nervous flux from nervous sensors in the muscular tissues responsible for the orientation of the eye in the desired direction.
This information is used for various purposes. It is primarily used to inform the brain of the relative position adjustment between the two images fed to it. Yes, since a simple observation of two images side by side doesn't permits a rapid find of common points on these two pictures. In fact, as we will see later, it's often exactly to opposite which is used in many virtual reality HMD helmets. Indeed, in every HMD, the image pair is presented at a fixed distance called the image plane. Thus, this distance won't necessarily be related to the distance where an object must be seen with regards to the stereoscopic correlation.
Secondly, the convergence angle of the eye enable the brain to determine what must be the eye accommodation. The accommodation represents the action to change the focal distance of the eye by means of modifying the muscular tensions applied to the crystalline lens. The last being a lens made from biological tissues which are relatively deformable. So, the accommodation constitutes a modification of the crystalline curvature.
To be able to recreate a binocular (stereoscopic) vision by artificial means, it is needed that we always keep the last considerations in mind. How can we generate two images which will have a volume appearance? The answer partially lies in figure 2. Given a real scene. For example, a box at two meters from an observer. The image plane where will be located the two stereoscopic images is at 1 meter from the observer. What will be the appearance of these two images recreating the scene and what must be the separation between the two in millimeters?
First, there must be a rotation between the two images. This rotation effectively depends on the inter-pupilary distance (IPD) of the observer and the distance between him and the object. Secondly, both images must be placed in such a way that they are on the vision paths of the two eyes. So, the closer the image plane is from the observer, the separation will tend to approximately 69 mm. To adjust the separation, no matter what reference point is used, as long as the corresponding point is used on the other image. Here, the center of the cube is used.
thêta = tan -1 (0.0345 meter/2 meters) = 0.988 degrees
angle between the two images = 2 x 0.988 degrees = 1.98 degrees
dx = tan (0.988 degrees) x 1 meter = 17.2 millimeters
separation = 2 * dx = 34.5 mm (note: the separation is always included between 0 and the inter-pupilary distance, about 69 mm for the average human)
Take note that the left/right eye representation of the figure 3 doesn't represent the numerical values calculated above. It only give an idea of the appearance of the perceived view.
biological stress caused by stereoscopic vision
At this stage, the reader must better realize the importance to respect the human body working principles with respect to the generation of artificial stereoscopic images. What is important to remember is that, in no case, is it possible to completely recreate the conditions of binocular vision found in real scenes around us. The problem is that, like previously mentioned, the only actual mean to recreate a stereoscopic vision of a scene is to place two images with slight disparities at a fixed distance called the image plane in front of the user's eyes. These images being essentially two dimensional imply that the observer will be forced to adopt a given fixed accommodation. This level of accommodation being defined by the physical distance between the eyes of the user and the image plane for that matter.
As we already saw it, it is possible to make virtual objects appear at a certain distance specified by the system geometry. So, this imply that we are able to present a virtual object at 10 meters in front of the observer as he look at a pair of images which are two dimensional and at only 10 centimeters from him. This doesn't seems to cause any problems. Wrong. There will be a more or less severe conflict between information which are sent to the brain. This conflict is the fundamental clue that there will be a problem when using the given stereoscopic system.
Let's look at the problem more closely. The observer gaze at a scene which is at a fixed distance " A " by means of stereoscopic images placed at a distance " B ". Refer to figure 4 which follow for the current discussion. Like previously stated, the brain start by adjusting the convergence in such way that the corresponding points of the two images given by the two eyes merge. Then, it send to the crystalline muscles nervous influx so that the accommodation (the focal distance of the eye) is in accordance with the convergence information. Take note that it is rather unclear at this point whether the covergence precede or follow accomodation. These two adjustments are made without a predefinite order in time. So the observer adopt an accommodation that gives him the ability to clearly see at 10 meters. He will then see a blurry scene. The brain doesn't understand the nature of this conflict. It will try to minimize this blur by modifying the muscle tensions around the crystalline to obtain a clearer image. More the required muscular compensation will be pronounced, more the biological tress will be important. From this very fact, there will be the apparition of symptoms like a temporary bad accommodation of the candidate even half an hour after being immersed with the erroneous visual stimulus.