Introduction to spectrometric modeling
Consider the formidable task of computer modeling a sophisticated spectrometric technique or instrument and doing it so well that the resulting simulation is of research quality, yielding highly accurate results immediately publishable in the primary refereed scientific literature. What is desired are results so good that they differ negligibly from actual experimental results obtained at far greater expense and with far greater effort. Performance of this task requires a computer program (or computer language) that has the following characteristics:
The first two criteria may be met with standard programming languages or with specialized simulation program "engines". However, implementation of an optical calculus is the sine qua non of the spectrometric modeling task, so it may be useful* to take a brief look at what an optical calculus is and why it is essential to the task at hand.
*Important note! It is not essential for one interested in quantitative spectrometric modeling to actually know how an optical calculus works, because all of the requisite mathematics (i.e., the optical calculus principles and computational aspects) can all be hidden away completely, much like the internal hardware of a computer system itself. This happy possibility is very amply demonstrated in the many free annotated models available. However, it certainly does not hurt to know a little about what is "under the hood" of one's computational vehicle. So, if you want to know a little about optical calculi, continue on; otherwise, follow this link to learn more about the software that will accomplish our spectrometric modeling task.
Background
Many years ago, scientists realized that when a single ray of monochromatic light interacts with a homogeneous optical medium in linear fashion, the interaction of the monochromatic light ray and the medium (often called a linear optical "device") may be quantitatively described by means of a so-called "optical calculus". In the early 1940's, Dr. R. Clark Jones (of the Polaroid Corporation) invented his elegant and powerful Jones optical calculus [1-7]. At nearly the same time, Prof. Hans Mueller (of M.I.T.) invented his Mueller optical calculus [8]. Both of these optical calculi have subsequently been heavily developed and refined, resulting in many hundreds of papers in the research literature. In a real sense, these two optical calculi are complementary, so while each has its proponents, neither has supplanted the other.
At this point, it should be noted that a full discussion of either optical calculus requires a book length treatment and, happily, such books already exist. I strongly recommend the following:
Accordingly, I shall not present a detailed tutorial on optical calculi, since this is already very well done in the above references and in the extensive research literature. However, what IS needed is a discussion of how to convert the "theory" in the textbooks and published papers into a practical and facile means of solving real spectrometric problems. This means that we need to put the optical calculi on a computer and use the computer to perform the requisite numerical computations that are implicit in the use of an optical calculus. So, starting at the beginning, exactly what is an optical calculus?
What is an optical calculus?
An optical calculus is simply a means of using linear algebra to model the linear interaction of a ray of monochromatic light and an optical medium that is homogeneous and well behaved (i.e., no non-linear effects). The monochromatic light ray is modeled with a vector while the medium is modeled via a matrix. For Jones calculus, the optical vector describing the light is a complex two-vector, called the Jones vector, and the matrix describing the optical medium is a complex 2 by 2 matrix, called the Jones matrix for the component. For Mueller calculus, the optical vector describing the light is a real four-vector, called the Stokes vector, and the matrix describing the optical medium is a real 4 by 4 matrix, called the Mueller matrix for the component.
Examples of Jones and Mueller optical vectors
For a detailed discussion of how these vectors (and more general ones) are defined in the first place, and the significance of the component locations within the vectors, see the references above. Note that any light ray, at a given instant in time and at a given spatial location, has an optical vector representation.
Examples of Jones and Mueller optical matrices
For a detailed discussion of how these matrices are defined in the first place, and the significance of the component locations within the matrices, see the references above, especially Kliger et al. Note that the above matrices are simple special cases of far more complicated matrices applicable to general anisotropic media. For full details, see H. P. Jensen, J. A. Schellman, and T. Troxell, Applied Spectroscopy, 32, 192-200 (1978).
OK, so how does an optical calculus work?
The easiest way to see how it all works is to look at a few simple examples. For this purpose, we will take several examples from Shurcliff's text, using them either as is or with some clarifying modification. Shurcliff's notation is used throughout, as is his choice of matrix element positioning (which differs from that of Jensen et al. above) and his choice of a right-handed coordinate system:
For simplicity, all of the examples chosen will be based on Mueller calculus, since this avoids a digression into how complex numbers, which are fundamental to the Jones calculus, are handled. First, however, we should list the restrictions that are almost always implicitly assumed at the outset. Our initial assumptions are as follows:
This is a highly restrictive set of conditions, but, as we will see later, almost all of them may be relaxed or even entirely eliminated.
Example 1
Consider now the following schematic block diagram:
The light source is of unit intensity (arbitrary units) and is left circularly polarized (LCP). The Stokes optical vector for this light source is one of those given above. The optical medium is an isotropic absorber having a transmittance of 0.5 at every wavelength. Its Mueller matrix is given above. Hence, we have the two items needed to set up and perform an optical calculus computation, but, before doing that, note that we can immediately state that the medium will simply attenuate the incident light ray, giving a resultant light intensity half that of the incident ray. This follows because the transmittance, denoted by "T", is defined as the transmitted light intensity divided by the incident light intensity and, in the present example, T = 0.5. There will clearly be no change in the polarization properties of the incident light ray.
Now, using Mueller optical calculus, we have the following:
The Stokes optical vector is at the far right and the absorber's Mueller matrix is to its immediate left. Thus, the ordering of mathematical objects is the reverse of the physical order of components, assuming conventional left to right physical component ordering. Starting from the right side of the succession of equalities, and working left, it is seen that the Mueller matrix of the absorber is really just a scaled identity matrix, so the required matrix-vector product is trivial. Then the last equality, at the far left, gives the output Stokes vector.
Clearly, this example is trivially easy because the medium is simply an isotropic absorber, having Mueller matrix equal to T times the identity matrix.
The next example is slightly more difficult.
Example 2
Consider the following schematic block diagram:
where the light source is again LCP of unit intensity and the polarizer is an ideal linear polarizer with transmission axis at zero degrees, i.e., an "x" polarizer by our coordinate system definition. The corresponding Mueller optical calculus expression is:
Note that this example, and the previous one, shows that a matrix with constant coefficients may sometimes be usefully simplified merely by factoring out a constant. This is convenient for hand numerical calculations, but is unimportant when the numerical computations are performed using a computer because the time required for the computer to process the matrix numerically is negligible compared to the time required for any human manipulation of a matrix or vector. Of course, for analytical (in the mathematics sense) manipulation of matrices, factoring and simplification will always be important, but one of our goals is to entirely avoid having to deal with explicit matrices and vectors.
Henceforth, we will not show explicit factoring of matrices or vectors.
Example 3
Combining elements from the two previous examples gives the following schematic block diagram:
where the light source is as given above, the polarizer is an ideal "x" oriented linear polarizer, and the absorbing medium is a "neutral density" absorber with T = 0.5. The corresponding Mueller optical calculus expression is:
The component order dictates the order of the matrices, as noted above. In passing, we note that the two matrices could be multiplied first, resulting in an "equivalent component" matrix which may then be used to transform the incident Stokes vector to the output Stokes vector. However, as noted above, such reduction is only important analytically (in the mathematics sense), not numerically, and we will have no need, per se, for the analytic "equivalent component" matrices. Rather, we will simply set up the computation and let a computer program do the numerical computation required. This also affects how we will deal with optical components (optical media) that are not in standard orientation, i.e., rotated components.
How are rotated components treated?
When an optical component is rotated, its Jones and Mueller matrices must be modified. For either calculus, the modification procedure involves multiplying the standard matrix by a pair of rotation matrices, with the original matrix sandwiched between the pair of rotation matrices. This is shown in the figure below:
In dealing with analytic matrices with rotational offset from standard orientation, it may be best to perform the requisite pair of matrix multiplications, resulting in the matrix for a rotated optical component. For example, suppose an ideal "x" polarizer is to be rotated by an angle theta. Then the result is:
As a practical matter, it may be better to leave all components in standard orientation and perform the rotation matrix multiplications numerically as they may be needed. This avoids messy analytical matrix manipulation, with attendant possibility of derivation error, and the time penalty imposed is relatively small. This is the approach adopted below.
The next example is taken from Shurcliff and is directly quoted.
Example 4
"Consider the experiment in which a unit-intensity beam of left-circularly polarized light is incident on a train containing the following four devices: (1) a linear polarizer with transmission axis horizontal, (2) a linear 90 degree retarder with fast axis at 45 degrees, (3) a linear polarizer with axis at 45 degrees, (4) a 90 degree right-circular retarder. ... The multiplication called for is
... On carrying out the four multiplications, one finds the over-all product to be
This shows that the emerging beam is 100-percent linearly, horizontally polarized and has an intensity of 1/4.
Obviously, the four devices considered in this example are so simple that a person familiar with polarizers and retarders could predict the result without recourse to algebra. This would no longer be true, however, if the polarizers were of nonideal type, or if the polarizers or retarders were mounted at unusual angles, or if some of the polarizers or retarders were of elliptical type. The outcome would then be far too complicated to compute in one's head. The Mueller-calculus method, however, goes through exactly as in the simple case (Shurcliff's italics). The numbers within the matrix are more complicated, but the procedure is identical. No special insight or planning is required, other than looking up the matrices, writing them down in proper sequence, and multiplying them starting from the right."
What now?
Obviously, any reasonable computer programming language (i.e., FORTRAN, BASIC, C, C++, Pascal, APL,etc.) is acceptable as a means of performing the numerical computations outlined by Shurcliff at the end of the quoted passage. However, use of a general programming language forces the user to do almost all of the real work: everything from the thinking down to making sure that all the trivial specifics of the computer language are properly handled. Using a program such as Mathcad (or MatLab or Mathematica or ...) is a significant improvement, because the interface makes the matrix and vector multiplications "visible" to the user, thereby reducing the need for the human user to "think like a computer", but, even so, the user is still stuck with almost all of the real work.
To put the task we face in sharp focus, consider that we are not really interested in easy problems: we want to use optical calculus methods to solve research level problems arising in the real world. As an example, consider the following scheme for measuring ultra-low optical activities of compounds eluting from a liquid chromatographic system:
This "real world" problem is vastly more complicated than the four simple examples that preceded it. Among the many complications it exhibits are the following:
And, of course, where do you get the matrices for the Faraday effect modulator, let alone the flow cell with the highly complicated eluent from a chromatographic system flowing through it? The latter is far from trivial. So, for all practical purposes, this "real world" problem is impossible to solve using optical calculus methodology instantiated with Mathcad, FORTRAN, Mathematica, etc. But, it CAN be solved, with relative ease, if we use the right software and optical calculus-based methodology. This leads us to a consideration of exactly what might be required to solve real world problems.