We may take an image to be some mapping between properties of radiation
from the sky and one, two, or three (sometimes even more)
dimensions of a data structure. In most normal cases, we refer
to a mapping of sky intensity onto an *(x,y)* grid, but many of the
processes we discuss apply just as well to spectral images and more
exotic data forms. This includes indirect images, where there is no
optical formation of an image but one is constructed from interferometer
measurements, photon counts through moving masks, fiber-array spectra,
and so forth.

Quite generally, imaging takes the form of an inverse problem. Some true
source distribution *S* is observed by a system with some response
pattern (different from a δ-function) *R* and in the
presence of noise *N* to give an observed image *I*. We may express
this as a convolution equation
*I(x,y) = ∫ S(v,w) R(x-v,y-w) dv dw + N *
which we must sometimes consider to be further integrated within the
area of each pixel, perhaps with some nonuniform internal weighting.

Often the convolution is understood so intuitively that it is not explicitly
treated. We expect photographs to have large round images of bright stars,
though the source distributions of these stars will be no larger than
for faint ones. Our eyes accustom us to this, and it is second nature
to consider what an object will look like at a given resolution.
Complexities enter when the response *R* has lots of structure or wings
very extended compared to its core profile. Such systems have introduced
the need for deconvolution - inverting the convolution equation to
estimate *S* given *I* and *R*. This may not be mathematically
well-conditioned, so different approaches may be needed for various
regimes. In principle, the problem lends itself to a Fourier approach,
since convolution in the function domain becomes multiplication in the
Fourier (spatial frequency) domain. Denoting the Fourier transform by
lower case, we have (neglecting the noise term) *i = sr * and thus
(in principle, again) *s = i / r* from which the inverse Fourier
transform gives the source distribution. No matter what the textbook
says, this is almost never useful in imaging applications. The reasons lie
first in the noise behavior, and second in the fact that real response
function have zeros, or near zeros, in their Fourier transforms. Thus
certain spatial frequencies are almost pure noise in the data, and
Fourier deconvolution will amplify this noise with no attendant signal.

There are many distinct algorithms used to approach this problem. Modified
Fourier methods, such as the Wiener filter, use a smoothed or filtered
Fourier transform to suppress noise amplification. Methods working
directly in the Fourier domain are known collectively as linear.
Radio astronomers have widely used the CLEAN algorithm, in which
point sources are iteraftively removed from the image until the residuals
are "adequately" close to the expected noise level. The Lucy-Richardson
technique has found wide use in HST data reduction; this takes a guess
image, compares its observed counterpart to the image, and uses the
differences to improve the guess. Various more sophisticated techniques
are being tested, most notably maximum-entropy and Bayesian methods.
If we have extensive *a priori* knowledge about an image,
we can do a better job of inferring the source distribution. For example,
if we know that a globular cluster contains only stars whose images are
identical except for a scale factor, we can derive the brightnesses of
10^{5} stars per cluster much more accurately than we could without
such knowledge. If we know (say from a short pre-refurbishment HST exposure)
exactly where
they are, we may do better yet. In general, the fewer parameters we
need to determine at once, the more accurately they can be measured from
a given set of data. Another example is optimal extraction from two-dimensional
spectra - one can impose a constant spatial profile and smoothly varying
location along the spectrograph slit to increase the signal-to-noise ratio
of a spectrum by at least √ 2.

For use in many of these techniques, and in further analysis, it can be crucial to understand the noise. We have already discussed noise models for CCDs. Other detectors, such as Vidicon TV systems and photographic plates, may have their own behaviors, such as noise constant with intensity or slowly varying with intensity. It is often useful to propogate the error per pixel through subsequent analysis, always watching whether the pixels are still statistically independent (an assumption that breaks down during deconvolution or any rebinning operation, for example).

2008 © 2000-2008