Every image which is known by the Elixir system, whether obtained by a
camera intimately connected to Elixir or introduced from outside by a
user, is recorded in the registration database, Imreg.db. This
database contains various pieces of information about the image which
are useful for organizing or categorizing them, and for identifying
trends.
The normal way in which images are introduce to Elixir is during at
the time of acquisition, with a method to be discussed below. Data
which is stored in the Imreg.db focuses on data which is determined
from the telescope (such as RA \& DEC), measured by the telescope
support environment, such as various temperatures, or can be easily
determined from the images themselves with little effort (bias, sky).
The Imreg.db, like all of the Elixir databases consists of a binary
file with a FITS-like header. The binary data of the file consists of
a series of C-structures written in sequence, with a defined byte
order for multiple byte fields. APIs are defined by the Elixir system
to make access of the data very straightforeward.
The following C structure defines the Imreg.db:
/* structure for Image Registration Database */
typedef struct {
char filename[64]; /* image filename */
char pathname[128]; /* image pathname */
char filter[32]; /* filter name */
char instrument[32]; /* name of camera */
char ccd; /* number of ccd if mosaic */
char mode; /* format of image file (MEF/SPLIT) */
char type; /* imagetype (OBJECT, FLAT, etc) */
char junk[25]; /* extra space */
float exptime; /* exposure time, seconds */
float airmass; /* airmass */
float sky; /* median of data region */
float bias; /* median of overscan */
float fwhm; /* fwhm of stars in image */
float telfocus; /* telescope focus setting for image */
float xprobe, yprobe; /* x & y position of guide probe */
float zprobe; /* z position of camera */
float dettemp; /* temperature of CCD, K */
float teltemp[4]; /* several environment temps, C */
float rotangle; /* rotation angle of instrument */
float ra, dec; /* TCS-reported ra & dec */
unsigned long int obstime; /* time of image */
unsigned long int regtime; /* time image was registered */
} RegImage; /* 360 bytes / image */
This structure defines a wide variety of information which may not be
either relevant or automatically available at all sites. This
database is intended to allow diagnostics of the telescope system, so
while it is convenient to fill as many of these fields as possible, it
is not particularly necessary to other parts of the Elixir system.
There are some entries which deserve extra attention.
At CFHT, most of the data in this structure can be determined by
examining the image headers. Currently (10/2000), the names of the
relevant header keywords are hard-coded in the relevant software.
However a reasonable and obvious extention would provide these entries
in a lookup table of some sort. Some entries are not available in the
image header and have to be provided in an alternate way.
The telescope temperature data is not provided in the image headers.
Currently, these are recorded by the 'data logger', a daemon which
runs at the summit and stores a wide variety of data in a proprietary
database. This data is relevant not only to images which come off the
telescope, but also to images which are being analysed from the
archives. To access this data with Elixir, we have provided two
mechanisms. First, we have a small script which can extract the data
for the current night. This script runs as a daemon during a 12k run,
and places the temperature information in a specified location. Then,
the Elixir programs can search this file for the appropriate
temperatures at the appropriate time. Second, for an archive
analysis, the temperatures for the appropriate time range (ie, Sept
1999) are similarly extracted into a specific file. The Elixir
software then searches though this file for the temperatures for a
given image. This two step process is somewhat cumbersome, but is
used to speed up the otherwise somewhat slow access to the complete
datalogger database. Ideally, the data which goes in this database
should be available from the image itself. This suggests that the TCS
should write the temperatures of interest in the header, which would
avoid the current cumbersome system.
The image name and pathname are available to the program which
registers the image in the database. This program, since it must read
data from the image header, must have access to the image itself. It
can either be passed a full path to the image or it can be given a
relative path. In either case, it records the current path to the
image in the 'pathname' variable. This information is used by later
elements of Elixir to find and display images as desired. One
drawback of the current implementation is that it assumes the images
remain on disk in the same location. This is clearly not guaranteed.
It is one thing if the user removes the image of interest: this is
expected and a simple error saying the image is unavailable can be
returned. But, if the user moves the directories containing the
images, it would be helpful to provide a mechanism to allow the
database to make this change trivially. Possible options along these
lines might include: 1) A program to change the pathname for specific
images or for groups of images following specific rules (perhaps
similar to sed's substitution rules). 2) The upper fraction of the
path can be represented by a variable which can be easily changed by
the user at will. 3) The move / remove function for images can be
implemented with an Elixir specific version of mv / rm, which would
make the update automatically. Each of these strategies has
advantages and disadvantages, and which should be weighed before
implementation.
The remaining type of data are those which cannot be determined by the
image header, but rather from the image itself. These include the
bias and data region median values, the latter representing the sky
flux, and the average FWHM of objects in the image. Within the Elixir
system, these values are determined by a fast analysis pipeline called
'imstats', which is documented in detail elsewhere. These values are
determined and added to Imreg.db after the images themselves have been
added.
One point to be made relates to the issue of mosaic data formats.
There are two typical ways mosaic images are stored: MEF and SPLIT.
In MEF, the entire mosaic is stored as a series of extensions in a
single FITS file. In SPLIT, each CCD image is stored as a separate
FITS file with related names. The Imreg.db allows for three types of
images: MEF, SPLIT, and SINGLE, the last referring to images from a
single CCD detector. A philosophical choice we have made is to have
every CCD represented in the Imreg.db. This implies an entry for
each CCD of a MEF image, even if these have the same file name (not to
mention RA, DEC, temp, etc). This is necessary since several elements
of the database table (sky, bias, ccd) refer to CCD-specific
information. It is also necessary to minimize the differences in
handling the the SPLIT and MEF images.
Image Registration Database input / output functions
There are several routines which are related to maintaining Imreg.db.
Several are used to introduce images or relevant data into the
database, while others are used to extract data as needed from the
database.
The basic data entry program is 'imregister', which places the basic
information about an image in the database, as determined from the
header. The program is invoked with: imregister (filename)
[-split]. The optional flag is used to tell the program to
distinguish the individual (SPLIT) frames of a mosaic CCD from an
individual CCD which should be treated as an isolated image (SINGLE).
The MEF images can be identified by the information in the image
header. The imregister program also searches for temperature
information as described above.
A varient of the 'imregister' program is 'imsort', which does the same
task as 'imregister', but it also sends a trigger to the IMSTAT elixir
and if needed to the PTOLEMY elixir. These triggers tell these
elixirs to perform their analyses on the image. In 'imsort' and in
the related elixirs, the -split flag makes a difference in the naming
convention of the derived data products. The basic point is that a
SINGLE image /fullpath/filename.fits produces analysis files of the
form /newpath/filename.ext while a SPLIT image will have the form
/fullpath/word/wordNN.fits and output files of the form
/newpath/word/wordNN.ext. The SPLIT maintains an extra directory
level in the path in common with the input path.
As mentioned above, Imreg.db include statistics determined from the
images themselves by the imstats elixir process. The results of these
measurements are included in the database with the command
'imstatreg', which finds the database entry for the given image and
adds in the new statistics as needed.
For data extraction, the program 'imsearch' is an all-purpose
searching tool. This program lists all images which match a set of
constraints listed on the command line. Without any arguments, it
therefore lists all images in the Imreg.db, along with a summary of
the interesting information. Flags to the program can restrict the
search, including options such as:
- -ccd N : restrict search to CCD number N
- -type WORD : restrict by image type (flat, etc)
- -mode WORD : restrict by MEF, SPLIT, SINGLE
- -filter NAME : restrict by filter
- -time date range : restrict by date \& time period
- -trange date1 date2 : restrict to range date1 - date2
- name?
These allow for easy searches of specific image. Since the data is
passed to standard out, more sophisticated searches and analyses may
be easily performed by passing the data to other UNIX filters like
sort, grep, and awk. The above entries are generally case
insensitive. The dates may be specified in the format:
[yy]yy/mm/dd,hh:mm:ss. The separators may be any character except
space or period (.), as long as it is not parse by the shell (ie, ?).
Entries in the dates may be dropped from the right (least
significance) as needed and default to their minimum values. The
words TODAY and NOW may also be used. The date may also be specified
as a julian day if suffixed with 'j' or as a length of time since Jan
1, 1970 (or is it REFDATE?), if the units are specified with suffixes
as describe below. The time range can be specified in several units
if specified with the following suffices: s (seconds), m (minutes), h
(hours), d (days), M(months), y(years). Note that only minutes and
months are case sensitive here.
Data extraction may also be performed in more detail with the 'status'
program. Functions in 'status' allow the extraction of the various
fields into vectors (or string vectors??) which may then be
manipulated as needed. Unlike imsearch, where only a minimal subset
of the available data are reported, status allows access to all
entries in the database table.
short side note: implementation specific issues} there are two
major aspects of the analysis process which depend to some extent on
the assumptions we have made at CFHT. I have tried to minimize the
number of places that programs need to know something specific to CFHT
that might not be treated exactly the same elsewhere. The first of
these aspects is the use of SPLIT vs MEF format images for the files.
It is necessary to distinguish them and to treat these differently in
some cases (particularly when an image is loaded). It is particularly
difficult to decide when a specific CCD in SPLIT format is one of a
mosaic or if it is just an isolated (SINGLE) CCD image. Header
keywords to disinguish these cases are not well defined. Related to
this is the question of the number of CCDs in the mosaic. In a
limited number of locations, it is important to know that there are 12
(or N) CCDs in the mosaic. The other implementation specific issue is
the naming convention. This is related to the SPLIT/MEF issue as
well. At CFHT, we have used the convention that a MEF image has a
name of the form /some/path/NNNNNNx.fits, where NNNNNN is a sequence
number and the 'x' is a flag associated with the image type (FLAT,
OBJECT, etc). For a SPLIT image, the NN CCD images are placed in
files with names of the form /some/path/NNNNNNx/NNNNNNxMM.fits where
MM is a 2 digit number representing the CCD number. At some level, it
shouldn't matter that this is the format for the split image. We
could just accept each file of the form NNNNNNxMM.fits as a separate
fits image. But, we have chosen to keep the organizational structure
and allow the processing result files (products produced by 'elixir'
typically) to have names of the format
/new/path/NNNNNNx/NNNNNNxMM.ext, not only for the SPLIT but also for
the MEF products. As a result, it is necessary for the certain
functions to know this naming convention and apply it as necessary.
|