CFHT archive manual - utilities

User Utilities


find-archive.tcl

Find-archive searches the archiving records for information about a file. Note that find-archive exits as soon as it finds a match so it only returns one entry. Find-archive is flexible so only give it as much of the name of the file as you are sure of and don't give it any wildcards (like *,? [], etc). Use it like this:

find-archive.tcl file

For example:

[archive@kapu:~] find-archive.tcl 528777
working...
going back further into the records...
528777 found in document tarexa_list.CADC223 :
528777f.fits on CFHT-CADC223_TAR - tar set 4 - Thu Apr 13 21:28:46 HST 2000 10494400
[archive@kapu:~]

CFHT-CADC233 is the tape the file is on.



Alternately, you can search for records of files using find-archive directly through this web page:
search archives for file 



extract.tcl

Extract takes three arguments, the destination directory, the source device and the number of sets on the tape. The number of sets on a tape should be written on the outside of it's box. So the construction would look like this:

[archive@kapu:~] extract.tcl -dst destination -src source device -sets no. of sets

for example:

[archive@kapu:~] extract.tcl -dst /local/data/dearchive -src /dev/rmt/0 -sets 33

Extract makes a subdirectory for each set where it writes the files names 0,1,2,3 etc.
Note that each DLT holds about 35 gigabytes of data, so dearchiving an entire DLT takes time and space.

extract-sets.tcl

Extract-sets works like extract except that it extracts only the sets that you specify. It is also interactive so rather than giving it a srting of arguments you enter information as it asks for it. It needs essentially the same information as extract except that it asks for a first and last set number rather than a total tape. One other difference is that it puts all of the files into the current directory which makes it easier to sort them.

mta.tcl

Mta works on a directory of dearchived files, spooling them into a temporary directory in groups appropriate to the size of the media you are using, then writing the files to tapes. mta can use any size of media and can write to any number of tapes. It gets information from you interactively so you don't send it any arguments. It needs to know the directory you want to work in, the device you want to write to, the size of the media you will be using in kilobyes (a 5g exabyte tape for example would be 5000000), the first file in your range and the last file. When entering the file names just enter the number associated with it, for example if the first file is 528777f.fits just enter 528777.


Administrative utilities

adh
Usage: adh <action> <media type> <runid> <MEF/SPLIT/RAW>
-delete : delete files from disk and update database.
-nospool : same as -delete only without processing data or writing any media.
-restore : restore to disk an entire storage area or files associated with a runid from archive tapes.
-file : specify a file that contains a list of the files to write to media, otherwise specify runid or use -recover
-recover : recover from an abrorted distribution, specify the spool directory originally used.
Currently the supported media types are: dlt autodlt disk

Adh (archive data handler) is a multipurpose utility for distributing queue data and maintaining the online data archives. It provides an interface for processing data for distribution to investigators, removing data from the online storage system and recovering parts of the online storage system from archive tapes to disk. It uses internal interfaces to the archive database and interfaces with the DLT library through libh. It is easily extensible to write distribution data to virtually any type of media.

To distribute data adh generally takes some input as to which files to process, how to store them and in what format they should be stored. Input files may be specified by a list in a text file of the format:

12345o.fits
54321o.fits
...

Unless RAW mode is specified all fits data is processed by elixir.

If a runid is specified, all data assosciated with that runid is processed. Once a list of files is generated or imported a list of the elixir master detrend files that will be used to process the data is generated and a copy of each file is copied into the spool area, which is the first value returned by adh for a distribution.

The spool area is a directory that will be used for temporary files associated with processing and media writing and will also contain all non-fits data associated with the distribution. Next if any of the files are standard stars acquired in queue mode they are processed by elixir and temporarily spooled. Next each file is examined to determine the calendar nights on which the observations were taken and a detailed report of the weather conditions for those dates is generated.

All the steps untill now are considered preliminary processes and if adh is directed to "recover" a distribution by pointing it to an existing spool area all of the information given at the command line, including the list of files will be inferred from controll data in the spool area and all of the preliminary processes that have already been completed will be skipped.

Next each of the science images are processed with elixir and written to the distribution media. The fits header of each post-processed image is copied into the "headers" directory in the spool area and two jpeg images, a thumbnail and a larger, binned image are copied into the "img" directory. A tape manifest is aslo maintined in the spool area as a text file. Once all the science data has been processed and written to media the master detrend data and the processed standard stars are written to the media as well. At the end of the distribution you are given the option to clean up the spool area of bulky data products so that only controll data and the records of the distribution remain.

Examples:

adh -file files.txt dlt MEF
Will process all the files listed in files.txt in the manner described above. They will be written to DLT tape in multiextension fits format.
adh 01ax99 dlt MEF
Will process all the files taken with the fits keyword RUNID set to 01ax99 in the manner described above. Note that no standard stars will be associated with a distribution of this type.
adh -recover /data/kapu/spool/spool3
Will look in the spool directory specified to obtain the original list of files, media type and file format. Any preliminary processing of weather, standard star, and master detrend data that is already complete will be skipped.
adh -restore /data/pono autodlt
Will restore the files that were stored on /data/pono from archive tapes in the dlt library. You will be prompted with options to restore only a subset of the filesystem or to duplicate the files to a host other than pono. adh -delete 01ax99 Will delete all files associated with the 01ax99 runid from disk and update the database to reflect the changes.
libh
--print_report Prints a summary of the library's contents.
--tape_info Prints a summary of the archive information assosciated with the tapes currently loaded in the library except for the tapes currently operated on.
--init_slot [slot] Initialize a slot. Slot numbers may be specified individually or as a range. Note that initializing a slot is equivalent to deleting a tape as it will be free for allocation by any program.
--allocate_tape [name] [owner] Allocates a tape for a specific name and user. A tape must be allocated before it can be used. This function returns the address of the slot that has been allocated.
--use_tape [name] Loads a specified tape into a free tape drive. Returns the device name of the tape drive used.
--free_tape [name] Unloads a tape from whatever tape drive it was used in. Returns the tape to the slot it was originally allocated in.
--clean [device] [reset] Clean drives, tapes will be temporarily moved back to thier slots. If a device is not specified all drives will be cleaned. The reset flag resets the cleaning counter which is in the "owner" field. Of course, you should only do this after actually changing this tape.
--load Change the names of the tapes in the library.
--help Print this message

Libh provides a uniform interface to the dlt library. It is extensible to include other robotic autoloaders as well. The basic fuctions serve to move tapes to and from drives, clean drives, and maintain a database that associates tape names to location within the library. Libh uses a locking scheme so that only one request may be made of the library at a time and only from the applicable host. Before a tape can be used it must first be allocated, whereby the name and primary user of the tape is secified. The index number of the tape that was allocated is returned.

Loading tapes into the drives is an ambiguous operation where the name of the tape to be used is specified and the drive into which it was loaded is returned. The first available device is used. Unloading tapes is also ambiguous as the name to the tape to be unloaded is specified and it is returned to the slot indicated in the database.

The "load" option invokes an interactive process where the database entry for each tape element is reviewed. This is helpfull if you are loading or unloading a large number of tapes to or from the library at one time.

Examples:

adh --print_report
Writes a summary of the contentes of the tape library to the terminal including the tape name, status, owner and date of allocation.
adh --use_tape CFHT-CADC234_3d
Loads the named tape into the first available drive and returns the device name of that drive.
adh --free_tape CFHT-CADC234_3d
Returns the tape loaded in the above example to it's original slot in the library.
sql_update
Sql_update provides a generic interface to the archive database. It has two main modes of operation, one where a single record is updated or created and another where a list of files is parsed and the data recorded in the database. The first mode, invoked with a -l, takes a sequence of keyword/value pairs where one must be "name" which is the name of the file, and updates the corresponding record in the database. If no record exists a new one is created. The second mode is invoked by default and it scans the list file generated the tarexad to update the database with information about when and where a file was archived.

Examples:

sql_update -l name="'12345o.fits'" RUNID="'01ax99'" path="'/data/loa/01ax99/1234o.fits'"
Updates the record for 12345o.fits to make an association to a runid and a location in an online storage volume.
sql_update
Parses the listfile generated by tarexad and updates the information on each line to the database.
change_auto_media
Change_auto_media is a utility invoked by distd when the archive tapes are full, it also includes an interactive mode for use by humans. It takes the tarexa daemon name as an argument. It unloads the old tape, allocates a new one and loads it into an available drive. If the drive needs cleaning it performs this operation. The new device name is inserted into the tarexa daemon's configuration files which will be reread when distd restarts it.

Examples:

change_auto_media tarexa3d
Change the tape currently used by tarexa3d. Rotate log and list files, clean the device if it is time, update the configuration files for tarexa3d with the new device name.
change_auto_media -i tarexa3d
Perform the operations above but in an interactive method. Debugging messages are printed to the screen and errors are trapped. At each error you are given the option to proceed or cancel the change of media.
kill_daemon
Kill_daemon cleanly terminates the operation of the specified daemon if it is not busy.

Example:

kill_damemon tarexa3d
start_daemon
Start_daemon initalizes a deamon based on information in it's parfile and starts it's process.

Example:

start_daemon tarexa3d
kill_archive
Kill_archive invokes kill_daemon for every archive daemon listed in the main archvie parfile.

Example:

kill_archive
start_archive
Start_archive invokes start_daemon for every archive daemon listed in the main archive parfile.

Example:

start_archive
check_arch
Check_arch reports on the status of each archive daemon listed in the main archive parfile, including the amount of storage space in the daemon's working directory and in the remote directories which files are copied from

Example:

check_arch
tape-test
Tape-test reads each set from an archive tape to verify that it is readable. It is interactive and prompts for information that it needs.

Example:

tape-test
itape
Itape returns information about an archive tape including the range of files on the tape, the number of sets and the date the tape was started and finished.

Example:

itape 220
Returns information about tape CFHT-CADC220.
tarexah
Tarexah is the handle used by the tarexa daemons to write files to media. Utilising a such a handler allows the daemons to be easily modified to write to various types of media. The handler checks the tape lable to ensure that it is using the correct tape, positions the tape, writes the files, rereads the files and compares them to the originals for verification, and manages a counter of how much data has been written to the tape tus far and in how many sets.

Example:

The handler is always invoked by the daemon which passed the working directory of the daemon and the device that it is currently using.

tarexa_init
Tarexa_init is a small handler used by the tarexa daemons to initialize and label new media before use.

Example:

The handler is always invoked by the daemon which passed the working directory of the daemon, the device that it is currently using and the new label for the media.