| CFHT archive manual - utilities | |
|
|
Find-archive searches the archiving records for information about a file. Note that find-archive exits as soon as it finds a match so it only returns one entry. Find-archive is flexible so only give it as much of the name of the file as you are sure of and don't give it any wildcards (like *,? [], etc). Use it like this:
find-archive.tcl file
For example:
[archive@kapu:~] find-archive.tcl 528777
working...
going back further into the records...
528777 found in document tarexa_list.CADC223 :
528777f.fits on CFHT-CADC223_TAR - tar set 4 - Thu Apr 13 21:28:46 HST
2000 10494400
[archive@kapu:~]
CFHT-CADC233 is the tape the file is on.
Extract takes three arguments, the destination directory, the source device and the number of sets on the tape. The number of sets on a tape should be written on the outside of its box. So the construction would look like this:
[archive@kapu:~] extract.tcl -dst destination -src source device -sets no. of sets
for example:
[archive@kapu:~] extract.tcl -dst /local/data/dearchive -src /dev/rmt/0 -sets 33
Extract makes a subdirectory for each set where it writes the files names
0,1,2,3 etc.
Note that each DLT holds about 35 gigabytes of data, so dearchiving an entire
DLT takes time and space.
Extract-sets works like extract except that it extracts only the sets that you specify. It is also interactive so rather than giving it a srting of arguments you enter information as it asks for it. It needs essentially the same information as extract except that it asks for a first and last set number rather than a total tape. One other difference is that it puts all of the files into the current directory which makes it easier to sort them.
Mta works on a directory of dearchived files, spooling them into a temporary directory in groups appropriate to the size of the media you are using, then writing the files to tapes. mta can use any size of media and can write to any number of tapes. It gets information from you interactively so you don't send it any arguments. It needs to know the directory you want to work in, the device you want to write to, the size of the media you will be using in kilobyes (a 5g exabyte tape for example would be 5000000), the first file in your range and the last file. When entering the file names just enter the number associated with it, for example if the first file is 528777f.fits just enter 528777.
adhUsage: adh <action> <options>
Adh (archive data handler) is a multipurpose utility for distributing queue data and maintaining the online data archives. It provides an interface for processing data for distribution to investigators, removing data from the online storage system and recovering parts of the online storage system from archive tapes to disk. It uses internal interfaces to the archive database and interfaces with the DLT library through libh. It is easily extensible to write distribution data to virtually any type of media.
To distribute data adh generally takes some input as to which files to process, how to store them and in what format they should be stored. Input files may be specified by a list in a text file of the format:
12345o.fits
54321o.fits
...
Or just OD numbers. Unless RAW mode is specified all fits data is processed by elixir.
If a runid is specified, all data assosciated with that runid is processed. Once a list of files is generated or imported a list of the elixir master detrend files that will be used to process the data is generated and a copy of each file is copied into the spool area, which is the first value returned by adh for a distribution.
The spool area is a directory that will be used for temporary files associated with processing and media writing and will also contain all non-fits data associated with the distribution. Next if any of the files are standard stars acquired in queue mode they are processed by elixir and temporarily spooled. Next each file is examined to determine the calendar nights on which the observations were taken and a detailed report of the weather conditions for those dates is generated.
All the steps untill now are considered preliminary processes and if adh is directed to "recover" a distribution by pointing it to an existing spool area all of the information given at the command line, including the list of files will be inferred from controll data in the spool area and all of the preliminary processes that have already been completed will be skipped.
Next each of the science images are processed with elixir and written to the distribution media. The fits header of each post-processed image is copied into the "headers" directory in the spool area and two jpeg images, a thumbnail and a larger, binned image are copied into the "img" directory. A tape manifest is aslo maintined in the spool area as a text file. Once all the science data has been processed and written to media the master detrend data and the processed standard stars are written to the media as well. At the end of the distribution you are given the option to clean up the spool area of bulky data products so that only controll data and the records of the distribution remain.
Examples:
adh -file files.txt dlt MEFWill process all the files listed in files.txt in the manner described above. They will be written to DLT tape in multiextension fits format.
adh 01ax99 dlt MEFWill process all the files taken with the fits keyword RUNID set to 01ax99 in the manner described above. Note that no standard stars will be associated with a distribution of this type.
adh -recover /data/kapu/spool/spool3Will look in the spool directory specified to obtain the original list of files, media type and file format. Any preliminary processing of weather, standard star, and master detrend data that is already complete will be skipped.
adh -restore /data/pono autodltWill restore the files that were stored on /data/pono from archive tapes in the dlt library. You will be prompted with options to restore only a subset of the filesystem or to duplicate the files to a host other than pono. adh -delete 01ax99 Will delete all files associated with the 01ax99 runid from disk and update the database to reflect the changes.
libh--print_report Prints a summary of the library's contents.
Libh provides a uniform interface to the dlt library. It is extensible to include other robotic autoloaders as well. The basic fuctions serve to move tapes to and from drives, clean drives, and maintain a database that associates tape names to location within the library. Libh uses a locking scheme so that only one request may be made of the library at a time and only from the applicable host. Before a tape can be used it must first be allocated, whereby the name and primary user of the tape is secified. The index number of the tape that was allocated is returned.
Loading tapes into the drives is an ambiguous operation where the name of the tape to be used is specified and the drive into which it was loaded is returned. The first available device is used. Unloading tapes is also ambiguous as the name to the tape to be unloaded is specified and it is returned to the slot indicated in the database.
The "load" option invokes an interactive process where the database entry for each tape element is reviewed. This is helpfull if you are loading or unloading a large number of tapes to or from the library at one time.
Examples:
adh --print_reportWrites a summary of the contentes of the tape library to the terminal including the tape name, status, owner and date of allocation.
adh --use_tape CFHT-CADC234_3dLoads the named tape into the first available drive and returns the device name of that drive.
adh --free_tape CFHT-CADC234_3dReturns the tape loaded in the above example to it's original slot in the library.
check_poolCheck_pool reports either or both a list of runids which currently have files on disk according to the database and summarises the state of those files, the number listed in the database, the number actually present on disk and the amount of disk space currently used on all storage pools by each runid.
sql_updateSql_update provides a generic interface to the archive database. It has two main modes of operation, one where a single record is updated or created and another where a list of files is parsed and the data recorded in the database. The first mode, invoked with a -l, takes a sequence of keyword/value pairs where one must be "name" which is the name of the file, and updates the corresponding record in the database. If no record exists a new one is created. The second mode is invoked by default and it scans the list file generated the tarexad to update the database with information about when and where a file was archived.
Examples:
sql_update -l name="'12345o.fits'" RUNID="'01ax99'" path="'/data/loa/01ax99/1234o.fits'"Updates the record for 12345o.fits to make an association to a runid and a location in an online storage volume.
sql_updateParses the listfile generated by tarexad and updates the information on each line to the database.
change_auto_mediaChange_auto_media is a utility invoked by distd when the archive tapes are full, it also includes an interactive mode for use by humans. It takes the tarexa daemon name as an argument. It unloads the old tape, allocates a new one and loads it into an available drive. If the drive needs cleaning it performs this operation. The new device name is inserted into the tarexa daemon's configuration files which will be reread when distd restarts it.
Examples:
change_auto_media tarexa3dChange the tape currently used by tarexa3d. Rotate log and list files, clean the device if it is time, update the configuration files for tarexa3d with the new device name.
change_auto_media -i tarexa3dPerform the operations above but in an interactive method. Debugging messages are printed to the screen and errors are trapped. At each error you are given the option to proceed or cancel the change of media.
kill_daemonKill_daemon cleanly terminates the operation of the specified daemon if it is not busy.
Example:
kill_damemon tarexa3d
start_daemonStart_daemon initalizes a deamon based on information in it's parfile and starts it's process.
Example:
start_daemon tarexa3d
kill_archiveKill_archive invokes kill_daemon for every archive daemon listed in the main archvie parfile.
Example:
kill_archive
start_archiveStart_archive invokes start_daemon for every archive daemon listed in the main archive parfile.
Example:
start_archive
check_archCheck_arch reports on the status of each archive daemon listed in the main archive parfile, including the amount of storage space in the daemon's working directory and in the remote directories which files are copied from
Example:
check_arch
tape-testTape-test reads each set from an archive tape to verify that it is readable. It is interactive and prompts for information that it needs.
Example:
tape-test
itapeItape returns information about an archive tape including the range of files on the tape, the number of sets and the date the tape was started and finished.
Example:
itape 220Returns information about tape CFHT-CADC220.
tarexahTarexah is the handle used by the tarexa daemons to write files to media. Utilising a such a handler allows the daemons to be easily modified to write to various types of media. The handler checks the tape lable to ensure that it is using the correct tape, positions the tape, writes the files, rereads the files and compares them to the originals for verification, and manages a counter of how much data has been written to the tape tus far and in how many sets.
Example:
The handler is always invoked by the daemon which passed the working directory of the daemon and the device that it is currently using.
tarexa_initTarexa_init is a small handler used by the tarexa daemons to initialize and label new media before use.
Example:
The handler is always invoked by the daemon which passed the working directory of the daemon, the device that it is currently using and the new label for the media.