CFHT archive manual - utilities

User Utilities

  • find-archive
  • extract.tcl
  • extract-sets.tcl
  • mta.tcl

  • find-archive.tcl

    Find-archive searches the archiving records for information about a file. Note that find-archive exits as soon as it finds a match so it only returns one entry. Find-archive is flexible so only give it as much of the name of the file as you are sure of and don't give it any wildcards (like *,? [], etc). Use it like this:

    find-archive.tcl file

    For example:

    [archive@kapu:~] find-archive.tcl 528777
    working...
    going back further into the records...
    528777 found in document tarexa_list.CADC223 :
    528777f.fits on CFHT-CADC223_TAR - tar set 4 - Thu Apr 13 21:28:46 HST 2000 10494400
    [archive@kapu:~]

    CFHT-CADC233 is the tape the file is on.



    Alternately, you can search for records of files using find-archive directly through this web page:
    search archives for file 



    extract.tcl

    Extract takes three arguments, the destination directory, the source device and the number of sets on the tape. The number of sets on a tape should be written on the outside of its box. So the construction would look like this:

    [archive@kapu:~] extract.tcl -dst destination -src source device -sets no. of sets

    for example:

    [archive@kapu:~] extract.tcl -dst /local/data/dearchive -src /dev/rmt/0 -sets 33

    Extract makes a subdirectory for each set where it writes the files names 0,1,2,3 etc.
    Note that each DLT holds about 35 gigabytes of data, so dearchiving an entire DLT takes time and space.

    extract-sets.tcl

    Extract-sets works like extract except that it extracts only the sets that you specify. It is also interactive so rather than giving it a srting of arguments you enter information as it asks for it. It needs essentially the same information as extract except that it asks for a first and last set number rather than a total tape. One other difference is that it puts all of the files into the current directory which makes it easier to sort them.

    mta.tcl

    Mta works on a directory of dearchived files, spooling them into a temporary directory in groups appropriate to the size of the media you are using, then writing the files to tapes. mta can use any size of media and can write to any number of tapes. It gets information from you interactively so you don't send it any arguments. It needs to know the directory you want to work in, the device you want to write to, the size of the media you will be using in kilobyes (a 5g exabyte tape for example would be 5000000), the first file in your range and the last file. When entering the file names just enter the number associated with it, for example if the first file is 528777f.fits just enter 528777.


    Administrative Utilities

  • adh
  • libh
  • check_pool
  • sql update
  • change auto media
  • kill daemon
  • start daemon
  • kill archive
  • start archive
  • check arch
  • tape test
  • itape
  • tarexah
  • tarexa init
  • adh
    Usage: adh <action> <options>

    --delete <path to delete> : delete files from disk, update symlinks and database. Give a fiesystem (ie:/data/ar1), a subdirectory (ie:/data/ar1/01aq97) or a file (ie:/data/ar1/01aq97/0101010o.fits).

    --restore <filesystem or list file> <media type> : restore to disk an entire storage area or files associated with a runid from archive tapes. Optionally give the filesystem or the path to a file that contains a list of files to restore. Several interactive questions are asked. It is possible to restore only files associated with a runid or list of runids in a filesystem, it is also possible to restore the files to a host other than the original (making a duplicate filesystem). Only files not already on the destination filesystem are restored. Media type must be specified, in this case dlt or autodlt.

    --recover <spool directory> : recover from an abrorted distribution, specify the spool directory originally used.

    --relocate <source> <destination> : move the files in source to destination and update the symbolic links and database to reflect the new location. Source can be filesystem, subdirectory or file but destination must be filesystem. Only files tracked by the database are moved, if a file in the database is no longer on the source filesystem the database is updated to "not.on.disk". Only files not already on the destination filesystem are copied and updated.

    --distribute <runid or --file=<input file>> <media type> <PROCESS/RAW> <MEF/SPLIT> : Collect, process and write a collection of files to disk or other media. If runid is given all exposures aquired under the runid are taken as input otherwise specify an input list file. Default output is PROCESS and MEF.

    -y : assume that the answer to all prompts is yes and proceed noninteractively.
    -q : quiet mode, continue to output to the logfile but reduce output to stdout to a minimum, also assume noninteractive mode as in -y.

    Currently the supported media types are: dlt autodlt dds4 exabyte mammoth cdrom and disk depending on the resources of the host.

    Adh (archive data handler) is a multipurpose utility for distributing queue data and maintaining the online data archives. It provides an interface for processing data for distribution to investigators, removing data from the online storage system and recovering parts of the online storage system from archive tapes to disk. It uses internal interfaces to the archive database and interfaces with the DLT library through libh. It is easily extensible to write distribution data to virtually any type of media.

    To distribute data adh generally takes some input as to which files to process, how to store them and in what format they should be stored. Input files may be specified by a list in a text file of the format:

    12345o.fits
    54321o.fits
    ...

    Or just OD numbers. Unless RAW mode is specified all fits data is processed by elixir.

    If a runid is specified, all data assosciated with that runid is processed. Once a list of files is generated or imported a list of the elixir master detrend files that will be used to process the data is generated and a copy of each file is copied into the spool area, which is the first value returned by adh for a distribution.

    The spool area is a directory that will be used for temporary files associated with processing and media writing and will also contain all non-fits data associated with the distribution. Next if any of the files are standard stars acquired in queue mode they are processed by elixir and temporarily spooled. Next each file is examined to determine the calendar nights on which the observations were taken and a detailed report of the weather conditions for those dates is generated.

    All the steps untill now are considered preliminary processes and if adh is directed to "recover" a distribution by pointing it to an existing spool area all of the information given at the command line, including the list of files will be inferred from controll data in the spool area and all of the preliminary processes that have already been completed will be skipped.

    Next each of the science images are processed with elixir and written to the distribution media. The fits header of each post-processed image is copied into the "headers" directory in the spool area and two jpeg images, a thumbnail and a larger, binned image are copied into the "img" directory. A tape manifest is aslo maintined in the spool area as a text file. Once all the science data has been processed and written to media the master detrend data and the processed standard stars are written to the media as well. At the end of the distribution you are given the option to clean up the spool area of bulky data products so that only controll data and the records of the distribution remain.

    Examples:

    adh -file files.txt dlt MEF
    Will process all the files listed in files.txt in the manner described above. They will be written to DLT tape in multiextension fits format.
    adh 01ax99 dlt MEF
    Will process all the files taken with the fits keyword RUNID set to 01ax99 in the manner described above. Note that no standard stars will be associated with a distribution of this type.
    adh -recover /data/kapu/spool/spool3
    Will look in the spool directory specified to obtain the original list of files, media type and file format. Any preliminary processing of weather, standard star, and master detrend data that is already complete will be skipped.
    adh -restore /data/pono autodlt
    Will restore the files that were stored on /data/pono from archive tapes in the dlt library. You will be prompted with options to restore only a subset of the filesystem or to duplicate the files to a host other than pono. adh -delete 01ax99 Will delete all files associated with the 01ax99 runid from disk and update the database to reflect the changes.
    libh
    --print_report Prints a summary of the library's contents.
    --tape_info Prints a summary of the archive information assosciated with the tapes currently loaded in the library except for the tapes currently operated on.
    --init_slot [slot] Initialize a slot. Slot numbers may be specified individually or as a range. Note that initializing a slot is equivalent to deleting a tape as it will be free for allocation by any program.
    --allocate_tape [name] [owner] Allocates a tape for a specific name and user. A tape must be allocated before it can be used. This function returns the address of the slot that has been allocated.
    --use_tape [name] Loads a specified tape into a free tape drive. Returns the device name of the tape drive used.
    --free_tape [name] Unloads a tape from whatever tape drive it was used in. Returns the tape to the slot it was originally allocated in.
    --clean [device] [reset] Clean drives, tapes will be temporarily moved back to thier slots. If a device is not specified all drives will be cleaned. The reset flag resets the cleaning counter which is in the "owner" field. Of course, you should only do this after actually changing this tape.
    --load Change the names of the tapes in the library.
    --help Print this message

    Libh provides a uniform interface to the dlt library. It is extensible to include other robotic autoloaders as well. The basic fuctions serve to move tapes to and from drives, clean drives, and maintain a database that associates tape names to location within the library. Libh uses a locking scheme so that only one request may be made of the library at a time and only from the applicable host. Before a tape can be used it must first be allocated, whereby the name and primary user of the tape is secified. The index number of the tape that was allocated is returned.

    Loading tapes into the drives is an ambiguous operation where the name of the tape to be used is specified and the drive into which it was loaded is returned. The first available device is used. Unloading tapes is also ambiguous as the name to the tape to be unloaded is specified and it is returned to the slot indicated in the database.

    The "load" option invokes an interactive process where the database entry for each tape element is reviewed. This is helpfull if you are loading or unloading a large number of tapes to or from the library at one time.

    Examples:

    adh --print_report
    Writes a summary of the contentes of the tape library to the terminal including the tape name, status, owner and date of allocation.
    adh --use_tape CFHT-CADC234_3d
    Loads the named tape into the first available drive and returns the device name of that drive.
    adh --free_tape CFHT-CADC234_3d
    Returns the tape loaded in the above example to it's original slot in the library.
    check_pool
    Check_pool reports either or both a list of runids which currently have files on disk according to the database and summarises the state of those files, the number listed in the database, the number actually present on disk and the amount of disk space currently used on all storage pools by each runid.
    sql_update
    Sql_update provides a generic interface to the archive database. It has two main modes of operation, one where a single record is updated or created and another where a list of files is parsed and the data recorded in the database. The first mode, invoked with a -l, takes a sequence of keyword/value pairs where one must be "name" which is the name of the file, and updates the corresponding record in the database. If no record exists a new one is created. The second mode is invoked by default and it scans the list file generated the tarexad to update the database with information about when and where a file was archived.

    Examples:

    sql_update -l name="'12345o.fits'" RUNID="'01ax99'" path="'/data/loa/01ax99/1234o.fits'"
    Updates the record for 12345o.fits to make an association to a runid and a location in an online storage volume.
    sql_update
    Parses the listfile generated by tarexad and updates the information on each line to the database.
    change_auto_media
    Change_auto_media is a utility invoked by distd when the archive tapes are full, it also includes an interactive mode for use by humans. It takes the tarexa daemon name as an argument. It unloads the old tape, allocates a new one and loads it into an available drive. If the drive needs cleaning it performs this operation. The new device name is inserted into the tarexa daemon's configuration files which will be reread when distd restarts it.

    Examples:

    change_auto_media tarexa3d
    Change the tape currently used by tarexa3d. Rotate log and list files, clean the device if it is time, update the configuration files for tarexa3d with the new device name.
    change_auto_media -i tarexa3d
    Perform the operations above but in an interactive method. Debugging messages are printed to the screen and errors are trapped. At each error you are given the option to proceed or cancel the change of media.
    kill_daemon
    Kill_daemon cleanly terminates the operation of the specified daemon if it is not busy.

    Example:

    kill_damemon tarexa3d
    start_daemon
    Start_daemon initalizes a deamon based on information in it's parfile and starts it's process.

    Example:

    start_daemon tarexa3d
    kill_archive
    Kill_archive invokes kill_daemon for every archive daemon listed in the main archvie parfile.

    Example:

    kill_archive
    start_archive
    Start_archive invokes start_daemon for every archive daemon listed in the main archive parfile.

    Example:

    start_archive
    check_arch
    Check_arch reports on the status of each archive daemon listed in the main archive parfile, including the amount of storage space in the daemon's working directory and in the remote directories which files are copied from

    Example:

    check_arch
    tape-test
    Tape-test reads each set from an archive tape to verify that it is readable. It is interactive and prompts for information that it needs.

    Example:

    tape-test
    itape
    Itape returns information about an archive tape including the range of files on the tape, the number of sets and the date the tape was started and finished.

    Example:

    itape 220
    Returns information about tape CFHT-CADC220.
    tarexah
    Tarexah is the handle used by the tarexa daemons to write files to media. Utilising a such a handler allows the daemons to be easily modified to write to various types of media. The handler checks the tape lable to ensure that it is using the correct tape, positions the tape, writes the files, rereads the files and compares them to the originals for verification, and manages a counter of how much data has been written to the tape tus far and in how many sets.

    Example:

    The handler is always invoked by the daemon which passed the working directory of the daemon and the device that it is currently using.

    tarexa_init
    Tarexa_init is a small handler used by the tarexa daemons to initialize and label new media before use.

    Example:

    The handler is always invoked by the daemon which passed the working directory of the daemon, the device that it is currently using and the new label for the media.


    Kanoa
    Last modified: Fri Nov 23 12:59:39 HST 2001