Re: e-transfer from terapix

From: Frederic Magnard <magnard_at_iap.fr>
Date: Wed, 21 Mar 2007 02:47:43 +0100 (CET)

Hi JJ,

On Sat, 17 Mar 2007, JJ Kavelaars wrote:
> To get e-transfer setup can you please provide a directory that will be OWNED
> by cadc and will have the new/replace/... directories included.

clix.iap.fr:/data/clix/fc6/cadc/
This directory has 1.2 TB free, and is on a Fiber Channel area.
You can create the staging directory you wish within /data/clix/fc6/cadc/.
NB: the former cadc directory was on fc3, which is out of order now.

> We also
> request that you 'batch' the placement of the sym-links so that we don't have
> a process that is trying to get more then about 100Gbytes of data in one
> step, just to be sure we aren't making mistakes.

Couldn't this check be done at the e-transfer level ? This would avoid a layer
on our side.

The step1 has 15850 RICE compressed weightmaps, which is 4.14 TiB of data.
Some of those weightmaps are the same as in T0003, with the same name, e.g.
716303p_weight.fits.fz.

> A work around for you is to place all sym-links into the new area. If we
> already have a copy with that name we will put the sym link into an area like
> rejected/not-new. Then you can move it to replace. We will check the replace
> area file and if we have one with the same CRC then we'll put the symlink into
> the directory rejected/not-replace.

OK, then the CRC is done only at this stage, right ? This saves computer
resources, and bandwidth.

> Alternatively, I can provide you with our 'CRC' of the file and you can
> compare our 'CRC' to value you get by running our crc-generator on the file.
> The binary for our crc generator is on your machine at
>
> clix.iap.fr:/home/nis/cadc/bin/cadcCRC

Thanks. Could we please have the source code of this program ?
Do you plan to switch to a checksum like md5 ? It's almost 2 times faster,
and the resulting hash is 4 times longer. Did you ever had collision
problems with this CRC ?

> I can add the CRC values we produce to the megaprime_proxy table.

Yes, that would be great. I can then move to new only the needed files.
Could you please also explain again how to use it (and its output format),
as I guess it might have changed a bit ?

The list of weight images is ready, just waiting to be filtered to eliminate
already transfered files. This time, the links will point to NFS mounted
filesystems. That's why, I asked, for the near future, to have one
e-transfer daemon per machine hosting the data.

The proxy http://cadcwww.hia.nrc.ca/cadcbin/cfhtInfo seems to still be
active, could you please explain how to use it, if it can be useful ? It
looks like it can return the CRC too.

The csv dump of image_name, grade, comment is in
/data/clix/fc6/cadc/new/grade_comments_step1_T0004.csv

Cheers,
Fred.
Received on Tue Mar 20 2007 - 15:47:52 HST

This archive was generated by hypermail 2.3.0 : Thu Jul 27 2017 - 17:52:27 HST