Re: e-transfer from terapix

From: kanoa <kanoa_at_cfht.hawaii.edu>
Date: Wed, 21 Mar 2007 11:03:40 -1000

Hi JJ,

We have not been using the FITS standard checksums on science data
though we do use it for metadata tables.

I'm a little concerned about the 'standard' method in that the checksums
are inserted into the header, which is fine until you compress the file.
If you use a tile compression like RICE the headers are still readable
and they have checksum data but the checksums are wrong. Then we would
also like to have checksums of compressed files, as we do now, because
that is the format we usually transmit data in.

So what do we do? Update the checksums after compressing? Then they will
be invalid when we decompress the file and we're making an unreasonable
assumption about data integrity through the codec.

We could record the original checksum values to compare with the
decompressed file later but we would first have to update the checksum
header values in the decompressed file first since we cannot trust that
they represent the actual data anymore. Then we would need to compare
header and data checksums for each extension, 72 checksums for each
megacam image.

I don't see an easy way out of the mess which is why I've been taking
md5sums of the uncompressed file and storing them externally. Maybe we
could store the CRC of the compressed file in addition? The CRC would
help us track the file in our systems and the md5 would help users who
have decompressed the file.

-Kanoa

JJ Kavelaars wrote:
>
> On 20-Mar-07, at 7:34 PM, kanoa wrote:
>
>>
>> I'll second this, it would be nice to use a widely available hash
>> format so we (CFHT) and other users can confirm data integrity. md5
>> would be ideal since we already track md5 signiatures for all CFHT data.
>>
>> -Kanoa
>
>
> BTW Kanoa, the cadcCRC binary is on the machines at CFHT also. How-
> ever, as mentioned earlier, we are moving to MD5 and also looking at
> complying with the FITS-standard initiative of CRC in the header.
>
> Has CFHT been looking at following the FITS standard on checksums?
>
> JJ
>
>
>
>
>>
>>
>> Frederic Magnard wrote:
>>
>>>> Alternatively, I can provide you with our 'CRC' of the file and you
>>>> can
>>>> compare our 'CRC' to value you get by running our crc-generator on
>>>> the file.
>>>> The binary for our crc generator is on your machine at
>>>>
>>>> clix.iap.fr:/home/nis/cadc/bin/cadcCRC
>>>
>>> Thanks. Could we please have the source code of this program ?
>>> Do you plan to switch to a checksum like md5 ? It's almost 2 times
>>> faster,
>>> and the resulting hash is 4 times longer. Did you ever had collision
>>> problems with this CRC ?
>>
>>
Received on Wed Mar 21 2007 - 11:03:41 HST

This archive was generated by hypermail 2.3.0 : Thu Jul 27 2017 - 17:52:27 HST