FORUMS: list search recent posts

checksum in archival workflow?

COW Forums : Archiving and Back-Up

<< PREVIOUS   •   VIEW ALL   •   PRINT   •   NEXT >>
Paul Dougherty
checksum in archival workflow?
on Jul 18, 2016 at 2:57:23 pm

I'm trying help a colleague do a fairly large archiving/digitizing project for a career's worth of work. As producers go he is pretty technical but not a hands-on techie per se. We can use Shotput Pro to copy all files and provide checksum verification (and a checksum value for future reference) every time an asset is copied. I don't think his scale and needs warrant LTO tape.

Most likely digitized NTSC "master" files will get stored on a minimum of two drives, each in a different location. Good quality access copies aka screeners will get stored in the cloud. Once we accomplish this, it's hard to know what my parting words (or memo) should be about re-verification and auditing of his archives should be?

I know there are rules of thumb that files should get migrated to new drives every so many years. But trickier still for me is to suggest a e-verification regime for a non-techie, suggestions?

For years I have used CDFinder (now Neofinder) to ride herd on file collectors spanning many hard drives. I don't have the latest Neofinder but it offers FileCheck that seems to address this issue. (see below). This seems like it might be a great fit but would love to hear from others.

Thanks in advance for any suggestions.

Paul


http://www.cdfinder.de/en/en/filecheck.html

If you verify the FileCheck values for an entire catalog, CDFinder will even show you a window containing all files who did NOT pass the check, so you know exactly which files are damaged and need to be replaced. Of course, CDFinder also displays the actual MD5 value for every file in the Inspector:


Return to posts index

Tim Jones
Re: checksum in archival workflow?
on Jul 18, 2016 at 5:10:01 pm

Hi Paul,

Are you strictly working with disk or is LTO coming into this process?

Tim
--
Tim Jones
CTO - TOLIS Group, Inc.
http://www.tolisgroup.com
BRU ... because it's the RESTORE that matters!


Return to posts index

Paul Dougherty
Re: checksum in archival workflow?
on Jul 18, 2016 at 7:04:47 pm

Hi Tim, Though I can't absolutely guarantee it forever, I'd say no LTO on this project. And even if it should change, I expect to work with clients who have small collections and will never employ LTO. So I'd still be seeking an answer for an no-LTO scenario.

Thanks,

Paul


Return to posts index


Tim Jones
Re: checksum in archival workflow?
on Jul 18, 2016 at 7:17:48 pm

In that case, I would recommend using something like rsync or rsyncX (rsync with a GUI) since they perform checksumming at the time of the file copy to the destination automatically. It would be the same as performing an MD5 on the source end, copying the files to the destination end and then re-running the MD5 on the destination copy, and then comparing the MD5 results.

Any other mechanism would involve manual processes to generate the checksums and that can lead to the loss of the sidecar MD5 values leaving you with no option to verify the copied files.

As an aside, you could buy an LTO Thunderbolt solution and use it to provide the LTO side of the equation as a service for the users regardless of their size. LTO-6 tapes are only around $30 each, so there's a potential for a new service offering for your customers.

Tim
--
Tim Jones
CTO - TOLIS Group, Inc.
http://www.tolisgroup.com
BRU ... because it's the RESTORE that matters!


Return to posts index

Paul Dougherty
Re: checksum in archival workflow?
on Jul 18, 2016 at 8:30:10 pm

Thanks Tim,

I have to admit that the rsyncX suggestion (advantages)went over my head, especially as compared to ShotPut Pro?

Best,

Paul


Return to posts index

Tim Jones
Re: checksum in archival workflow?
on Jul 18, 2016 at 9:50:48 pm

Sorry - I totally missed that you were using ShotPut PRO for the offloading and copies (I'm still trying to get used to the new Cow forum look and workflow :) ).

The difference is that rsync/rsyncX do the checksumming transparently and any errors are recognized at the point of copy rather than after the fact and with a separate sidecar checksum database/file.

Tim
--
Tim Jones
CTO - TOLIS Group, Inc.
http://www.tolisgroup.com
BRU ... because it's the RESTORE that matters!


Return to posts index

<< PREVIOUS   •   VIEW ALL   •   PRINT   •   NEXT >>
© 2017 CreativeCOW.net All Rights Reserved
[TOP]