I noticed the original post is rather old, however, I think this information can still be valuable to those looking for a solution to verify that files are copied correctly. Rsync might be the best method to copy data and answers given in this thread are good, however for those not that experienced with Linux, I will try to give a more detailed explanation.
Scenario: You just copied data from a disk to another, with lots of sub-directories and files. You want to verify that all the data is copied correctly.
First check that md5deep is installed by issuing the command md5deep -v.
If you get a message saying something like 'command not found', then install md5deep by apt-get install md5deep.
It's assumed you only want to deal with regular files. If you want to deal with other types of files, refer to the -o flag in the md5deep manual. (man md5deep)
Now you are good to go, and we assume that you copied files from /mnt/orginal to /mnt/backup, substitute these for any directories you are using.
First change to the source directory, this is the original source for the files you copied or backed up:
cd /mnt/orginal
Then make a checksum of each file:
md5deep -rel -o f . >> /tmp/checksums.md5
This command explained:
-r enables recursive mode
-e displays progress indicator
-l enables relative file paths.
-o f only work on regular files (not block devices, named pipes etc.)
. tells md5deep to start in the current directory.
>> /tmp/checksums.md5 tells md5deep to redirect all output to /tmp/checksums.md5.
Note, if you want to overwrite content in previous versions of /tmp/checksums.md5, use > and not >>
Note that this command could take quite a while, depending on the io-speed and the size of the data. You could experiment with nice and/or ionice to increase performance of md5deep, but that's outside the scope of this answer.
When the creation of the check sums has finished, you now have a file that has entries similar to:
69c0a826b29c8f40b7ca5e56e53d7f83
./oldconfig-11-09-2013/etc2/apm/event.d/20hdparm
651f3c7f79a14332f9fa7bb368039210
./oldconfig-11-09-2013/etc2/apm/event.d/anacron
50d89784c1e201f68ff978b95ff4bdfb
./oldconfig-11-09-2013/etc2/apm/scripts.d/alsa
e9b9131660a8013983bc5e19d7d669eb
./oldconfig-11-09-2013/etc2/ld.so.cache
The first column is the md5 check sum, and the second column is the relative path to the file the checksum belongs to.
If you want to see how many files exists in the checksum file, issue the command:
wc /tmp/checksums.md5 -l
Now, you want to check that the copied data is correct:
cd /mnt/backup
md5deep -o f -reX /tmp/checksums.md5 . >> /tmp/compare.result
The only difference from when we created the checksums is -X which displays the current hash of a file if the entry in the checksums.md5 file does not match. So by the end of the test, if /tmp/compare.result is blank, you can trust that all the files are copied correctly since the checksums matches.
Note that only files listed in the /tmp/checksums.md5 file will be checked for a correct checksum, if there's any additional files in the /mnt/backup directory, md5deep will not notify you about these.
Notes:
You don't necessarily have to use redirection to store output files. Refer to the md5deep manual for further information.
You might have to run md5deep commands as root, depending on the permissions of the files you're handling.