1

A couple of weeks ago I bought a "2Tb" thumb drive of uncertain origin, with the intention of using this as intermediate storage when rebuilding a system (three releases forwards of Debian).

Can anybody suggest an efficient way of verifying the actual size of this, i.e. that it actually has "2Tb" of Flash rather than a single "500Mb" device repeating in the storage space?

I'd like to emphasise that I am fully aware of the liberties that manufacturers have long taken when stating capacities, and that my "2Tb" drive will be likely to have a maximum real capacity of something like 1.75Tib.

It was originally formatted with unpartitioned exFAT, and while my usual test program would write more than 1Tb of test data to it it invariably glitched at some random point before getting to the read pass which would verify that the block numbers were actually retained. While that could point to flakiness in the drive's microcontroller, the problem might equally be in the comparatively new exFAT support on Linux.

I am able to use gparted to partition and reformat as ext4 or ext2 without error.

Tring to manually run mke2fs with -cc options for a r/w block test is taking about 80 hours per 1% of the drive. In addition, I've not seen explicit verification that it has the two separate passes which would be needed to verify size unambiguously.

I've not yet tried running my own test program, which I trust on smaller media (10s of Gb scale) on this device formatted as ext2.

In cases where my test program was being applied to a block device rather than to a file, I could possibly improve efficiency by adding a --sparse option which only wrote the block number in e.g. a 4K block. This probably wouldn't help if the target was a test file, since (a) the OS might not allocate space for unwritten areas in sparse files and (b) there would be so many layers of translation involved that it would be virtually impossibly to hit the Flash device block boundaries.

Any suggestions would be appreciated.

  • Just dd /dev/zero to the raw device and see how far it gets. It's as simple as that. – Bib Aug 18 '22 at 09:33
  • 2
    @Bib that won’t verify that the storage isn’t “duplicated” (_e.g._ 2TiB provided by having 512GiB of storage and wrapping). – Stephen Kitt Aug 18 '22 at 09:34
  • That wouldn't work if the first block (numbered zero) was overwritten when the 500 millionth block was written and so on. After writing the entire device, it's necessary to explicitly go back and check that all blocks are correct. – Mark Morgan Lloyd Aug 18 '22 at 09:35
  • @StephenKitt Then the only way is to strip it down, x-ray it and start analysing it. dd'ing it is about as reliable as you are going to get. You could always use /dev/random and create a 2TB file of it first, then dd it to the device then back again and compare. – Bib Aug 18 '22 at 09:37
  • [H2testw](https://www.heise.de/download/product/h2testw-50539) comes to mind, but that's only available for Windows afaik. Has nobody implemented this for linux yet? – Panki Aug 18 '22 at 09:39
  • 3
    Seems like there is something for linux called `f3`: https://askubuntu.com/questions/737473/check-real-size-of-usb-thumb-drive – Panki Aug 18 '22 at 09:42
  • @Bib `dd` may well be as reliable as it gets (it isn’t, but that doesn’t matter here), that doesn’t mean that a test using it is reliable ;-). – Stephen Kitt Aug 18 '22 at 09:50
  • @StephenKitt I fail to see how this can fail checks... `dd if=/dev/random of=testfile1 bs=1M count=2100000; dd if=testfile1 of=/dev/sdX bs=1M; dd if=/dev/sdX of=testfile2 bs=1M`, then just compare testfile1 & testfile2. Wrapping will not hide the diffs. Sure it could come back with other errors, but... – Bib Aug 18 '22 at 09:55
  • @Bib you should never use `bs` and `count` without `iflag=fullblock`, that’s where `dd` is unreliable (run your command and see how large `testfile1` actually is). The unreliable test I was referring to however was your suggestion to “dd /dev/zero to the raw device”, not your suggestion to compare (which is fine). – Stephen Kitt Aug 18 '22 at 09:59
  • @Bib your solution with 2x test files is of course adequate although in principle the second pass could be achieved by using cksum etc. However it /does/ require >= 2Gb of disc storage, and its performance is at the mercy of the efficiency of /dev/random etc. – Mark Morgan Lloyd Aug 18 '22 at 10:00
  • @Panki thanks for that, f3 looks interesting at first glance and I will continue investigating. Please add that as an answer so that if nothing better comes along I can accept it. – Mark Morgan Lloyd Aug 18 '22 at 10:02
  • @StephenKitt Don't you just love posts here, everyone is guilty of ambiguity... – Bib Aug 18 '22 at 10:12
  • @Bib in what way is the question ambiguous? – Mark Morgan Lloyd Aug 18 '22 at 10:15
  • @MarkMorganLloyd So sodding what!!! That argument has zero validity. I don't think the op cares if it takes a few hours to determine whether the drive is kosher or not. /dev/random on my system is producing 10GB of data at around 90MB/s, hardly slow... – Bib Aug 18 '22 at 10:17
  • @MarkMorganLloyd The question is not ambiguous, it's Stephen's reply, and my initial comment, which he then clarified. – Bib Aug 18 '22 at 10:19
  • Let us [continue this discussion in chat](https://chat.stackexchange.com/rooms/138604/discussion-between-mark-morgan-lloyd-and-bib). – Mark Morgan Lloyd Aug 18 '22 at 10:30

1 Answers1

2

I found a tool called f3 (fight flash fraud) which appears to do this.

There even seems to be a QT GUI for it.

Github

Documentation

A quote from the readme:

Quick capacity tests with f3probe

f3probe is the fastest drive test and suitable for large disks because it only writes what's necessary to test the drive. It operates directly on the (unmounted) block device and needs to be run as a privileged user:

./f3probe --destructive --time-ops /dev/sdX

Warning

This will destroy any previously stored data on your disk!

Panki
  • 6,221
  • 2
  • 24
  • 33
  • 2
    I've run the basic f3probe a couple of times, which reports that it's a 32Gb device with a consistent block count (but a couple of other details varying). I'm currently reformatting the filesystem on the assumption that the destructive test has corrupted at least one inode table and will report on the result of f3write/f3read. – Mark Morgan Lloyd Aug 18 '22 at 17:26
  • 1
    f3write claimed to have put 93Gb of data on the formatted filesystem before terminating in good order, and still claimed "Free space: 1.79 TB" with no error indication. This is not a combination which inspires confidence in the test program. TBC. – Mark Morgan Lloyd Aug 22 '22 at 15:14
  • f3read started off reporting that roughly 5% of each file was corrupt, then at the 30Gb point flipped to roughly 95% corrupt... both cases also had sporadic I/O errors. The summary said that roughly 30Gb was OK and roughly 90Gb was corrupt. Even if not stated explicitly, the behaviour change at around 30Gb suggests that this is a roughly 30Gb device. – Mark Morgan Lloyd Aug 23 '22 at 07:43
  • Final comment for the record. As above, f3 suggested that "something" happened at around 30Gb but results were inconclusive. My own test program, modified to test blocks sparsely which means that a "2Tb" device can be written in 2.5 hours, shows large numbers of single-bit errors with the first recognisable verify failure around 8Gb in after roughly 3 hours reading. The device is obviously dud, definitely isn't 2Tb, and might be as small as 32Gb. I'm left feeling good about f3, I can't publish my own program due to possible IP ownership issues... besides which, it's written in Pascal :-) – Mark Morgan Lloyd Aug 30 '22 at 14:10