We are having quite strange problem. There is a program (cryptocurrency node to be precise) which has local database of all the transactions ever made. Database is huge - around 15 TB. The problem is that the program won't synchronize with the network, though it has enough peers and knowledge about new and old blocks is not a problem.
Now the strange part - I have started the same program from scratch, without that history of 15TB, and it started syncing immediately, loading disk by about 50% per iostat. CPU and memory utilization are negligible. Absolute figures are:
- Read speed: 5MB/s
- Write speed: 20MB/s
- iotop - 20% on average for this process
When I switch to historical DB (15TB), iostat shows 100% disk utilization, iotop shows multiple forked processes with majority of them sitting at 99% of I/O, but actual I/O is not happening judging by the volume reported by iotop or iostat. Both read and write are within 1MB/s. This is running on MS Azure VM, through Azure portal we see that disk utilization is around 1% in "full" mode and writing is around 20% in "fresh" mode, so throttling by cloud operator is no an issue either.
Now the question - how do I diagnose what exactly program is doing with the disk? I was thinking about random I/O, tried to strace lseek function, got some for both fresh and full modes, much less ratio in full mode, while I expected the opposite. What does it do in full mode then? Program has quite bearable number of file descriptors (/prod/<pid>/fd), below 50 together with peer TCP connections. How can it be in general that both iostat and iotop show 100% utilization with no actual consuming of I/O bandwidth? We even had a call with engineer from Microsoft, he said that iostat may be not accurate especially with SSDs. Might be, but when it says util is 100%, iotop confirms it, and program is not doing what it is supposed to do, what is an alternative explanation?