1

I have a file on an internal server that causes any machine that interacts with it to immediately and irrecoverably freeze. By interaction, I mean just about anything: rm, mv, cp, cat, vi, gedit, less, touch. (ls is fine.) Try any of these commands and the machine will completely halt: the GUI will freeze and all network connectivity goes down. The server hosting the file only goes down if a Windows machine attempts to interact with the file.

The data server is mounted with Lustre and all the servers in question use RHEL 6. This is an internal, well-protected network, and the file was created within the network. There are many, many other files that have been created in a similar fashion that do no exhibit this problem.

What could possibly be going on here? Is there some strange character sequence that causes Lustre or RHEL to flip out? I'm not a sysadmin, and I'm loathe to further interact with the file for fear of bringing down machines in active use for other purposes, but I'd like to know what's going on, how to remove the file if possible, or at least how to prevent creating another similar File of Doom.

Gilles 'SO- stop being evil'
  • 807,993
  • 194
  • 1,674
  • 2,175
pattivacek
  • 111
  • 1
  • 4
  • Can you download said file using wget or FTP? If so, how big is the file? Once downloaded and saved can the file be edited? – eyoung100 Oct 27 '14 at 14:22
  • Since `ls` is fine, please give us the output of `ls -l [file]` (or even `stat [file]`, if it does not freeze your system). It'll provide us with some file metadata. – John WH Smith Oct 27 '14 at 14:30
  • The file is a few MB. I've never tried `wget` or FTP because Lustre is the official network-mounting software. I doubt that FTP or HTTP are enabled. I will try `ls -l` or `stat` when I get a safe opportunity to do so. – pattivacek Oct 27 '14 at 14:33
  • 1
    Any related logs on the server after the file access and the accessing machine freezed? – xx4h Oct 27 '14 at 14:38
  • @JohnWHSmith, I tried `ls -l` and `stat` and both worked as expected. The permissions are `-rw-rw-r--`, the filesize is `7594488`, there is one link, nothing else was interesting or unexpected. – pattivacek Oct 27 '14 at 14:54
  • @xx4h, those are sadly mostly not accessible to me as a non-root user. If I'm able to get any relevant information, I will post it. – pattivacek Oct 27 '14 at 14:54
  • @xx4h, the only interesting line I could find in `dmesg` after rebooting after the last freeze was `memory for crash kernel (0x0 to 0x0) notwithin range` (sic). – pattivacek Oct 27 '14 at 16:28
  • Does the machine crash when `fsck` runs on the partition containing the File of Doom? You can't run `fsck` without root privileges but it _should_ run automatically at boot time every month (or so). – PM 2Ring Oct 28 '14 at 01:00
  • @PM2Ring, yet to be ascertained, but a good question. I believe `fsck` is scheduled to run in December, and I'm curious to see what happens then. – pattivacek Oct 28 '14 at 12:35
  • I'll be interested to see what the `fsck` turns up. If you create another hard link to the file, does that hard link have the same problem? At least then you can rule out issues with the file reference... – laubai Nov 27 '14 at 12:05

0 Answers0