26

I have a Samsung laptop (Chronos s7) with one SATA hard disk on bus ata:1, which is detected as /dev/sda, an 8G SSD on ata:2, /dev/sdb, and various other devices on the rest of SATA interface.

The problem is that the SSD disk is

  • soldered to the main board (unmovable)
  • busted (it just gives I/O errors for any operation)
  • it does not appear in the bios (probably because it is broken)

Now this disk:

  • delays the boot three to five minutes trying to probe the failing disk, which is annoying;
  • but the most annoying thing is that the system fails to suspend due to /dev/sdb failing.

Notice that I can live with the delay at boot --- what worries me is the resume/suspend thing.


So the question is: can I tell the kernel to avoid even probing the device on ata:2?

In older kernel (<3.0), when I was still able to dig a bit into the source, there was a command-line parameter of the style hdb=ignore that would have done the trick.

I have tried all the tricks proposed below with udev and libata:force kernel parameters, to no avail. Specifically, the following does not work:

  1. Adding to one of the following /etc/udev/rules.d/ a file (in early execution like 00-ignoredisk.rules or in late as 99-ignoredisk.rules or in both places)

    SUBSYSTEMS=="scsi", DRIVERS=="sd", ATTRS{rev}=="SSD ", ATTRS{model}=="SanDisk iSSD P4 ", ENV{UDISKS_IGNORE}="1" 
    

    nor

    KERNEL=="sdb", ENV{UDISKS_IGNORE}="1"
    

    nor a lot of intermediate solutions --- this makes the disk not accessible after boot, but it is probed at boot, and still checked when suspending --- causing the suspend to fail.

  2. Editing the system files /lib/udev/rules.d/60-persistent-storage.rules (and udisks, udisks2) changing

    KERNEL=="ram*|loop*|fd*|nbd*|gnbd*|dm-|md", GOTO="persistent_storage_end"
    

    to

    KERNEL=="ram*|loop*|fd*|nbd*|gnbd*|dm-|md|sdb*", GOTO="persistent_storage_end"
    

    again, this has some effect, masking the disk from userspace, but the disk is still visible to the kernel.

  3. Booting with all the possible combinations (well, a lot of them) of the libata:force parameters (found for example here) in order to disable DMA, lower speed or whatever about the failing disk --- does not work. The parameter is used, but the disk is still probed and fails.

    Full udevadm info -a -n /dev/sdb pasted to http://paste.ubuntu.com/6186145/

    smartctl -i /dev/sdb -T permissive gives:

    root@samsung-romano:/home/romano# smartctl -i /dev/sdb -T permissive
    smartctl 5.43 2012-06-30 r3573 [x86_64-linux-3.8.0-31-generic] (local build)
    Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net
    
    Vendor:               /1:0:0:0
    Product:              
    User Capacity:        600,332,565,813,390,450 bytes [600 PB]
    Logical block size:   774843950 bytes
    >> Terminate command early due to bad response to IEC mode page
    

    which is clearly wrong. Nevertheless:

    root@samsung-romano:/home/romano# fdisk -b 512 -C 970 -H 256 -S 63 /dev/sdb
    fdisk: unable to read /dev/sdb: Input/output error
    

(SSD data from http://ubuntuforums.org/showthread.php?t=1935699&p=11739579#post11739579 ).

Rui F Ribeiro
  • 55,929
  • 26
  • 146
  • 227
Rmano
  • 3,335
  • 5
  • 21
  • 36
  • Sorry if this is too obvious, but since you have not included in your question: have you made sure the device name or UUID is not listed in `/etc/fstab`? Because the delay on boot could be caused earlier by the kernel or udev, which seems to be the case, but also later by fsck, when reading `fstab`. – admirabilis Nov 26 '13 at 18:10
  • Yes, there is no mention of /dev/sdb (or its partitions) in system files. The delay is even **before** init starts... it is in a kthread (because the boot continues in parallel), but it's at a more fundamental level. But really the boot delay is the lesser of the problem --- if only I could ignore the disk during suspend/resume so that suspend works I will be happy. (thanks anyway). – Rmano Nov 26 '13 at 18:21
  • Are you using in initrd? if so whose? – hildred Nov 26 '13 at 19:04
  • @hildred: I am using stock kernel and initramfs from Ubuntu 13.04. I can disable AHCI or all SATA there, but then my system is dead --- no disks at all. – Rmano Nov 26 '13 at 19:06
  • Debian (and Umbutu) compile the ata subsystem as module. Have you tried setting parameters to the module when it is loaded by the initrd? – hildred Nov 26 '13 at 19:10
  • @hildred: Yes, see point 3) in my question. If you can suggest an ata parameter that will disable the scanning of a single bus (and not all of them), it will do. – Rmano Nov 26 '13 at 19:14
  • In the back of my fuzzy memory, there used to be an option to specify pci id's for scsi controllers, but I couldn't find it in current documentation. – hildred Nov 26 '13 at 19:38
  • @hildred - I have that same fuzzy memory too! – slm Nov 26 '13 at 20:03
  • Will it be OK for you to modify Kernel ? – SHW Dec 02 '13 at 07:25
  • I used to compile my own kernels, ages ago --- probably I will be still able to apply a patch ;-) – Rmano Dec 02 '13 at 15:00
  • @Rmano see that I posted a kernel patch implementing your feature request. – robbat2 Dec 08 '13 at 22:36
  • @Rmano how did you find the ATA:2.0? I tried a bunch of hdparm, hwinfo etc but none of them gives it. My disk is a SATA disk btw, could that be the reason? – Jubei Aug 26 '15 at 22:30
  • @Jubei with `dmesg`, as explained un the accepted answer.... Obviously adaptes to your case.... – Rmano Aug 27 '15 at 05:48

4 Answers4

30

libata does not have a noprobe option at all; that was a legacy IDE option...

But I went and wrote a kernel patch for you that implements it. It Should apply to many kernels very easily (the line above it was added 2013-05-21/v3.10-rc1*, but can be safely applied manually without that line).

Update The patch is now upstream (at least in 3.12.7 stable kernel). It is in the standard kernel distributed with Ubuntu 14.04 (which is based on 3.13-stable).

Once the patch is installed, adding

 libata.force=2.00:disable

to the kernel boot parameters will hide the disk from the Linux kernel. Double check that the number is correct; searching for the device name can help (obviously, you have to check the kernel messages before adding the boot parameters):

(0)samsung-romano:~% dmesg | grep iSSD
[    1.493279] ata2.00: ATA-8: SanDisk iSSD P4 8GB, SSD 9.14, max UDMA/133
[    1.494236] scsi 1:0:0:0: Direct-Access     ATA      SanDisk iSSD P4  SSD  PQ: 0 ANSI: 5

The important number is the ata2.00 in the first line above.

Cristian Ciupitu
  • 2,430
  • 1
  • 22
  • 29
robbat2
  • 3,599
  • 20
  • 32
  • Thanks a lot. I will try to check it as soon as I can remember how to compile and install a kernel on my Ubuntu. Unfortunately, I'll have a very complex week forward... – Rmano Dec 09 '13 at 02:52
  • 1
    +1 It's clearly better than the trick I posted. I hope it will becomes official. – Emmanuel Dec 09 '13 at 07:41
  • 1
    Ok, tested the patch. It works. If you need to push it upstream I can add my Tested-by: to the patch --- you have my true email in my profile. I installed it following (with quirks) the instructions in https://wiki.ubuntu.com/Kernel/BuildYourOwnKernel. – Rmano Dec 14 '13 at 21:18
  • For what I see it is now upstream (at least in 3.12.7 stable kernel), see [https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?h=linux-3.12.y&id=83ceef0795fb85bb40bb90b300e3ebd8e8628454](the git repository). I asked for backport in the Ubuntu launchpad. – Rmano Jan 13 '14 at 23:56
  • It is now in standard kernels for Ubuntu from 14.04 upward. The exact command to use (in my case) is: `libata.force=2.00:disable`. – Rmano Jun 15 '14 at 19:42
  • 1
    @illuminÉ --- just edited the answer in that sense --- wait for it to be approved. – Rmano Jun 16 '14 at 15:40
  • 1
    Another Reason to LOVE GENTOO!! – eyoung100 Jun 16 '14 at 18:11
  • Another big thanks goes to the Linux kernel team for speedily including a patch into a stable kernel release, probably helping 10,000 people that way with similar problems. – syntaxerror Jan 16 '16 at 02:15
  • Hm, this does not work with scsi hard drives, yes? – Mitar Mar 17 '17 at 01:15
  • @Mitar can you expand on your setup a bit? Maybe post it as a new question, referencing this one, specifying the SAS/SCSI disks. – robbat2 Mar 17 '17 at 05:14
  • I opened it here: https://unix.stackexchange.com/questions/352258/how-to-get-linux-to-completely-ignore-a-scsi-drive – Mitar Mar 18 '17 at 05:06
  • 2
    @robbat2: Thank you, thank you - you turned my 20-minute boot (waiting for a sketchy, hard-wired storage to finally time out) into a 30-second one. Thank you! – Piskvor left the building Aug 29 '19 at 14:02
19

Hardware problems have physical hardware solution. Did you consider to unsolder or cut the power supply of the drive ?

EDIT: Ok if thats not an option people are using this before to hot-plug a hard drive. You could use that to disable your drive.

echo 1 > /sys/block/sdb/device/delete

Note that any other process can force a scan of the SATA bus, and then makes it to be back. Try to do that just before hibernating the laptop.

Edited by OP: it worked. I added the following file :

-rwxr-xr-x 1 root root 204 Dec  6 16:03 99_delete_sdb

with content:

#!/bin/sh

# Tell grub that resume was successful

case "$1" in
    suspend|hibernate)
        if [ -d /sys/block/sdb ]; then
            echo Deleting device sdb 
            echo 1 > /sys/block/sdb/device/delete       
        fi
        ;;
esac

...and now the system suspends (and resume) correctly.

Rmano
  • 3,335
  • 5
  • 21
  • 36
Emmanuel
  • 4,167
  • 2
  • 23
  • 30
  • 1
    If only it were true. I can't even say which chip (or chips) are the SSD drive --- most are unmarked. And unpowering a chip is not safe --- what about undriven three-state pins? I opened the laptop hoping that the SSD drive was connected on some sort of daughterboard. No luck. (And besides, most of the difficulty in writing kernel drivers is to work around bad designed hw). – Rmano Dec 05 '13 at 00:34
  • @Rmano How does perform the "delete" trick ? – Emmanuel Dec 06 '13 at 10:29
  • **IT** **WORKS** --- I can suspend after the "delete" trick. Thanks a lot. (It still delays the boot, but well --- not a problem). – Rmano Dec 06 '13 at 22:05
  • Thanks a lot for reminding about `delete`. – Michael Shigorin Nov 14 '14 at 15:05
4

BIOS

Does this device not show up in any type of way via your BIOS?

Often times HDDs are configured in an "auto" mode, I would go through and make sure that these devices are in a disabled state and even go to the extent of explicitly enabling only the one HDD and disabling everything else.

Kernel Boot Options

Often times you can disable various subsystems from being auto-detected by the booting Linux Kernel through the use of different boot options that can be passed to it as switches.

Most if not all of the options are listed here:

Linux in a Nutshell book

You might want to skim through the O'Reilly book, Linux Kernel in a Nutshell, specifically, Chapter 7: Customizing a Kernel.

This book is made available for free by its author, Greg Kroah-Hartman, on his personal website. The entire book can be downloaded as well.

slm
  • 363,520
  • 117
  • 767
  • 871
  • No, the BIOS does not have any trace of this disk; I can see the HDD and the DVD and no more. Before failing, in Windows (now there is no windows anymore in the system) it was used as a speed-up cache for the main disk. I tried to set the AHCI mode to legacy, off, yes or auto (for all disks) but that did not change anything or (for off) simply made the system not boot. – Rmano Nov 26 '13 at 19:17
  • The other method I've used in the past is when the Kernel is booting to tell it via Grub (kernel boot options) to `noide=....`. There are host of other options you can provide to the booting kernel to disable auto-detection of hardware. – slm Nov 26 '13 at 19:20
  • the disk is SATA (scsi), no IDE. And the `hdb=noprobe` options has not passed muster to scsi (I think it was eliminated around 2.6.x), so it does not exists (as soon as I know) a `sdb=noprobe` or `ata:2=noprobe` option. I have read (almost) all the `kernel-parameters.txt` file in the kernel source and I can't find the correct parameter. If you know anyone, please tell it in an answer --- I will be really grateful. – Rmano Nov 26 '13 at 19:24
  • @Rmano - I'll have to dig more to find it, I do remember several options related to HDD and bus detection, but not off the top of my head. – slm Nov 26 '13 at 19:59
  • @Rmano - what about the option: `libata.dma=` – slm Nov 26 '13 at 20:18
  • Yes, tried --- it's in the question, point #3... and also tried all the speed option, the bus options, and the timeout ones. No go. – Rmano Nov 26 '13 at 20:48
  • @Rmano - just a thought but might be worth a look: http://www.6by9.net/using-linux-sys-to-disable-ethernet-hardware-devices/ – slm Nov 27 '13 at 08:13
  • @Rmano - might be some ideas on this page about how to workaround the issue: https://ata.wiki.kernel.org/index.php/Known_issues – slm Nov 27 '13 at 08:22
  • slm: interesting page but still I can't find anything on how to disable a drive. Firmware on my laptop is the last released by Samsung... – Rmano Nov 29 '13 at 04:54
  • @Rmano Joining late to the party, but OK...what you've reported ("error on each an every I/O operation") can even happen with a 100% perfect drive, when ... __security locked__! Once experienced with `hdparm`, which, after finishing its `--security-erase`, does not reliably remove the security lock from the drive after doing its work (unless by a cold reboot). However, after issuing a `DISPWD` on that drive (system rescue CD), it behaved just greatly and there were no errors. So beware of those drives locked by security password!! At least make sure your SSD drive doesn't have LOCK enabled. – syntaxerror Jan 16 '16 at 13:59
  • @syntaxerror --- I think it was not the problem, the system never asked a password not even when it has Windows on it. And the SSD was supposed to be used as a cache... anyway, is there a (linux only) way to check if it is security locked? I wouldn't mind to try. – Rmano Jan 16 '16 at 17:47
  • @Rmano Are you kidding me? The system would never __ask__ for a password! Nor would Linux ... ever. It would just behave as if the drive was badly broken (flooding screen with timing errors etc.). I am *not* talking about an *encrypted* drive. If you security-lock a hard drive by password (e. g. with `hdparm` under Linux), it will most likely not operate on Windows __at all__ (probably not even be detected). Until it gets unlocked again (temporarily (`UNLOCK`) or permanently (`DISPWD`)) – syntaxerror Jan 16 '16 at 17:59
  • You can check for security lock (Linux only) as follows: `sudo hdparm -I /dev/sdX`(X = a .. z, you have to known the device name of your drive, of course). Look at the last 10 lines of the output. You MUST be able to read `NOT locked` in one line. The "not" is important! Should it be missing, your drive __is__ locked!! – syntaxerror Jan 16 '16 at 18:09
0

Linux way to check for lock: sudo hdparm -I /dev/sdX (with X = a..z ; you must know what device your drive is, of course). At the end of the (big) output, you MUST be able to read at in the last 10 lines: *not* locked.

syntaxerror
  • 2,236
  • 2
  • 27
  • 49