20

Is there a generic way to reset a PCI device in Linux from the command line? That is, cause the PCI bus to issue a reset command.

Jonathan
  • 301
  • 1
  • 2
  • 4

4 Answers4

8

according to http://www.kernel.org/doc/Documentation/ABI/testing/sysfs-bus-pci you can reset individual functions of the device if that's supported:

What:       /sys/bus/pci/devices/.../reset
Description:
            Some devices allow an individual function to be reset
            without affecting other functions in the same device.
            For devices that have this support, a file named reset
            will be present in sysfs.  Writing 1 to this file
            will perform reset.
Andre Holzner
  • 791
  • 8
  • 10
  • 2
    This issues a function-level reset, which does not have to be implemented and specifically does not reset the entire device, only the function in question. – alex.forencich Oct 08 '20 at 00:28
8

The problem with the solutions above is that they require the cooperativity of the device; however in most scenarios the reason to reset it is exactly its non-cooperativity.

However, as it is described here, there is another, "harder" way to reset it on the PCI level: we remove it from the PCI bus and then re-insert it by a rescan.

The steps:

  1. echo 1 >/sys/bus/pci/<pci-id-of-device>/remove. We can find its PCI ID with an lspci command.
  2. echo 1 >/sys/bus/pci/rescan

I have here a buggy pci device, sometimes PCI-level reset, sometimes this removal-readd trick fixes it. I am about to write a script to do it automagically. :-)

peterh
  • 9,488
  • 16
  • 59
  • 88
  • Ugh. My USB controller is still buggy, even after this reset. – Chris Feb 03 '20 at 00:25
  • @Chris Are you sure that it is the controller, and not a device on it? – peterh Feb 03 '20 at 00:54
  • Yes, I'm sure it's the controller `00:14.0 USB controller: Intel Corporation 200 Series/Z370 Chipset Family USB 3.0 xHCI Controller`. I pass it to VM through VFIO and after VM reboot I also have to reboot the host to make it work again, otherwise any USB plugged in is detected, but communication fails. – Chris Feb 04 '20 at 09:32
  • 1
    This worked beautifully for me to periodically reset an old RAID card with a bunch of failing drives attached to it that would keep infinitely retrying bad sectors without ever returning I/O errors to the kernel. Attempts at data recovery using ddrescue would simply lock up once it hit these bad sectors, so the only way to make progress was to keep resetting the card using the method described in this answer. – ven42 Jan 07 '21 at 20:49
  • @ven42 Wow, thanks! But note, it is not the hdd, but the hdd controller. It is possible that the hdd controller is bad, and the disk is good. Try it with a different controller (not in a different sata slot, as it will be likely the same controller). The simplest way is to use an usb adapter for that. – peterh Jan 08 '21 at 00:35
  • @peterh-ReinstateMonica Under normal circumstances, I would definitely have started pulling hardware components as you suggest rather than doing this, but thanks to pandemic restrictions, I'm miles away from this ancient, neglected machine with zero prospect of getting hands on it anytime soon. Resorting to fancy kernel tricks over a (half-broken) iKVM is the best I can do right now, so the butt-saving factor for your method is abnormally high. :-) – ven42 Jan 08 '21 at 21:03
7

The closest thing the PCI bus has to a device level reset is changing the power state to D3 and back to D0. After unloading the driver ( it would be bad to reset the hardware out from under the driver ), you can use setpci to write to the control registers to change the power state, but I believe this happens automatically when you unload the driver.

psusi
  • 17,007
  • 3
  • 40
  • 51
1

Since a generic PCI device is not hotpluggable there won't be a way to reset it and have the kernel re-enumerate it.

Whatever kind of problems you are trying to solve, there surely is a better way than to just reset it.

  • 5
    I'm simulating a PCI device in QEMU and need to reset it's state as I develop. I wanted to do it from within the guest. – Jonathan Jan 24 '12 at 05:20
  • 2
    I have a buggy pci cctv card, it is working but sometimes dies with a segfault. After that, the whole system needs to be restarted, wiht a PCI reset this could be avoidable ... The system is stable, no harm, only the video input gets blank, so sometimes it is a better solution to reset the PCI than restarting the whole machine every day. (especially if you are 8000km away from that machine for 6 months) – Gipsz Jakab Jun 14 '17 at 08:41