1. 23 Jul, 2018 2 commits
    • Lukas Wunner's avatar
      PCI: pciehp: Fix unprotected list iteration in IRQ handler · 1204e35b
      Lukas Wunner authored
      Commit b440bde7 ("PCI: Add pci_ignore_hotplug() to ignore hotplug
      events for a device") iterates over the devices on a hotplug port's
      subordinate bus in pciehp's IRQ handler without acquiring pci_bus_sem.
      It is thus possible for a user to cause a crash by concurrently
      manipulating the device list, e.g. by disabling slot power via sysfs
      on a different CPU or by initiating a remove/rescan via sysfs.
      
      This can't be fixed by acquiring pci_bus_sem because it may sleep.
      The simplest fix is to avoid the list iteration altogether and just
      check the ignore_hotplug flag on the port itself.  This works because
      pci_ignore_hotplug() sets the flag both on the device as well as on its
      parent bridge.
      
      We do lose the ability to print the name of the device blocking hotplug
      in the debug message, but that's probably bearable.
      
      Fixes: b440bde7 ("PCI: Add pci_ignore_hotplug() to ignore hotplug events for a device")
      Signed-off-by: default avatarLukas Wunner <lukas@wunner.de>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Cc: stable@vger.kernel.org
      1204e35b
    • Lukas Wunner's avatar
      PCI: pciehp: Fix use-after-free on unplug · 281e878e
      Lukas Wunner authored
      When pciehp is unbound (e.g. on unplug of a Thunderbolt device), the
      hotplug_slot struct is deregistered and thus freed before freeing the
      IRQ.  The IRQ handler and the work items it schedules print the slot
      name referenced from the freed structure in various informational and
      debug log messages, each time resulting in a quadruple dereference of
      freed pointers (hotplug_slot -> pci_slot -> kobject -> name).
      
      At best the slot name is logged as "(null)", at worst kernel memory is
      exposed in logs or the driver crashes:
      
        pciehp 0000:10:00.0:pcie204: Slot((null)): Card not present
      
      An attacker may provoke the bug by unplugging multiple devices on a
      Thunderbolt daisy chain at once.  Unplugging can also be simulated by
      powering down slots via sysfs.  The bug is particularly easy to trigger
      in poll mode.
      
      It has been present since the driver's introduction in 2004:
      https://git.kernel.org/tglx/history/c/c16b4b14d980
      
      Fix by rearranging teardown such that the IRQ is freed first.  Run the
      work items queued by the IRQ handler to completion before freeing the
      hotplug_slot struct by draining the work queue from the ->release_slot
      callback which is invoked by pci_hp_deregister().
      Signed-off-by: default avatarLukas Wunner <lukas@wunner.de>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Cc: stable@vger.kernel.org # v2.6.4
      281e878e
  2. 23 May, 2018 1 commit
  3. 17 May, 2018 1 commit
  4. 07 May, 2018 1 commit
    • Bjorn Helgaas's avatar
      PCI: pciehp: Add quirk for Command Completed errata · d22b3621
      Bjorn Helgaas authored
      Several PCIe hotplug controllers have errata that mean they do not set the
      Command Completed bit unless writes to the Slot Command register change
      "Control" bits.  Command Completed is never set for writes that only change
      software notification "Enable" bits.  This results in timeouts like this:
      
        pciehp 0000:00:1c.0:pcie004: Timeout on hotplug command 0x1038 (issued 65284 msec ago)
      
      When this erratum is present, avoid these timeouts by marking commands
      "completed" immediately unless they change the "Control" bits.
      
      Here's the text of the Intel erratum CF118.  We assume this applies to all
      Intel parts:
      
        CF118        PCIe Slot Status Register Command Completed bit not always
                     updated on any configuration write to the Slot Control
                     Register
      
        Problem:     For PCIe root ports (devices 0 - 10) supporting hot-plug,
                     the Slot Status Register (offset AAh) Command Completed
                     (bit[4]) status is updated under the following condition:
                     IOH will set Command Completed bit after delivering the new
                     commands written in the Slot Controller register (offset
                     A8h) to VPP. The IOH detects new commands written in Slot
                     Control register by checking the change of value for Power
                     Controller Control (bit[10]), Power Indicator Control
                     (bits[9:8]), Attention Indicator Control (bits[7:6]), or
                     Electromechanical Interlock Control (bit[11]) fields. Any
                     other configuration writes to the Slot Control register
                     without changing the values of these fields will not cause
                     Command Completed bit to be set.
      
                     The PCIe Base Specification Revision 2.0 or later describes
                     the “Slot Control Register” in section 7.8.10, as follows
                     (Reference section 7.8.10, Slot Control Register, Offset
                     18h). In hot-plug capable Downstream Ports, a write to the
                     Slot Control register must cause a hot-plug command to be
                     generated (see Section 6.7.3.2 for details on hot-plug
                     commands). A write to the Slot Control register in a
                     Downstream Port that is not hotplug capable must not cause a
                     hot-plug command to be executed.
      
                     The PCIe Spec intended that every write to the Slot Control
                     Register is a command and expected a command complete status
                     to abstract the VPP implementation specific nuances from the
                     OS software. IOH PCIe Slot Control Register implementation
                     is not fully conforming to the PCIe Specification in this
                     respect.
      
        Implication: Software checking on the Command Completed status after
                     writing to the Slot Control register may time out.
      
        Workaround:  Software can read the Slot Control register and compare the
                     existing and new values to determine if it should check the
                     Command Completed status after writing to the Slot Control
                     register.
      
      Per Sinan, the Qualcomm QDF2400 controller also does not set the Command
      Completed bit unless writes to the Slot Command register change "Control"
      bits.
      
      Link: http://www.intel.com/content/www/us/en/processors/xeon/xeon-e7-v2-spec-update.html
      Link: https://lkml.kernel.org/r/8770820b-85a0-172b-7230-3a44524e6c9f@molgen.mpg.de
      Reported-by: Paul Menzel <pmenzel+linux-pci@molgen.mpg.de>	# Lenovo X60
      Tested-by: Paul Menzel <pmenzel+linux-pci@molgen.mpg.de>	# Lenovo X60
      Signed-off-by: Sinan Kaya <okaya@codeaurora.org>		# Qcom quirk
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: default avatarMika Westerberg <mika.westerberg@linux.intel.com>
      d22b3621
  5. 28 Jan, 2018 1 commit
  6. 23 Jan, 2018 1 commit
    • Lukas Wunner's avatar
      PCI: pciehp: Assume NoCompl+ for Thunderbolt ports · 493fb50e
      Lukas Wunner authored
      Certain Thunderbolt 1 controllers claim to support Command Completed events
      (value of 0b in the No Command Completed Support field of the Slot
      Capabilities register) but in reality they neither set the Command
      Completed bit in the Slot Status register nor signal a Command Completed
      interrupt:
      
        8086:1513  CV82524  [Light Ridge 4C  2010]
        8086:151a  DSL2310  [Eagle Ridge 2C  2011]
        8086:151b  CVL2510  [Light Peak 2C   2010]
        8086:1547  DSL3510  [Cactus Ridge 4C 2012]
        8086:1548  DSL3310  [Cactus Ridge 2C 2012]
        8086:1549  DSL2210  [Port Ridge 1C   2011]
      
      All known newer chips (Redwood Ridge and onwards) set No Command Completed
      Support, indicating that they do not support Command Completed events.
      
      The user-visible impact is that after unplugging such a device, 2 seconds
      elapse until pciehp is unbound.  That's because on ->remove,
      pcie_write_cmd() is called via pcie_disable_notification() and every call
      to pcie_write_cmd() takes 2 seconds (1 second for each invocation of
      pcie_wait_cmd()):
      
        [  337.942727] pciehp 0000:0a:00.0:pcie204: Timeout on hotplug command 0x1038 (issued 21176 msec ago)
        [  340.014735] pciehp 0000:0a:00.0:pcie204: Timeout on hotplug command 0x0000 (issued 2072 msec ago)
      
      That by itself has always been unpleasant, but the situation has become
      worse with commit cc27b735 ("PCI/portdrv: Turn off PCIe services during
      shutdown"):  Now pciehp is unbound on ->shutdown.  Because Thunderbolt
      controllers typically have 4 hotplug ports, every reboot and shutdown is
      now delayed by 8 seconds, plus another 2 seconds for every attached
      Thunderbolt 1 device.
      
      Thunderbolt hotplug slots are not physical slots that one inserts cards
      into, but rather logical hotplug slots implemented in silicon.  Devices
      appear beyond those logical slots once a PCI tunnel is established on top
      of the Thunderbolt Converged I/O switch.  One would expect commands written
      to the Slot Control register to be executed immediately by the silicon, so
      for simplicity we always assume NoCompl+ for Thunderbolt ports.
      
      Fixes: cc27b735 ("PCI/portdrv: Turn off PCIe services during shutdown")
      Tested-by: default avatarMika Westerberg <mika.westerberg@linux.intel.com>
      Signed-off-by: default avatarLukas Wunner <lukas@wunner.de>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: default avatarMika Westerberg <mika.westerberg@linux.intel.com>
      Cc: stable@vger.kernel.org	# v4.12+
      Cc: Sinan Kaya <okaya@codeaurora.org>
      Cc: Yehezkel Bernat <yehezkel.bernat@intel.com>
      Cc: Michael Jamet <michael.jamet@intel.com>
      Cc: Andreas Noever <andreas.noever@gmail.com>
      493fb50e
  7. 17 Jan, 2018 1 commit
    • Markus Elfring's avatar
      PCI: Remove unnecessary messages for memory allocation failures · c7abb235
      Markus Elfring authored
      Per ebfdc409 ("checkpatch: attempt to find unnecessary 'out of memory'
      messages"), when a memory allocation fails, the memory subsystem emits
      generic "out of memory" messages (see slab_out_of_memory() for some of this
      logging).  Therefore, additional error messages in the caller don't add
      much value.
      
      Remove messages that merely report "out of memory".
      
      This preserves some messages that report additional information, e.g.,
      allocation failures that mean we drop hotplug events.
      
      This issue was detected by using the Coccinelle software.
      Signed-off-by: default avatarMarkus Elfring <elfring@users.sourceforge.net>
      [bhelgaas: changelog, squash patches, make similar changes to acpiphp,
      cpqphp, ibmphp, keep warning when dropping hotplug event]
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      c7abb235
  8. 07 Nov, 2017 3 commits
    • Mika Westerberg's avatar
      PCI: pciehp: Do not clear Presence Detect Changed during initialization · db63d400
      Mika Westerberg authored
      It is possible that the hotplug event has already happened before the
      driver is attached to a PCIe hotplug downstream port. If we just clear the
      status we never get the hotplug interrupt and thus the event will be
      missed.
      
      To make sure that does not happen, we leave Presence Detect Changed bit
      untouched during initialization. Then once the event is unmasked we get an
      interrupt and handle the hotplug event properly.
      Signed-off-by: default avatarMika Westerberg <mika.westerberg@linux.intel.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      db63d400
    • Mika Westerberg's avatar
      PCI: pciehp: Fix race condition handling surprise link down · 49902239
      Mika Westerberg authored
      A surprise link down may retrain very quickly causing the same slot
      generate a link up event before handling the link down event completes.
      
      Since the link is active, the power off work queued from the first link
      down will cause a second down event when power is disabled. However, the
      link up event sets the slot state to POWERON_STATE before the event to
      handle this is enqueued, making the second down event believe it needs to
      do something.
      
      This creates constant link up and down event cycle.
      
      To prevent this it is better to handle each event at the time in order it
      occurred, so change the driver to use ordered workqueue instead.
      
      A normal device hotplug triggers two events (presense detect and link up)
      that are already handled properly in the driver but we currently log an
      error if we find an existing device in the slot. Since this is not an error
      change the log level to be debug instead to avoid scaring users.
      
      This is based on the original work by Ashok Raj.
      
      Link: https://patchwork.kernel.org/patch/9469023Suggested-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: default avatarMika Westerberg <mika.westerberg@linux.intel.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      49902239
    • Kees Cook's avatar
      PCI: pciehp: Convert timers to use timer_setup() · c4459a08
      Kees Cook authored
      In preparation for unconditionally passing the struct timer_list pointer to
      all timer callbacks, switch to using the new timer_setup() and from_timer()
      to pass the timer pointer explicitly. This fixes what appears to be a bug
      in passing the wrong pointer to the timer handler (address of ctrl pointer
      instead of ctrl pointer).
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
      Cc: Mayurkumar Patel <mayurkumar.patel@intel.com>
      Cc: Keith Busch <keith.busch@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      c4459a08
  9. 15 Aug, 2017 1 commit
    • Keith Busch's avatar
      PCI: pciehp: Report power fault only once until we clear it · 7612b3b2
      Keith Busch authored
      When a power fault occurs, the power controller sets Power Fault Detected
      in the Slot Status register, and pciehp_isr() queues an INT_POWER_FAULT
      event to handle it.
      
      It also clears Power Fault Detected, but since nothing has yet changed to
      correct the power fault, the power controller will likely set it again
      immediately, which may cause an infinite loop when pcie_isr() rechecks
      Slot Status.
      
      Fix that by masking off Power Fault Detected from new events if the driver
      hasn't seen the power fault clear from the previous handling attempt.
      
      Fixes: fad214b0 ("PCI: pciehp: Process all hotplug events before looking for new ones")
      Signed-off-by: default avatarKeith Busch <keith.busch@intel.com>
      [bhelgaas: changelog, pull test out and add comment]
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Cc: Mayurkumar Patel <mayurkumar.patel@intel.com>
      Cc: stable@vger.kernel.org	# 4.9+
      7612b3b2
  10. 07 Dec, 2016 1 commit
    • Ashok Raj's avatar
      PCI: pciehp: Prioritize data-link event over presence detect · 385895fe
      Ashok Raj authored
      If Slot Status indicates changes in both Data Link Layer Status and
      Presence Detect, prioritize the Link status change.
      
      When both events are observed, pciehp currently relies on the Slot Status
      Presence Detect State (PDS) to agree with the Link Status Data Link Layer
      Active status.  The Presence Detect State, however, may be set to 1 through
      out-of-band presence detect even if the link is down, which creates
      conflicting events.
      
      Since the Link Status accurately reflects the reachability of the
      downstream bus, the Link Status event should take precedence over a
      Presence Detect event.  Skip checking the PDC status if we handled a link
      event in the same handler.
      Signed-off-by: default avatarAshok Raj <ashok.raj@intel.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: default avatarKeith Busch <keith.busch@intel.com>
      385895fe
  11. 22 Sep, 2016 1 commit
    • Keith Busch's avatar
      PCI: pciehp: Allow exclusive userspace control of indicators · 576243b3
      Keith Busch authored
      PCIe hotplug supports optional Attention and Power Indicators, which are
      used internally by pciehp.  Users can't control the Power Indicator, but
      they can control the Attention Indicator by writing to a sysfs "attention"
      file.
      
      The Slot Control register has two bits for each indicator, and the PCIe
      spec defines the encodings for each as (Reserved/On/Blinking/Off).  For
      sysfs "attention" writes, pciehp_set_attention_status() maps into these
      encodings, so the only useful write values are 0 (Off), 1 (On), and 2
      (Blinking).
      
      However, some platforms use all four bits for platform-specific indicators,
      and they need to allow direct user control of them while preventing pciehp
      from using them at all.
      
      Add a "hotplug_user_indicators" flag to the pci_dev structure.  When set,
      pciehp does not use either the Attention Indicator or the Power Indicator,
      and the low four bits (values 0x0 - 0xf) of sysfs "attention" write values
      are written directly to the Attention Indicator Control and Power Indicator
      Control fields.
      
      [bhelgaas: changelog, rename flag and accessors to s/attention/indicator/]
      Signed-off-by: default avatarKeith Busch <keith.busch@intel.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      576243b3
  12. 14 Sep, 2016 5 commits
  13. 12 Sep, 2016 1 commit
  14. 20 Jun, 2016 1 commit
    • Lukas Wunner's avatar
      PCI: pciehp: Ignore interrupts during D3cold · ed91de7e
      Lukas Wunner authored
      If a hotplug port is suspended to D3cold, its slot status register cannot
      be read.  If that hotplug port happens to share its IRQ with other devices,
      whenever an interrupt occurs for one of these devices, pciehp logs a
      "no response from device" message and tries to read the PCI_EXP_SLTSTA
      register, even though we know that will fail.
      
      Ignore interrupts while we're in D3cold.
      
      [bhelgaas: changelog]
      Signed-off-by: default avatarLukas Wunner <lukas@wunner.de>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      ed91de7e
  15. 10 Aug, 2015 2 commits
    • Bjorn Helgaas's avatar
      PCI: pciehp: Remove ignored MRL sensor interrupt events · 2db0f71f
      Bjorn Helgaas authored
      We queued interrupt events for the MRL being opened or closed, but the code
      in interrupt_event_handler() that handles these events ignored them.
      
      Stop enabling MRL interrupts and remove the ignored events.
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      2db0f71f
    • Jarod Wilson's avatar
      PCI: pciehp: Handle invalid data when reading from non-existent devices · 1469d17d
      Jarod Wilson authored
      It's platform-dependent, but an MMIO read to a non-existent PCI device
      generally returns data with all bits set.  This happens when the host
      bridge or Root Complex times out waiting for a response from the device and
      fabricates return data to complete the CPU's read.
      
      One example, reported in the bugzilla below, involved this hierarchy:
      
        pci 0000:00:1c.0: PCI bridge to [bus 02-3a] Root Port
        pci 0000:02:00.0: PCI bridge to [bus 03-0a] Upstream Port
        pci 0000:03:03.0: PCI bridge to [bus 05-07] Downstream Port
        pci 0000:05:00.0: PCI bridge to [bus 06-07] Thunderbolt Upstream Port
        pci 0000:06:00.0: PCI bridge to [bus 07]    Thunderbolt Downstream Port
        pci 0000:07:00.0: BCM57762 NIC
      
      Unplugging the Thunderbolt switch and the NIC below it resulted in this:
      
        pciehp 0000:03:03.0: Surprise Removal
        tg3 0000:07:00.0: tg3_abort_hw timed out, TX_MODE_ENABLE will not clear MAC_TX_MODE=ffffffff
        pciehp 0000:06:00.0: unloading service driver pciehp
        pciehp 0000:06:00.0: pcie_isr: intr_loc 11f
        pciehp 0000:06:00.0: Switch interrupt received
        pciehp 0000:06:00.0: Latch open on Slot
        pciehp 0000:06:00.0: Attention button interrupt received
        pciehp 0000:06:00.0: Button pressed on Slot
        pciehp 0000:06:00.0: Presence/Notify input change
        pciehp 0000:06:00.0: Card present on Slot
        pciehp 0000:06:00.0: Power fault interrupt received
        pciehp 0000:06:00.0: Data Link Layer State change
        pciehp 0000:06:00.0: Link Up event
      
      The pciehp driver correctly noticed that the Thunderbolt switch (05:00.0
      and 06:00.0) and NIC (07:00.0) had been removed, and it called their driver
      remove methods.
      
      Since the NIC was already gone, tg3 received 0xffffffff when it tried to
      read from the device.  The resulting timeout is a tg3 issue and not of
      interest here.
      
      Similarly, since the 06:00.0 Thunderbolt switch was already gone,
      pcie_isr() received 0xffff when it tried to read PCI_EXP_SLTSTA, and pciehp
      thought that was valid status showing that many events had happened: the
      latch had been opened, the attention button had been pressed, a card was
      now present, and the link was now up.  These are all wrong, of course, but
      pciehp went on to try to power up and enumerate devices below the
      non-existent bridge:
      
        pciehp 0000:06:00.0: PCI slot - powering on due to button press
        pciehp 0000:06:00.0: Surprise Insertion
        pci 0000:07:00.0 id reading try 50 times with interval 20 ms to get ffffffff
      
      [bhelgaas: changelog, also check in pcie_poll_cmd() & pcie_do_write_cmd()]
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=99841Suggested-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: default avatarJarod Wilson <jarod@redhat.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      1469d17d
  16. 16 Jul, 2015 1 commit
  17. 18 Jun, 2015 1 commit
  18. 17 Jun, 2015 1 commit
    • Bjorn Helgaas's avatar
      PCI: pciehp: Clean up debug logging · 3784e0c6
      Bjorn Helgaas authored
      The pciehp debug logging is overly verbose and often redundant.  Almost all
      of the information printed by dbg_ctrl() is also printed by the normal PCI
      core enumeration code and by pcie_init().
      
      Remove the redundant debug info.
      
      When claiming a pciehp bridge, we print the slot characteristics, e.g.,
      
        Slot #6 AttnBtn- AttnInd- PwrInd- PwrCtrl- MRL- Interlock- NoCompl+ LLActRep+
      
      Add the Hot-Plug Capable and Hot-Plug Surprise bits to this information,
      and print it all in the same order as lspci does.
      
      No functional change except the message text changes.
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: default avatarRajat Jain <rajatja@google.com>
      Acked-by: default avatarYinghai Lu <yinghai@kernel.org>
      3784e0c6
  19. 09 Jun, 2015 1 commit
    • Alex Williamson's avatar
      PCI: pciehp: Wait for hotplug command completion where necessary · a5dd4b4b
      Alex Williamson authored
      The commit referenced below deferred waiting for command completion until
      the start of the next command, allowing hardware to do the latching
      asynchronously.  Unfortunately, being ready to accept a new command is the
      only indication we have that the previous command is completed.  In cases
      where we need that state change to be enabled, we must still wait for
      completion.  For instance, pciehp_reset_slot() attempts to disable anything
      that might generate a surprise hotplug on slots that support presence
      detection.  If we don't wait for those settings to latch before the
      secondary bus reset, we negate any value in attempting to prevent the
      spurious hotplug.
      
      Create a base function with optional wait and helper functions so that
      pcie_write_cmd() turns back into the "safe" interface which waits before
      and after issuing a command and add pcie_write_cmd_nowait(), which
      eliminates the trailing wait for asynchronous completion.  The following
      functions are returned to their previous behavior:
      
        pciehp_power_on_slot
        pciehp_power_off_slot
        pcie_disable_notification
        pciehp_reset_slot
      
      The rationale is that pciehp_power_on_slot() enables the link and therefore
      relies on completion of power-on.  pciehp_power_off_slot() and
      pcie_disable_notification() need a wait because data structures may be
      freed after these calls and continued signaling from the device would be
      unexpected.  And, of course, pciehp_reset_slot() needs to wait for the
      scenario outlined above.
      
      Fixes: 3461a068 ("PCI: pciehp: Wait for hotplug command completion lazily")
      Signed-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      CC: stable@vger.kernel.org	# v3.17+
      a5dd4b4b
  20. 23 Sep, 2014 4 commits
  21. 13 Sep, 2014 1 commit
  22. 10 Sep, 2014 1 commit
    • Bjorn Helgaas's avatar
      PCI: Add pci_ignore_hotplug() to ignore hotplug events for a device · b440bde7
      Bjorn Helgaas authored
      Powering off a hot-pluggable device, e.g., with pci_set_power_state(D3cold),
      normally generates a hot-remove event that unbinds the driver.
      
      Some drivers expect to remain bound to a device even while they power it
      off and back on again.  This can be dangerous, because if the device is
      removed or replaced while it is powered off, the driver doesn't know that
      anything changed.  But some drivers accept that risk.
      
      Add pci_ignore_hotplug() for use by drivers that know their device cannot
      be removed.  Using pci_ignore_hotplug() tells the PCI core that hot-plug
      events for the device should be ignored.
      
      The radeon and nouveau drivers use this to switch between a low-power,
      integrated GPU and a higher-power, higher-performance discrete GPU.  They
      power off the unused GPU, but they want to remain bound to it.
      
      This is a reimplementation of f244d8b6 ("ACPIPHP / radeon / nouveau:
      Fix VGA switcheroo problem related to hotplug") but extends it to work with
      both acpiphp and pciehp.
      
      This fixes a problem where systems with dual GPUs using the radeon drivers
      become unusable, freezing every few seconds (see bugzillas below).  The
      resume of the radeon device may also fail, e.g.,
      
      This fixes problems on dual GPU systems where the radeon driver becomes
      unusable because of problems while suspending the device, as in bug 79701:
      
          [drm] radeon: finishing device.
          radeon 0000:01:00.0: Userspace still has active objects !
          radeon 0000:01:00.0: ffff8800cb4ec288 ffff8800cb4ec000 16384 4294967297 force free
          ...
          WARNING: CPU: 0 PID: 67 at /home/apw/COD/linux/drivers/gpu/drm/radeon/radeon_gart.c:234 radeon_gart_unbind+0xd2/0xe0 [radeon]()
          trying to unbind memory from uninitialized GART !
      
      or while resuming it, as in bug 77261:
      
          radeon 0000:01:00.0: ring 0 stalled for more than 10158msec
          radeon 0000:01:00.0: GPU lockup ...
          radeon 0000:01:00.0: GPU pci config reset
          pciehp 0000:00:01.0:pcie04: Card not present on Slot(1-1)
          radeon 0000:01:00.0: GPU reset succeeded, trying to resume
          *ERROR* radeon: dpm resume failed
          radeon 0000:01:00.0: Wait for MC idle timedout !
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=77261
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=79701Reported-by: default avatarShawn Starr <shawn.starr@rogers.com>
      Reported-by: default avatarJose P. <lbdkmjdf@sharklasers.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Acked-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Acked-by: default avatarRajat Jain <rajatxjain@gmail.com>
      Acked-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: default avatarDave Airlie <airlied@redhat.com>
      CC: stable@vger.kernel.org	# v3.15+
      b440bde7
  23. 07 Jul, 2014 1 commit
    • Myron Stowe's avatar
      PCI: pciehp: Clear Data Link Layer State Changed during init · 0d25d35c
      Myron Stowe authored
      During PCIe hot-plug initialization - pciehp_probe() - data structures
      related to slot capabilities are set up.  As part of this set up, ISRs are
      put in place to handle slot events and all event bits are cleared out.
      
      This patch adds the Data Link Layer State Changed (PCI_EXP_SLTSTA_DLLSC)
      Slot Status bit to the event bits that are cleared out during
      initialization.
      
      If the BIOS doesn't clear DLLSC before handoff to the OS, pciehp notices
      that it's set and interprets it as a new Link Up event, which results in
      spurious messages:
      
        pciehp 0000:82:04.0:pcie24: slot(4): Link Up event
        pciehp 0000:82:04.0:pcie24: Device 0000:83:00.0 already exists at 0000:83:00, cannot hot-add
        pciehp 0000:82:04.0:pcie24: Cannot add device at 0000:83:00
      
      Prior to e48f1b67 ("PCI: pciehp: Use link change notifications for
      hot-plug and removal"), pciehp ignored DLLSC.
      
      Reference:
        PCI-SIG.  PCI Express Base Specification Revision 4.0 Version 0.3
        (PCI-SIG, 2014): 7.8.11. Slot Status Register (Offset 1Ah).
      
      [bhelgaas: add e48f1b67 ref and stable tag]
      Fixes: e48f1b67 ("PCI: pciehp: Use link change notifications for hot-plug and removal")
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=79611Signed-off-by: default avatarMyron Stowe <myron.stowe@redhat.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      CC: stable@vger.kernel.org	# v3.15+
      0d25d35c
  24. 05 Jul, 2014 1 commit
  25. 17 Jun, 2014 3 commits
    • Bjorn Helgaas's avatar
      PCI: pciehp: Remove assumptions about which commands cause completion events · 2cc56f30
      Bjorn Helgaas authored
      We use incorrect logic to decide whether a PCIe hotplug controller
      generates command completion events.
      
      5808639b ("pciehp: fix slow probing") assumed that the Slot Status
      "Command Completed" bit was set only for commands affecting slot power,
      indicators, or electromechanical interlock.  That assumption is false: per
      sec. 6.7.3.2 of PCIe spec r3.0, a write targeting any portion of the Slot
      Control register is a command, and (if command completed events are
      supported) software must wait for a command to complete before issuing the
      next command.
      
      5808639b was to fix boot-time timeouts (see bugzilla below) on a Lenovo
      Thinkpad R61 with an Intel hotplug controller.  The controller probably has
      the Intel CF118 erratum, which means it doesn't report Command Completed
      unless the Slot Control power, indicator, or interlock bits are changed.
      This causes a timeout because pciehp always waits for Command Complete (if
      supported), regardless of which bits are changed.
      
      Remove the incorrect logic because the timeouts have been addressed
      differently by these changes:
      
        PCI: pciehp: Wait for hotplug command completion lazily
        PCI: pciehp: Compute timeout from hotplug command start time
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=10751
      Tested-by: Rajat Jain <rajatxjain@gmail.com>	(IDT 807a controller)
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Acked-by: default avatarYinghai Lu <yinghai@kernel.org>
      2cc56f30
    • Bjorn Helgaas's avatar
      PCI: pciehp: Compute timeout from hotplug command start time · 40b96083
      Bjorn Helgaas authored
      If we issue a hotplug command, go do something else, then come back and
      wait for the command to complete, we don't have to wait the whole timeout
      period, because some of it elapsed while we were doing something else.
      
      Keep track of the time we issued the command, and wait only until the
      timeout period from that point has elapsed.
      
      For controllers with errata like Intel CF118, we previously timed out
      before issuing the second hotplug command:
      
        At time T1 (during boot):
          - Write DLLSCE, ABPE, PDCE, etc. to Slot Control
        At time T2 (hotplug event):
          - Wait for command completion (CC) in Slot Status
          - Timeout at T2 + 1 second because CC is never set in Slot Status
          - Write PCC, PIC, etc. to Slot Control
      
      With this change, we wait until T1 + 1 second instead of T2 + 1 second.
      If the hotplug event is more than 1 second after the boot-time
      initialization, we won't wait for the timeout at all.
      
      We still emit a "Timeout on hotplug command" message if it timed out; we
      should see this on the first hotplug event on every controller with this
      erratum, as well as on real errors on controllers without the erratum.
      
      Link: http://www.intel.com/content/www/us/en/processors/xeon/xeon-e7-v2-spec-update.html
      Tested-by: Rajat Jain <rajatxjain@gmail.com>	(IDT 807a controller)
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Acked-by: default avatarYinghai Lu <yinghai@kernel.org>
      40b96083
    • Bjorn Helgaas's avatar
      PCI: pciehp: Wait for hotplug command completion lazily · 3461a068
      Bjorn Helgaas authored
      Previously we issued a hotplug command and waited for it to complete.  But
      there's no need to wait until we're ready to issue the *next* command.  The
      next command will probably be much later, so the first one may have already
      completed and we may not have to actually wait at all.
      
      Because of hardware errata, some controllers generate command completion
      events for some commands but not others.  In the case of Intel CF118 (see
      spec update reference), the controller indicates command completion only
      for Slot Control writes that change the value of the following bits:
      
        Power Controller Control
        Power Indicator Control
        Attention Indicator Control
        Electromechanical Interlock Control
      
      Changes to other bits, e.g., the interrupt enable bits, do not cause the
      Command Completed bit to be set.  Controllers from AMD and Nvidia are
      reported to have similar errata.
      
      These errata cause timeouts when pcie_enable_notification() enables
      interrupts.  Previously that timeout occurred at boot-time.  With this
      change, the timeout occurs later, when we change the state of the slot
      power, indicators, or interlock.  This speeds up boot but causes a timeout
      at the first hotplug event on the slot.  Subsequent events don't timeout
      because only the first (boot-time) hotplug command updates Slot Control
      without touching the power/indicator/interlock controls.
      
      Link: http://www.intel.com/content/www/us/en/processors/xeon/xeon-e7-v2-spec-update.html
      Tested-by: Rajat Jain <rajatxjain@gmail.com>	(IDT 807a controller)
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Acked-by: default avatarYinghai Lu <yinghai@kernel.org>
      3461a068
  26. 16 Jun, 2014 1 commit
  27. 11 Jun, 2014 1 commit