Skip to content
  • Ashok Raj's avatar
    KVM/x86: Add IBPB support · 15d45071
    Ashok Raj authored
    The Indirect Branch Predictor Barrier (IBPB) is an indirect branch
    control mechanism. It keeps earlier branches from influencing
    later ones.
    
    Unlike IBRS and STIBP, IBPB does not define a new mode of operation.
    It's a command that ensures predicted branch targets aren't used after
    the barrier. Although IBRS and IBPB are enumerated by the same CPUID
    enumeration, IBPB is very different.
    
    IBPB helps mitigate against three potential attacks:
    
    * Mitigate guests from being attacked by other guests.
      - This is addressed by issing IBPB when we do a guest switch.
    
    * Mitigate attacks from guest/ring3->host/ring3.
      These would require a IBPB during context switch in host, or after
      VMEXIT. The host process has two ways to mitigate
      - Either it can be compiled with retpoline
      - If its going through context switch, and has set !dumpable then
        there is a IBPB in that path.
        (Tim's patch: https://patchwork.kernel.org/patch/10192871)
      - The case where after a VMEXIT you return back to Qemu might make
        Qemu attackable from guest when Qemu isn't compiled with retpoline.
      There are issues reported when doing IBPB on every VMEXIT that resulted
      in some tsc calibration woes in guest.
    
    * Mitigate guest/ring0->host/ring0 attacks.
      When host kernel is using retpoline it is safe against these attacks.
      If host kernel isn't using retpoline we might need to do a IBPB flush on
      every VMEXIT.
    
    Even when using retpoline for indirect calls, in certain conditions 'ret'
    can use the BTB on Skylake-era CPUs. There are other mitigations
    available like RSB stuffing/clearing.
    
    * IBPB is issued only for SVM during svm_free_vcpu().
      VMX has a vmclear and SVM doesn't.  Follow discussion here:
      https://lkml.org/lkml/2018/1/15/146
    
    Please refer to the following spec for more details on the enumeration
    and control.
    
    Refer here to get documentation about mitigations.
    
    https://software.intel.com/en-us/side-channel-security-support
    
    
    
    [peterz: rebase and changelog rewrite]
    [karahmed: - rebase
               - vmx: expose PRED_CMD if guest has it in CPUID
               - svm: only pass through IBPB if guest has it in CPUID
               - vmx: support !cpu_has_vmx_msr_bitmap()]
               - vmx: support nested]
    [dwmw2: Expose CPUID bit too (AMD IBPB only for now as we lack IBRS)
            PRED_CMD is a write-only MSR]
    
    Signed-off-by: default avatarAshok Raj <ashok.raj@intel.com>
    Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
    Signed-off-by: default avatarDavid Woodhouse <dwmw@amazon.co.uk>
    Signed-off-by: default avatarKarimAllah Ahmed <karahmed@amazon.de>
    Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
    Reviewed-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
    Cc: Andrea Arcangeli <aarcange@redhat.com>
    Cc: Andi Kleen <ak@linux.intel.com>
    Cc: kvm@vger.kernel.org
    Cc: Asit Mallick <asit.k.mallick@intel.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Andy Lutomirski <luto@kernel.org>
    Cc: Dave Hansen <dave.hansen@intel.com>
    Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com>
    Cc: Greg KH <gregkh@linuxfoundation.org>
    Cc: Jun Nakajima <jun.nakajima@intel.com>
    Cc: Paolo Bonzini <pbonzini@redhat.com>
    Cc: Dan Williams <dan.j.williams@intel.com>
    Cc: Tim Chen <tim.c.chen@linux.intel.com>
    Link: http://lkml.kernel.org/r/1515720739-43819-6-git-send-email-ashok.raj@intel.com
    Link: https://lkml.kernel.org/r/1517522386-18410-3-git-send-email-karahmed@amazon.de
    15d45071