Skip to content
  • Kan Liang's avatar
    perf/x86/intel/lbr: Fix incomplete LBR call stack · d5963fae
    Kan Liang authored
    [ Upstream commit 0592e57b
    
     ]
    
    LBR has a limited stack size. If a task has a deeper call stack than
    LBR's stack size, only the overflowed part is reported. A complete call
    stack may not be reconstructed by perf tool.
    
    Current code doesn't access all LBR registers. It only read the ones
    below the TOS. The LBR registers above the TOS will be discarded
    unconditionally.
    
    When a CALL is captured, the TOS is incremented by 1 , modulo max LBR
    stack size. The LBR HW only records the call stack information to the
    register which the TOS points to. It will not touch other LBR
    registers. So the registers above the TOS probably still store the valid
    call stack information for an overflowed call stack, which need to be
    reported.
    
    To retrieve complete call stack information, we need to start from TOS,
    read all LBR registers until an invalid entry is detected.
    0s can be used to detect the invalid entry, because:
    
     - When a RET is captured, the HW zeros the LBR register which TOS points
       to, then decreases the TOS.
     - The LBR registers are reset to 0 when adding a new LBR event or
       scheduling an existing LBR event.
     - A taken branch at IP 0 is not expected
    
    The context switch code is also modified to save/restore all valid LBR
    registers. Furthermore, the LBR registers, which don't have valid call
    stack information, need to be reset in restore, because they may be
    polluted while swapped out.
    
    Here is a small test program, tchain_deep.
    Its call stack is deeper than 32.
    
     noinline void f33(void)
     {
            int i;
    
            for (i = 0; i < 10000000;) {
                    if (i%2)
                            i++;
                    else
                            i++;
            }
     }
    
     noinline void f32(void)
     {
            f33();
     }
    
     noinline void f31(void)
     {
            f32();
     }
    
     ... ...
    
     noinline void f1(void)
     {
            f2();
     }
    
     int main()
     {
            f1();
     }
    
    Here is the test result on SKX. The max stack size of SKX is 32.
    
    Without the patch:
    
     $ perf record -e cycles --call-graph lbr -- ./tchain_deep
     $ perf report --stdio
     #
     # Children      Self  Command      Shared Object     Symbol
     # ........  ........  ...........  ................  .................
     #
       100.00%    99.99%  tchain_deep    tchain_deep       [.] f33
                |
                 --99.99%--f30
                           f31
                           f32
                           f33
    
    With the patch:
    
     $ perf record -e cycles --call-graph lbr -- ./tchain_deep
     $ perf report --stdio
     # Children      Self  Command      Shared Object     Symbol
     # ........  ........  ...........  ................  ..................
     #
        99.99%     0.00%  tchain_deep    tchain_deep       [.] f1
                |
                ---f1
                   f2
                   f3
                   f4
                   f5
                   f6
                   f7
                   f8
                   f9
                   f10
                   f11
                   f12
                   f13
                   f14
                   f15
                   f16
                   f17
                   f18
                   f19
                   f20
                   f21
                   f22
                   f23
                   f24
                   f25
                   f26
                   f27
                   f28
                   f29
                   f30
                   f31
                   f32
                   f33
    
    Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
    Cc: Jiri Olsa <jolsa@redhat.com>
    Cc: Stephane Eranian <eranian@google.com>
    Cc: Vince Weaver <vincent.weaver@maine.edu>
    Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: acme@kernel.org
    Cc: eranian@google.com
    Link: https://lore.kernel.org/lkml/1528213126-4312-1-git-send-email-kan.liang@linux.intel.com
    
    
    Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
    Signed-off-by: default avatarSasha Levin <alexander.levin@microsoft.com>
    Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    d5963fae