Skip to content
  • Oleg Nesterov's avatar
    coredump: add %i/%I in core_pattern to report the tid of the crashed thread · b03023ec
    Oleg Nesterov authored
    format_corename() can only pass the leader's pid to the core handler,
    but there is no simple way to figure out which thread originated the
    coredump.
    
    As Jan explains, this also means that there is no simple way to create
    the backtrace of the crashed process:
    
    As programs are mostly compiled with implicit gcc -fomit-frame-pointer
    one needs program's .eh_frame section (equivalently PT_GNU_EH_FRAME
    segment) or .debug_frame section.  .debug_frame usually is present only
    in separate debug info files usually not even installed on the system.
    While .eh_frame is a part of the executable/library (and it is even
    always mapped for C++ exceptions unwinding) it no longer has to be
    present anywhere on the disk as the program could be upgraded in the
    meantime and the running instance has its executable file already
    unlinked from disk.
    
    One possibility is to echo 0x3f >/proc/*/coredump_filter and dump all
    the file-backed memory including the executable's .eh_frame section.
    But that can create huge core files, for example even due to mmapped
    data files.
    
    Other possibility would be to read .eh_frame from /proc/PID/mem at the
    core_pattern handler time of the core dump.  For the backtrace one needs
    to read the register state first which can be done from core_pattern
    handler:
    
        ptrace(PTRACE_SEIZE, tid, 0, PTRACE_O_TRACEEXIT)
        close(0);    // close pipe fd to resume the sleeping dumper
        waitpid();   // should report EXIT
        PTRACE_GETREGS or other requests
    
    The remaining problem is how to get the 'tid' value of the crashed
    thread.  It could be read from the first NT_PRSTATUS note of the core
    file but that makes the core_pattern handler complicated.
    
    Unfortunately %t is already used so this patch uses %i/%I.
    
    Automatic Bug Reporting Tool (https://github.com/abrt/abrt/wiki/overview)
    is experimenting with this.  It is using the elfutils
    (https://fedorahosted.org/elfutils/
    
    ) unwinder for generating the
    backtraces.  Apart from not needing matching executables as mentioned
    above, another advantage is that we can get the backtrace without saving
    the core (which might be quite large) to disk.
    
    [mmilata@redhat.com: final paragraph of changelog]
    Signed-off-by: default avatarJan Kratochvil <jan.kratochvil@redhat.com>
    Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
    Cc: Alexander Viro <viro@zeniv.linux.org.uk>
    Cc: Denys Vlasenko <dvlasenk@redhat.com>
    Cc: Jan Kratochvil <jan.kratochvil@redhat.com>
    Cc: Mark Wielaard <mjw@redhat.com>
    Cc: Martin Milata <mmilata@redhat.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    b03023ec