1. 19 Sep, 2014 1 commit
    • Aaron Tomlin's avatar
      init/main.c: Give init_task a canary · d4311ff1
      Aaron Tomlin authored
      Tasks get their end of stack set to STACK_END_MAGIC with the
      aim to catch stack overruns. Currently this feature does not
      apply to init_task. This patch removes this restriction.
      
      Note that a similar patch was posted by Prarit Bhargava
      some time ago but was never merged:
      
        http://marc.info/?l=linux-kernel&m=127144305403241&w=2Signed-off-by: default avatarAaron Tomlin <atomlin@redhat.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Cc: aneesh.kumar@linux.vnet.ibm.com
      Cc: dzickus@redhat.com
      Cc: bmr@redhat.com
      Cc: jcastillo@redhat.com
      Cc: jgh@redhat.com
      Cc: minchan@kernel.org
      Cc: tglx@linutronix.de
      Cc: hannes@cmpxchg.org
      Cc: Alex Thorlton <athorlton@sgi.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Daeseok Youn <daeseok.youn@gmail.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Fabian Frederick <fabf@skynet.be>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Michael Opdenacker <michael.opdenacker@free-electrons.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Seiji Aguchi <seiji.aguchi@hds.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Vladimir Davydov <vdavydov@parallels.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: http://lkml.kernel.org/r/1410527779-8133-2-git-send-email-atomlin@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      d4311ff1
  2. 02 Jun, 2014 1 commit
    • Minchan Kim's avatar
      tracing: Print max callstack on stacktrace bug · e3172181
      Minchan Kim authored
      While I played with my own feature(ex, something on the way to reclaim),
      the kernel would easily oops. I guessed that the reason had to do with
      stack overflow and wanted to prove it.
      
      I discovered the stack tracer which proved to be very useful for me but
      the kernel would oops before my user program gather the information via
      "watch cat /sys/kernel/debug/tracing/stack_trace" so I couldn't get any
      message from that. What I needed was to have the stack tracer emit the
      kernel stack usage before it does the oops so I could find what was
      hogging the stack.
      
      This patch shows the callstack of max stack usage right before an oops so
      we can find a culprit.
      
      So, the result is as follows.
      
      [ 1116.522206] init: lightdm main process (1246) terminated with status 1
      [ 1119.922916] init: failsafe-x main process (1272) terminated with status 1
      [ 3887.728131] kworker/u24:1 (6637) used greatest stack depth: 256 bytes left
      [ 6397.629227] cc1 (9554) used greatest stack depth: 128 bytes left
      [ 7174.467392]         Depth    Size   Location    (47 entries)
      [ 7174.467392]         -----    ----   --------
      [ 7174.467785]   0)     7248     256   get_page_from_freelist+0xa7/0x920
      [ 7174.468506]   1)     6992     352   __alloc_pages_nodemask+0x1cd/0xb20
      [ 7174.469224]   2)     6640       8   alloc_pages_current+0x10f/0x1f0
      [ 7174.469413]   3)     6632     168   new_slab+0x2c5/0x370
      [ 7174.469413]   4)     6464       8   __slab_alloc+0x3a9/0x501
      [ 7174.469413]   5)     6456      80   __kmalloc+0x1cb/0x200
      [ 7174.469413]   6)     6376     376   vring_add_indirect+0x36/0x200
      [ 7174.469413]   7)     6000     144   virtqueue_add_sgs+0x2e2/0x320
      [ 7174.469413]   8)     5856     288   __virtblk_add_req+0xda/0x1b0
      [ 7174.469413]   9)     5568      96   virtio_queue_rq+0xd3/0x1d0
      [ 7174.469413]  10)     5472     128   __blk_mq_run_hw_queue+0x1ef/0x440
      [ 7174.469413]  11)     5344      16   blk_mq_run_hw_queue+0x35/0x40
      [ 7174.469413]  12)     5328      96   blk_mq_insert_requests+0xdb/0x160
      [ 7174.469413]  13)     5232     112   blk_mq_flush_plug_list+0x12b/0x140
      [ 7174.469413]  14)     5120     112   blk_flush_plug_list+0xc7/0x220
      [ 7174.469413]  15)     5008      64   io_schedule_timeout+0x88/0x100
      [ 7174.469413]  16)     4944     128   mempool_alloc+0x145/0x170
      [ 7174.469413]  17)     4816      96   bio_alloc_bioset+0x10b/0x1d0
      [ 7174.469413]  18)     4720      48   get_swap_bio+0x30/0x90
      [ 7174.469413]  19)     4672     160   __swap_writepage+0x150/0x230
      [ 7174.469413]  20)     4512      32   swap_writepage+0x42/0x90
      [ 7174.469413]  21)     4480     320   shrink_page_list+0x676/0xa80
      [ 7174.469413]  22)     4160     208   shrink_inactive_list+0x262/0x4e0
      [ 7174.469413]  23)     3952     304   shrink_lruvec+0x3e1/0x6a0
      [ 7174.469413]  24)     3648      80   shrink_zone+0x3f/0x110
      [ 7174.469413]  25)     3568     128   do_try_to_free_pages+0x156/0x4c0
      [ 7174.469413]  26)     3440     208   try_to_free_pages+0xf7/0x1e0
      [ 7174.469413]  27)     3232     352   __alloc_pages_nodemask+0x783/0xb20
      [ 7174.469413]  28)     2880       8   alloc_pages_current+0x10f/0x1f0
      [ 7174.469413]  29)     2872     200   __page_cache_alloc+0x13f/0x160
      [ 7174.469413]  30)     2672      80   find_or_create_page+0x4c/0xb0
      [ 7174.469413]  31)     2592      80   ext4_mb_load_buddy+0x1e9/0x370
      [ 7174.469413]  32)     2512     176   ext4_mb_regular_allocator+0x1b7/0x460
      [ 7174.469413]  33)     2336     128   ext4_mb_new_blocks+0x458/0x5f0
      [ 7174.469413]  34)     2208     256   ext4_ext_map_blocks+0x70b/0x1010
      [ 7174.469413]  35)     1952     160   ext4_map_blocks+0x325/0x530
      [ 7174.469413]  36)     1792     384   ext4_writepages+0x6d1/0xce0
      [ 7174.469413]  37)     1408      16   do_writepages+0x23/0x40
      [ 7174.469413]  38)     1392      96   __writeback_single_inode+0x45/0x2e0
      [ 7174.469413]  39)     1296     176   writeback_sb_inodes+0x2ad/0x500
      [ 7174.469413]  40)     1120      80   __writeback_inodes_wb+0x9e/0xd0
      [ 7174.469413]  41)     1040     160   wb_writeback+0x29b/0x350
      [ 7174.469413]  42)      880     208   bdi_writeback_workfn+0x11c/0x480
      [ 7174.469413]  43)      672     144   process_one_work+0x1d2/0x570
      [ 7174.469413]  44)      528     112   worker_thread+0x116/0x370
      [ 7174.469413]  45)      416     240   kthread+0xf3/0x110
      [ 7174.469413]  46)      176     176   ret_from_fork+0x7c/0xb0
      [ 7174.469413] ------------[ cut here ]------------
      [ 7174.469413] kernel BUG at kernel/trace/trace_stack.c:174!
      [ 7174.469413] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
      [ 7174.469413] Dumping ftrace buffer:
      [ 7174.469413]    (ftrace buffer empty)
      [ 7174.469413] Modules linked in:
      [ 7174.469413] CPU: 0 PID: 440 Comm: kworker/u24:0 Not tainted 3.14.0+ #212
      [ 7174.469413] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      [ 7174.469413] Workqueue: writeback bdi_writeback_workfn (flush-253:0)
      [ 7174.469413] task: ffff880034170000 ti: ffff880029518000 task.ti: ffff880029518000
      [ 7174.469413] RIP: 0010:[<ffffffff8112336e>]  [<ffffffff8112336e>] stack_trace_call+0x2de/0x340
      [ 7174.469413] RSP: 0000:ffff880029518290  EFLAGS: 00010046
      [ 7174.469413] RAX: 0000000000000030 RBX: 000000000000002f RCX: 0000000000000000
      [ 7174.469413] RDX: 0000000000000000 RSI: 000000000000002f RDI: ffffffff810b7159
      [ 7174.469413] RBP: ffff8800295182f0 R08: ffffffffffffffff R09: 0000000000000000
      [ 7174.469413] R10: 0000000000000001 R11: 0000000000000001 R12: ffffffff82768dfc
      [ 7174.469413] R13: 000000000000f2e8 R14: ffff8800295182b8 R15: 00000000000000f8
      [ 7174.469413] FS:  0000000000000000(0000) GS:ffff880037c00000(0000) knlGS:0000000000000000
      [ 7174.469413] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [ 7174.469413] CR2: 00002acd0b994000 CR3: 0000000001c0b000 CR4: 00000000000006f0
      [ 7174.469413] Stack:
      [ 7174.469413]  0000000000000000 ffffffff8114fdb7 0000000000000087 0000000000001c50
      [ 7174.469413]  0000000000000000 0000000000000000 0000000000000000 0000000000000000
      [ 7174.469413]  0000000000000002 ffff880034170000 ffff880034171028 0000000000000000
      [ 7174.469413] Call Trace:
      [ 7174.469413]  [<ffffffff8114fdb7>] ? get_page_from_freelist+0xa7/0x920
      [ 7174.469413]  [<ffffffff816eee3f>] ftrace_call+0x5/0x2f
      [ 7174.469413]  [<ffffffff81165065>] ? next_zones_zonelist+0x5/0x70
      [ 7174.469413]  [<ffffffff810a23fa>] ? __bfs+0x11a/0x270
      [ 7174.469413]  [<ffffffff81165065>] ? next_zones_zonelist+0x5/0x70
      [ 7174.469413]  [<ffffffff8114fdb7>] ? get_page_from_freelist+0xa7/0x920
      [ 7174.469413]  [<ffffffff8119092f>] ? alloc_pages_current+0x10f/0x1f0
      [ 7174.469413]  [<ffffffff811507fd>] __alloc_pages_nodemask+0x1cd/0xb20
      [ 7174.469413]  [<ffffffff810a4de6>] ? check_irq_usage+0x96/0xe0
      [ 7174.469413]  [<ffffffff816eee3f>] ? ftrace_call+0x5/0x2f
      [ 7174.469413]  [<ffffffff8119092f>] alloc_pages_current+0x10f/0x1f0
      [ 7174.469413]  [<ffffffff81199cd5>] ? new_slab+0x2c5/0x370
      [ 7174.469413]  [<ffffffff81199cd5>] new_slab+0x2c5/0x370
      [ 7174.469413]  [<ffffffff816eee3f>] ? ftrace_call+0x5/0x2f
      [ 7174.469413]  [<ffffffff816db002>] __slab_alloc+0x3a9/0x501
      [ 7174.469413]  [<ffffffff8119af8b>] ? __kmalloc+0x1cb/0x200
      [ 7174.469413]  [<ffffffff8141dc46>] ? vring_add_indirect+0x36/0x200
      [ 7174.469413]  [<ffffffff8141dc46>] ? vring_add_indirect+0x36/0x200
      [ 7174.469413]  [<ffffffff8141dc46>] ? vring_add_indirect+0x36/0x200
      [ 7174.469413]  [<ffffffff8119af8b>] __kmalloc+0x1cb/0x200
      [ 7174.469413]  [<ffffffff8141de10>] ? vring_add_indirect+0x200/0x200
      [ 7174.469413]  [<ffffffff8141dc46>] vring_add_indirect+0x36/0x200
      [ 7174.469413]  [<ffffffff8141e402>] virtqueue_add_sgs+0x2e2/0x320
      [ 7174.469413]  [<ffffffff8148e35a>] __virtblk_add_req+0xda/0x1b0
      [ 7174.469413]  [<ffffffff8148e503>] virtio_queue_rq+0xd3/0x1d0
      [ 7174.469413]  [<ffffffff8134aa0f>] __blk_mq_run_hw_queue+0x1ef/0x440
      [ 7174.469413]  [<ffffffff8134b0d5>] blk_mq_run_hw_queue+0x35/0x40
      [ 7174.469413]  [<ffffffff8134b7bb>] blk_mq_insert_requests+0xdb/0x160
      [ 7174.469413]  [<ffffffff8134be5b>] blk_mq_flush_plug_list+0x12b/0x140
      [ 7174.469413]  [<ffffffff81342237>] blk_flush_plug_list+0xc7/0x220
      [ 7174.469413]  [<ffffffff816e60ef>] ? _raw_spin_unlock_irqrestore+0x3f/0x70
      [ 7174.469413]  [<ffffffff816e16e8>] io_schedule_timeout+0x88/0x100
      [ 7174.469413]  [<ffffffff816e1665>] ? io_schedule_timeout+0x5/0x100
      [ 7174.469413]  [<ffffffff81149415>] mempool_alloc+0x145/0x170
      [ 7174.469413]  [<ffffffff8109baf0>] ? __init_waitqueue_head+0x60/0x60
      [ 7174.469413]  [<ffffffff811e246b>] bio_alloc_bioset+0x10b/0x1d0
      [ 7174.469413]  [<ffffffff81184230>] ? end_swap_bio_read+0xc0/0xc0
      [ 7174.469413]  [<ffffffff81184230>] ? end_swap_bio_read+0xc0/0xc0
      [ 7174.469413]  [<ffffffff81184110>] get_swap_bio+0x30/0x90
      [ 7174.469413]  [<ffffffff81184230>] ? end_swap_bio_read+0xc0/0xc0
      [ 7174.469413]  [<ffffffff81184660>] __swap_writepage+0x150/0x230
      [ 7174.469413]  [<ffffffff810ab405>] ? do_raw_spin_unlock+0x5/0xa0
      [ 7174.469413]  [<ffffffff81184230>] ? end_swap_bio_read+0xc0/0xc0
      [ 7174.469413]  [<ffffffff81184515>] ? __swap_writepage+0x5/0x230
      [ 7174.469413]  [<ffffffff81184782>] swap_writepage+0x42/0x90
      [ 7174.469413]  [<ffffffff8115ae96>] shrink_page_list+0x676/0xa80
      [ 7174.469413]  [<ffffffff816eee3f>] ? ftrace_call+0x5/0x2f
      [ 7174.469413]  [<ffffffff8115b872>] shrink_inactive_list+0x262/0x4e0
      [ 7174.469413]  [<ffffffff8115c1c1>] shrink_lruvec+0x3e1/0x6a0
      [ 7174.469413]  [<ffffffff8115c4bf>] shrink_zone+0x3f/0x110
      [ 7174.469413]  [<ffffffff816eee3f>] ? ftrace_call+0x5/0x2f
      [ 7174.469413]  [<ffffffff8115c9e6>] do_try_to_free_pages+0x156/0x4c0
      [ 7174.469413]  [<ffffffff8115cf47>] try_to_free_pages+0xf7/0x1e0
      [ 7174.469413]  [<ffffffff81150db3>] __alloc_pages_nodemask+0x783/0xb20
      [ 7174.469413]  [<ffffffff8119092f>] alloc_pages_current+0x10f/0x1f0
      [ 7174.469413]  [<ffffffff81145c0f>] ? __page_cache_alloc+0x13f/0x160
      [ 7174.469413]  [<ffffffff81145c0f>] __page_cache_alloc+0x13f/0x160
      [ 7174.469413]  [<ffffffff81146c6c>] find_or_create_page+0x4c/0xb0
      [ 7174.469413]  [<ffffffff811463e5>] ? find_get_page+0x5/0x130
      [ 7174.469413]  [<ffffffff812837b9>] ext4_mb_load_buddy+0x1e9/0x370
      [ 7174.469413]  [<ffffffff81284c07>] ext4_mb_regular_allocator+0x1b7/0x460
      [ 7174.469413]  [<ffffffff81281070>] ? ext4_mb_use_preallocated+0x40/0x360
      [ 7174.469413]  [<ffffffff816eee3f>] ? ftrace_call+0x5/0x2f
      [ 7174.469413]  [<ffffffff81287eb8>] ext4_mb_new_blocks+0x458/0x5f0
      [ 7174.469413]  [<ffffffff8127d83b>] ext4_ext_map_blocks+0x70b/0x1010
      [ 7174.469413]  [<ffffffff8124e6d5>] ext4_map_blocks+0x325/0x530
      [ 7174.469413]  [<ffffffff81253871>] ext4_writepages+0x6d1/0xce0
      [ 7174.469413]  [<ffffffff812531a0>] ? ext4_journalled_write_end+0x330/0x330
      [ 7174.469413]  [<ffffffff811539b3>] do_writepages+0x23/0x40
      [ 7174.469413]  [<ffffffff811d2365>] __writeback_single_inode+0x45/0x2e0
      [ 7174.469413]  [<ffffffff811d36ed>] writeback_sb_inodes+0x2ad/0x500
      [ 7174.469413]  [<ffffffff811d39de>] __writeback_inodes_wb+0x9e/0xd0
      [ 7174.469413]  [<ffffffff811d40bb>] wb_writeback+0x29b/0x350
      [ 7174.469413]  [<ffffffff81057c3d>] ? __local_bh_enable_ip+0x6d/0xd0
      [ 7174.469413]  [<ffffffff811d6e9c>] bdi_writeback_workfn+0x11c/0x480
      [ 7174.469413]  [<ffffffff81070610>] ? process_one_work+0x170/0x570
      [ 7174.469413]  [<ffffffff81070672>] process_one_work+0x1d2/0x570
      [ 7174.469413]  [<ffffffff81070610>] ? process_one_work+0x170/0x570
      [ 7174.469413]  [<ffffffff81071bb6>] worker_thread+0x116/0x370
      [ 7174.469413]  [<ffffffff81071aa0>] ? manage_workers.isra.19+0x2e0/0x2e0
      [ 7174.469413]  [<ffffffff81078e53>] kthread+0xf3/0x110
      [ 7174.469413]  [<ffffffff81078d60>] ? flush_kthread_worker+0x150/0x150
      [ 7174.469413]  [<ffffffff816ef0ec>] ret_from_fork+0x7c/0xb0
      [ 7174.469413]  [<ffffffff81078d60>] ? flush_kthread_worker+0x150/0x150
      [ 7174.469413] Code: c0 49 bc fc 8d 76 82 ff ff ff ff e8 44 5a 5b 00 31 f6 8b 05 95 2b b3 00 48 39 c6 7d 0e 4c 8b 04 f5 20 5f c5 81 49 83 f8 ff 75 11 <0f> 0b 48 63 05 71 5a 64 01 48 29 c3 e9 d0 fd ff ff 48 8d 5e 01
      [ 7174.469413] RIP  [<ffffffff8112336e>] stack_trace_call+0x2de/0x340
      [ 7174.469413]  RSP <ffff880029518290>
      [ 7174.469413] ---[ end trace c97d325b36b718f3 ]---
      
      Link: http://lkml.kernel.org/p/1401683592-1651-1-git-send-email-minchan@kernel.orgSigned-off-by: default avatarMinchan Kim <minchan@kernel.org>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      e3172181
  3. 24 Apr, 2014 1 commit
  4. 24 Mar, 2014 1 commit
  5. 02 Jan, 2014 1 commit
  6. 13 Apr, 2013 1 commit
  7. 12 Apr, 2013 1 commit
  8. 15 Mar, 2013 3 commits
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Remove most or all of stack tracer stack size from stack_max_size · 4df29712
      Steven Rostedt (Red Hat) authored
      Currently, the depth reported in the stack tracer stack_trace file
      does not match the stack_max_size file. This is because the stack_max_size
      includes the overhead of stack tracer itself while the depth does not.
      
      The first time a max is triggered, a calculation is not performed that
      figures out the overhead of the stack tracer and subtracts it from
      the stack_max_size variable. The overhead is stored and is subtracted
      from the reported stack size for comparing for a new max.
      
      Now the stack_max_size corresponds to the reported depth:
      
       # cat stack_max_size
      4640
      
       # cat stack_trace
              Depth    Size   Location    (48 entries)
              -----    ----   --------
        0)     4640      32   _raw_spin_lock+0x18/0x24
        1)     4608     112   ____cache_alloc+0xb7/0x22d
        2)     4496      80   kmem_cache_alloc+0x63/0x12f
        3)     4416      16   mempool_alloc_slab+0x15/0x17
      [...]
      
      While testing against and older gcc on x86 that uses mcount instead
      of fentry, I found that pasing in ip + MCOUNT_INSN_SIZE let the
      stack trace show one more function deep which was missing before.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      4df29712
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Fix stack tracer with fentry use · d4ecbfc4
      Steven Rostedt (Red Hat) authored
      When gcc 4.6 on x86 is used, the function tracer will use the new
      option -mfentry which does a call to "fentry" at every function
      instead of "mcount". The significance of this is that fentry is
      called as the first operation of the function instead of the mcount
      usage of being called after the stack.
      
      This causes the stack tracer to show some bogus results for the size
      of the last function traced, as well as showing "ftrace_call" instead
      of the function. This is due to the stack frame not being set up
      by the function that is about to be traced.
      
       # cat stack_trace
              Depth    Size   Location    (48 entries)
              -----    ----   --------
        0)     4824     216   ftrace_call+0x5/0x2f
        1)     4608     112   ____cache_alloc+0xb7/0x22d
        2)     4496      80   kmem_cache_alloc+0x63/0x12f
      
      The 216 size for ftrace_call includes both the ftrace_call stack
      (which includes the saving of registers it does), as well as the
      stack size of the parent.
      
      To fix this, if CC_USING_FENTRY is defined, then the stack_tracer
      will reserve the first item in stack_dump_trace[] array when
      calling save_stack_trace(), and it will fill it in with the parent ip.
      Then the code will look for the parent pointer on the stack and
      give the real size of the parent's stack pointer:
      
       # cat stack_trace
              Depth    Size   Location    (14 entries)
              -----    ----   --------
        0)     2640      48   update_group_power+0x26/0x187
        1)     2592     224   update_sd_lb_stats+0x2a5/0x4ac
        2)     2368     160   find_busiest_group+0x31/0x1f1
        3)     2208     256   load_balance+0xd9/0x662
      
      I'm Cc'ing stable, although it's not urgent, as it only shows bogus
      size for item #0, the rest of the trace is legit. It should still be
      corrected in previous stable releases.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      d4ecbfc4
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Use stack of calling function for stack tracer · 87889501
      Steven Rostedt (Red Hat) authored
      Use the stack of stack_trace_call() instead of check_stack() as
      the test pointer for max stack size. It makes it a bit cleaner
      and a little more accurate.
      
      Adding stable, as a later fix depends on this patch.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      87889501
  9. 19 Nov, 2012 1 commit
  10. 31 Jul, 2012 1 commit
    • Steven Rostedt's avatar
      ftrace: Add default recursion protection for function tracing · 4740974a
      Steven Rostedt authored
      As more users of the function tracer utility are being added, they do
      not always add the necessary recursion protection. To protect from
      function recursion due to tracing, if the callback ftrace_ops does not
      specifically specify that it protects against recursion (by setting
      the FTRACE_OPS_FL_RECURSION_SAFE flag), the list operation will be
      called by the mcount trampoline which adds recursion protection.
      
      If the flag is set, then the function will be called directly with no
      extra protection.
      
      Note, the list operation is called if more than one function callback
      is registered, or if the arch does not support all of the function
      tracer features.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      4740974a
  11. 19 Jul, 2012 2 commits
  12. 21 Dec, 2011 2 commits
  13. 15 Jun, 2011 1 commit
  14. 18 May, 2011 1 commit
    • Steven Rostedt's avatar
      ftrace: Implement separate user function filtering · b848914c
      Steven Rostedt authored
      ftrace_ops that are registered to trace functions can now be
      agnostic to each other in respect to what functions they trace.
      Each ops has their own hash of the functions they want to trace
      and a hash to what they do not want to trace. A empty hash for
      the functions they want to trace denotes all functions should
      be traced that are not in the notrace hash.
      
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      b848914c
  15. 15 Oct, 2010 1 commit
    • Arnd Bergmann's avatar
      llseek: automatically add .llseek fop · 6038f373
      Arnd Bergmann authored
      All file_operations should get a .llseek operation so we can make
      nonseekable_open the default for future file operations without a
      .llseek pointer.
      
      The three cases that we can automatically detect are no_llseek, seq_lseek
      and default_llseek. For cases where we can we can automatically prove that
      the file offset is always ignored, we use noop_llseek, which maintains
      the current behavior of not returning an error from a seek.
      
      New drivers should normally not use noop_llseek but instead use no_llseek
      and call nonseekable_open at open time.  Existing drivers can be converted
      to do the same when the maintainer knows for certain that no user code
      relies on calling seek on the device file.
      
      The generated code is often incorrectly indented and right now contains
      comments that clarify for each added line why a specific variant was
      chosen. In the version that gets submitted upstream, the comments will
      be gone and I will manually fix the indentation, because there does not
      seem to be a way to do that using coccinelle.
      
      Some amount of new code is currently sitting in linux-next that should get
      the same modifications, which I will do at the end of the merge window.
      
      Many thanks to Julia Lawall for helping me learn to write a semantic
      patch that does all this.
      
      ===== begin semantic patch =====
      // This adds an llseek= method to all file operations,
      // as a preparation for making no_llseek the default.
      //
      // The rules are
      // - use no_llseek explicitly if we do nonseekable_open
      // - use seq_lseek for sequential files
      // - use default_llseek if we know we access f_pos
      // - use noop_llseek if we know we don't access f_pos,
      //   but we still want to allow users to call lseek
      //
      @ open1 exists @
      identifier nested_open;
      @@
      nested_open(...)
      {
      <+...
      nonseekable_open(...)
      ...+>
      }
      
      @ open exists@
      identifier open_f;
      identifier i, f;
      identifier open1.nested_open;
      @@
      int open_f(struct inode *i, struct file *f)
      {
      <+...
      (
      nonseekable_open(...)
      |
      nested_open(...)
      )
      ...+>
      }
      
      @ read disable optional_qualifier exists @
      identifier read_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      expression E;
      identifier func;
      @@
      ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
      {
      <+...
      (
         *off = E
      |
         *off += E
      |
         func(..., off, ...)
      |
         E = *off
      )
      ...+>
      }
      
      @ read_no_fpos disable optional_qualifier exists @
      identifier read_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      @@
      ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
      {
      ... when != off
      }
      
      @ write @
      identifier write_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      expression E;
      identifier func;
      @@
      ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
      {
      <+...
      (
        *off = E
      |
        *off += E
      |
        func(..., off, ...)
      |
        E = *off
      )
      ...+>
      }
      
      @ write_no_fpos @
      identifier write_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      @@
      ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
      {
      ... when != off
      }
      
      @ fops0 @
      identifier fops;
      @@
      struct file_operations fops = {
       ...
      };
      
      @ has_llseek depends on fops0 @
      identifier fops0.fops;
      identifier llseek_f;
      @@
      struct file_operations fops = {
      ...
       .llseek = llseek_f,
      ...
      };
      
      @ has_read depends on fops0 @
      identifier fops0.fops;
      identifier read_f;
      @@
      struct file_operations fops = {
      ...
       .read = read_f,
      ...
      };
      
      @ has_write depends on fops0 @
      identifier fops0.fops;
      identifier write_f;
      @@
      struct file_operations fops = {
      ...
       .write = write_f,
      ...
      };
      
      @ has_open depends on fops0 @
      identifier fops0.fops;
      identifier open_f;
      @@
      struct file_operations fops = {
      ...
       .open = open_f,
      ...
      };
      
      // use no_llseek if we call nonseekable_open
      ////////////////////////////////////////////
      @ nonseekable1 depends on !has_llseek && has_open @
      identifier fops0.fops;
      identifier nso ~= "nonseekable_open";
      @@
      struct file_operations fops = {
      ...  .open = nso, ...
      +.llseek = no_llseek, /* nonseekable */
      };
      
      @ nonseekable2 depends on !has_llseek @
      identifier fops0.fops;
      identifier open.open_f;
      @@
      struct file_operations fops = {
      ...  .open = open_f, ...
      +.llseek = no_llseek, /* open uses nonseekable */
      };
      
      // use seq_lseek for sequential files
      /////////////////////////////////////
      @ seq depends on !has_llseek @
      identifier fops0.fops;
      identifier sr ~= "seq_read";
      @@
      struct file_operations fops = {
      ...  .read = sr, ...
      +.llseek = seq_lseek, /* we have seq_read */
      };
      
      // use default_llseek if there is a readdir
      ///////////////////////////////////////////
      @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier readdir_e;
      @@
      // any other fop is used that changes pos
      struct file_operations fops = {
      ... .readdir = readdir_e, ...
      +.llseek = default_llseek, /* readdir is present */
      };
      
      // use default_llseek if at least one of read/write touches f_pos
      /////////////////////////////////////////////////////////////////
      @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read.read_f;
      @@
      // read fops use offset
      struct file_operations fops = {
      ... .read = read_f, ...
      +.llseek = default_llseek, /* read accesses f_pos */
      };
      
      @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier write.write_f;
      @@
      // write fops use offset
      struct file_operations fops = {
      ... .write = write_f, ...
      +	.llseek = default_llseek, /* write accesses f_pos */
      };
      
      // Use noop_llseek if neither read nor write accesses f_pos
      ///////////////////////////////////////////////////////////
      
      @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read_no_fpos.read_f;
      identifier write_no_fpos.write_f;
      @@
      // write fops use offset
      struct file_operations fops = {
      ...
       .write = write_f,
       .read = read_f,
      ...
      +.llseek = noop_llseek, /* read and write both use no f_pos */
      };
      
      @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier write_no_fpos.write_f;
      @@
      struct file_operations fops = {
      ... .write = write_f, ...
      +.llseek = noop_llseek, /* write uses no f_pos */
      };
      
      @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read_no_fpos.read_f;
      @@
      struct file_operations fops = {
      ... .read = read_f, ...
      +.llseek = noop_llseek, /* read uses no f_pos */
      };
      
      @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      @@
      struct file_operations fops = {
      ...
      +.llseek = noop_llseek, /* no read or write fn */
      };
      ===== End semantic patch =====
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Cc: Julia Lawall <julia@diku.dk>
      Cc: Christoph Hellwig <hch@infradead.org>
      6038f373
  16. 25 Aug, 2010 1 commit
    • Anton Blanchard's avatar
      tracing/trace_stack: Fix stack trace on ppc64 · 151772db
      Anton Blanchard authored
      save_stack_trace() stores the instruction pointer, not the
      function descriptor. On ppc64 the trace stack code currently
      dereferences the instruction pointer and shows 8 bytes of
      instructions in our backtraces:
      
       # cat /sys/kernel/debug/tracing/stack_trace
              Depth    Size   Location    (26 entries)
              -----    ----   --------
        0)     5424     112   0x6000000048000004
        1)     5312     160   0x60000000ebad01b0
        2)     5152     160   0x2c23000041c20030
        3)     4992     240   0x600000007c781b79
        4)     4752     160   0xe84100284800000c
        5)     4592     192   0x600000002fa30000
        6)     4400     256   0x7f1800347b7407e0
        7)     4144     208   0xe89f0108f87f0070
        8)     3936     272   0xe84100282fa30000
      
      Since we aren't dealing with function descriptors, use %pS
      instead of %pF to fix it:
      
       # cat /sys/kernel/debug/tracing/stack_trace
              Depth    Size   Location    (26 entries)
              -----    ----   --------
        0)     5424     112   ftrace_call+0x4/0x8
        1)     5312     160   .current_io_context+0x28/0x74
        2)     5152     160   .get_io_context+0x48/0xa0
        3)     4992     240   .cfq_set_request+0x94/0x4c4
        4)     4752     160   .elv_set_request+0x60/0x84
        5)     4592     192   .get_request+0x2d4/0x468
        6)     4400     256   .get_request_wait+0x7c/0x258
        7)     4144     208   .__make_request+0x49c/0x610
        8)     3936     272   .generic_make_request+0x390/0x434
      Signed-off-by: default avatarAnton Blanchard <anton@samba.org>
      Cc: rostedt@goodmis.org
      Cc: fweisbec@gmail.com
      LKML-Reference: <20100825013238.GE28360@kryten>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      151772db
  17. 03 Jun, 2010 1 commit
    • Steven Rostedt's avatar
      tracing: Remove ftrace_preempt_disable/enable · 5168ae50
      Steven Rostedt authored
      The ftrace_preempt_disable/enable functions were to address a
      recursive race caused by the function tracer. The function tracer
      traces all functions which makes it easily susceptible to recursion.
      One area was preempt_enable(). This would call the scheduler and
      the schedulre would call the function tracer and loop.
      (So was it thought).
      
      The ftrace_preempt_disable/enable was made to protect against recursion
      inside the scheduler by storing the NEED_RESCHED flag. If it was
      set before the ftrace_preempt_disable() it would not call schedule
      on ftrace_preempt_enable(), thinking that if it was set before then
      it would have already scheduled unless it was already in the scheduler.
      
      This worked fine except in the case of SMP, where another task would set
      the NEED_RESCHED flag for a task on another CPU, and then kick off an
      IPI to trigger it. This could cause the NEED_RESCHED to be saved at
      ftrace_preempt_disable() but the IPI to arrive in the the preempt
      disabled section. The ftrace_preempt_enable() would not call the scheduler
      because the flag was already set before entring the section.
      
      This bug would cause a missed preemption check and cause lower latencies.
      
      Investigating further, I found that the recusion caused by the function
      tracer was not due to schedule(), but due to preempt_schedule(). Now
      that preempt_schedule is completely annotated with notrace, the recusion
      no longer is an issue.
      Reported-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      5168ae50
  18. 02 Feb, 2010 1 commit
    • Lai Jiangshan's avatar
      tracing: Fix circular dead lock in stack trace · 4f48f8b7
      Lai Jiangshan authored
      When we cat <debugfs>/tracing/stack_trace, we may cause circular lock:
      sys_read()
        t_start()
           arch_spin_lock(&max_stack_lock);
      
        t_show()
           seq_printf(), vsnprintf() .... /* they are all trace-able,
             when they are traced, max_stack_lock may be required again. */
      
      The following script can trigger this circular dead lock very easy:
      #!/bin/bash
      
      echo 1 > /proc/sys/kernel/stack_tracer_enabled
      
      mount -t debugfs xxx /mnt > /dev/null 2>&1
      
      (
      # make check_stack() zealous to require max_stack_lock
      for ((; ;))
      {
      	echo 1 > /mnt/tracing/stack_max_size
      }
      ) &
      
      for ((; ;))
      {
      	cat /mnt/tracing/stack_trace > /dev/null
      }
      
      To fix this bug, we increase the percpu trace_active before
      require the lock.
      Reported-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      LKML-Reference: <4B67D4F9.9080905@cn.fujitsu.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      4f48f8b7
  19. 14 Dec, 2009 3 commits
  20. 24 Sep, 2009 1 commit
  21. 17 Aug, 2009 1 commit
  22. 23 Jul, 2009 1 commit
  23. 17 Jul, 2009 1 commit
  24. 26 Jun, 2009 1 commit
    • Li Zefan's avatar
      tracing: Fix stack tracer sysctl handling · a32c7765
      Li Zefan authored
      This made my machine completely frozen:
      
        # echo 1 > /proc/sys/kernel/stack_tracer_enabled
        # echo 2 > /proc/sys/kernel/stack_tracer_enabled
      
      The cause is register_ftrace_function() was called twice.
      
      Also fix ftrace_enabled sysctl, though seems nothing bad happened
      as I tested it.
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4A448D17.9010305@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      a32c7765
  25. 03 Jun, 2009 1 commit
    • walimis's avatar
      tracing/trace_stack: fix the number of entries in the header · 083a63b4
      walimis authored
      The last entry in the stack_dump_trace is ULONG_MAX, which is not
      a valid entry, but max_stack_trace.nr_entries has accounted for it.
      So when printing the header, we should decrease it by one.
      Before fix, print as following, for example:
      
      	Depth    Size   Location    (53 entries)	<--- should be 52
      	-----    ----   --------
        0)     3264     108   update_wall_time+0x4d5/0x9a0
        ...
       51)       80      80   syscall_call+0x7/0xb
       ^^^
         it's correct.
      Signed-off-by: default avatarwalimis <walimisdev@gmail.com>
      LKML-Reference: <1244016090-7814-1-git-send-email-walimisdev@gmail.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      083a63b4
  26. 07 Apr, 2009 1 commit
  27. 13 Mar, 2009 3 commits
    • Steven Rostedt's avatar
      tracing: left align location header in stack_trace · eb1871f3
      Steven Rostedt authored
      Ingo Molnar suggested, instead of:
      
              Depth    Size      Location    (27 entries)
              -----    ----      --------
        0)     2880      48   lock_timer_base+0x2b/0x4f
        1)     2832      80   __mod_timer+0x33/0xe0
        2)     2752      16   __ide_set_handler+0x63/0x65
      
      To have it be:
      
              Depth    Size   Location    (27 entries)
              -----    ----   --------
        0)     2880      48   lock_timer_base+0x2b/0x4f
        1)     2832      80   __mod_timer+0x33/0xe0
        2)     2752      16   __ide_set_handler+0x63/0x65
      Requested-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      eb1871f3
    • Steven Rostedt's avatar
      tracing: explain why stack tracer is empty · e447e1df
      Steven Rostedt authored
      If the stack tracing is disabled (by default) the stack_trace file
      will only contain the header:
      
       # cat /debug/tracing/stack_trace
              Depth    Size      Location    (0 entries)
              -----    ----      --------
      
      This can be frustrating to a developer that does not realize that the
      stack tracer is disabled. This patch adds the following text:
      
        # cat /debug/tracing/stack_trace
              Depth    Size      Location    (0 entries)
              -----    ----      --------
       #
       #  Stack tracer disabled
       #
       # To enable the stack tracer, either add 'stacktrace' to the
       # kernel command line
       # or 'echo 1 > /proc/sys/kernel/stack_tracer_enabled'
       #
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      e447e1df
    • Steven Rostedt's avatar
      tracing: fix stack tracer header · 2da03ece
      Steven Rostedt authored
      The stack tracer use to look like this:
      
       # cat /debug/tracing/stack_trace
               Depth  Size      Location    (57 entries)
               -----  ----      --------
        0)     5088      16   mempool_alloc_slab+0x16/0x18
        1)     5072     144   mempool_alloc+0x4d/0xfe
        2)     4928      16   scsi_sg_alloc+0x48/0x4a [scsi_mod]
      
      Now it looks like this:
      
       # cat /debug/tracing/stack_trace
      
              Depth    Size      Location    (57 entries)
              -----    ----      --------
        0)     5088      16   mempool_alloc_slab+0x16/0x18
        1)     5072     144   mempool_alloc+0x4d/0xfe
        2)     4928      16   scsi_sg_alloc+0x48/0x4a [scsi_mod]
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      2da03ece
  28. 18 Dec, 2008 2 commits
  29. 03 Dec, 2008 2 commits
  30. 21 Nov, 2008 1 commit
    • Liming Wang's avatar
      function tracing: fix wrong position computing of stack_trace · 522a110b
      Liming Wang authored
      Impact: make output of stack_trace complete if buffer overruns
      
      When read buffer overruns, the output of stack_trace isn't complete.
      
      When printing records with seq_printf in t_show, if the read buffer
      has overruned by the current record, then this record won't be
      printed to user space through read buffer, it will just be dropped in
      this printing.
      
      When next printing, t_start should return the "*pos"th record, which
      is the one dropped by previous printing, but it just returns
      (m->private + *pos)th record.
      
      Here we use a more sane method to implement seq_operations which can
      be found in kernel code. Thus we needn't initialize m->private.
      
      About testing, it's not easy to overrun read buffer, but we can use
      seq_printf to print more padding bytes in t_show, then it's easy to
      check whether or not records are lost.
      
      This commit has been tested on both condition of overrun and non
      overrun.
      Signed-off-by: default avatarLiming Wang <liming.wang@windriver.com>
      Acked-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      522a110b