1. 21 Dec, 2018 1 commit
  2. 20 Dec, 2018 2 commits
  3. 17 Dec, 2018 37 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.14.89 · 3beeb261
      Greg Kroah-Hartman authored
      3beeb261
    • Eric Dumazet's avatar
      tcp: lack of available data can also cause TSO defer · 4465b31b
      Eric Dumazet authored
      commit f9bfe4e6a9d08d405fe7b081ee9a13e649c97ecf upstream.
      
      tcp_tso_should_defer() can return true in three different cases :
      
       1) We are cwnd-limited
       2) We are rwnd-limited
       3) We are application limited.
      
      Neal pointed out that my recent fix went too far, since
      it assumed that if we were not in 1) case, we must be rwnd-limited
      
      Fix this by properly populating the is_cwnd_limited and
      is_rwnd_limited booleans.
      
      After this change, we can finally move the silly check for FIN
      flag only for the application-limited case.
      
      The same move for EOR bit will be handled in net-next,
      since commit 1c09f7d073b1 ("tcp: do not try to defer skbs
      with eor mark (MSG_EOR)") is scheduled for linux-4.21
      
      Tested by running 200 concurrent netperf -t TCP_RR -- -r 60000,100
      and checking none of them was rwnd_limited in the chrono_stat
      output from "ss -ti" command.
      
      Fixes: 41727549de3e ("tcp: Do not underestimate rwnd_limited")
      Signed-off-by: 's avatarEric Dumazet <edumazet@google.com>
      Suggested-by: 's avatarNeal Cardwell <ncardwell@google.com>
      Reviewed-by: 's avatarNeal Cardwell <ncardwell@google.com>
      Acked-by: 's avatarSoheil Hassas Yeganeh <soheil@google.com>
      Reviewed-by: 's avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: 's avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4465b31b
    • Piotr Stankiewicz's avatar
      IB/hfi1: Fix an out-of-bounds access in get_hw_stats · 01a16601
      Piotr Stankiewicz authored
      commit 36d842194a57f1b21fbc6a6875f2fa2f9a7f8679 upstream.
      
      When running with KASAN, the following trace is produced:
      
      [   62.535888]
      
      ==================================================================
      [   62.544930] BUG: KASAN: slab-out-of-bounds in
      gut_hw_stats+0x122/0x230 [hfi1]
      [   62.553856] Write of size 8 at addr ffff88080e8d6330 by task
      kworker/0:1/14
      
      [   62.565333] CPU: 0 PID: 14 Comm: kworker/0:1 Not tainted
      4.19.0-test-build-kasan+ #8
      [   62.575087] Hardware name: Intel Corporation S2600KPR/S2600KPR, BIOS
      SE5C610.86B.01.01.0019.101220160604 10/12/2016
      [   62.587951] Workqueue: events work_for_cpu_fn
      [   62.594050] Call Trace:
      [   62.598023]  dump_stack+0xc6/0x14c
      [   62.603089]  ? dump_stack_print_info.cold.1+0x2f/0x2f
      [   62.610041]  ? kmsg_dump_rewind_nolock+0x59/0x59
      [   62.616615]  ? get_hw_stats+0x122/0x230 [hfi1]
      [   62.622985]  print_address_description+0x6c/0x23c
      [   62.629744]  ? get_hw_stats+0x122/0x230 [hfi1]
      [   62.636108]  kasan_report.cold.6+0x241/0x308
      [   62.642365]  get_hw_stats+0x122/0x230 [hfi1]
      [   62.648703]  ? hfi1_alloc_rn+0x40/0x40 [hfi1]
      [   62.655088]  ? __kmalloc+0x110/0x240
      [   62.660695]  ? hfi1_alloc_rn+0x40/0x40 [hfi1]
      [   62.667142]  setup_hw_stats+0xd8/0x430 [ib_core]
      [   62.673972]  ? show_hfi+0x50/0x50 [hfi1]
      [   62.680026]  ib_device_register_sysfs+0x165/0x180 [ib_core]
      [   62.687995]  ib_register_device+0x5a2/0xa10 [ib_core]
      [   62.695340]  ? show_hfi+0x50/0x50 [hfi1]
      [   62.701421]  ? ib_unregister_device+0x2e0/0x2e0 [ib_core]
      [   62.709222]  ? __vmalloc_node_range+0x2d0/0x380
      [   62.716131]  ? rvt_driver_mr_init+0x11f/0x2d0 [rdmavt]
      [   62.723735]  ? vmalloc_node+0x5c/0x70
      [   62.729697]  ? rvt_driver_mr_init+0x11f/0x2d0 [rdmavt]
      [   62.737347]  ? rvt_driver_mr_init+0x1f5/0x2d0 [rdmavt]
      [   62.744998]  ? __rvt_alloc_mr+0x110/0x110 [rdmavt]
      [   62.752315]  ? rvt_rc_error+0x140/0x140 [rdmavt]
      [   62.759434]  ? rvt_vma_open+0x30/0x30 [rdmavt]
      [   62.766364]  ? mutex_unlock+0x1d/0x40
      [   62.772445]  ? kmem_cache_create_usercopy+0x15d/0x230
      [   62.780115]  rvt_register_device+0x1f6/0x360 [rdmavt]
      [   62.787823]  ? rvt_get_port_immutable+0x180/0x180 [rdmavt]
      [   62.796058]  ? __get_txreq+0x400/0x400 [hfi1]
      [   62.802969]  ? memcpy+0x34/0x50
      [   62.808611]  hfi1_register_ib_device+0xde6/0xeb0 [hfi1]
      [   62.816601]  ? hfi1_get_npkeys+0x10/0x10 [hfi1]
      [   62.823760]  ? hfi1_init+0x89f/0x9a0 [hfi1]
      [   62.830469]  ? hfi1_setup_eagerbufs+0xad0/0xad0 [hfi1]
      [   62.838204]  ? pcie_capability_clear_and_set_word+0xcd/0xe0
      [   62.846429]  ? pcie_capability_read_word+0xd0/0xd0
      [   62.853791]  ? hfi1_pcie_init+0x187/0x4b0 [hfi1]
      [   62.860958]  init_one+0x67f/0xae0 [hfi1]
      [   62.867301]  ? hfi1_init+0x9a0/0x9a0 [hfi1]
      [   62.873876]  ? wait_woken+0x130/0x130
      [   62.879860]  ? read_word_at_a_time+0xe/0x20
      [   62.886329]  ? strscpy+0x14b/0x280
      [   62.891998]  ? hfi1_init+0x9a0/0x9a0 [hfi1]
      [   62.898405]  local_pci_probe+0x70/0xd0
      [   62.904295]  ? pci_device_shutdown+0x90/0x90
      [   62.910833]  work_for_cpu_fn+0x29/0x40
      [   62.916750]  process_one_work+0x584/0x960
      [   62.922974]  ? rcu_work_rcufn+0x40/0x40
      [   62.928991]  ? __schedule+0x396/0xdc0
      [   62.934806]  ? __sched_text_start+0x8/0x8
      [   62.941020]  ? pick_next_task_fair+0x68b/0xc60
      [   62.947674]  ? run_rebalance_domains+0x260/0x260
      [   62.954471]  ? __list_add_valid+0x29/0xa0
      [   62.960607]  ? move_linked_works+0x1c7/0x230
      [   62.967077]  ?
      trace_event_raw_event_workqueue_execute_start+0x140/0x140
      [   62.976248]  ? mutex_lock+0xa6/0x100
      [   62.982029]  ? __mutex_lock_slowpath+0x10/0x10
      [   62.988795]  ? __switch_to+0x37a/0x710
      [   62.994731]  worker_thread+0x62e/0x9d0
      [   63.000602]  ? max_active_store+0xf0/0xf0
      [   63.006828]  ? __switch_to_asm+0x40/0x70
      [   63.012932]  ? __switch_to_asm+0x34/0x70
      [   63.019013]  ? __switch_to_asm+0x40/0x70
      [   63.025042]  ? __switch_to_asm+0x34/0x70
      [   63.031030]  ? __switch_to_asm+0x40/0x70
      [   63.037006]  ? __schedule+0x396/0xdc0
      [   63.042660]  ? kmem_cache_alloc_trace+0xf3/0x1f0
      [   63.049323]  ? kthread+0x59/0x1d0
      [   63.054594]  ? ret_from_fork+0x35/0x40
      [   63.060257]  ? __sched_text_start+0x8/0x8
      [   63.066212]  ? schedule+0xcf/0x250
      [   63.071529]  ? __wake_up_common+0x110/0x350
      [   63.077794]  ? __schedule+0xdc0/0xdc0
      [   63.083348]  ? wait_woken+0x130/0x130
      [   63.088963]  ? finish_task_switch+0x1f1/0x520
      [   63.095258]  ? kasan_unpoison_shadow+0x30/0x40
      [   63.101792]  ? __init_waitqueue_head+0xa0/0xd0
      [   63.108183]  ? replenish_dl_entity.cold.60+0x18/0x18
      [   63.115151]  ? _raw_spin_lock_irqsave+0x25/0x50
      [   63.121754]  ? max_active_store+0xf0/0xf0
      [   63.127753]  kthread+0x1ae/0x1d0
      [   63.132894]  ? kthread_bind+0x30/0x30
      [   63.138422]  ret_from_fork+0x35/0x40
      
      [   63.146973] Allocated by task 14:
      [   63.152077]  kasan_kmalloc+0xbf/0xe0
      [   63.157471]  __kmalloc+0x110/0x240
      [   63.162804]  init_cntrs+0x34d/0xdf0 [hfi1]
      [   63.168883]  hfi1_init_dd+0x29a3/0x2f90 [hfi1]
      [   63.175244]  init_one+0x551/0xae0 [hfi1]
      [   63.181065]  local_pci_probe+0x70/0xd0
      [   63.186759]  work_for_cpu_fn+0x29/0x40
      [   63.192310]  process_one_work+0x584/0x960
      [   63.198163]  worker_thread+0x62e/0x9d0
      [   63.203843]  kthread+0x1ae/0x1d0
      [   63.208874]  ret_from_fork+0x35/0x40
      
      [   63.217203] Freed by task 1:
      [   63.221844]  __kasan_slab_free+0x12e/0x180
      [   63.227844]  kfree+0x92/0x1a0
      [   63.232570]  single_release+0x3a/0x60
      [   63.238024]  __fput+0x1d9/0x480
      [   63.242911]  task_work_run+0x139/0x190
      [   63.248440]  exit_to_usermode_loop+0x191/0x1a0
      [   63.254814]  do_syscall_64+0x301/0x330
      [   63.260283]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      [   63.270199] The buggy address belongs to the object at
      ffff88080e8d5500
       which belongs to the cache kmalloc-4096 of size 4096
      [   63.287247] The buggy address is located 3632 bytes inside of
       4096-byte region [ffff88080e8d5500, ffff88080e8d6500)
      [   63.303564] The buggy address belongs to the page:
      [   63.310447] page:ffffea00203a3400 count:1 mapcount:0
      mapping:ffff88081380e840 index:0x0 compound_mapcount: 0
      [   63.323102] flags: 0x2fffff80008100(slab|head)
      [   63.329775] raw: 002fffff80008100 0000000000000000 0000000100000001
      ffff88081380e840
      [   63.340175] raw: 0000000000000000 0000000000070007 00000001ffffffff
      0000000000000000
      [   63.350564] page dumped because: kasan: bad access detected
      
      [   63.361974] Memory state around the buggy address:
      [   63.369137]  ffff88080e8d6200: 00 00 00 00 00 00 00 00 00 00 00 00 00
      00 00 00
      [   63.379082]  ffff88080e8d6280: 00 00 00 00 00 00 00 00 00 00 00 00 00
      00 00 00
      [   63.389032] >ffff88080e8d6300: 00 00 00 00 00 00 fc fc fc fc fc fc fc
      fc fc fc
      [   63.398944]                                      ^
      [   63.406141]  ffff88080e8d6380: fc fc fc fc fc fc fc fc fc fc fc fc fc
      fc fc fc
      [   63.416109]  ffff88080e8d6400: fc fc fc fc fc fc fc fc fc fc fc fc fc
      fc fc fc
      [   63.426099]
      ==================================================================
      
      The trace happens because get_hw_stats() assumes there is room in the
      memory allocated in init_cntrs() to accommodate the driver counters.
      Unfortunately, that routine only allocated space for the device
      counters.
      
      Fix by insuring the allocation has room for the additional driver
      counters.
      
      Cc: <Stable@vger.kernel.org> # v4.14+
      Fixes: b7481944 ("IB/hfi1: Show statistics counters under IB stats interface")
      Reviewed-by: 's avatarMike Marciniczyn <mike.marciniszyn@intel.com>
      Reviewed-by: 's avatarMike Ruhl <michael.j.ruhl@intel.com>
      Signed-off-by: 's avatarPiotr Stankiewicz <piotr.stankiewicz@intel.com>
      Signed-off-by: 's avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: 's avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      01a16601
    • Kailang Yang's avatar
      ALSA: hda/realtek - Fixed headphone issue for ALC700 · d655a1a6
      Kailang Yang authored
      commit bde1a7459623a66c2abec4d0a841e4b06cc88d9a upstream.
      
      If it plugged headphone or headset into the jack, then
      do the reboot, it will have a chance to cause headphone no sound.
      It just need to run the headphone mode procedure after boot time.
      The issue will be fixed.
      It also suitable for ALC234 ALC274 and ALC294.
      Signed-off-by: 's avatarKailang Yang <kailang@realtek.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: 's avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d655a1a6
    • Takashi Sakamoto's avatar
      ALSA: fireface: fix reference to wrong register for clock configuration · 62711dc6
      Takashi Sakamoto authored
      commit fa9c98e4b975bb3192ed6af09d9fa282ed3cd8a0 upstream.
      
      In an initial commit, 'SYNC_STATUS' register is referred to get
      clock configuration, however this is wrong, according to my local
      note at hand for reverse-engineering about packet dump. It should
      be 'CLOCK_CONFIG' register. Actually, ff400_dump_clock_config()
      is correctly programmed.
      
      This commit fixes the bug.
      
      Cc: <stable@vger.kernel.org> # v4.12+
      Fixes: 76fdb3a9 ('ALSA: fireface: add support for Fireface 400')
      Signed-off-by: 's avatarTakashi Sakamoto <o-takashi@sakamocchi.jp>
      Signed-off-by: 's avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      62711dc6
    • Guenter Roeck's avatar
      staging: speakup: Replace strncpy with memcpy · 16906e5a
      Guenter Roeck authored
      commit fd29edc7 upstream.
      
      gcc 8.1.0 generates the following warnings.
      
      drivers/staging/speakup/kobjects.c: In function 'punc_store':
      drivers/staging/speakup/kobjects.c:522:2: warning:
      	'strncpy' output truncated before terminating nul
      	copying as many bytes from a string as its length
      drivers/staging/speakup/kobjects.c:504:6: note: length computed here
      
      drivers/staging/speakup/kobjects.c: In function 'synth_store':
      drivers/staging/speakup/kobjects.c:391:2: warning:
      	'strncpy' output truncated before terminating nul
      	copying as many bytes from a string as its length
      drivers/staging/speakup/kobjects.c:388:8: note: length computed here
      
      Using strncpy() is indeed less than perfect since the length of data to
      be copied has already been determined with strlen(). Replace strncpy()
      with memcpy() to address the warning and optimize the code a little.
      Signed-off-by: 's avatarGuenter Roeck <linux@roeck-us.net>
      Reviewed-by: 's avatarSamuel Thibault <samuel.thibault@ens-lyon.org>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      16906e5a
    • Tigran Mkrtchyan's avatar
      flexfiles: enforce per-mirror stateid only for v4 DSes · 5d2cc520
      Tigran Mkrtchyan authored
      commit 320f35b7bf8cccf1997ca3126843535e1b95e9c4 upstream.
      
      Since commit bb21ce0ad227 we always enforce per-mirror stateid.
      However, this makes sense only for v4+ servers.
      Signed-off-by: 's avatarTigran Mkrtchyan <tigran.mkrtchyan@desy.de>
      Signed-off-by: 's avatarTrond Myklebust <trond.myklebust@hammerspace.com>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5d2cc520
    • Davidlohr Bueso's avatar
      lib/rbtree-test: lower default params · 891e5a89
      Davidlohr Bueso authored
      commit 0b548e33 upstream.
      
      Fengguang reported soft lockups while running the rbtree and interval
      tree test modules.  The logic for these tests all occur in init phase,
      and we currently are pounding with the default values for number of
      nodes and number of iterations of each test.  Reduce the latter by two
      orders of magnitude.  This does not influence the value of the tests in
      that one thousand times by default is enough to get the picture.
      
      Link: http://lkml.kernel.org/r/20171109161715.xai2dtwqw2frhkcm@linux-n805Signed-off-by: 's avatarDavidlohr Bueso <dbueso@suse.de>
      Reported-by: 's avatarFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      891e5a89
    • Petr Mladek's avatar
      printk: Wake klogd when passing console_lock owner · 16c9a316
      Petr Mladek authored
      [ Upstream commit c14376de ]
      
      wake_klogd is a local variable in console_unlock(). The information
      is lost when the console_lock owner using the busy wait added by
      the commit dbdda842 ("printk: Add console owner and waiter
      logic to load balance console writes"). The following race is
      possible:
      
      CPU0				CPU1
      console_unlock()
      
        for (;;)
           /* calling console for last message */
      
      				printk()
      				  log_store()
      				    log_next_seq++;
      
           /* see new message */
           if (seen_seq != log_next_seq) {
      	wake_klogd = true;
      	seen_seq = log_next_seq;
           }
      
           console_lock_spinning_enable();
      
      				  if (console_trylock_spinning())
      				     /* spinning */
      
           if (console_lock_spinning_disable_and_check()) {
      	printk_safe_exit_irqrestore(flags);
      	return;
      
      				  console_unlock()
      				    if (seen_seq != log_next_seq) {
      				    /* already seen */
      				    /* nothing to do */
      
      Result: Nobody would wakeup klogd.
      
      One solution would be to make a global variable from wake_klogd.
      But then we would need to manipulate it under a lock or so.
      
      This patch wakes klogd also when console_lock is passed to the
      spinning waiter. It looks like the right way to go. Also userspace
      should have a chance to see and store any "flood" of messages.
      
      Note that the very late klogd wake up was a historic solution.
      It made sense on single CPU systems or when sys_syslog() operations
      were synchronized using the big kernel lock like in v2.1.113.
      But it is questionable these days.
      
      Fixes: dbdda842 ("printk: Add console owner and waiter logic to load balance console writes")
      Link: http://lkml.kernel.org/r/20180226155734.dzwg3aovqnwtvkoy@pathway.suse.cz
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: linux-kernel@vger.kernel.org
      Cc: Tejun Heo <tj@kernel.org>
      Suggested-by: 's avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Reviewed-by: 's avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: 's avatarPetr Mladek <pmladek@suse.com>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      16c9a316
    • Sergey Senozhatsky's avatar
      printk: Never set console_may_schedule in console_trylock() · 08b7a8f8
      Sergey Senozhatsky authored
      [ Upstream commit fd5f7cde ]
      
      This patch, basically, reverts commit 6b97a20d ("printk:
      set may_schedule for some of console_trylock() callers").
      That commit was a mistake, it introduced a big dependency
      on the scheduler, by enabling preemption under console_sem
      in printk()->console_unlock() path, which is rather too
      critical. The patch did not significantly reduce the
      possibilities of printk() lockups, but made it possible to
      stall printk(), as has been reported by Tetsuo Handa [1].
      
      Another issues is that preemption under console_sem also
      messes up with Steven Rostedt's hand off scheme, by making
      it possible to sleep with console_sem both in console_unlock()
      and in vprintk_emit(), after acquiring the console_sem
      ownership (anywhere between printk_safe_exit_irqrestore() in
      console_trylock_spinning() and printk_safe_enter_irqsave()
      in console_unlock()). This makes hand off less likely and,
      at the same time, may result in a significant amount of
      pending logbuf messages. Preempted console_sem owner makes
      it impossible for other CPUs to emit logbuf messages, but
      does not make it impossible for other CPUs to append new
      messages to the logbuf.
      
      Reinstate the old behavior and make printk() non-preemptible.
      Should any printk() lockup reports arrive they must be handled
      in a different way.
      
      [1] http://lkml.kernel.org/r/201603022101.CAH73907.OVOOMFHFFtQJSL%20()%20I-love%20!%20SAKURA%20!%20ne%20!%20jp
      Fixes: 6b97a20d ("printk: set may_schedule for some of console_trylock() callers")
      Link: http://lkml.kernel.org/r/20180116044716.GE6607@jagdpanzerIV
      To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: akpm@linux-foundation.org
      Cc: linux-mm@kvack.org
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: 's avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Reported-by: 's avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Reviewed-by: 's avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: 's avatarPetr Mladek <pmladek@suse.com>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      08b7a8f8
    • Petr Mladek's avatar
      printk: Hide console waiter logic into helpers · ef433725
      Petr Mladek authored
      [ Upstream commit c162d5b4 ]
      
      The commit ("printk: Add console owner and waiter logic to load balance
      console writes") made vprintk_emit() and console_unlock() even more
      complicated.
      
      This patch extracts the new code into 3 helper functions. They should
      help to keep it rather self-contained. It will be easier to use and
      maintain.
      
      This patch just shuffles the existing code. It does not change
      the functionality.
      
      Link: http://lkml.kernel.org/r/20180112160837.GD24497@linux.suse
      Cc: akpm@linux-foundation.org
      Cc: linux-mm@kvack.org
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: rostedt@home.goodmis.org
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: linux-kernel@vger.kernel.org
      Reviewed-by: 's avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Acked-by: 's avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: 's avatarPetr Mladek <pmladek@suse.com>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      ef433725
    • Steven Rostedt (VMware)'s avatar
      printk: Add console owner and waiter logic to load balance console writes · 59423114
      Steven Rostedt (VMware) authored
      [ Upstream commit dbdda842 ]
      
      This patch implements what I discussed in Kernel Summit. I added
      lockdep annotation (hopefully correctly), and it hasn't had any splats
      (since I fixed some bugs in the first iterations). It did catch
      problems when I had the owner covering too much. But now that the owner
      is only set when actively calling the consoles, lockdep has stayed
      quiet.
      
      Here's the design again:
      
      I added a "console_owner" which is set to a task that is actively
      writing to the consoles. It is *not* the same as the owner of the
      console_lock. It is only set when doing the calls to the console
      functions. It is protected by a console_owner_lock which is a raw spin
      lock.
      
      There is a console_waiter. This is set when there is an active console
      owner that is not current, and waiter is not set. This too is protected
      by console_owner_lock.
      
      In printk() when it tries to write to the consoles, we have:
      
      	if (console_trylock())
      		console_unlock();
      
      Now I added an else, which will check if there is an active owner, and
      no current waiter. If that is the case, then console_waiter is set, and
      the task goes into a spin until it is no longer set.
      
      When the active console owner finishes writing the current message to
      the consoles, it grabs the console_owner_lock and sees if there is a
      waiter, and clears console_owner.
      
      If there is a waiter, then it breaks out of the loop, clears the waiter
      flag (because that will release the waiter from its spin), and exits.
      Note, it does *not* release the console semaphore. Because it is a
      semaphore, there is no owner. Another task may release it. This means
      that the waiter is guaranteed to be the new console owner! Which it
      becomes.
      
      Then the waiter calls console_unlock() and continues to write to the
      consoles.
      
      If another task comes along and does a printk() it too can become the
      new waiter, and we wash rinse and repeat!
      
      By Petr Mladek about possible new deadlocks:
      
      The thing is that we move console_sem only to printk() call
      that normally calls console_unlock() as well. It means that
      the transferred owner should not bring new type of dependencies.
      As Steven said somewhere: "If there is a deadlock, it was
      there even before."
      
      We could look at it from this side. The possible deadlock would
      look like:
      
      CPU0                            CPU1
      
      console_unlock()
      
        console_owner = current;
      
      				spin_lockA()
      				  printk()
      				    spin = true;
      				    while (...)
      
          call_console_drivers()
            spin_lockA()
      
      This would be a deadlock. CPU0 would wait for the lock A.
      While CPU1 would own the lockA and would wait for CPU0
      to finish calling the console drivers and pass the console_sem
      owner.
      
      But if the above is true than the following scenario was
      already possible before:
      
      CPU0
      
      spin_lockA()
        printk()
          console_unlock()
            call_console_drivers()
      	spin_lockA()
      
      By other words, this deadlock was there even before. Such
      deadlocks are prevented by using printk_deferred() in
      the sections guarded by the lock A.
      
      By Steven Rostedt:
      
      To demonstrate the issue, this module has been shown to lock up a
      system with 4 CPUs and a slow console (like a serial console). It is
      also able to lock up a 8 CPU system with only a fast (VGA) console, by
      passing in "loops=100". The changes in this commit prevent this module
      from locking up the system.
      
       #include <linux/module.h>
       #include <linux/delay.h>
       #include <linux/sched.h>
       #include <linux/mutex.h>
       #include <linux/workqueue.h>
       #include <linux/hrtimer.h>
      
       static bool stop_testing;
       static unsigned int loops = 1;
      
       static void preempt_printk_workfn(struct work_struct *work)
       {
       	int i;
      
       	while (!READ_ONCE(stop_testing)) {
       		for (i = 0; i < loops && !READ_ONCE(stop_testing); i++) {
       			preempt_disable();
       			pr_emerg("%5d%-75s\n", smp_processor_id(),
       				 " XXX NOPREEMPT");
       			preempt_enable();
       		}
       		msleep(1);
       	}
       }
      
       static struct work_struct __percpu *works;
      
       static void finish(void)
       {
       	int cpu;
      
       	WRITE_ONCE(stop_testing, true);
       	for_each_online_cpu(cpu)
       		flush_work(per_cpu_ptr(works, cpu));
       	free_percpu(works);
       }
      
       static int __init test_init(void)
       {
       	int cpu;
      
       	works = alloc_percpu(struct work_struct);
       	if (!works)
       		return -ENOMEM;
      
       	/*
       	 * This is just a test module. This will break if you
       	 * do any CPU hot plugging between loading and
       	 * unloading the module.
       	 */
      
       	for_each_online_cpu(cpu) {
       		struct work_struct *work = per_cpu_ptr(works, cpu);
      
       		INIT_WORK(work, &preempt_printk_workfn);
       		schedule_work_on(cpu, work);
       	}
      
       	return 0;
       }
      
       static void __exit test_exit(void)
       {
       	finish();
       }
      
       module_param(loops, uint, 0);
       module_init(test_init);
       module_exit(test_exit);
       MODULE_LICENSE("GPL");
      
      Link: http://lkml.kernel.org/r/20180110132418.7080-2-pmladek@suse.com
      Cc: akpm@linux-foundation.org
      Cc: linux-mm@kvack.org
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: 's avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      [pmladek@suse.com: Commit message about possible deadlocks]
      Acked-by: 's avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: 's avatarPetr Mladek <pmladek@suse.com>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      59423114
    • Sasha Levin's avatar
      Revert "printk: Never set console_may_schedule in console_trylock()" · 62582f67
      Sasha Levin authored
      This reverts commit c9b8d580.
      
      This is just a technical revert to make the printk fix apply cleanly,
      this patch will be re-picked in about 3 commits.
      62582f67
    • Pan Bian's avatar
      ocfs2: fix potential use after free · 56926f91
      Pan Bian authored
      [ Upstream commit 164f7e586739d07eb56af6f6d66acebb11f315c8 ]
      
      ocfs2_get_dentry() calls iput(inode) to drop the reference count of
      inode, and if the reference count hits 0, inode is freed.  However, in
      this function, it then reads inode->i_generation, which may result in a
      use after free bug.  Move the put operation later.
      
      Link: http://lkml.kernel.org/r/1543109237-110227-1-git-send-email-bianpan2016@163.com
      Fixes: 781f200c("ocfs2: Remove masklog ML_EXPORT.")
      Signed-off-by: 's avatarPan Bian <bianpan2016@163.com>
      Reviewed-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Joseph Qi <jiangqi903@gmail.com>
      Cc: Changwei Ge <ge.changwei@h3c.com>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      56926f91
    • Qian Cai's avatar
      debugobjects: avoid recursive calls with kmemleak · 225b1137
      Qian Cai authored
      [ Upstream commit 8de456cf87ba863e028c4dd01bae44255ce3d835 ]
      
      CONFIG_DEBUG_OBJECTS_RCU_HEAD does not play well with kmemleak due to
      recursive calls.
      
      fill_pool
        kmemleak_ignore
          make_black_object
            put_object
              __call_rcu (kernel/rcu/tree.c)
                debug_rcu_head_queue
                  debug_object_activate
                    debug_object_init
                      fill_pool
                        kmemleak_ignore
                          make_black_object
                            ...
      
      So add SLAB_NOLEAKTRACE to kmem_cache_create() to not register newly
      allocated debug objects at all.
      
      Link: http://lkml.kernel.org/r/20181126165343.2339-1-cai@gmx.usSigned-off-by: 's avatarQian Cai <cai@gmx.us>
      Suggested-by: 's avatarCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: 's avatarWaiman Long <longman@redhat.com>
      Acked-by: 's avatarCatalin Marinas <catalin.marinas@arm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yang Shi <yang.shi@linux.alibaba.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      225b1137
    • Pan Bian's avatar
      hfsplus: do not free node before using · 95c8714e
      Pan Bian authored
      [ Upstream commit c7d7d620dcbd2a1c595092280ca943f2fced7bbd ]
      
      hfs_bmap_free() frees node via hfs_bnode_put(node).  However it then
      reads node->this when dumping error message on an error path, which may
      result in a use-after-free bug.  This patch frees node only when it is
      never used.
      
      Link: http://lkml.kernel.org/r/1543053441-66942-1-git-send-email-bianpan2016@163.comSigned-off-by: 's avatarPan Bian <bianpan2016@163.com>
      Reviewed-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Ernesto A. Fernandez <ernesto.mnd.fernandez@gmail.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: Viacheslav Dubeyko <slava@dubeyko.com>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      95c8714e
    • Pan Bian's avatar
      hfs: do not free node before using · 5ee5fa61
      Pan Bian authored
      [ Upstream commit ce96a407adef126870b3f4a1b73529dd8aa80f49 ]
      
      hfs_bmap_free() frees the node via hfs_bnode_put(node).  However, it
      then reads node->this when dumping error message on an error path, which
      may result in a use-after-free bug.  This patch frees the node only when
      it is never again used.
      
      Link: http://lkml.kernel.org/r/1542963889-128825-1-git-send-email-bianpan2016@163.com
      Fixes: a1185ffa2fc ("HFS rewrite")
      Signed-off-by: 's avatarPan Bian <bianpan2016@163.com>
      Reviewed-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Joe Perches <joe@perches.com>
      Cc: Ernesto A. Fernandez <ernesto.mnd.fernandez@gmail.com>
      Cc: Viacheslav Dubeyko <slava@dubeyko.com>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      5ee5fa61
    • Wei Yang's avatar
      mm/page_alloc.c: fix calculation of pgdat->nr_zones · c7aafad0
      Wei Yang authored
      [ Upstream commit 8f416836c0d50b198cad1225132e5abebf8980dc ]
      
      init_currently_empty_zone() will adjust pgdat->nr_zones and set it to
      'zone_idx(zone) + 1' unconditionally.  This is correct in the normal
      case, while not exact in hot-plug situation.
      
      This function is used in two places:
      
        * free_area_init_core()
        * move_pfn_range_to_zone()
      
      In the first case, we are sure zone index increase monotonically.  While
      in the second one, this is under users control.
      
      One way to reproduce this is:
      ----------------------------
      
      1. create a virtual machine with empty node1
      
         -m 4G,slots=32,maxmem=32G \
         -smp 4,maxcpus=8          \
         -numa node,nodeid=0,mem=4G,cpus=0-3 \
         -numa node,nodeid=1,mem=0G,cpus=4-7
      
      2. hot-add cpu 3-7
      
         cpu-add [3-7]
      
      2. hot-add memory to nod1
      
         object_add memory-backend-ram,id=ram0,size=1G
         device_add pc-dimm,id=dimm0,memdev=ram0,node=1
      
      3. online memory with following order
      
         echo online_movable > memory47/state
         echo online > memory40/state
      
      After this, node1 will have its nr_zones equals to (ZONE_NORMAL + 1)
      instead of (ZONE_MOVABLE + 1).
      
      Michal said:
       "Having an incorrect nr_zones might result in all sorts of problems
        which would be quite hard to debug (e.g. reclaim not considering the
        movable zone). I do not expect many users would suffer from this it
        but still this is trivial and obviously right thing to do so
        backporting to the stable tree shouldn't be harmful (last famous
        words)"
      
      Link: http://lkml.kernel.org/r/20181117022022.9956-1-richard.weiyang@gmail.com
      Fixes: f1dd2cd1 ("mm, memory_hotplug: do not associate hotadded memory to zones until online")
      Signed-off-by: 's avatarWei Yang <richard.weiyang@gmail.com>
      Acked-by: 's avatarMichal Hocko <mhocko@suse.com>
      Reviewed-by: 's avatarOscar Salvador <osalvador@suse.de>
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      c7aafad0
    • Larry Chen's avatar
      ocfs2: fix deadlock caused by ocfs2_defrag_extent() · df2055c0
      Larry Chen authored
      [ Upstream commit e21e57445a64598b29a6f629688f9b9a39e7242a ]
      
      ocfs2_defrag_extent may fall into deadlock.
      
      ocfs2_ioctl_move_extents
          ocfs2_ioctl_move_extents
            ocfs2_move_extents
              ocfs2_defrag_extent
                ocfs2_lock_allocators_move_extents
      
                  ocfs2_reserve_clusters
                    inode_lock GLOBAL_BITMAP_SYSTEM_INODE
      
      	  __ocfs2_flush_truncate_log
                    inode_lock GLOBAL_BITMAP_SYSTEM_INODE
      
      As backtrace shows above, ocfs2_reserve_clusters() will call inode_lock
      against the global bitmap if local allocator has not sufficient cluters.
      Once global bitmap could meet the demand, ocfs2_reserve_cluster will
      return success with global bitmap locked.
      
      After ocfs2_reserve_cluster(), if truncate log is full,
      __ocfs2_flush_truncate_log() will definitely fall into deadlock because
      it needs to inode_lock global bitmap, which has already been locked.
      
      To fix this bug, we could remove from
      ocfs2_lock_allocators_move_extents() the code which intends to lock
      global allocator, and put the removed code after
      __ocfs2_flush_truncate_log().
      
      ocfs2_lock_allocators_move_extents() is referred by 2 places, one is
      here, the other does not need the data allocator context, which means
      this patch does not affect the caller so far.
      
      Link: http://lkml.kernel.org/r/20181101071422.14470-1-lchen@suse.comSigned-off-by: 's avatarLarry Chen <lchen@suse.com>
      Reviewed-by: 's avatarChangwei Ge <ge.changwei@h3c.com>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Joseph Qi <jiangqi903@gmail.com>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      df2055c0
    • Lorenzo Pieralisi's avatar
      ACPI/IORT: Fix iort_get_platform_device_domain() uninitialized pointer value · 8e3f3c06
      Lorenzo Pieralisi authored
      [ Upstream commit ea2412dc21cc790335d319181dddc43682aef164 ]
      
      Running the Clang static analyzer on IORT code detected the following
      error:
      
      Logic error: Branch condition evaluates to a garbage value
      
      in
      
      iort_get_platform_device_domain()
      
      If the named component associated with a given device has no IORT
      mappings, iort_get_platform_device_domain() exits its MSI mapping loop
      with msi_parent pointer containing garbage, which can lead to erroneous
      code path execution.
      
      Initialize the msi_parent pointer, fixing the bug.
      
      Fixes: d4f54a18 ("ACPI: platform: setup MSI domain for ACPI based
      platform device")
      Reported-by: 's avatarPatrick Bellasi <patrick.bellasi@arm.com>
      Reviewed-by: 's avatarHanjun Guo <hanjun.guo@linaro.org>
      Acked-by: 's avatarWill Deacon <will.deacon@arm.com>
      Cc: Sudeep Holla <sudeep.holla@arm.com>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Signed-off-by: 's avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: 's avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      8e3f3c06
    • Sagi Grimberg's avatar
      nvme: flush namespace scanning work just before removing namespaces · 9407fd14
      Sagi Grimberg authored
      [ Upstream commit f6c8e432cb0479255322c5d0335b9f1699a0270c ]
      
      nvme_stop_ctrl can be called also for reset flow and there is no need to
      flush the scan_work as namespaces are not being removed. This can cause
      deadlock in rdma, fc and loop drivers since nvme_stop_ctrl barriers
      before controller teardown (and specifically I/O cancellation of the
      scan_work itself) takes place, but the scan_work will be blocked anyways
      so there is no need to flush it.
      
      Instead, move scan_work flush to nvme_remove_namespaces() where it really
      needs to flush.
      Reported-by: 's avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: 's avatarSagi Grimberg <sagi@grimberg.me>
      Reviewed-by: 's avatarKeith Busch <keith.busch@intel.com>
      Reviewed by: James Smart <jsmart2021@gmail.com>
      Tested-by: 's avatarEwan D. Milne <emilne@redhat.com>
      Signed-off-by: 's avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      9407fd14
    • Colin Ian King's avatar
      fscache, cachefiles: remove redundant variable 'cache' · 0859bb25
      Colin Ian King authored
      [ Upstream commit 31ffa563833576bd49a8bf53120568312755e6e2 ]
      
      Variable 'cache' is being assigned but is never used hence it is
      redundant and can be removed.
      
      Cleans up clang warning:
      warning: variable 'cache' set but not used [-Wunused-but-set-variable]
      Signed-off-by: 's avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: 's avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      0859bb25
    • NeilBrown's avatar
      fscache: fix race between enablement and dropping of object · 38026d1a
      NeilBrown authored
      [ Upstream commit c5a94f434c82529afda290df3235e4d85873c5b4 ]
      
      It was observed that a process blocked indefintely in
      __fscache_read_or_alloc_page(), waiting for FSCACHE_COOKIE_LOOKING_UP
      to be cleared via fscache_wait_for_deferred_lookup().
      
      At this time, ->backing_objects was empty, which would normaly prevent
      __fscache_read_or_alloc_page() from getting to the point of waiting.
      This implies that ->backing_objects was cleared *after*
      __fscache_read_or_alloc_page was was entered.
      
      When an object is "killed" and then "dropped",
      FSCACHE_COOKIE_LOOKING_UP is cleared in fscache_lookup_failure(), then
      KILL_OBJECT and DROP_OBJECT are "called" and only in DROP_OBJECT is
      ->backing_objects cleared.  This leaves a window where
      something else can set FSCACHE_COOKIE_LOOKING_UP and
      __fscache_read_or_alloc_page() can start waiting, before
      ->backing_objects is cleared
      
      There is some uncertainty in this analysis, but it seems to be fit the
      observations.  Adding the wake in this patch will be handled correctly
      by __fscache_read_or_alloc_page(), as it checks if ->backing_objects
      is empty again, after waiting.
      
      Customer which reported the hang, also report that the hang cannot be
      reproduced with this fix.
      
      The backtrace for the blocked process looked like:
      
      PID: 29360  TASK: ffff881ff2ac0f80  CPU: 3   COMMAND: "zsh"
       #0 [ffff881ff43efbf8] schedule at ffffffff815e56f1
       #1 [ffff881ff43efc58] bit_wait at ffffffff815e64ed
       #2 [ffff881ff43efc68] __wait_on_bit at ffffffff815e61b8
       #3 [ffff881ff43efca0] out_of_line_wait_on_bit at ffffffff815e625e
       #4 [ffff881ff43efd08] fscache_wait_for_deferred_lookup at ffffffffa04f2e8f [fscache]
       #5 [ffff881ff43efd18] __fscache_read_or_alloc_page at ffffffffa04f2ffe [fscache]
       #6 [ffff881ff43efd58] __nfs_readpage_from_fscache at ffffffffa0679668 [nfs]
       #7 [ffff881ff43efd78] nfs_readpage at ffffffffa067092b [nfs]
       #8 [ffff881ff43efda0] generic_file_read_iter at ffffffff81187a73
       #9 [ffff881ff43efe50] nfs_file_read at ffffffffa066544b [nfs]
      #10 [ffff881ff43efe70] __vfs_read at ffffffff811fc756
      #11 [ffff881ff43efee8] vfs_read at ffffffff811fccfa
      #12 [ffff881ff43eff18] sys_read at ffffffff811fda62
      #13 [ffff881ff43eff50] entry_SYSCALL_64_fastpath at ffffffff815e986e
      Signed-off-by: 's avatarNeilBrown <neilb@suse.com>
      Signed-off-by: 's avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      38026d1a
    • Kees Cook's avatar
      pstore/ram: Correctly calculate usable PRZ bytes · b718d6be
      Kees Cook authored
      [ Upstream commit 89d328f637b9904b6d4c9af73c8a608b8dd4d6f8 ]
      
      The actual number of bytes stored in a PRZ is smaller than the
      bytes requested by platform data, since there is a header on each
      PRZ. Additionally, if ECC is enabled, there are trailing bytes used
      as well. Normally this mismatch doesn't matter since PRZs are circular
      buffers and the leading "overflow" bytes are just thrown away. However, in
      the case of a compressed record, this rather badly corrupts the results.
      
      This corruption was visible with "ramoops.mem_size=204800 ramoops.ecc=1".
      Any stored crashes would not be uncompressable (producing a pstorefs
      "dmesg-*.enc.z" file), and triggering errors at boot:
      
        [    2.790759] pstore: crypto_comp_decompress failed, ret = -22!
      
      Backporting this depends on commit 70ad35db ("pstore: Convert console
      write to use ->write_buf")
      Reported-by: 's avatarJoel Fernandes <joel@joelfernandes.org>
      Fixes: b0aad7a9 ("pstore: Add compression support to pstore")
      Signed-off-by: 's avatarKees Cook <keescook@chromium.org>
      Reviewed-by: 's avatarJoel Fernandes (Google) <joel@joelfernandes.org>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      b718d6be
    • Igor Druzhinin's avatar
      Revert "xen/balloon: Mark unallocated host memory as UNUSABLE" · b9c242b4
      Igor Druzhinin authored
      [ Upstream commit 123664101aa2156d05251704fc63f9bcbf77741a ]
      
      This reverts commit b3cf8528.
      
      That commit unintentionally broke Xen balloon memory hotplug with
      "hotplug_unpopulated" set to 1. As long as "System RAM" resource
      got assigned under a new "Unusable memory" resource in IO/Mem tree
      any attempt to online this memory would fail due to general kernel
      restrictions on having "System RAM" resources as 1st level only.
      
      The original issue that commit has tried to workaround fa564ad9
      ("x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 00-1f, 30-3f,
      60-7f)") also got amended by the following 03a55173 ("x86/PCI: Move
      and shrink AMD 64-bit window to avoid conflict") which made the
      original fix to Xen ballooning unnecessary.
      Signed-off-by: 's avatarIgor Druzhinin <igor.druzhinin@citrix.com>
      Reviewed-by: 's avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: 's avatarJuergen Gross <jgross@suse.com>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      b9c242b4
    • Srikanth Boddepalli's avatar
      xen: xlate_mmu: add missing header to fix 'W=1' warning · f02e0d5f
      Srikanth Boddepalli authored
      [ Upstream commit 72791ac854fea36034fa7976b748fde585008e78 ]
      
      Add a missing header otherwise compiler warns about missed prototype:
      
      drivers/xen/xlate_mmu.c:183:5: warning: no previous prototype for 'xen_xlate_unmap_gfn_range?' [-Wmissing-prototypes]
        int xen_xlate_unmap_gfn_range(struct vm_area_struct *vma,
            ^~~~~~~~~~~~~~~~~~~~~~~~~
      Signed-off-by: 's avatarSrikanth Boddepalli <boddepalli.srikanth@gmail.com>
      Reviewed-by: 's avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Reviewed-by: 's avatarJoey Pabalinas <joeypabalinas@gmail.com>
      Signed-off-by: 's avatarJuergen Gross <jgross@suse.com>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      f02e0d5f
    • Y.C. Chen's avatar
      drm/ast: fixed reading monitor EDID not stable issue · 6ee9b4de
      Y.C. Chen authored
      [ Upstream commit 300625620314194d9e6d4f6dda71f2dc9cf62d9f ]
      
      v1: over-sample data to increase the stability with some specific monitors
      v2: refine to avoid infinite loop
      v3: remove un-necessary "volatile" declaration
      
      [airlied: fix two checkpatch warnings]
      Signed-off-by: 's avatarY.C. Chen <yc_chen@aspeedtech.com>
      Signed-off-by: 's avatarDave Airlie <airlied@redhat.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/1542858988-1127-1-git-send-email-yc_chen@aspeedtech.comSigned-off-by: 's avatarSasha Levin <sashal@kernel.org>
      6ee9b4de
    • shaoyunl's avatar
      drm/amdgpu: Add delay after enable RLC ucode · cc92ade5
      shaoyunl authored
      [ Upstream commit ad97d9de45835b6a0f71983b0ae0cffd7306730a ]
      
      Driver shouldn't try to access any GFX registers until RLC is idle.
      During the test, it took 12 seconds for RLC to clear the BUSY bit
      in RLC_GPM_STAT register which is un-acceptable for driver.
      As per RLC engineer, it would take RLC Ucode less than 10,000 GFXCLK
      cycles to finish its critical section. In a lowest 300M enginer clock
      setting(default from vbios), 50 us delay is enough.
      
      This commit fix the hang when RLC introduce the work around for XGMI
      which requires more cycles to setup more registers than normal
      Signed-off-by: 's avatarshaoyunl <shaoyun.liu@amd.com>
      Acked-by: 's avatarFelix Kuehling <Felix.Kuehling@amd.com>
      Signed-off-by: 's avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      cc92ade5
    • Pan Bian's avatar
      net: hisilicon: remove unexpected free_netdev · 8a69edb8
      Pan Bian authored
      [ Upstream commit c758940158bf29fe14e9d0f89d5848f227b48134 ]
      
      The net device ndev is freed via free_netdev when failing to register
      the device. The control flow then jumps to the error handling code
      block. ndev is used and freed again. Resulting in a use-after-free bug.
      Signed-off-by: 's avatarPan Bian <bianpan2016@163.com>
      Signed-off-by: 's avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      8a69edb8
    • Josh Elsasser's avatar
      ixgbe: recognize 1000BaseLX SFP modules as 1Gbps · 689dee4b
      Josh Elsasser authored
      [ Upstream commit a8bf879af7b1999eba36303ce9cc60e0e7dd816c ]
      
      Add the two 1000BaseLX enum values to the X550's check for 1Gbps modules,
      allowing the core driver code to establish a link over this SFP type.
      
      This is done by the out-of-tree driver but the fix wasn't in mainline.
      
      Fixes: e23f3336 ("ixgbe: Fix 1G and 10G link stability for X550EM_x SFP+”)
      Fixes: 6a14ee0c ("ixgbe: Add X550 support function pointers")
      Signed-off-by: 's avatarJosh Elsasser <jelsasser@appneta.com>
      Tested-by: 's avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: 's avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      689dee4b
    • Yunjian Wang's avatar
      igb: fix uninitialized variables · 9966f78a
      Yunjian Wang authored
      [ Upstream commit e4c39f7926b4de355f7df75651d75003806aae09 ]
      
      This patch fixes the variable 'phy_word' may be used uninitialized.
      Signed-off-by: 's avatarYunjian Wang <wangyunjian@huawei.com>
      Tested-by: 's avatarAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: 's avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      9966f78a
    • Kiran Kumar Modukuri's avatar
      cachefiles: Fix page leak in cachefiles_read_backing_file while vmscan is active · 7b99a0d7
      Kiran Kumar Modukuri authored
      [ Upstream commit 9a24ce5b66f9c8190d63b15f4473600db4935f1f ]
      
      [Description]
      
      In a heavily loaded system where the system pagecache is nearing memory
      limits and fscache is enabled, pages can be leaked by fscache while trying
      read pages from cachefiles backend.  This can happen because two
      applications can be reading same page from a single mount, two threads can
      be trying to read the backing page at same time.  This results in one of
      the threads finding that a page for the backing file or netfs file is
      already in the radix tree.  During the error handling cachefiles does not
      clean up the reference on backing page, leading to page leak.
      
      [Fix]
      The fix is straightforward, to decrement the reference when error is
      encountered.
      
        [dhowells: Note that I've removed the clearance and put of newpage as
         they aren't attested in the commit message and don't appear to actually
         achieve anything since a new page is only allocated is newpage!=NULL and
         any residual new page is cleared before returning.]
      
      [Testing]
      I have tested the fix using following method for 12+ hrs.
      
      1) mkdir -p /mnt/nfs ; mount -o vers=3,fsc <server_ip>:/export /mnt/nfs
      2) create 10000 files of 2.8MB in a NFS mount.
      3) start a thread to simulate heavy VM presssure
         (while true ; do echo 3 > /proc/sys/vm/drop_caches ; sleep 1 ; done)&
      4) start multiple parallel reader for data set at same time
         find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
         find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
         find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
         ..
         ..
         find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
         find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
      5) finally check using cat /proc/fs/fscache/stats | grep -i pages ;
         free -h , cat /proc/meminfo and page-types -r -b lru
         to ensure all pages are freed.
      Reviewed-by: 's avatarDaniel Axtens <dja@axtens.net>
      Signed-off-by: 's avatarShantanu Goel <sgoel01@yahoo.com>
      Signed-off-by: 's avatarKiran Kumar Modukuri <kiran.modukuri@gmail.com>
      [dja: forward ported to current upstream]
      Signed-off-by: 's avatarDaniel Axtens <dja@axtens.net>
      Signed-off-by: 's avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      7b99a0d7
    • Taehee Yoo's avatar
      netfilter: nf_tables: deactivate expressions in rule replecement routine · 397727e7
      Taehee Yoo authored
      [ Upstream commit ca08987885a147643817d02bf260bc4756ce8cd4 ]
      
      There is no expression deactivation call from the rule replacement path,
      hence, chain counter is not decremented. A few steps to reproduce the
      problem:
      
         %nft add table ip filter
         %nft add chain ip filter c1
         %nft add chain ip filter c1
         %nft add rule ip filter c1 jump c2
         %nft replace rule ip filter c1 handle 3 accept
         %nft flush ruleset
      
      <jump c2> expression means immediate NFT_JUMP to chain c2.
      Reference count of chain c2 is increased when the rule is added.
      
      When rule is deleted or replaced, the reference counter of c2 should be
      decreased via nft_rule_expr_deactivate() which calls
      nft_immediate_deactivate().
      
      Splat looks like:
      [  214.396453] WARNING: CPU: 1 PID: 21 at net/netfilter/nf_tables_api.c:1432 nf_tables_chain_destroy.isra.38+0x2f9/0x3a0 [nf_tables]
      [  214.398983] Modules linked in: nf_tables nfnetlink
      [  214.398983] CPU: 1 PID: 21 Comm: kworker/1:1 Not tainted 4.20.0-rc2+ #44
      [  214.398983] Workqueue: events nf_tables_trans_destroy_work [nf_tables]
      [  214.398983] RIP: 0010:nf_tables_chain_destroy.isra.38+0x2f9/0x3a0 [nf_tables]
      [  214.398983] Code: 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 8e 00 00 00 48 8b 7b 58 e8 e1 2c 4e c6 48 89 df e8 d9 2c 4e c6 eb 9a <0f> 0b eb 96 0f 0b e9 7e fe ff ff e8 a7 7e 4e c6 e9 a4 fe ff ff e8
      [  214.398983] RSP: 0018:ffff8881152874e8 EFLAGS: 00010202
      [  214.398983] RAX: 0000000000000001 RBX: ffff88810ef9fc28 RCX: ffff8881152876f0
      [  214.398983] RDX: dffffc0000000000 RSI: 1ffff11022a50ede RDI: ffff88810ef9fc78
      [  214.398983] RBP: 1ffff11022a50e9d R08: 0000000080000000 R09: 0000000000000000
      [  214.398983] R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff11022a50eba
      [  214.398983] R13: ffff888114446e08 R14: ffff8881152876f0 R15: ffffed1022a50ed6
      [  214.398983] FS:  0000000000000000(0000) GS:ffff888116400000(0000) knlGS:0000000000000000
      [  214.398983] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  214.398983] CR2: 00007fab9bb5f868 CR3: 000000012aa16000 CR4: 00000000001006e0
      [  214.398983] Call Trace:
      [  214.398983]  ? nf_tables_table_destroy.isra.37+0x100/0x100 [nf_tables]
      [  214.398983]  ? __kasan_slab_free+0x145/0x180
      [  214.398983]  ? nf_tables_trans_destroy_work+0x439/0x830 [nf_tables]
      [  214.398983]  ? kfree+0xdb/0x280
      [  214.398983]  nf_tables_trans_destroy_work+0x5f5/0x830 [nf_tables]
      [ ... ]
      
      Fixes: bb7b40ae ("netfilter: nf_tables: bogus EBUSY in chain deletions")
      Reported by: Christoph Anton Mitterer <calestyo@scientia.net>
      Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=914505
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=201791Signed-off-by: 's avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: 's avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      397727e7
    • Marek Szyprowski's avatar
      usb: gadget: u_ether: fix unsafe list iteration · b641ee14
      Marek Szyprowski authored
      [ Upstream commit c9287fa657b3328b4549c0ab39ea7f197a3d6a50 ]
      
      list_for_each_entry_safe() is not safe for deleting entries from the
      list if the spin lock, which protects it, is released and reacquired during
      the list iteration. Fix this issue by replacing this construction with
      a simple check if list is empty and removing the first entry in each
      iteration. This is almost equivalent to a revert of the commit mentioned in
      the Fixes: tag.
      
      This patch fixes following issue:
      --->8---
      Unable to handle kernel NULL pointer dereference at virtual address 00000104
      pgd = (ptrval)
      [00000104] *pgd=00000000
      Internal error: Oops: 817 [#1] PREEMPT SMP ARM
      Modules linked in:
      CPU: 1 PID: 84 Comm: kworker/1:1 Not tainted 4.20.0-rc2-next-20181114-00009-g8266b35ec404 #1061
      Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
      Workqueue: events eth_work
      PC is at rx_fill+0x60/0xac
      LR is at _raw_spin_lock_irqsave+0x50/0x5c
      pc : [<c065fee0>]    lr : [<c0a056b8>]    psr: 80000093
      sp : ee7fbee8  ip : 00000100  fp : 00000000
      r10: 006000c0  r9 : c10b0ab0  r8 : ee7eb5c0
      r7 : ee7eb614  r6 : ee7eb5ec  r5 : 000000dc  r4 : ee12ac00
      r3 : ee12ac24  r2 : 00000200  r1 : 60000013  r0 : ee7eb5ec
      Flags: Nzcv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment none
      Control: 10c5387d  Table: 6d5dc04a  DAC: 00000051
      Process kworker/1:1 (pid: 84, stack limit = 0x(ptrval))
      Stack: (0xee7fbee8 to 0xee7fc000)
      ...
      [<c065fee0>] (rx_fill) from [<c0143b7c>] (process_one_work+0x200/0x738)
      [<c0143b7c>] (process_one_work) from [<c0144118>] (worker_thread+0x2c/0x4c8)
      [<c0144118>] (worker_thread) from [<c014a8a4>] (kthread+0x128/0x164)
      [<c014a8a4>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20)
      Exception stack(0xee7fbfb0 to 0xee7fbff8)
      ...
      ---[ end trace 64480bc835eba7d6 ]---
      
      Fixes: fea14e68 ("usb: gadget: u_ether: use better list accessors")
      Signed-off-by: 's avatarMarek Szyprowski <m.szyprowski@samsung.com>
      Signed-off-by: 's avatarFelipe Balbi <felipe.balbi@linux.intel.com>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      b641ee14
    • Lorenzo Bianconi's avatar
      net: thunderx: fix NULL pointer dereference in nic_remove · ff791c9e
      Lorenzo Bianconi authored
      [ Upstream commit 24a6d2dd263bc910de018c78d1148b3e33b94512 ]
      
      Fix a possible NULL pointer dereference in nic_remove routine
      removing the nicpf module if nic_probe fails.
      The issue can be triggered with the following reproducer:
      
      $rmmod nicvf
      $rmmod nicpf
      
      [  521.412008] Unable to handle kernel access to user memory outside uaccess routines at virtual address 0000000000000014
      [  521.422777] Mem abort info:
      [  521.425561]   ESR = 0x96000004
      [  521.428624]   Exception class = DABT (current EL), IL = 32 bits
      [  521.434535]   SET = 0, FnV = 0
      [  521.437579]   EA = 0, S1PTW = 0
      [  521.440730] Data abort info:
      [  521.443603]   ISV = 0, ISS = 0x00000004
      [  521.447431]   CM = 0, WnR = 0
      [  521.450417] user pgtable: 4k pages, 48-bit VAs, pgdp = 0000000072a3da42
      [  521.457022] [0000000000000014] pgd=0000000000000000
      [  521.461916] Internal error: Oops: 96000004 [#1] SMP
      [  521.511801] Hardware name: GIGABYTE H270-T70/MT70-HD0, BIOS T49 02/02/2018
      [  521.518664] pstate: 80400005 (Nzcv daif +PAN -UAO)
      [  521.523451] pc : nic_remove+0x24/0x88 [nicpf]
      [  521.527808] lr : pci_device_remove+0x48/0xd8
      [  521.532066] sp : ffff000013433cc0
      [  521.535370] x29: ffff000013433cc0 x28: ffff810f6ac50000
      [  521.540672] x27: 0000000000000000 x26: 0000000000000000
      [  521.545974] x25: 0000000056000000 x24: 0000000000000015
      [  521.551274] x23: ffff8007ff89a110 x22: ffff000001667070
      [  521.556576] x21: ffff8007ffb170b0 x20: ffff8007ffb17000
      [  521.561877] x19: 0000000000000000 x18: 0000000000000025
      [  521.567178] x17: 0000000000000000 x16: 000000000000010ffc33ff98 x8 : 0000000000000000
      [  521.593683] x7 : 0000000000000000 x6 : 0000000000000001
      [  521.598983] x5 : 0000000000000002 x4 : 0000000000000003
      [  521.604284] x3 : ffff8007ffb17184 x2 : ffff8007ffb17184
      [  521.609585] x1 : ffff000001662118 x0 : ffff000008557be0
      [  521.614887] Process rmmod (pid: 1897, stack limit = 0x00000000859535c3)
      [  521.621490] Call trace:
      [  521.623928]  nic_remove+0x24/0x88 [nicpf]
      [  521.627927]  pci_device_remove+0x48/0xd8
      [  521.631847]  device_release_driver_internal+0x1b0/0x248
      [  521.637062]  driver_detach+0x50/0xc0
      [  521.640628]  bus_remove_driver+0x60/0x100
      [  521.644627]  driver_unregister+0x34/0x60
      [  521.648538]  pci_unregister_driver+0x24/0xd8
      [  521.652798]  nic_cleanup_module+0x14/0x111c [nicpf]
      [  521.657672]  __arm64_sys_delete_module+0x150/0x218
      [  521.662460]  el0_svc_handler+0x94/0x110
      [  521.666287]  el0_svc+0x8/0xc
      [  521.669160] Code: aa1e03e0 9102c295 d503201f f9404eb3 (b9401660)
      
      Fixes: 4863dea3 ("net: Adding support for Cavium ThunderX network controller")
      Signed-off-by: 's avatarLorenzo Bianconi <lorenzo.bianconi@redhat.com>
      Signed-off-by: 's avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      ff791c9e
    • Yi Wang's avatar
      x86/kvm/vmx: fix old-style function declaration · 7fdd58de
      Yi Wang authored
      [ Upstream commit 1e4329ee2c52692ea42cc677fb2133519718b34a ]
      
      The inline keyword which is not at the beginning of the function
      declaration may trigger the following build warnings, so let's fix it:
      
      arch/x86/kvm/vmx.c:1309:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration]
      arch/x86/kvm/vmx.c:5947:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration]
      arch/x86/kvm/vmx.c:5985:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration]
      arch/x86/kvm/vmx.c:6023:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration]
      Signed-off-by: 's avatarYi Wang <wang.yi59@zte.com.cn>
      Signed-off-by: 's avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      7fdd58de
    • Yi Wang's avatar
      KVM: x86: fix empty-body warnings · bb3f8691
      Yi Wang authored
      [ Upstream commit 354cb410d87314e2eda344feea84809e4261570a ]
      
      We get the following warnings about empty statements when building
      with 'W=1':
      
      arch/x86/kvm/lapic.c:632:53: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body]
      arch/x86/kvm/lapic.c:1907:42: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body]
      arch/x86/kvm/lapic.c:1936:65: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body]
      arch/x86/kvm/lapic.c:1975:44: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body]
      
      Rework the debug helper macro to get rid of these warnings.
      Signed-off-by: 's avatarYi Wang <wang.yi59@zte.com.cn>
      Signed-off-by: 's avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      bb3f8691