1. 06 Jul, 2017 3 commits
  2. 28 Feb, 2017 3 commits
  3. 03 Feb, 2017 1 commit
    • Dan Streetman's avatar
      zswap: disable changing params if init fails · d7b028f5
      Dan Streetman authored
      Add zswap_init_failed bool that prevents changing any of the module
      params, if init_zswap() fails, and set zswap_enabled to false.  Change
      'enabled' param to a callback, and check zswap_init_failed before
      allowing any change to 'enabled', 'zpool', or 'compressor' params.
      
      Any driver that is built-in to the kernel will not be unloaded if its
      init function returns error, and its module params remain accessible for
      users to change via sysfs.  Since zswap uses param callbacks, which
      assume that zswap has been initialized, changing the zswap params after
      a failed initialization will result in WARNING due to the param
      callbacks expecting a pool to already exist.  This prevents that by
      immediately exiting any of the param callbacks if initialization failed.
      
      This was reported here:
        https://marc.info/?l=linux-mm&m=147004228125528&w=4
      
      And fixes this WARNING:
        [  429.723476] WARNING: CPU: 0 PID: 5140 at mm/zswap.c:503 __zswap_pool_current+0x56/0x60
      
      The warning is just noise, and not serious.  However, when init fails,
      zswap frees all its percpu dstmem pages and its kmem cache.  The kmem
      cache might be serious, if kmem_cache_alloc(NULL, gfp) has problems; but
      the percpu dstmem pages are definitely a problem, as they're used as
      temporary buffer for compressed pages before copying into place in the
      zpool.
      
      If the user does get zswap enabled after an init failure, then zswap
      will likely Oops on the first page it tries to compress (or worse, start
      corrupting memory).
      
      Fixes: 90b0fc26 ("zswap: change zpool/compressor at runtime")
      Link: http://lkml.kernel.org/r/20170124200259.16191-2-ddstreet@ieee.orgSigned-off-by: default avatarDan Streetman <dan.streetman@canonical.com>
      Reported-by: default avatarMarcin Miroslaw <marcin@mejor.pl>
      Cc: Seth Jennings <sjenning@redhat.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d7b028f5
  4. 01 Dec, 2016 2 commits
  5. 21 May, 2016 1 commit
    • Dan Streetman's avatar
      mm/zswap: use workqueue to destroy pool · 200867af
      Dan Streetman authored
      Add a work_struct to struct zswap_pool, and change __zswap_pool_empty to
      use the workqueue instead of using call_rcu().
      
      When zswap destroys a pool no longer in use, it uses call_rcu() to
      perform the destruction/freeing.  Since that executes in softirq
      context, it must not sleep.  However, actually destroying the pool
      involves freeing the per-cpu compressors (which requires locking the
      cpu_add_remove_lock mutex) and freeing the zpool, for which the
      implementation may sleep (e.g.  zsmalloc calls kmem_cache_destroy, which
      locks the slab_mutex).  So if either mutex is currently taken, or any
      other part of the compressor or zpool implementation sleeps, it will
      result in a BUG().
      
      It's not easy to reproduce this when changing zswap's params normally.
      In testing with a loaded system, this does not fail:
      
        $ cd /sys/module/zswap/parameters
        $ echo lz4 > compressor ; echo zsmalloc > zpool
      
      nor does this:
      
        $ while true ; do
        > echo lzo > compressor ; echo zbud > zpool
        > sleep 1
        > echo lz4 > compressor ; echo zsmalloc > zpool
        > sleep 1
        > done
      
      although it's still possible either of those might fail, depending on
      whether anything else besides zswap has locked the mutexes.
      
      However, changing a parameter with no delay immediately causes the
      schedule while atomic BUG:
      
        $ while true ; do
        > echo lzo > compressor ; echo lz4 > compressor
        > done
      
      This is essentially the same as Yu Zhao's proposed patch to zsmalloc,
      but moved to zswap, to cover compressor and zpool freeing.
      
      Fixes: f1c54846 ("zswap: dynamic pool creation")
      Signed-off-by: default avatarDan Streetman <ddstreet@ieee.org>
      Reported-by: default avatarYu Zhao <yuzhao@google.com>
      Reviewed-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Dan Streetman <dan.streetman@canonical.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      200867af
  6. 06 May, 2016 1 commit
    • Dan Streetman's avatar
      mm/zswap: provide unique zpool name · 32a4e169
      Dan Streetman authored
      Instead of using "zswap" as the name for all zpools created, add an
      atomic counter and use "zswap%x" with the counter number for each zpool
      created, to provide a unique name for each new zpool.
      
      As zsmalloc, one of the zpool implementations, requires/expects a unique
      name for each pool created, zswap should provide a unique name.  The
      zsmalloc pool creation does not fail if a new pool with a conflicting
      name is created, unless CONFIG_ZSMALLOC_STAT is enabled; in that case,
      zsmalloc pool creation fails with -ENOMEM.  Then zswap will be unable to
      change its compressor parameter if its zpool is zsmalloc; it also will
      be unable to change its zpool parameter back to zsmalloc, if it has any
      existing old zpool using zsmalloc with page(s) in it.  Attempts to
      change the parameters will result in failure to create the zpool.  This
      changes zswap to provide a unique name for each zpool creation.
      
      Fixes: f1c54846 ("zswap: dynamic pool creation")
      Signed-off-by: default avatarDan Streetman <ddstreet@ieee.org>
      Reported-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Reviewed-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Dan Streetman <dan.streetman@canonical.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      32a4e169
  7. 04 Apr, 2016 1 commit
    • Kirill A. Shutemov's avatar
      mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf
      Kirill A. Shutemov authored
      PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
      ago with promise that one day it will be possible to implement page
      cache with bigger chunks than PAGE_SIZE.
      
      This promise never materialized.  And unlikely will.
      
      We have many places where PAGE_CACHE_SIZE assumed to be equal to
      PAGE_SIZE.  And it's constant source of confusion on whether
      PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
      especially on the border between fs and mm.
      
      Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
      breakage to be doable.
      
      Let's stop pretending that pages in page cache are special.  They are
      not.
      
      The changes are pretty straight-forward:
      
       - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
      
       - page_cache_get() -> get_page();
      
       - page_cache_release() -> put_page();
      
      This patch contains automated changes generated with coccinelle using
      script below.  For some reason, coccinelle doesn't patch header files.
      I've called spatch for them manually.
      
      The only adjustment after coccinelle is revert of changes to
      PAGE_CAHCE_ALIGN definition: we are going to drop it later.
      
      There are few places in the code where coccinelle didn't reach.  I'll
      fix them manually in a separate patch.  Comments and documentation also
      will be addressed with the separate patch.
      
      virtual patch
      
      @@
      expression E;
      @@
      - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      expression E;
      @@
      - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      @@
      - PAGE_CACHE_SHIFT
      + PAGE_SHIFT
      
      @@
      @@
      - PAGE_CACHE_SIZE
      + PAGE_SIZE
      
      @@
      @@
      - PAGE_CACHE_MASK
      + PAGE_MASK
      
      @@
      expression E;
      @@
      - PAGE_CACHE_ALIGN(E)
      + PAGE_ALIGN(E)
      
      @@
      expression E;
      @@
      - page_cache_get(E)
      + get_page(E)
      
      @@
      expression E;
      @@
      - page_cache_release(E)
      + put_page(E)
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      09cbfeaf
  8. 18 Dec, 2015 1 commit
  9. 07 Nov, 2015 3 commits
    • Dan Streetman's avatar
      zswap: use charp for zswap param strings · c99b42c3
      Dan Streetman authored
      Instead of using a fixed-length string for the zswap params, use charp.
      This simplifies the code and uses less memory, as most zswap param strings
      will be less than the current maximum length.
      Signed-off-by: default avatarDan Streetman <ddstreet@ieee.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Seth Jennings <sjennings@variantweb.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c99b42c3
    • Alexey Klimov's avatar
      mm/zswap.c: remove unneeded initialization to NULL in zswap_entry_find_get() · b0c9865f
      Alexey Klimov authored
      On the next line entry variable will be re-initialized so no need to init
      it with NULL.
      Signed-off-by: default avatarAlexey Klimov <alexey.klimov@linaro.org>
      Cc: Seth Jennings <sjennings@variantweb.net>
      Cc: Dan Streetman <ddstreet@ieee.org>
      Cc: Minchan Kim <minchan@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b0c9865f
    • Mel Gorman's avatar
      mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep... · d0164adc
      Mel Gorman authored
      mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd
      
      __GFP_WAIT has been used to identify atomic context in callers that hold
      spinlocks or are in interrupts.  They are expected to be high priority and
      have access one of two watermarks lower than "min" which can be referred
      to as the "atomic reserve".  __GFP_HIGH users get access to the first
      lower watermark and can be called the "high priority reserve".
      
      Over time, callers had a requirement to not block when fallback options
      were available.  Some have abused __GFP_WAIT leading to a situation where
      an optimisitic allocation with a fallback option can access atomic
      reserves.
      
      This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
      cannot sleep and have no alternative.  High priority users continue to use
      __GFP_HIGH.  __GFP_DIRECT_RECLAIM identifies callers that can sleep and
      are willing to enter direct reclaim.  __GFP_KSWAPD_RECLAIM to identify
      callers that want to wake kswapd for background reclaim.  __GFP_WAIT is
      redefined as a caller that is willing to enter direct reclaim and wake
      kswapd for background reclaim.
      
      This patch then converts a number of sites
      
      o __GFP_ATOMIC is used by callers that are high priority and have memory
        pools for those requests. GFP_ATOMIC uses this flag.
      
      o Callers that have a limited mempool to guarantee forward progress clear
        __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall
        into this category where kswapd will still be woken but atomic reserves
        are not used as there is a one-entry mempool to guarantee progress.
      
      o Callers that are checking if they are non-blocking should use the
        helper gfpflags_allow_blocking() where possible. This is because
        checking for __GFP_WAIT as was done historically now can trigger false
        positives. Some exceptions like dm-crypt.c exist where the code intent
        is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
        flag manipulations.
      
      o Callers that built their own GFP flags instead of starting with GFP_KERNEL
        and friends now also need to specify __GFP_KSWAPD_RECLAIM.
      
      The first key hazard to watch out for is callers that removed __GFP_WAIT
      and was depending on access to atomic reserves for inconspicuous reasons.
      In some cases it may be appropriate for them to use __GFP_HIGH.
      
      The second key hazard is callers that assembled their own combination of
      GFP flags instead of starting with something like GFP_KERNEL.  They may
      now wish to specify __GFP_KSWAPD_RECLAIM.  It's almost certainly harmless
      if it's missed in most cases as other activity will wake kswapd.
      Signed-off-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Vitaly Wool <vitalywool@gmail.com>
      Cc: Rik van Riel <riel@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d0164adc
  10. 10 Sep, 2015 2 commits
  11. 08 Sep, 2015 2 commits
  12. 26 Jun, 2015 1 commit
  13. 13 Feb, 2015 1 commit
  14. 13 Dec, 2014 2 commits
  15. 13 Nov, 2014 1 commit
  16. 08 Aug, 2014 2 commits
    • Fabian Frederick's avatar
      mm/zswap.c: add __init to zswap_entry_cache_destroy() · c119239b
      Fabian Frederick authored
      zswap_entry_cache_destroy() is only called by __init init_zswap().
      
      This patch also fixes function name zswap_entry_cache_ s/destory/destroy
      Signed-off-by: default avatarFabian Frederick <fabf@skynet.be>
      Acked-by: default avatarSeth Jennings <sjennings@variantweb.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c119239b
    • Johannes Weiner's avatar
      mm: memcontrol: rewrite uncharge API · 0a31bc97
      Johannes Weiner authored
      The memcg uncharging code that is involved towards the end of a page's
      lifetime - truncation, reclaim, swapout, migration - is impressively
      complicated and fragile.
      
      Because anonymous and file pages were always charged before they had their
      page->mapping established, uncharges had to happen when the page type
      could still be known from the context; as in unmap for anonymous, page
      cache removal for file and shmem pages, and swap cache truncation for swap
      pages.  However, these operations happen well before the page is actually
      freed, and so a lot of synchronization is necessary:
      
      - Charging, uncharging, page migration, and charge migration all need
        to take a per-page bit spinlock as they could race with uncharging.
      
      - Swap cache truncation happens during both swap-in and swap-out, and
        possibly repeatedly before the page is actually freed.  This means
        that the memcg swapout code is called from many contexts that make
        no sense and it has to figure out the direction from page state to
        make sure memory and memory+swap are always correctly charged.
      
      - On page migration, the old page might be unmapped but then reused,
        so memcg code has to prevent untimely uncharging in that case.
        Because this code - which should be a simple charge transfer - is so
        special-cased, it is not reusable for replace_page_cache().
      
      But now that charged pages always have a page->mapping, introduce
      mem_cgroup_uncharge(), which is called after the final put_page(), when we
      know for sure that nobody is looking at the page anymore.
      
      For page migration, introduce mem_cgroup_migrate(), which is called after
      the migration is successful and the new page is fully rmapped.  Because
      the old page is no longer uncharged after migration, prevent double
      charges by decoupling the page's memcg association (PCG_USED and
      pc->mem_cgroup) from the page holding an actual charge.  The new bits
      PCG_MEM and PCG_MEMSW represent the respective charges and are transferred
      to the new page during migration.
      
      mem_cgroup_migrate() is suitable for replace_page_cache() as well,
      which gets rid of mem_cgroup_replace_page_cache().  However, care
      needs to be taken because both the source and the target page can
      already be charged and on the LRU when fuse is splicing: grab the page
      lock on the charge moving side to prevent changing pc->mem_cgroup of a
      page under migration.  Also, the lruvecs of both pages change as we
      uncharge the old and charge the new during migration, and putback may
      race with us, so grab the lru lock and isolate the pages iff on LRU to
      prevent races and ensure the pages are on the right lruvec afterward.
      
      Swap accounting is massively simplified: because the page is no longer
      uncharged as early as swap cache deletion, a new mem_cgroup_swapout() can
      transfer the page's memory+swap charge (PCG_MEMSW) to the swap entry
      before the final put_page() in page reclaim.
      
      Finally, page_cgroup changes are now protected by whatever protection the
      page itself offers: anonymous pages are charged under the page table lock,
      whereas page cache insertions, swapin, and migration hold the page lock.
      Uncharging happens under full exclusion with no outstanding references.
      Charging and uncharging also ensure that the page is off-LRU, which
      serializes against charge migration.  Remove the very costly page_cgroup
      lock and set pc->flags non-atomically.
      
      [mhocko@suse.cz: mem_cgroup_charge_statistics needs preempt_disable]
      [vdavydov@parallels.com: fix flags definition]
      Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Vladimir Davydov <vdavydov@parallels.com>
      Tested-by: default avatarJet Chen <jet.chen@intel.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.cz>
      Tested-by: default avatarFelipe Balbi <balbi@ti.com>
      Signed-off-by: default avatarVladimir Davydov <vdavydov@parallels.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0a31bc97
  17. 07 Aug, 2014 1 commit
  18. 04 Jun, 2014 1 commit
  19. 07 Apr, 2014 4 commits
  20. 20 Mar, 2014 1 commit
    • Srivatsa S. Bhat's avatar
      mm, zswap: Fix CPU hotplug callback registration · 57637824
      Srivatsa S. Bhat authored
      Subsystems that want to register CPU hotplug callbacks, as well as perform
      initialization for the CPUs that are already online, often do it as shown
      below:
      
      	get_online_cpus();
      
      	for_each_online_cpu(cpu)
      		init_cpu(cpu);
      
      	register_cpu_notifier(&foobar_cpu_notifier);
      
      	put_online_cpus();
      
      This is wrong, since it is prone to ABBA deadlocks involving the
      cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
      with CPU hotplug operations).
      
      Instead, the correct and race-free way of performing the callback
      registration is:
      
      	cpu_notifier_register_begin();
      
      	for_each_online_cpu(cpu)
      		init_cpu(cpu);
      
      	/* Note the use of the double underscored version of the API */
      	__register_cpu_notifier(&foobar_cpu_notifier);
      
      	cpu_notifier_register_done();
      
      Fix the zswap code by using this latter form of callback registration.
      
      Cc: Ingo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      57637824
  21. 24 Jan, 2014 1 commit
  22. 13 Nov, 2013 3 commits
  23. 17 Oct, 2013 1 commit
    • Weijie Yang's avatar
      mm/zswap: bugfix: memory leak when re-swapon · aa9bca05
      Weijie Yang authored
      zswap_tree is not freed when swapoff, and it got re-kmalloced in swapon,
      so a memory leak occurs.
      
      Free the memory of zswap_tree in zswap_frontswap_invalidate_area().
      Signed-off-by: default avatarWeijie Yang <weijie.yang@samsung.com>
      Reviewed-by: default avatarBob Liu <bob.liu@oracle.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Reviewed-by: default avatarMinchan Kim <minchan@kernel.org>
      Cc: <stable@vger.kernel.org>
      From: Weijie Yang <weijie.yang@samsung.com>
      Subject: mm/zswap: bugfix: memory leak when invalidate and reclaim occur concurrently
      
      Consider the following scenario:
      thread 0: reclaim entry x (get refcount, but not call zswap_get_swap_cache_page)
      thread 1: call zswap_frontswap_invalidate_page to invalidate entry x.
      	finished, entry x and its zbud is not freed as its refcount != 0
      	now, the swap_map[x] = 0
      thread 0: now call zswap_get_swap_cache_page
      	swapcache_prepare return -ENOENT because entry x is not used any more
      	zswap_get_swap_cache_page return ZSWAP_SWAPCACHE_NOMEM
      	zswap_writeback_entry do nothing except put refcount
      Now, the memory of zswap_entry x and its zpage leak.
      
      Modify:
       - check the refcount in fail path, free memory if it is not referenced.
      
       - use ZSWAP_SWAPCACHE_FAIL instead of ZSWAP_SWAPCACHE_NOMEM as the fail path
         can be not only caused by nomem but also by invalidate.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: default avatarWeijie Yang <weijie.yang@samsung.com>
      Reviewed-by: default avatarBob Liu <bob.liu@oracle.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: <stable@vger.kernel.org>
      Acked-by: default avatarSeth Jennings <sjenning@linux.vnet.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      aa9bca05
  24. 11 Sep, 2013 1 commit