1. 05 Dec, 2017 1 commit
  2. 30 Nov, 2017 1 commit
  3. 01 Sep, 2017 1 commit
    • NeilBrown's avatar
      md/bitmap: disable bitmap_resize for file-backed bitmaps. · e8a27f83
      NeilBrown authored
      bitmap_resize() does not work for file-backed bitmaps.
      The buffer_heads are allocated and initialized when
      the bitmap is read from the file, but resize doesn't
      read from the file, it loads from the internal bitmap.
      When it comes time to write the new bitmap, the bh is
      non-existent and we crash.
      
      The common case when growing an array involves making the array larger,
      and that normally means making the bitmap larger.  Doing
      that inside the kernel is possible, but would need more code.
      It is probably easier to require people who use file-backed
      bitmaps to remove them and re-add after a reshape.
      
      So this patch disables the resizing of arrays which have
      file-backed bitmaps.  This is better than crashing.
      Reported-by: default avatarZhilong Liu <zlliu@suse.com>
      Fixes: d60b479d ("md/bitmap: add bitmap_resize function to allow bitmap resizing.")
      Cc: stable@vger.kernel.org (v3.5+).
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      e8a27f83
  4. 24 Aug, 2017 1 commit
    • Shaohua Li's avatar
      md/bitmap: copy correct data for bitmap super · 8031c3dd
      Shaohua Li authored
      raid5 cache could write bitmap superblock before bitmap superblock is
      initialized. The bitmap superblock is less than 512B. The current code will
      only copy the superblock to a new page and write the whole 512B, which will
      zero the the data after the superblock. Unfortunately the data could include
      bitmap, which we should preserve. The patch will make superblock read do 4k
      chunk and we always copy the 4k data to new page, so the superblock write will
      old data to disk and we don't change the bitmap.
      Reported-by: default avatarSong Liu <songliubraving@fb.com>
      Reviewed-by: default avatarSong Liu <songliubraving@fb.com>
      Cc: stable@vger.kernel.org (4.10+)
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      8031c3dd
  5. 10 Jul, 2017 1 commit
  6. 24 May, 2017 1 commit
  7. 24 Mar, 2017 1 commit
  8. 16 Mar, 2017 2 commits
    • Guoqing Jiang's avatar
      md: move bitmap_destroy to the beginning of __md_stop · 48df498d
      Guoqing Jiang authored
      Since we have switched to sync way to handle METADATA_UPDATED
      msg for md-cluster, then process_metadata_update is depended
      on mddev->thread->wqueue.
      
      With the new change, clustered raid could possible hang if
      array received a METADATA_UPDATED msg after array unregistered
      mddev->thread, so we need to stop clustered raid (bitmap_destroy
      -> bitmap_free -> md_cluster_stop) earlier than unregister
      thread (mddev_detach -> md_unregister_thread).
      
      And this change should be safe for non-clustered raid since
      all writes are stopped before the destroy. Also in md_run,
      we activate the personality (pers->run()) before activating
      the bitmap (bitmap_create()). So it is pleasingly symmetric
      to stop the bitmap (bitmap_destroy()) before stopping the
      personality (__md_stop() calls pers->free()), we achieve this
      by move bitmap_destroy to the beginning of __md_stop.
      
      But we don't want to break the codes for waiting behind IO as
      Shaohua mentioned, so introduce bitmap_wait_behind_writes to
      call the codes, and call the new fun in both mddev_detach and
      bitmap_destroy, then we will not break original behind IO code
      and also fit the new condition well.
      Signed-off-by: default avatarGuoqing Jiang <gqjiang@suse.com>
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      48df498d
    • Guoqing Jiang's avatar
      md-cluster: introduce cluster_check_sync_size · b98938d1
      Guoqing Jiang authored
      Support resize is a little complex for clustered
      raid, since we need to ensure all the nodes share
      the same knowledge about the size of raid.
      
      We achieve the goal by check the sync_size which
      is in each node's bitmap, we can only change the
      capacity after cluster_check_sync_size returns 0.
      
      Also, get_bitmap_from_slot is added to get a slot's
      bitmap. And we exported some funcs since they are
      used in cluster_check_sync_size().
      
      We can also reuse get_bitmap_from_slot to remove
      redundant code existed in bitmap_copy_from_slot.
      Reviewed-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarGuoqing Jiang <gqjiang@suse.com>
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      b98938d1
  9. 09 Dec, 2016 1 commit
    • Shaohua Li's avatar
      md: separate flags for superblock changes · 2953079c
      Shaohua Li authored
      The mddev->flags are used for different purposes. There are a lot of
      places we check/change the flags without masking unrelated flags, we
      could check/change unrelated flags. These usage are most for superblock
      write, so spearate superblock related flags. This should make the code
      clearer and also fix real bugs.
      Reviewed-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      2953079c
  10. 22 Nov, 2016 1 commit
    • NeilBrown's avatar
      md: Use REQ_FAILFAST_* on metadata writes where appropriate · 46533ff7
      NeilBrown authored
      This can only be supported on personalities which ensure
      that md_error() never causes an array to enter the 'failed'
      state.  i.e. if marking a device Faulty would cause some
      data to be inaccessible, the device is status is left as
      non-Faulty.  This is true for RAID1 and RAID10.
      
      If we get a failure writing metadata but the device doesn't
      fail, it must be the last device so we re-write without
      FAILFAST to improve chance of success.  We also flag the
      device as LastDev so that future metadata updates don't
      waste time on failfast writes.
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      46533ff7
  11. 18 Nov, 2016 1 commit
  12. 07 Nov, 2016 3 commits
  13. 21 Sep, 2016 1 commit
  14. 06 Aug, 2016 1 commit
  15. 07 Jun, 2016 2 commits
  16. 09 May, 2016 1 commit
    • Guoqing Jiang's avatar
      md-cluster: gather resync infos and enable recv_thread after bitmap is ready · 51e453ae
      Guoqing Jiang authored
      The in-memory bitmap is not ready when node joins cluster,
      so it doesn't make sense to make gather_all_resync_info()
      called so earlier, we need to call it after the node's
      bitmap is setup. Also, recv_thread could be wake up after
      node joins cluster, but it could cause problem if node
      receives RESYNCING message without persionality since
      mddev->pers->quiesce is called in process_suspend_info.
      
      This commit introduces a new cluster interface load_bitmaps
      to fix above problems, load_bitmaps is called in bitmap_load
      where bitmap and persionality are ready, and load_bitmaps
      does the following tasks:
      
      1. call gather_all_resync_info to load all the node's
         bitmap info.
      2. set MD_CLUSTER_ALREADY_IN_CLUSTER bit to recv_thread
         could be wake up, and wake up recv_thread if there is
         pending recv event.
      
      Then ack_bast only wakes up recv_thread after IN_CLUSTER
      bit is ready otherwise MD_CLUSTER_PENDING_RESYNC_EVENT is
      set.
      Reviewed-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarGuoqing Jiang <gqjiang@suse.com>
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      51e453ae
  17. 04 May, 2016 6 commits
  18. 04 Apr, 2016 1 commit
    • Kirill A. Shutemov's avatar
      mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf
      Kirill A. Shutemov authored
      PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
      ago with promise that one day it will be possible to implement page
      cache with bigger chunks than PAGE_SIZE.
      
      This promise never materialized.  And unlikely will.
      
      We have many places where PAGE_CACHE_SIZE assumed to be equal to
      PAGE_SIZE.  And it's constant source of confusion on whether
      PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
      especially on the border between fs and mm.
      
      Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
      breakage to be doable.
      
      Let's stop pretending that pages in page cache are special.  They are
      not.
      
      The changes are pretty straight-forward:
      
       - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
      
       - page_cache_get() -> get_page();
      
       - page_cache_release() -> put_page();
      
      This patch contains automated changes generated with coccinelle using
      script below.  For some reason, coccinelle doesn't patch header files.
      I've called spatch for them manually.
      
      The only adjustment after coccinelle is revert of changes to
      PAGE_CAHCE_ALIGN definition: we are going to drop it later.
      
      There are few places in the code where coccinelle didn't reach.  I'll
      fix them manually in a separate patch.  Comments and documentation also
      will be addressed with the separate patch.
      
      virtual patch
      
      @@
      expression E;
      @@
      - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      expression E;
      @@
      - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      @@
      - PAGE_CACHE_SHIFT
      + PAGE_SHIFT
      
      @@
      @@
      - PAGE_CACHE_SIZE
      + PAGE_SIZE
      
      @@
      @@
      - PAGE_CACHE_MASK
      + PAGE_MASK
      
      @@
      expression E;
      @@
      - PAGE_CACHE_ALIGN(E)
      + PAGE_ALIGN(E)
      
      @@
      expression E;
      @@
      - page_cache_get(E)
      + get_page(E)
      
      @@
      expression E;
      @@
      - page_cache_release(E)
      + put_page(E)
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      09cbfeaf
  19. 01 Apr, 2016 1 commit
  20. 14 Mar, 2016 1 commit
  21. 07 Mar, 2016 1 commit
  22. 25 Jan, 2016 1 commit
  23. 12 Oct, 2015 2 commits
    • Goldwyn Rodrigues's avatar
      md-cluster: Use a small window for resync · c40f341f
      Goldwyn Rodrigues authored
      Suspending the entire device for resync could take too long. Resync
      in small chunks.
      
      cluster's resync window (32M) is maintained in r1conf as
      cluster_sync_low and cluster_sync_high and processed in
      raid1's sync_request(). If the current resync is outside the cluster
      resync window:
      
      1. Set the cluster_sync_low to curr_resync_completed.
      2. Check if the sync will fit in the new window, if not issue a
         wait_barrier() and set cluster_sync_low to sector_nr.
      3. Set cluster_sync_high to cluster_sync_low + resync_window.
      4. Send a message to all nodes so they may add it in their suspension
         list.
      
      bitmap_cond_end_sync is modified to allow to force a sync inorder
      to get the curr_resync_completed uptodate with the sector passed.
      Signed-off-by: default avatarGoldwyn Rodrigues <rgoldwyn@suse.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      c40f341f
    • Goldwyn Rodrigues's avatar
      md: Increment version for clustered bitmaps · 3c462c88
      Goldwyn Rodrigues authored
      Add BITMAP_MAJOR_CLUSTERED as 5, in order to prevent older kernels
      to assemble a clustered device.
      
      In order to maximize compatibility, the major version is set to
      BITMAP_MAJOR_CLUSTERED *only* if the bitmap is clustered.
      
      Added MD_FEATURE_CLUSTERED in order to return error for older
      kernels which would assemble MD even if the bitmap is corrupted.
      Signed-off-by: default avatarGoldwyn Rodrigues <rgoldwyn@suse.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      3c462c88
  24. 02 Oct, 2015 1 commit
  25. 24 Jul, 2015 2 commits
  26. 22 Jul, 2015 1 commit
    • Goldwyn Rodrigues's avatar
      md: Skip cluster setup for dm-raid · d3b178ad
      Goldwyn Rodrigues authored
      There is a bug that the bitmap superblock isn't initialised properly for
      dm-raid, so a new field can have garbage in new fields.
      (dm-raid does initialisation in the kernel - md initialised the
       superblock in mdadm).
      
      This means that for dm-raid we cannot currently trust the new ->nodes
      field. So:
       - use __GFP_ZERO to initialise the superblock properly for all new
          arrays
       - initialise all fields in bitmap_info in bitmap_new_disk_sb
       - ignore ->nodes for dm arrays (yes, this is a hack)
      
      This bug exposes dm-raid to bug in the (still experimental) md-cluster
      code, so it is suitable for -stable.  It does cause crashes.
      
      References: https://bugzilla.kernel.org/show_bug.cgi?id=100491
      Cc: stable@vger.kernel.org (v4.1)
      Signed-off-By: default avatarGoldwyn Rodrigues <rgoldwyn@suse.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      d3b178ad
  27. 23 Jun, 2015 2 commits
  28. 20 May, 2015 1 commit