Skip to content
  • Wang Nan's avatar
    memory-hotplug: add zone_for_memory() for selecting zone for new memory · 63264400
    Wang Nan authored
    
    
    This series of patches fixes a problem when adding memory in bad manner.
    For example: for a x86_64 machine booted with "mem=400M" and with 2GiB
    memory installed, following commands cause problem:
    
      # echo 0x40000000 > /sys/devices/system/memory/probe
     [   28.613895] init_memory_mapping: [mem 0x40000000-0x47ffffff]
      # echo 0x48000000 > /sys/devices/system/memory/probe
     [   28.693675] init_memory_mapping: [mem 0x48000000-0x4fffffff]
      # echo online_movable > /sys/devices/system/memory/memory9/state
      # echo 0x50000000 > /sys/devices/system/memory/probe
     [   29.084090] init_memory_mapping: [mem 0x50000000-0x57ffffff]
      # echo 0x58000000 > /sys/devices/system/memory/probe
     [   29.151880] init_memory_mapping: [mem 0x58000000-0x5fffffff]
      # echo online_movable > /sys/devices/system/memory/memory11/state
      # echo online> /sys/devices/system/memory/memory8/state
      # echo online> /sys/devices/system/memory/memory10/state
      # echo offline> /sys/devices/system/memory/memory9/state
     [   30.558819] Offlined Pages 32768
      # free
                  total       used       free     shared    buffers     cached
     Mem:        780588 18014398509432020     830552          0          0      51180
     -/+ buffers/cache: 18014398509380840     881732
     Swap:            0          0          0
    
    This is because the above commands probe higher memory after online a
    section with online_movable, which causes ZONE_HIGHMEM (or ZONE_NORMAL
    for systems without ZONE_HIGHMEM) overlaps ZONE_MOVABLE.
    
    After the second online_movable, the problem can be observed from
    zoneinfo:
    
      # cat /proc/zoneinfo
      ...
      Node 0, zone  Movable
        pages free     65491
              min      250
              low      312
              high     375
              scanned  0
              spanned  18446744073709518848
              present  65536
              managed  65536
      ...
    
    This series of patches solve the problem by checking ZONE_MOVABLE when
    choosing zone for new memory.  If new memory is inside or higher than
    ZONE_MOVABLE, makes it go there instead.
    
    After applying this series of patches, following are free and zoneinfo
    result (after offlining memory9):
    
      bash-4.2# free
                    total       used       free     shared    buffers     cached
       Mem:        780956      80112     700844          0          0      51180
       -/+ buffers/cache:      28932     752024
       Swap:            0          0          0
    
      bash-4.2# cat /proc/zoneinfo
    
      Node 0, zone      DMA
        pages free     3389
              min      14
              low      17
              high     21
              scanned  0
              spanned  4095
              present  3998
              managed  3977
          nr_free_pages 3389
      ...
        start_pfn:         1
        inactive_ratio:    1
      Node 0, zone    DMA32
        pages free     73724
              min      341
              low      426
              high     511
              scanned  0
              spanned  98304
              present  98304
              managed  92958
          nr_free_pages 73724
        ...
        start_pfn:         4096
        inactive_ratio:    1
      Node 0, zone   Normal
        pages free     32630
              min      120
              low      150
              high     180
              scanned  0
              spanned  32768
              present  32768
              managed  32768
          nr_free_pages 32630
      ...
        start_pfn:         262144
        inactive_ratio:    1
      Node 0, zone  Movable
        pages free     65476
              min      241
              low      301
              high     361
              scanned  0
              spanned  98304
              present  65536
              managed  65536
          nr_free_pages 65476
      ...
        start_pfn:         294912
        inactive_ratio:    1
    
    This patch (of 7):
    
    Introduce zone_for_memory() in arch independent code for
    arch_add_memory() use.
    
    Many arch_add_memory() function simply selects ZONE_HIGHMEM or
    ZONE_NORMAL and add new memory into it.  However, with the existance of
    ZONE_MOVABLE, the selection method should be carefully considered: if
    new, higher memory is added after ZONE_MOVABLE is setup, the default
    zone and ZONE_MOVABLE may overlap each other.
    
    should_add_memory_movable() checks the status of ZONE_MOVABLE.  If it
    has already contain memory, compare the address of new memory and
    movable memory.  If new memory is higher than movable, it should be
    added into ZONE_MOVABLE instead of default zone.
    
    Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
    Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
    Cc: Dave Hansen <dave.hansen@intel.com>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Yinghai Lu <yinghai@kernel.org>
    Cc: "Mel Gorman" <mgorman@suse.de>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: "H. Peter Anvin" <hpa@zytor.com>
    Cc: "Luck, Tony" <tony.luck@intel.com>
    Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
    Cc: Paul Mackerras <paulus@samba.org>
    Cc: Chris Metcalf <cmetcalf@tilera.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    63264400