Skip to content
  • Linus Torvalds's avatar
    Merge tag 'balancenuma-v11' of git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux-balancenuma · 3d59eebc
    Linus Torvalds authored
    Pull Automatic NUMA Balancing bare-bones from Mel Gorman:
     "There are three implementations for NUMA balancing, this tree
      (balancenuma), numacore which has been developed in tip/master and
      autonuma which is in aa.git.
    
      In almost all respects balancenuma is the dumbest of the three because
      its main impact is on the VM side with no attempt to be smart about
      scheduling.  In the interest of getting the ball rolling, it would be
      desirable to see this much merged for 3.8 with the view to building
      scheduler smarts on top and adapting the VM where required for 3.9.
    
      The most recent set of comparisons available from different people are
    
        mel:    https://lkml.org/lkml/2012/12/9/108
        mingo:  https://lkml.org/lkml/2012/12/7/331
        tglx:   https://lkml.org/lkml/2012/12/10/437
        srikar: https://lkml.org/lkml/2012/12/10/397
    
      The results are a mixed bag.  In my own tests, balancenuma does
      reasonably well.  It's dumb as rocks and does not regress against
      mainline.  On the other hand, Ingo's tests shows that balancenuma is
      incapable of converging for this workloads driven by perf which is bad
      but is potentially explained by the lack of scheduler smarts.  Thomas'
      results show balancenuma improves on mainline but falls far short of
      numacore or autonuma.  Srikar's results indicate we all suffer on a
      large machine with imbalanced node sizes.
    
      My own testing showed that recent numacore results have improved
      dramatically, particularly in the last week but not universally.
      We've butted heads heavily on system CPU usage and high levels of
      migration even when it shows that overall performance is better.
      There are also cases where it regresses.  Of interest is that for
      specjbb in some configurations it will regress for lower numbers of
      warehouses and show gains for higher numbers which is not reported by
      the tool by default and sometimes missed in treports.  Recently I
      reported for numacore that the JVM was crashing with
      NullPointerExceptions but currently it's unclear what the source of
      this problem is.  Initially I thought it was in how numacore batch
      handles PTEs but I'm no longer think this is the case.  It's possible
      numacore is just able to trigger it due to higher rates of migration.
    
      These reports were quite late in the cycle so I/we would like to start
      with this tree as it contains much of the code we can agree on and has
      not changed significantly over the last 2-3 weeks."
    
    * tag 'balancenuma-v11' of git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux-balancenuma: (50 commits)
      mm/rmap, migration: Make rmap_walk_anon() and try_to_unmap_anon() more scalable
      mm/rmap: Convert the struct anon_vma::mutex to an rwsem
      mm: migrate: Account a transhuge page properly when rate limiting
      mm: numa: Account for failed allocations and isolations as migration failures
      mm: numa: Add THP migration for the NUMA working set scanning fault case build fix
      mm: numa: Add THP migration for the NUMA working set scanning fault case.
      mm: sched: numa: Delay PTE scanning until a task is scheduled on a new node
      mm: sched: numa: Control enabling and disabling of NUMA balancing if !SCHED_DEBUG
      mm: sched: numa: Control enabling and disabling of NUMA balancing
      mm: sched: Adapt the scanning rate if a NUMA hinting fault does not migrate
      mm: numa: Use a two-stage filter to restrict pages being migrated for unlikely task<->node relationships
      mm: numa: migrate: Set last_nid on newly allocated page
      mm: numa: split_huge_page: Transfer last_nid on tail page
      mm: numa: Introduce last_nid to the page frame
      sched: numa: Slowly increase the scanning period as NUMA faults are handled
      mm: numa: Rate limit setting of pte_numa if node is saturated
      mm: numa: Rate limit the amount of memory that is migrated between nodes
      mm: numa: Structures for Migrate On Fault per NUMA migration rate limiting
      mm: numa: Migrate pages handled during a pmd_numa hinting fault
      mm: numa: Migrate on reference policy
      ...
    3d59eebc