Skip to content
  • Aaron Lu's avatar
    thp: reduce usage of huge zero page's atomic counter · 6fcb52a5
    Aaron Lu authored
    The global zero page is used to satisfy an anonymous read fault.  If
    THP(Transparent HugePage) is enabled then the global huge zero page is
    used.  The global huge zero page uses an atomic counter for reference
    counting and is allocated/freed dynamically according to its counter
    value.
    
    CPU time spent on that counter will greatly increase if there are a lot
    of processes doing anonymous read faults.  This patch proposes a way to
    reduce the access to the global counter so that the CPU load can be
    reduced accordingly.
    
    To do this, a new flag of the mm_struct is introduced:
    MMF_USED_HUGE_ZERO_PAGE.  With this flag, the process only need to touch
    the global counter in two cases:
    
     1 The first time it uses the global huge zero page;
     2 The time when mm_user of its mm_struct reaches zero.
    
    Note that right now, the huge zero page is eligible to be freed as soon
    as its last use goes away.  With this patch, the page will not be
    eligible to be freed until the exit of the last process from which it
    was ever used.
    
    And with the use of mm_user, the kthread is not eligible to use huge
    zero page either.  Since no kthread is using huge zero page today, there
    is no difference after applying this patch.  But if that is not desired,
    I can change it to when mm_count reaches zero.
    
    Case used for test on Haswell EP:
    
      usemem -n 72 --readonly -j 0x200000 100G
    
    Which spawns 72 processes and each will mmap 100G anonymous space and
    then do read only access to that space sequentially with a step of 2MB.
    
      CPU cycles from perf report for base commit:
          54.03%  usemem   [kernel.kallsyms]   [k] get_huge_zero_page
      CPU cycles from perf report for this commit:
           0.11%  usemem   [kernel.kallsyms]   [k] mm_get_huge_zero_page
    
    Performance(throughput) of the workload for base commit: 1784430792
    Performance(throughput) of the workload for this commit: 4726928591
    164% increase.
    
    Runtime of the workload for base commit: 707592 us
    Runtime of the workload for this commit: 303970 us
    50% drop.
    
    Link: http://lkml.kernel.org/r/fe51a88f-446a-4622-1363-ad1282d71385@intel.com
    
    
    Signed-off-by: default avatarAaron Lu <aaron.lu@intel.com>
    Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
    Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
    Cc: Dave Hansen <dave.hansen@intel.com>
    Cc: Tim Chen <tim.c.chen@linux.intel.com>
    Cc: Huang Ying <ying.huang@intel.com>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Cc: Jerome Marchand <jmarchan@redhat.com>
    Cc: Andrea Arcangeli <aarcange@redhat.com>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Cc: Ebru Akagunduz <ebru.akagunduz@gmail.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    6fcb52a5