Skip to content
  • Wu Fengguang's avatar
    readahead: introduce context readahead algorithm · 10be0b37
    Wu Fengguang authored
    Introduce page cache context based readahead algorithm.
    This is to better support concurrent read streams in general.
    
    RATIONALE
    ---------
    The current readahead algorithm detects interleaved reads in a _passive_ way.
    Given a sequence of interleaved streams 1,1001,2,1002,3,4,1003,5,1004,1005,6,...
    By checking for (offset == prev_offset + 1), it will discover the sequentialness
    between 3,4 and between 1004,1005, and start doing sequential readahead for the
    individual streams since page 4 and page 1005.
    
    The context readahead algorithm guarantees to discover the sequentialness no
    matter how the streams are interleaved. For the above example, it will start
    sequential readahead since page 2 and 1002.
    
    The trick is to poke for page @offset-1 in the page cache when it has no other
    clues on the sequentialness of request @offset: if the current requenst belongs
    to a sequential stream, that stream must have accessed page @offset-1 recently,
    and the page will still be cached now. So if page @offset-1 is there, we can
    take request @offset as a sequential access.
    
    BENEFICIARIES
    -------------
    - strictly interleaved reads  i.e. 1,1001,2,1002,3,1003,...
      the current readahead will take them as silly random reads;
      the context readahead will take them as two sequential streams.
    
    - cooperative IO processes   i.e. NFS and SCST
      They create a thread pool, farming off (sequential) IO requests to different
      threads which will be performing interleaved IO.
    
      It was not easy(or possible) to reliably tell from file->f_ra all those
      cooperative processes working on the same sequential stream, since they will
      have different file->f_ra instances. And NFSD's file->f_ra is particularly
      unusable, since their file objects are dynamically created for each request.
      The nfsd does have code trying to restore the f_ra bits, but not satisfactory.
    
      The new scheme is to detect the sequential pattern via looking up the page
      cache, which provides one single and consistent view of the pages recently
      accessed. That makes sequential detection for cooperative processes possible.
    
    USER REPORT
    -----------
    Vladislav recommends the addition of context readahead as a result of his SCST
    benchmarks. It leads to 6%~40% performance gains in various cases and achieves
    equal performance in others.                http://lkml.org/lkml/2009/3/19/239
    
    
    
    OVERHEADS
    ---------
    In theory, it introduces one extra page cache lookup per random read.  However
    the below benchmark shows context readahead to be slightly faster, wondering..
    
    Randomly reading 200MB amount of data on a sparse file, repeat 20 times for
    each block size. The average throughputs are:
    
                           	original ra	context ra	gain
     4K random reads:	 65.561MB/s	 65.648MB/s	+0.1%
    16K random reads:	124.767MB/s	124.951MB/s	+0.1%
    64K random reads: 	162.123MB/s	162.278MB/s	+0.1%
    
    Cc: Jens Axboe <jens.axboe@oracle.com>
    Cc: Jeff Moyer <jmoyer@redhat.com>
    Tested-by: default avatarVladislav Bolkhovitin <vst@vlnb.net>
    Signed-off-by: default avatarWu Fengguang <fengguang.wu@intel.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    10be0b37