Skip to content
  • Amir Goldstein's avatar
    xfs: fix incorrect log_flushed on fsync · f46a61f6
    Amir Goldstein authored
    commit 47c7d0b1
    
     upstream.
    
    When calling into _xfs_log_force{,_lsn}() with a pointer
    to log_flushed variable, log_flushed will be set to 1 if:
    1. xlog_sync() is called to flush the active log buffer
    AND/OR
    2. xlog_wait() is called to wait on a syncing log buffers
    
    xfs_file_fsync() checks the value of log_flushed after
    _xfs_log_force_lsn() call to optimize away an explicit
    PREFLUSH request to the data block device after writing
    out all the file's pages to disk.
    
    This optimization is incorrect in the following sequence of events:
    
     Task A                    Task B
     -------------------------------------------------------
     xfs_file_fsync()
       _xfs_log_force_lsn()
         xlog_sync()
            [submit PREFLUSH]
                               xfs_file_fsync()
                                 file_write_and_wait_range()
                                   [submit WRITE X]
                                   [endio  WRITE X]
                                 _xfs_log_force_lsn()
                                   xlog_wait()
            [endio  PREFLUSH]
    
    The write X is not guarantied to be on persistent storage
    when PREFLUSH request in completed, because write A was submitted
    after the PREFLUSH request, but xfs_file_fsync() of task A will
    be notified of log_flushed=1 and will skip explicit flush.
    
    If the system crashes after fsync of task A, write X may not be
    present on disk after reboot.
    
    This bug was discovered and demonstrated using Josef Bacik's
    dm-log-writes target, which can be used to record block io operations
    and then replay a subset of these operations onto the target device.
    The test goes something like this:
    - Use fsx to execute ops of a file and record ops on log device
    - Every now and then fsync the file, store md5 of file and mark
      the location in the log
    - Then replay log onto device for each mark, mount fs and compare
      md5 of file to stored value
    
    Cc: Christoph Hellwig <hch@lst.de>
    Cc: Josef Bacik <jbacik@fb.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: default avatarAmir Goldstein <amir73il@gmail.com>
    Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
    Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
    Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    f46a61f6