1. 04 Sep, 2017 1 commit
  2. 29 Jun, 2017 1 commit
  3. 02 Mar, 2017 1 commit
  4. 20 Feb, 2017 1 commit
  5. 16 Feb, 2017 1 commit
  6. 27 Dec, 2016 3 commits
  7. 21 Dec, 2016 1 commit
  8. 27 Nov, 2016 1 commit
  9. 10 Nov, 2016 1 commit
  10. 01 Nov, 2016 1 commit
  11. 10 Oct, 2016 1 commit
    • Al Viro's avatar
      fix ITER_PIPE interaction with direct_IO · c3a69024
      Al Viro authored
      by making sure we call iov_iter_advance() on original
      iov_iter even if direct_IO (done on its copy) has returned 0.
      It's a no-op for old iov_iter flavours and does the right thing
      (== truncation of the stuff we'd allocated, but not filled) in
      ITER_PIPE case.  Failures (e.g. -EIO) get caught and dealt with
      by cleanup in generic_file_read_iter().
      Signed-off-by: 's avatarAl Viro <viro@zeniv.linux.org.uk>
      c3a69024
  12. 05 Oct, 2016 6 commits
  13. 04 Oct, 2016 4 commits
  14. 04 Apr, 2016 1 commit
    • Kirill A. Shutemov's avatar
      mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf
      Kirill A. Shutemov authored
      PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
      ago with promise that one day it will be possible to implement page
      cache with bigger chunks than PAGE_SIZE.
      
      This promise never materialized.  And unlikely will.
      
      We have many places where PAGE_CACHE_SIZE assumed to be equal to
      PAGE_SIZE.  And it's constant source of confusion on whether
      PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
      especially on the border between fs and mm.
      
      Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
      breakage to be doable.
      
      Let's stop pretending that pages in page cache are special.  They are
      not.
      
      The changes are pretty straight-forward:
      
       - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
      
       - page_cache_get() -> get_page();
      
       - page_cache_release() -> put_page();
      
      This patch contains automated changes generated with coccinelle using
      script below.  For some reason, coccinelle doesn't patch header files.
      I've called spatch for them manually.
      
      The only adjustment after coccinelle is revert of changes to
      PAGE_CAHCE_ALIGN definition: we are going to drop it later.
      
      There are few places in the code where coccinelle didn't reach.  I'll
      fix them manually in a separate patch.  Comments and documentation also
      will be addressed with the separate patch.
      
      virtual patch
      
      @@
      expression E;
      @@
      - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      expression E;
      @@
      - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      @@
      - PAGE_CACHE_SHIFT
      + PAGE_SHIFT
      
      @@
      @@
      - PAGE_CACHE_SIZE
      + PAGE_SIZE
      
      @@
      @@
      - PAGE_CACHE_MASK
      + PAGE_MASK
      
      @@
      expression E;
      @@
      - PAGE_CACHE_ALIGN(E)
      + PAGE_ALIGN(E)
      
      @@
      expression E;
      @@
      - page_cache_get(E)
      + get_page(E)
      
      @@
      expression E;
      @@
      - page_cache_release(E)
      + put_page(E)
      Signed-off-by: 's avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: 's avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      09cbfeaf
  15. 03 Apr, 2016 1 commit
  16. 18 Mar, 2016 1 commit
    • Rabin Vincent's avatar
      splice: handle zero nr_pages in splice_to_pipe() · d6785d91
      Rabin Vincent authored
      Running the following command:
      
       busybox cat /sys/kernel/debug/tracing/trace_pipe > /dev/null
      
      with any tracing enabled pretty very quickly leads to various NULL
      pointer dereferences and VM BUG_ON()s, such as these:
      
       BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
       IP: [<ffffffff8119df6c>] generic_pipe_buf_release+0xc/0x40
       Call Trace:
        [<ffffffff811c48a3>] splice_direct_to_actor+0x143/0x1e0
        [<ffffffff811c42e0>] ? generic_pipe_buf_nosteal+0x10/0x10
        [<ffffffff811c49cf>] do_splice_direct+0x8f/0xb0
        [<ffffffff81196869>] do_sendfile+0x199/0x380
        [<ffffffff81197600>] SyS_sendfile64+0x90/0xa0
        [<ffffffff8192cbee>] entry_SYSCALL_64_fastpath+0x12/0x6d
      
       page dumped because: VM_BUG_ON_PAGE(atomic_read(&page->_count) == 0)
       kernel BUG at include/linux/mm.h:367!
       invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
       RIP: [<ffffffff8119df9c>] generic_pipe_buf_release+0x3c/0x40
       Call Trace:
        [<ffffffff811c48a3>] splice_direct_to_actor+0x143/0x1e0
        [<ffffffff811c42e0>] ? generic_pipe_buf_nosteal+0x10/0x10
        [<ffffffff811c49cf>] do_splice_direct+0x8f/0xb0
        [<ffffffff81196869>] do_sendfile+0x199/0x380
        [<ffffffff81197600>] SyS_sendfile64+0x90/0xa0
        [<ffffffff8192cd1e>] tracesys_phase2+0x84/0x89
      
      (busybox's cat uses sendfile(2), unlike the coreutils version)
      
      This is because tracing_splice_read_pipe() can call splice_to_pipe()
      with spd->nr_pages == 0.  spd_pages underflows in splice_to_pipe() and
      we fill the page pointers and the other fields of the pipe_buffers with
      garbage.
      
      All other callers of splice_to_pipe() avoid calling it when nr_pages ==
      0, and we could make tracing_splice_read_pipe() do that too, but it
      seems reasonable to have splice_to_page() handle this condition
      gracefully.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: 's avatarRabin Vincent <rabin@rab.in>
      Reviewed-by: 's avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: 's avatarAl Viro <viro@zeniv.linux.org.uk>
      d6785d91
  17. 04 Mar, 2016 1 commit
  18. 09 Jan, 2016 1 commit
    • Abhi Das's avatar
      fs: __generic_file_splice_read retry lookup on AOP_TRUNCATED_PAGE · 90330e68
      Abhi Das authored
      During testing, I discovered that __generic_file_splice_read() returns
      0 (EOF) when aops->readpage fails with AOP_TRUNCATED_PAGE on the first
      page of a single/multi-page splice read operation. This EOF return code
      causes the userspace test to (correctly) report a zero-length read error
      when it was expecting otherwise.
      
      The current strategy of returning a partial non-zero read when ->readpage
      returns AOP_TRUNCATED_PAGE works only when the failed page is not the
      first of the lot being processed.
      
      This patch attempts to retry lookup and call ->readpage again on pages
      that had previously failed with AOP_TRUNCATED_PAGE. With this patch, my
      tests pass and I haven't noticed any unwanted side effects.
      
      This version removes the thrice-retry loop and instead indefinitely
      retries lookups on AOP_TRUNCATED_PAGE errors from ->readpage. This
      behavior is now similar to do_generic_file_read().
      Signed-off-by: 's avatarAbhi Das <adas@redhat.com>
      Reviewed-by: 's avatarJan Kara <jack@suse.cz>
      Cc: Bob Peterson <rpeterso@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: 's avatarAl Viro <viro@zeniv.linux.org.uk>
      90330e68
  19. 24 Nov, 2015 2 commits
    • Jan Kara's avatar
      vfs: Avoid softlockups with sendfile(2) · c2489e07
      Jan Kara authored
      The following test program from Dmitry can cause softlockups or RCU
      stalls as it copies 1GB from tmpfs into eventfd and we don't have any
      scheduling point at that path in sendfile(2) implementation:
      
              int r1 = eventfd(0, 0);
              int r2 = memfd_create("", 0);
              unsigned long n = 1<<30;
              fallocate(r2, 0, 0, n);
              sendfile(r1, r2, 0, n);
      
      Add cond_resched() into __splice_from_pipe() to fix the problem.
      
      CC: Dmitry Vyukov <dvyukov@google.com>
      CC: stable@vger.kernel.org
      Signed-off-by: 's avatarJan Kara <jack@suse.cz>
      Signed-off-by: 's avatarAl Viro <viro@zeniv.linux.org.uk>
      c2489e07
    • Jan Kara's avatar
      vfs: Make sendfile(2) killable even better · c725bfce
      Jan Kara authored
      Commit 296291cd (mm: make sendfile(2) killable) fixed an issue where
      sendfile(2) was doing a lot of tiny writes into a filesystem and thus
      was unkillable for a long time. However sendfile(2) can be (mis)used to
      issue lots of writes into arbitrary file descriptor such as evenfd or
      similar special file descriptors which never hit the standard filesystem
      write path and thus are still unkillable. E.g. the following example
      from Dmitry burns CPU for ~16s on my test system without possibility to
      be killed:
      
              int r1 = eventfd(0, 0);
              int r2 = memfd_create("", 0);
              unsigned long n = 1<<30;
              fallocate(r2, 0, 0, n);
              sendfile(r1, r2, 0, n);
      
      There are actually quite a few tests for pending signals in sendfile
      code however we data to write is always available none of them seems to
      trigger. So fix the problem by adding a test for pending signal into
      splice_from_pipe_next() also before the loop waiting for pipe buffers to
      be available. This should fix all the lockup issues with sendfile of the
      do-ton-of-tiny-writes nature.
      
      CC: stable@vger.kernel.org
      Reported-by: 's avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: 's avatarJan Kara <jack@suse.cz>
      Signed-off-by: 's avatarAl Viro <viro@zeniv.linux.org.uk>
      c725bfce
  20. 07 Nov, 2015 1 commit
  21. 25 Jun, 2015 1 commit
    • Michal Hocko's avatar
      mm: do not ignore mapping_gfp_mask in page cache allocation paths · 6afdb859
      Michal Hocko authored
      page_cache_read, do_generic_file_read, __generic_file_splice_read and
      __ntfs_grab_cache_pages currently ignore mapping_gfp_mask when calling
      add_to_page_cache_lru which might cause recursion into fs down in the
      direct reclaim path if the mapping really relies on GFP_NOFS semantic.
      
      This doesn't seem to be the case now because page_cache_read (page fault
      path) doesn't seem to suffer from the reclaim recursion issues and
      do_generic_file_read and __generic_file_splice_read also shouldn't be
      called under fs locks which would deadlock in the reclaim path.  Anyway it
      is better to obey mapping gfp mask and prevent from later breakage.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: 's avatarMichal Hocko <mhocko@suse.cz>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Anton Altaparmakov <anton@tuxera.com>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      6afdb859
  22. 25 May, 2015 1 commit
  23. 06 May, 2015 1 commit
    • Christophe Leroy's avatar
      splice: sendfile() at once fails for big files · 0ff28d9f
      Christophe Leroy authored
      Using sendfile with below small program to get MD5 sums of some files,
      it appear that big files (over 64kbytes with 4k pages system) get a
      wrong MD5 sum while small files get the correct sum.
      This program uses sendfile() to send a file to an AF_ALG socket
      for hashing.
      
      /* md5sum2.c */
      #include <stdio.h>
      #include <stdlib.h>
      #include <unistd.h>
      #include <string.h>
      #include <fcntl.h>
      #include <sys/socket.h>
      #include <sys/stat.h>
      #include <sys/types.h>
      #include <linux/if_alg.h>
      
      int main(int argc, char **argv)
      {
      	int sk = socket(AF_ALG, SOCK_SEQPACKET, 0);
      	struct stat st;
      	struct sockaddr_alg sa = {
      		.salg_family = AF_ALG,
      		.salg_type = "hash",
      		.salg_name = "md5",
      	};
      	int n;
      
      	bind(sk, (struct sockaddr*)&sa, sizeof(sa));
      
      	for (n = 1; n < argc; n++) {
      		int size;
      		int offset = 0;
      		char buf[4096];
      		int fd;
      		int sko;
      		int i;
      
      		fd = open(argv[n], O_RDONLY);
      		sko = accept(sk, NULL, 0);
      		fstat(fd, &st);
      		size = st.st_size;
      		sendfile(sko, fd, &offset, size);
      		size = read(sko, buf, sizeof(buf));
      		for (i = 0; i < size; i++)
      			printf("%2.2x", buf[i]);
      		printf("  %s\n", argv[n]);
      		close(fd);
      		close(sko);
      	}
      	exit(0);
      }
      
      Test below is done using official linux patch files. First result is
      with a software based md5sum. Second result is with the program above.
      
      root@vgoip:~# ls -l patch-3.6.*
      -rw-r--r--    1 root     root         64011 Aug 24 12:01 patch-3.6.2.gz
      -rw-r--r--    1 root     root         94131 Aug 24 12:01 patch-3.6.3.gz
      
      root@vgoip:~# md5sum patch-3.6.*
      b3ffb9848196846f31b2ff133d2d6443  patch-3.6.2.gz
      c5e8f687878457db77cb7158c38a7e43  patch-3.6.3.gz
      
      root@vgoip:~# ./md5sum2 patch-3.6.*
      b3ffb9848196846f31b2ff133d2d6443  patch-3.6.2.gz
      5fd77b24e68bb24dcc72d6e57c64790e  patch-3.6.3.gz
      
      After investivation, it appears that sendfile() sends the files by blocks
      of 64kbytes (16 times PAGE_SIZE). The problem is that at the end of each
      block, the SPLICE_F_MORE flag is missing, therefore the hashing operation
      is reset as if it was the end of the file.
      
      This patch adds SPLICE_F_MORE to the flags when more data is pending.
      
      With the patch applied, we get the correct sums:
      
      root@vgoip:~# md5sum patch-3.6.*
      b3ffb9848196846f31b2ff133d2d6443  patch-3.6.2.gz
      c5e8f687878457db77cb7158c38a7e43  patch-3.6.3.gz
      
      root@vgoip:~# ./md5sum2 patch-3.6.*
      b3ffb9848196846f31b2ff133d2d6443  patch-3.6.2.gz
      c5e8f687878457db77cb7158c38a7e43  patch-3.6.3.gz
      Signed-off-by: 's avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: 's avatarJens Axboe <axboe@fb.com>
      0ff28d9f
  24. 15 Apr, 2015 1 commit
    • Boaz Harrosh's avatar
      dax: unify ext2/4_{dax,}_file_operations · be64f884
      Boaz Harrosh authored
      The original dax patchset split the ext2/4_file_operations because of the
      two NULL splice_read/splice_write in the dax case.
      
      In the vfs if splice_read/splice_write are NULL we then call
      default_splice_read/write.
      
      What we do here is make generic_file_splice_read aware of IS_DAX() so the
      original ext2/4_file_operations can be used as is.
      
      For write it appears that iter_file_splice_write is just fine.  It uses
      the regular f_op->write(file,..) or new_sync_write(file, ...).
      Signed-off-by: 's avatarBoaz Harrosh <boaz@plexistor.com>
      Reviewed-by: 's avatarJan Kara <jack@suse.cz>
      Cc: Dave Chinner <dchinner@redhat.com>
      Cc: Matthew Wilcox <willy@linux.intel.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      be64f884
  25. 12 Apr, 2015 1 commit
  26. 26 Mar, 2015 1 commit
  27. 29 Jan, 2015 2 commits
  28. 23 Oct, 2014 1 commit