1. 23 Feb, 2017 1 commit
  2. 12 Apr, 2016 1 commit
    • Rabin Vincent's avatar
      splice: handle zero nr_pages in splice_to_pipe() · ab14444f
      Rabin Vincent authored
      commit d6785d91 upstream.
      
      Running the following command:
      
       busybox cat /sys/kernel/debug/tracing/trace_pipe > /dev/null
      
      with any tracing enabled pretty very quickly leads to various NULL
      pointer dereferences and VM BUG_ON()s, such as these:
      
       BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
       IP: [<ffffffff8119df6c>] generic_pipe_buf_release+0xc/0x40
       Call Trace:
        [<ffffffff811c48a3>] splice_direct_to_actor+0x143/0x1e0
        [<ffffffff811c42e0>] ? generic_pipe_buf_nosteal+0x10/0x10
        [<ffffffff811c49cf>] do_splice_direct+0x8f/0xb0
        [<ffffffff81196869>] do_sendfile+0x199/0x380
        [<ffffffff81197600>] SyS_sendfile64+0x90/0xa0
        [<ffffffff8192cbee>] entry_SYSCALL_64_fastpath+0x12/0x6d
      
       page dumped because: VM_BUG_ON_PAGE(atomic_read(&page->_count) == 0)
       kernel BUG at include/linux/mm.h:367!
       invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
       RIP: [<ffffffff8119df9c>] generic_pipe_buf_release+0x3c/0x40
       Call Trace:
        [<ffffffff811c48a3>] splice_direct_to_actor+0x143/0x1e0
        [<ffffffff811c42e0>] ? generic_pipe_buf_nosteal+0x10/0x10
        [<ffffffff811c49cf>] do_splice_direct+0x8f/0xb0
        [<ffffffff81196869>] do_sendfile+0x199/0x380
        [<ffffffff81197600>] SyS_sendfile64+0x90/0xa0
        [<ffffffff8192cd1e>] tracesys_phase2+0x84/0x89
      
      (busybox's cat uses sendfile(2), unlike the coreutils version)
      
      This is because tracing_splice_read_pipe() can call splice_to_pipe()
      with spd->nr_pages == 0.  spd_pages underflows in splice_to_pipe() and
      we fill the page pointers and the other fields of the pipe_buffers with
      garbage.
      
      All other callers of splice_to_pipe() avoid calling it when nr_pages ==
      0, and we could make tracing_splice_read_pipe() do that too, but it
      seems reasonable to have splice_to_page() handle this condition
      gracefully.
      Signed-off-by: default avatarRabin Vincent <rabin@rab.in>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ab14444f
  3. 24 Nov, 2015 2 commits
    • Jan Kara's avatar
      vfs: Avoid softlockups with sendfile(2) · c2489e07
      Jan Kara authored
      The following test program from Dmitry can cause softlockups or RCU
      stalls as it copies 1GB from tmpfs into eventfd and we don't have any
      scheduling point at that path in sendfile(2) implementation:
      
              int r1 = eventfd(0, 0);
              int r2 = memfd_create("", 0);
              unsigned long n = 1<<30;
              fallocate(r2, 0, 0, n);
              sendfile(r1, r2, 0, n);
      
      Add cond_resched() into __splice_from_pipe() to fix the problem.
      
      CC: Dmitry Vyukov <dvyukov@google.com>
      CC: stable@vger.kernel.org
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      c2489e07
    • Jan Kara's avatar
      vfs: Make sendfile(2) killable even better · c725bfce
      Jan Kara authored
      Commit 296291cd (mm: make sendfile(2) killable) fixed an issue where
      sendfile(2) was doing a lot of tiny writes into a filesystem and thus
      was unkillable for a long time. However sendfile(2) can be (mis)used to
      issue lots of writes into arbitrary file descriptor such as evenfd or
      similar special file descriptors which never hit the standard filesystem
      write path and thus are still unkillable. E.g. the following example
      from Dmitry burns CPU for ~16s on my test system without possibility to
      be killed:
      
              int r1 = eventfd(0, 0);
              int r2 = memfd_create("", 0);
              unsigned long n = 1<<30;
              fallocate(r2, 0, 0, n);
              sendfile(r1, r2, 0, n);
      
      There are actually quite a few tests for pending signals in sendfile
      code however we data to write is always available none of them seems to
      trigger. So fix the problem by adding a test for pending signal into
      splice_from_pipe_next() also before the loop waiting for pipe buffers to
      be available. This should fix all the lockup issues with sendfile of the
      do-ton-of-tiny-writes nature.
      
      CC: stable@vger.kernel.org
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      c725bfce
  4. 07 Nov, 2015 1 commit
  5. 25 Jun, 2015 1 commit
    • Michal Hocko's avatar
      mm: do not ignore mapping_gfp_mask in page cache allocation paths · 6afdb859
      Michal Hocko authored
      page_cache_read, do_generic_file_read, __generic_file_splice_read and
      __ntfs_grab_cache_pages currently ignore mapping_gfp_mask when calling
      add_to_page_cache_lru which might cause recursion into fs down in the
      direct reclaim path if the mapping really relies on GFP_NOFS semantic.
      
      This doesn't seem to be the case now because page_cache_read (page fault
      path) doesn't seem to suffer from the reclaim recursion issues and
      do_generic_file_read and __generic_file_splice_read also shouldn't be
      called under fs locks which would deadlock in the reclaim path.  Anyway it
      is better to obey mapping gfp mask and prevent from later breakage.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.cz>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Anton Altaparmakov <anton@tuxera.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6afdb859
  6. 25 May, 2015 1 commit
  7. 06 May, 2015 1 commit
    • Christophe Leroy's avatar
      splice: sendfile() at once fails for big files · 0ff28d9f
      Christophe Leroy authored
      Using sendfile with below small program to get MD5 sums of some files,
      it appear that big files (over 64kbytes with 4k pages system) get a
      wrong MD5 sum while small files get the correct sum.
      This program uses sendfile() to send a file to an AF_ALG socket
      for hashing.
      
      /* md5sum2.c */
      #include <stdio.h>
      #include <stdlib.h>
      #include <unistd.h>
      #include <string.h>
      #include <fcntl.h>
      #include <sys/socket.h>
      #include <sys/stat.h>
      #include <sys/types.h>
      #include <linux/if_alg.h>
      
      int main(int argc, char **argv)
      {
      	int sk = socket(AF_ALG, SOCK_SEQPACKET, 0);
      	struct stat st;
      	struct sockaddr_alg sa = {
      		.salg_family = AF_ALG,
      		.salg_type = "hash",
      		.salg_name = "md5",
      	};
      	int n;
      
      	bind(sk, (struct sockaddr*)&sa, sizeof(sa));
      
      	for (n = 1; n < argc; n++) {
      		int size;
      		int offset = 0;
      		char buf[4096];
      		int fd;
      		int sko;
      		int i;
      
      		fd = open(argv[n], O_RDONLY);
      		sko = accept(sk, NULL, 0);
      		fstat(fd, &st);
      		size = st.st_size;
      		sendfile(sko, fd, &offset, size);
      		size = read(sko, buf, sizeof(buf));
      		for (i = 0; i < size; i++)
      			printf("%2.2x", buf[i]);
      		printf("  %s\n", argv[n]);
      		close(fd);
      		close(sko);
      	}
      	exit(0);
      }
      
      Test below is done using official linux patch files. First result is
      with a software based md5sum. Second result is with the program above.
      
      root@vgoip:~# ls -l patch-3.6.*
      -rw-r--r--    1 root     root         64011 Aug 24 12:01 patch-3.6.2.gz
      -rw-r--r--    1 root     root         94131 Aug 24 12:01 patch-3.6.3.gz
      
      root@vgoip:~# md5sum patch-3.6.*
      b3ffb9848196846f31b2ff133d2d6443  patch-3.6.2.gz
      c5e8f687878457db77cb7158c38a7e43  patch-3.6.3.gz
      
      root@vgoip:~# ./md5sum2 patch-3.6.*
      b3ffb9848196846f31b2ff133d2d6443  patch-3.6.2.gz
      5fd77b24e68bb24dcc72d6e57c64790e  patch-3.6.3.gz
      
      After investivation, it appears that sendfile() sends the files by blocks
      of 64kbytes (16 times PAGE_SIZE). The problem is that at the end of each
      block, the SPLICE_F_MORE flag is missing, therefore the hashing operation
      is reset as if it was the end of the file.
      
      This patch adds SPLICE_F_MORE to the flags when more data is pending.
      
      With the patch applied, we get the correct sums:
      
      root@vgoip:~# md5sum patch-3.6.*
      b3ffb9848196846f31b2ff133d2d6443  patch-3.6.2.gz
      c5e8f687878457db77cb7158c38a7e43  patch-3.6.3.gz
      
      root@vgoip:~# ./md5sum2 patch-3.6.*
      b3ffb9848196846f31b2ff133d2d6443  patch-3.6.2.gz
      c5e8f687878457db77cb7158c38a7e43  patch-3.6.3.gz
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      0ff28d9f
  8. 15 Apr, 2015 1 commit
    • Boaz Harrosh's avatar
      dax: unify ext2/4_{dax,}_file_operations · be64f884
      Boaz Harrosh authored
      The original dax patchset split the ext2/4_file_operations because of the
      two NULL splice_read/splice_write in the dax case.
      
      In the vfs if splice_read/splice_write are NULL we then call
      default_splice_read/write.
      
      What we do here is make generic_file_splice_read aware of IS_DAX() so the
      original ext2/4_file_operations can be used as is.
      
      For write it appears that iter_file_splice_write is just fine.  It uses
      the regular f_op->write(file,..) or new_sync_write(file, ...).
      Signed-off-by: default avatarBoaz Harrosh <boaz@plexistor.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Cc: Dave Chinner <dchinner@redhat.com>
      Cc: Matthew Wilcox <willy@linux.intel.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      be64f884
  9. 12 Apr, 2015 1 commit
  10. 26 Mar, 2015 1 commit
  11. 29 Jan, 2015 2 commits
  12. 23 Oct, 2014 1 commit
  13. 12 Jun, 2014 3 commits
  14. 28 May, 2014 1 commit
  15. 06 May, 2014 1 commit
    • Al Viro's avatar
      start adding the tag to iov_iter · 71d8e532
      Al Viro authored
      For now, just use the same thing we pass to ->direct_IO() - it's all
      iovec-based at the moment.  Pass it explicitly to iov_iter_init() and
      account for kvec vs. iovec in there, by the same kludge NFS ->direct_IO()
      uses.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      71d8e532
  16. 02 Apr, 2014 2 commits
  17. 22 Jan, 2014 1 commit
    • Miklos Szeredi's avatar
      fuse: fix pipe_buf_operations · 28a625cb
      Miklos Szeredi authored
      Having this struct in module memory could Oops when if the module is
      unloaded while the buffer still persists in a pipe.
      
      Since sock_pipe_buf_ops is essentially the same as fuse_dev_pipe_buf_steal
      merge them into nosteal_pipe_buf_ops (this is the same as
      default_pipe_buf_ops except stealing the page from the buffer is not
      allowed).
      Reported-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@suse.cz>
      Cc: stable@vger.kernel.org
      28a625cb
  18. 25 Oct, 2013 1 commit
  19. 29 Jun, 2013 3 commits
  20. 24 Jun, 2013 1 commit
  21. 20 Jun, 2013 1 commit
  22. 09 Apr, 2013 5 commits
  23. 21 Mar, 2013 1 commit
  24. 04 Mar, 2013 1 commit
  25. 26 Feb, 2013 1 commit
  26. 23 Feb, 2013 1 commit
  27. 07 Jan, 2013 1 commit
    • Eric Dumazet's avatar
      tcp: fix MSG_SENDPAGE_NOTLAST logic · ae62ca7b
      Eric Dumazet authored
      commit 35f9c09f (tcp: tcp_sendpages() should call tcp_push() once)
      added an internal flag : MSG_SENDPAGE_NOTLAST meant to be set on all
      frags but the last one for a splice() call.
      
      The condition used to set the flag in pipe_to_sendpage() relied on
      splice() user passing the exact number of bytes present in the pipe,
      or a smaller one.
      
      But some programs pass an arbitrary high value, and the test fails.
      
      The effect of this bug is a lack of tcp_push() at the end of a
      splice(pipe -> socket) call, and possibly very slow or erratic TCP
      sessions.
      
      We should both test sd->total_len and fact that another fragment
      is in the pipe (pipe->nrbufs > 1)
      
      Many thanks to Willy for providing very clear bug report, bisection
      and test programs.
      Reported-by: default avatarWilly Tarreau <w@1wt.eu>
      Bisected-by: default avatarWilly Tarreau <w@1wt.eu>
      Tested-by: default avatarWilly Tarreau <w@1wt.eu>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ae62ca7b
  28. 12 Dec, 2012 1 commit
  29. 27 Sep, 2012 1 commit