1. 04 Dec, 2007 3 commits
  2. 30 Nov, 2007 1 commit
  3. 29 Nov, 2007 2 commits
  4. 28 Nov, 2007 2 commits
  5. 20 Nov, 2007 1 commit
    • Christian Borntraeger's avatar
      [S390] Optimize storage key handling for anonymous pages · ce7e9fae
      Christian Borntraeger authored
      page_mkclean used to call page_clear_dirty for every given page. This
      is different to all other architectures, where the dirty bit in the
      PTEs is only resetted, if page_mapping() returns a non-NULL pointer.
      We can move the page_test_dirty/page_clear_dirty sequence into the
      2nd if to avoid unnecessary iske/sske sequences, which are expensive.
      
      This change also helps kvm for s390 as the host must transfer the
      dirty bit into the guest status bits. By moving the page_clear_dirty
      operation into the 2nd if, the vm will only call page_clear_dirty
      for pages where it walks the mapping anyway. There it calls
      ptep_clear_flush for writable ptes, so we can transfer the dirty bit
      to the guest.
      Signed-off-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      ce7e9fae
  6. 16 Nov, 2007 1 commit
    • Linus Torvalds's avatar
      dirty page balancing: Get rid of broken unmapped_ratio logic · 8c086340
      Linus Torvalds authored
      This code harks back to the days when we didn't count dirty mapped
      pages, which led us to try to balance the number of dirty unmapped pages
      by how much unmapped memory there was in the system.
      
      That makes no sense any more, since now the dirty counts include the
      mapped pages.  Not to mention that the math doesn't work with HIGHMEM
      machines anyway, and causes the unmapped_ratio to potentially turn
      negative (which we do catch thanks to clamping it at a minimum value,
      but I mention that as an indication of how broken the code is).
      
      The code also was written at a time when the default dirty ratio was
      much larger, and the unmapped_ratio logic effectively capped that large
      dirty ratio a bit.  Again, we've since lowered the dirty ratio rather
      aggressively, further lessening the point of that code.
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8c086340
  7. 15 Nov, 2007 17 commits
    • Nick Piggin's avatar
      slob: fix memory corruption · d32ddd8f
      Nick Piggin authored
      Previously, it would be possible for prev->next to point to
      &free_slob_pages, and thus we would try to move a list onto itself, and
      bad things would happen.
      
      It seems a bit hairy to be doing list operations with the list marker as
      an entry, rather than a head, but...
      
      this resolves the following crash:
      
        http://bugzilla.kernel.org/show_bug.cgi?id=9379Signed-off-by: default avatarNick Piggin <npiggin@suse.de>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Acked-by: default avatarMatt Mackall <mpm@selenic.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d32ddd8f
    • Balbir Singh's avatar
      Swap delay accounting, include lock_page() delays · 20a1022d
      Balbir Singh authored
      The delay incurred in lock_page() should also be accounted in swap delay
      accounting
      Reported-by: default avatarNick Piggin <nickpiggin@yahoo.com.au>
      Signed-off-by: default avatarBalbir Singh <balbir@linux.vnet.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      20a1022d
    • Randy Dunlap's avatar
      vmstat: fix section mismatch warning · 42614fcd
      Randy Dunlap authored
      Mark start_cpu_timer() as __cpuinit instead of __devinit.
      Fixes this section warning:
      
      WARNING: vmlinux.o(.text+0x60e53): Section mismatch: reference to .init.text:start_cpu_timer (between 'vmstat_cpuup_callback' and 'vmstat_show')
      Signed-off-by: default avatarRandy Dunlap <randy.dunlap@oracle.com>
      Acked-by: default avatarChristoph Lameter <clameter@sgi.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      42614fcd
    • Adrian Bunk's avatar
      fix mm/util.c:krealloc() · be21f0ab
      Adrian Bunk authored
      Commit ef8b4520 added one NULL check for
      "p" in krealloc(), but that doesn't seem to be enough since there
      doesn't seem to be any guarantee that memcpy(ret, NULL, 0) works
      (spotted by the Coverity checker).
      
      For making it clearer what happens this patch also removes the pointless
      min().
      Signed-off-by: default avatarAdrian Bunk <bunk@kernel.org>
      Acked-by: default avatarChristoph Lameter <clameter@sgi.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      be21f0ab
    • Ken Chen's avatar
      hugetlb: fix i_blocks accounting · 45c682a6
      Ken Chen authored
      For administrative purpose, we want to query actual block usage for
      hugetlbfs file via fstat.  Currently, hugetlbfs always return 0.  Fix that
      up since kernel already has all the information to track it properly.
      Signed-off-by: default avatarKen Chen <kenchen@google.com>
      Acked-by: default avatarAdam Litke <agl@us.ibm.com>
      Cc: Badari Pulavarty <pbadari@us.ibm.com>
      Cc: David Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      45c682a6
    • Adrian Bunk's avatar
      mm/hugetlb.c: make a function static · 8cde045c
      Adrian Bunk authored
      return_unused_surplus_pages() can become static.
      Signed-off-by: default avatarAdrian Bunk <bunk@kernel.org>
      Acked-by: default avatarAdam Litke <agl@us.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8cde045c
    • Adam Litke's avatar
      hugetlb: enforce quotas during reservation for shared mappings · 90d8b7e6
      Adam Litke authored
      When a MAP_SHARED mmap of a hugetlbfs file succeeds, huge pages are reserved
      to guarantee no problems will occur later when instantiating pages.  If quotas
      are in force, page instantiation could fail due to a race with another process
      or an oversized (but approved) shared mapping.
      
      To prevent these scenarios, debit the quota for the full reservation amount up
      front and credit the unused quota when the reservation is released.
      Signed-off-by: default avatarAdam Litke <agl@us.ibm.com>
      Cc: Ken Chen <kenchen@google.com>
      Cc: Andy Whitcroft <apw@shadowen.org>
      Cc: Dave Hansen <haveblue@us.ibm.com>
      Cc: David Gibson <hermes@gibson.dropbear.id.au>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      90d8b7e6
    • Adam Litke's avatar
      hugetlb: allow bulk updating in hugetlb_*_quota() · 9a119c05
      Adam Litke authored
      Add a second parameter 'delta' to hugetlb_get_quota and hugetlb_put_quota to
      allow bulk updating of the sbinfo->free_blocks counter.  This will be used by
      the next patch in the series.
      Signed-off-by: default avatarAdam Litke <agl@us.ibm.com>
      Cc: Ken Chen <kenchen@google.com>
      Cc: Andy Whitcroft <apw@shadowen.org>
      Cc: Dave Hansen <haveblue@us.ibm.com>
      Cc: David Gibson <hermes@gibson.dropbear.id.au>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9a119c05
    • Adam Litke's avatar
      hugetlb: debit quota in alloc_huge_page · 2fc39cec
      Adam Litke authored
      Now that quota is credited by free_huge_page(), calls to hugetlb_get_quota()
      seem out of place.  The alloc/free API is unbalanced because we handle the
      hugetlb_put_quota() but expect the caller to open-code hugetlb_get_quota().
      Move the get inside alloc_huge_page to clean up this disparity.
      
      This patch has been kept apart from the previous patch because of the somewhat
      dodgy ERR_PTR() use herein.  Moving the quota logic means that
      alloc_huge_page() has two failure modes.  Quota failure must result in a
      SIGBUS while a standard allocation failure is OOM.  Unfortunately, ERR_PTR()
      doesn't like the small positive errnos we have in VM_FAULT_* so they must be
      negated before they are used.
      
      Does anyone take issue with the way I am using PTR_ERR.  If so, what are your
      thoughts on how to clean this up (without needing an if,else if,else block at
      each alloc_huge_page() callsite)?
      Signed-off-by: default avatarAdam Litke <agl@us.ibm.com>
      Cc: Ken Chen <kenchen@google.com>
      Cc: Andy Whitcroft <apw@shadowen.org>
      Cc: Dave Hansen <haveblue@us.ibm.com>
      Cc: David Gibson <hermes@gibson.dropbear.id.au>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2fc39cec
    • Adam Litke's avatar
      hugetlb: fix quota management for private mappings · c79fb75e
      Adam Litke authored
      The hugetlbfs quota management system was never taught to handle MAP_PRIVATE
      mappings when that support was added.  Currently, quota is debited at page
      instantiation and credited at file truncation.  This approach works correctly
      for shared pages but is incomplete for private pages.  In addition to
      hugetlb_no_page(), private pages can be instantiated by hugetlb_cow(); but
      this function does not respect quotas.
      
      Private huge pages are treated very much like normal, anonymous pages.  They
      are not "backed" by the hugetlbfs file and are not stored in the mapping's
      radix tree.  This means that private pages are invisible to
      truncate_hugepages() so that function will not credit the quota.
      
      This patch (based on a prototype provided by Ken Chen) moves quota crediting
      for all pages into free_huge_page().  page->private is used to store a pointer
      to the mapping to which this page belongs.  This is used to credit quota on
      the appropriate hugetlbfs instance.
      Signed-off-by: default avatarAdam Litke <agl@us.ibm.com>
      Cc: Ken Chen <kenchen@google.com>
      Cc: Ken Chen <kenchen@google.com>
      Cc: Andy Whitcroft <apw@shadowen.org>
      Cc: Dave Hansen <haveblue@us.ibm.com>
      Cc: David Gibson <hermes@gibson.dropbear.id.au>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c79fb75e
    • Adam Litke's avatar
      hugetlb: split alloc_huge_page into private and shared components · 348ea204
      Adam Litke authored
      Hugetlbfs implements a quota system which can limit the amount of memory that
      can be used by the filesystem.  Before allocating a new huge page for a file,
      the quota is checked and debited.  The quota is then credited when truncating
      the file.  I found a few bugs in the code for both MAP_PRIVATE and MAP_SHARED
      mappings.  Before detailing the problems and my proposed solutions, we should
      agree on a definition of quotas that properly addresses both private and
      shared pages.  Since the purpose of quotas is to limit total memory
      consumption on a per-filesystem basis, I argue that all pages allocated by the
      fs (private and shared) should be charged against quota.
      
      Private Mappings
      ================
      
      The current code will debit quota for private pages sometimes, but will never
      credit it.  At a minimum, this causes a leak in the quota accounting which
      renders the accounting essentially useless as it is.  Shared pages have a one
      to one mapping with a hugetlbfs file and are easy to account by debiting on
      allocation and crediting on truncate.  Private pages are anonymous in nature
      and have a many to one relationship with their hugetlbfs files (due to copy on
      write).  Because private pages are not indexed by the mapping's radix tree,
      thier quota cannot be credited at file truncation time.  Crediting must be
      done when the page is unmapped and freed.
      
      Shared Pages
      ============
      
      I discovered an issue concerning the interaction between the MAP_SHARED
      reservation system and quotas.  Since quota is not checked until page
      instantiation, an over-quota mmap/reservation will initially succeed.  When
      instantiating the first over-quota page, the program will receive SIGBUS.
      This is inconsistent since the reservation is supposed to be a guarantee.  The
      solution is to debit the full amount of quota at reservation time and credit
      the unused portion when the reservation is released.
      
      This patch series brings quotas back in line by making the following
      modifications:
       * Private pages
         - Debit quota in alloc_huge_page()
         - Credit quota in free_huge_page()
       * Shared pages
         - Debit quota for entire reservation at mmap time
         - Credit quota for instantiated pages in free_huge_page()
         - Credit quota for unused reservation at munmap time
      
      This patch:
      
      The shared page reservation and dynamic pool resizing features have made the
      allocation of private vs.  shared huge pages quite different.  By splitting
      out the private/shared-specific portions of the process into their own
      functions, readability is greatly improved.  alloc_huge_page now calls the
      proper helper and performs common operations.
      
      [akpm@linux-foundation.org: coding-style cleanups]
      Signed-off-by: default avatarAdam Litke <agl@us.ibm.com>
      Cc: Ken Chen <kenchen@google.com>
      Cc: Andy Whitcroft <apw@shadowen.org>
      Cc: Dave Hansen <haveblue@us.ibm.com>
      Cc: David Gibson <hermes@gibson.dropbear.id.au>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      348ea204
    • Adam Litke's avatar
      hugetlb: follow_hugetlb_page() for write access · 5b23dbe8
      Adam Litke authored
      When calling get_user_pages(), a write flag is passed in by the caller to
      indicate if write access is required on the faulted-in pages.  Currently,
      follow_hugetlb_page() ignores this flag and always faults pages for
      read-only access.  This can cause data corruption because a device driver
      that calls get_user_pages() with write set will not expect COW faults to
      occur on the returned pages.
      
      This patch passes the write flag down to follow_hugetlb_page() and makes
      sure hugetlb_fault() is called with the right write_access parameter.
      
      [ezk@cs.sunysb.edu: build fix]
      Signed-off-by: default avatarAdam Litke <agl@us.ibm.com>
      Reviewed-by: default avatarKen Chen <kenchen@google.com>
      Cc: David Gibson <hermes@gibson.dropbear.id.au>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Cc: Badari Pulavarty <pbadari@us.ibm.com>
      Signed-off-by: default avatarErez Zadok <ezk@cs.sunysb.edu>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5b23dbe8
    • Yasunori Goto's avatar
      Add IORESOUCE_BUSY flag for System RAM · 887c3cb1
      Yasunori Goto authored
      i386 and x86-64 registers System RAM as IORESOURCE_MEM | IORESOURCE_BUSY.
      
      But ia64 registers it as IORESOURCE_MEM only.
      In addition, memory hotplug code registers new memory as IORESOURCE_MEM too.
      
      This difference causes a failure of memory unplug of x86-64.  This patch
      fixes it.
      
      This patch adds IORESOURCE_BUSY to avoid potential overlap mapping by PCI
      device.
      Signed-off-by: default avatarYasunori Goto <y-goto@jp.fujitsu.com>
      Signed-off-by: default avatarBadari Pulavarty <pbadari@us.ibm.com>
      Cc: Luck, Tony" <tony.luck@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      887c3cb1
    • Peter Zijlstra's avatar
      mm: speed up writeback ramp-up on clean systems · 5fce25a9
      Peter Zijlstra authored
      We allow violation of bdi limits if there is a lot of room on the system.
      Once we hit half the total limit we start enforcing bdi limits and bdi
      ramp-up should happen.  Doing it this way avoids many small writeouts on an
      otherwise idle system and should also speed up the ramp-up.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Reviewed-by: default avatarFengguang Wu <wfg@mail.ustc.edu.cn>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5fce25a9
    • KAMEZAWA Hiroyuki's avatar
      memory hotremove: unset migrate type "ISOLATE" after removal · dbc0e4ce
      KAMEZAWA Hiroyuki authored
      We should unset migrate type "ISOLATE" when we successfully removed memory.
       But current code has BUG and cannot works well.
      
      This patch also includes bugfix?  to change get_pageblock_flags to
      get_pageblock_migratetype().
      
      Thanks to Badari Pulavarty for finding this.
      Signed-off-by: default avatarKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: default avatarBadari Pulavarty <pbadari@us.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dbc0e4ce
    • Lee Schermerhorn's avatar
      Migration: find correct vma in new_vma_page() · 3ad33b24
      Lee Schermerhorn authored
      We hit the BUG_ON() in mm/rmap.c:vma_address() when trying to migrate via
      mbind(MPOL_MF_MOVE) a non-anon region that spans multiple vmas.  For
      anon-regions, we just fail to migrate any pages beyond the 1st vma in the
      range.
      
      This occurs because do_mbind() collects a list of pages to migrate by
      calling check_range().  check_range() walks the task's mm, spanning vmas as
      necessary, to collect the migratable pages into a list.  Then, do_mbind()
      calls migrate_pages() passing the list of pages, a function to allocate new
      pages based on vma policy [new_vma_page()], and a pointer to the first vma
      of the range.
      
      For each page in the list, new_vma_page() calls page_address_in_vma()
      passing the page and the vma [first in range] to obtain the address to get
      for alloc_page_vma().  The page address is needed to get interleaving
      policy correct.  If the pages in the list come from multiple vmas,
      eventually, new_page_address() will pass that page to page_address_in_vma()
      with the incorrect vma.  For !PageAnon pages, this will result in a bug
      check in rmap.c:vma_address().  For anon pages, vma_address() will just
      return EFAULT and fail the migration.
      
      This patch modifies new_vma_page() to check the return value from
      page_address_in_vma().  If the return value is EFAULT, new_vma_page()
      searchs forward via vm_next for the vma that maps the page--i.e., that does
      not return EFAULT.  This assumes that the pages in the list handed to
      migrate_pages() is in address order.  This is currently case.  The patch
      documents this assumption in a new comment block for new_vma_page().
      
      If new_vma_page() cannot locate the vma mapping the page in a forward
      search in the mm, it will pass a NULL vma to alloc_page_vma().  This will
      result in the allocation using the task policy, if any, else system default
      policy.  This situation is unlikely, but the patch documents this behavior
      with a comment.
      
      Note, this patch results in restarting from the first vma in a multi-vma
      range each time new_vma_page() is called.  If this is not acceptable, we
      can make the vma argument a pointer, both in new_vma_page() and it's caller
      unmap_and_move() so that the value held by the loop in migrate_pages()
      always passes down the last vma in which a page was found.  This will
      require changes to all new_page_t functions passed to migrate_pages().  Is
      this necessary?
      
      For this patch to work, we can't bug check in vma_address() for pages
      outside the argument vma.  This patch removes the BUG_ON().  All other
      callers [besides new_vma_page()] already check the return status.
      
      Tested on x86_64, 4 node NUMA platform.
      Signed-off-by: default avatarLee Schermerhorn <lee.schermerhorn@hp.com>
      Acked-by: default avatarChristoph Lameter <clameter@sgi.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3ad33b24
    • Akinobu Mita's avatar
      slab: fix typo in allocation failure handling · cc550def
      Akinobu Mita authored
      This patch fixes wrong array index in allocation failure handling.
      
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Acked-by: default avatarChristoph Lameter <clameter@sgi.com>
      Signed-off-by: default avatarAkinobu Mita <akinobu.mita@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cc550def
  8. 12 Nov, 2007 2 commits
  9. 05 Nov, 2007 2 commits
  10. 31 Oct, 2007 1 commit
    • Linus Torvalds's avatar
      Remove broken ptrace() special-case code from file mapping · 5307cc1a
      Linus Torvalds authored
      The kernel has for random historical reasons allowed ptrace() accesses
      to access (and insert) pages into the page cache above the size of the
      file.
      
      However, Nick broke that by mistake when doing the new fault handling in
      commit 54cb8821 ("mm: merge populate and
      nopage into fault (fixes nonlinear)".  The breakage caused a hang with
      gdb when trying to access the invalid page.
      
      The ptrace "feature" really isn't worth resurrecting, since it really is
      wrong both from a portability _and_ from an internal page cache validity
      standpoint.  So this removes those old broken remnants, and fixes the
      ptrace() hang in the process.
      
      Noticed and bisected by Duane Griffin, who also supplied a test-case
      (quoth Nick: "Well that's probably the best bug report I've ever had,
      thanks Duane!").
      
      Cc: Duane Griffin <duaneg@dghda.com>
      Acked-by: default avatarNick Piggin <npiggin@suse.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5307cc1a
  11. 30 Oct, 2007 3 commits
    • Zach Brown's avatar
      dio: fix cache invalidation after sync writes · bdb76ef5
      Zach Brown authored
      Commit commit 65b8291c ("dio: invalidate
      clean pages before dio write") introduced a bug which stopped dio from
      ever invalidating the page cache after writes.  It still invalidated it
      before writes so most users were fine.
      
      Karl Schendel reported ( http://lkml.org/lkml/2007/10/26/481 ) hitting
      this bug when he had a buffered reader immediately reading file data
      after an O_DIRECT wirter had written the data.  The kernel issued
      read-ahead beyond the position of the reader which overlapped with the
      O_DIRECT writer.  The failure to invalidate after writes caused the
      reader to see stale data from the read-ahead.
      
      The following patch is originally from Karl.  The following commentary
      is his:
      
      	The below 3rd try takes on your suggestion of just invalidating
      	no matter what the retval from the direct_IO call.  I ran it
      	thru the test-case several times and it has worked every time.
      	The post-invalidate is probably still too early for async-directio,
      	but I don't have a testcase for that;  just sync.  And, this
      	won't be any worse in the async case.
      
      I added a test to the aio-dio-regress repository which mimics Karl's IO
      pattern.  It verifed the bad behaviour and that the patch fixed it.  I
      agree with Karl, this still doesn't help the case where a buffered
      reader follows an AIO O_DIRECT writer.  That will require a bit more
      work.
      
      This gives up on the idea of returning EIO to indicate to userspace that
      stale data remains if the invalidation failed.
      Signed-off-by: default avatarZach Brown <zach.brown@oracle.com>
      Cc: Karl Schendel <kschendel@datallegro.com>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Cc: Leonid Ananiev <leonid.i.ananiev@linux.intel.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bdb76ef5
    • Hugh Dickins's avatar
      fix tmpfs BUG and AOP_WRITEPAGE_ACTIVATE · 487e9bf2
      Hugh Dickins authored
      It's possible to provoke unionfs (not yet in mainline, though in mm and
      some distros) to hit shmem_writepage's BUG_ON(page_mapped(page)).  I expect
      it's possible to provoke the 2.6.23 ecryptfs in the same way (but the
      2.6.24 ecryptfs no longer calls lower level's ->writepage).
      
      This came to light with the recent find that AOP_WRITEPAGE_ACTIVATE could
      leak from tmpfs via write_cache_pages and unionfs to userspace.  There's
      already a fix (e4230030 - writeback: don't
      propagate AOP_WRITEPAGE_ACTIVATE) in the tree for that, and it's okay so
      far as it goes; but insufficient because it doesn't address the underlying
      issue, that shmem_writepage expects to be called only by vmscan (relying on
      backing_dev_info capabilities to prevent the normal writeback path from
      ever approaching it).
      
      That's an increasingly fragile assumption, and ramdisk_writepage (the other
      source of AOP_WRITEPAGE_ACTIVATEs) is already careful to check
      wbc->for_reclaim before returning it.  Make the same check in
      shmem_writepage, thereby sidestepping the page_mapped BUG also.
      Signed-off-by: default avatarHugh Dickins <hugh@veritas.com>
      Cc: Erez Zadok <ezk@cs.sunysb.edu>
      Cc: <stable@kernel.org>
      Reviewed-by: default avatarPekka Enberg <penberg@cs.helsinki.fi>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      487e9bf2
    • Glauber de Oliveira Costa's avatar
      mm/sparse-vmemmap.c: make sure init_mm is included · 8bca44bb
      Glauber de Oliveira Costa authored
      mm/sparse-vmemmap.c uses init_mm in some places.  However, it is not
      present in any of the headers currently included in the file.
      
      init_mm is defined as extern in sched.h, so we add it to the headers list
      
      Up to now, this problem was masked by the fact that functions like
      set_pte_at() and pmd_populate_kernel() are usually macros that expand to
      simpler variants that does not use the first parameter at all.
      Signed-off-by: default avatarGlauber de Oliveira Costa <gcosta@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8bca44bb
  12. 29 Oct, 2007 4 commits
  13. 23 Oct, 2007 1 commit