1. 09 Sep, 2017 1 commit
  2. 29 Jul, 2017 1 commit
  3. 20 Jun, 2017 1 commit
    • Ingo Molnar's avatar
      sched/wait: Rename wait_queue_t => wait_queue_entry_t · ac6424b9
      Ingo Molnar authored
      Rename:
      
      	wait_queue_t		=>	wait_queue_entry_t
      
      'wait_queue_t' was always a slight misnomer: its name implies that it's a "queue",
      but in reality it's a queue *entry*. The 'real' queue is the wait queue head,
      which had to carry the name.
      
      Start sorting this out by renaming it to 'wait_queue_entry_t'.
      
      This also allows the real structure name 'struct __wait_queue' to
      lose its double underscore and become 'struct wait_queue_entry',
      which is the more canonical nomenclature for such data types.
      
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      ac6424b9
  4. 01 Mar, 2017 1 commit
    • Jason Wang's avatar
      vhost: introduce O(1) vq metadata cache · f8894913
      Jason Wang authored
      When device IOTLB is enabled, all address translations were stored in
      interval tree. O(lgN) searching time could be slow for virtqueue
      metadata (avail, used and descriptors) since they were accessed much
      often than other addresses. So this patch introduces an O(1) array
      which points to the interval tree nodes that store the translations of
      vq metadata. Those array were update during vq IOTLB prefetching and
      were reset during each invalidation and tlb update. Each time we want
      to access vq metadata, this small array were queried before interval
      tree. This would be sufficient for static mappings but not dynamic
      mappings, we could do optimizations on top.
      
      Test were done with l2fwd in guest (2M hugepage):
      
         noiommu  | before        | after
      tx 1.32Mpps | 1.06Mpps(82%) | 1.30Mpps(98%)
      rx 2.33Mpps | 1.46Mpps(63%) | 2.29Mpps(98%)
      
      We can almost reach the same performance as noiommu mode.
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      f8894913
  5. 15 Dec, 2016 1 commit
    • Jason Wang's avatar
      vhost: cache used event for better performance · 809ecb9b
      Jason Wang authored
      When event index was enabled, we need to fetch used event from
      userspace memory each time. This userspace fetch (with memory
      barrier) could be saved sometime when 1) caching used event and 2)
      if used event is ahead of new and old to new updating does not cross
      it, we're sure there's no need to notify guest.
      
      This will be useful for heavy tx load e.g guest pktgen test with Linux
      driver shows ~3.5% improvement.
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      809ecb9b
  6. 02 Aug, 2016 1 commit
    • Jason Wang's avatar
      vhost: new device IOTLB API · 6b1e6cc7
      Jason Wang authored
      This patch tries to implement an device IOTLB for vhost. This could be
      used with userspace(qemu) implementation of DMA remapping
      to emulate an IOMMU for the guest.
      
      The idea is simple, cache the translation in a software device IOTLB
      (which is implemented as an interval tree) in vhost and use vhost_net
      file descriptor for reporting IOTLB miss and IOTLB
      update/invalidation. When vhost meets an IOTLB miss, the fault
      address, size and access can be read from the file. After userspace
      finishes the translation, it writes the translated address to the
      vhost_net file to update the device IOTLB.
      
      When device IOTLB is enabled by setting VIRTIO_F_IOMMU_PLATFORM all vq
      addresses set by ioctl are treated as iova instead of virtual address and
      the accessing can only be done through IOTLB instead of direct userspace
      memory access. Before each round or vq processing, all vq metadata is
      prefetched in device IOTLB to make sure no translation fault happens
      during vq processing.
      
      In most cases, virtqueues are contiguous even in virtual address space.
      The IOTLB translation for virtqueue itself may make it a little
      slower. We might add fast path cache on top of this patch.
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      [mst: use virtio feature bit: VHOST_F_DEVICE_IOTLB -> VIRTIO_F_IOMMU_PLATFORM ]
      [mst: fix build warnings ]
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      [ weiyj.lk: missing unlock on error ]
      Signed-off-by: default avatarWei Yongjun <weiyj.lk@gmail.com>
      6b1e6cc7
  7. 01 Aug, 2016 2 commits
    • Jason Wang's avatar
      vhost: convert pre sorted vhost memory array to interval tree · a9709d68
      Jason Wang authored
      Current pre-sorted memory region array has some limitations for future
      device IOTLB conversion:
      
      1) need extra work for adding and removing a single region, and it's
         expected to be slow because of sorting or memory re-allocation.
      2) need extra work of removing a large range which may intersect
         several regions with different size.
      3) need trick for a replacement policy like LRU
      
      To overcome the above shortcomings, this patch convert it to interval
      tree which can easily address the above issue with almost no extra
      work.
      
      The patch could be used for:
      
      - Extend the current API and only let the userspace to send diffs of
        memory table.
      - Simplify Device IOTLB implementation.
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      a9709d68
    • Jason Wang's avatar
      vhost: lockless enqueuing · 04b96e55
      Jason Wang authored
      We use spinlock to synchronize the work list now which may cause
      unnecessary contentions. So this patch switch to use llist to remove
      this contention. Pktgen tests shows about 5% improvement:
      
      Before:
      ~1300000 pps
      After:
      ~1370000 pps
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Reviewed-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      04b96e55
  8. 11 Mar, 2016 3 commits
  9. 02 Mar, 2016 1 commit
    • Greg Kurz's avatar
      vhost: rename vhost_init_used() · 80f7d030
      Greg Kurz authored
      Looking at how callers use this, maybe we should just rename init_used
      to vhost_vq_init_access. The _used suffix was a hint that we
      access the vq used ring. But maybe what callers care about is
      that it must be called after access_ok.
      
      Also, this function manipulates the vq->is_le field which isn't related
      to the vq used ring.
      
      This patch simply renames vhost_init_used() to vhost_vq_init_access() as
      suggested by Michael.
      
      No behaviour change.
      Signed-off-by: default avatarGreg Kurz <gkurz@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      80f7d030
  10. 28 Oct, 2015 1 commit
  11. 16 Sep, 2015 1 commit
  12. 01 Jun, 2015 3 commits
  13. 09 Dec, 2014 3 commits
  14. 09 Jun, 2014 2 commits
    • Michael S. Tsirkin's avatar
      vhost: move memory pointer to VQs · 47283bef
      Michael S. Tsirkin authored
      commit 2ae76693b8bcabf370b981cd00c36cd41d33fabc
          vhost: replace rcu with mutex
      replaced rcu sync for memory accesses with VQ mutex locl/unlock.
      This is correct since all accesses are under VQ mutex, but incomplete:
      we still do useless rcu lock/unlock operations, someone might copy this
      code into some other context where this won't be right.
      This use of RCU is also non standard and hard to understand.
      Let's copy the pointer to each VQ structure, this way
      the access rules become straight-forward, and there's
      no need for RCU anymore.
      Reported-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      47283bef
    • Michael S. Tsirkin's avatar
      vhost: move acked_features to VQs · ea16c514
      Michael S. Tsirkin authored
      Refactor code to make sure features are only accessed
      under VQ mutex. This makes everything simpler, no need
      for RCU here anymore.
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      ea16c514
  15. 06 Dec, 2013 1 commit
  16. 11 Jul, 2013 1 commit
  17. 07 Jul, 2013 1 commit
  18. 11 Jun, 2013 1 commit
  19. 06 May, 2013 4 commits
  20. 01 May, 2013 4 commits
  21. 29 Jan, 2013 1 commit
    • Jason Wang's avatar
      vhost_net: handle polling errors when setting backend · 2b8b328b
      Jason Wang authored
      Currently, the polling errors were ignored, which can lead following issues:
      
      - vhost remove itself unconditionally from waitqueue when stopping the poll,
        this may crash the kernel since the previous attempt of starting may fail to
        add itself to the waitqueue
      - userspace may think the backend were successfully set even when the polling
        failed.
      
      Solve this by:
      
      - check poll->wqh before trying to remove from waitqueue
      - report polling errors in vhost_poll_start(), tx_poll_start(), the return value
        will be checked and returned when userspace want to set the backend
      
      After this fix, there still could be a polling failure after backend is set, it
      will addressed by the next patch.
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2b8b328b
  22. 06 Dec, 2012 1 commit
    • Michael S. Tsirkin's avatar
      vhost: avoid backend flush on vring ops · 935cdee7
      Michael S. Tsirkin authored
      vring changes already do a flush internally where appropriate, so we do
      not need a second flush.
      
      It's currently not very expensive but a follow-up patch makes flush more
      heavy-weight, so remove the extra flush here to avoid regressing
      performance if call or kick fds are changed on data path.
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      935cdee7
  23. 03 Nov, 2012 4 commits