1. 03 Jul, 2017 1 commit
    • Paolo Valente's avatar
      block, bfq: don't change ioprio class for a bfq_queue on a service tree · 431b17f9
      Paolo Valente authored
      On each deactivation or re-scheduling (after being served) of a
      bfq_queue, BFQ invokes the function __bfq_entity_update_weight_prio(),
      to perform pending updates of ioprio, weight and ioprio class for the
      bfq_queue. BFQ also invokes this function on I/O-request dispatches,
      to raise or lower weights more quickly when needed, thereby improving
      latency. However, the entity representing the bfq_queue may be on the
      active (sub)tree of a service tree when this happens, and, although
      with a very low probability, the bfq_queue may happen to also have a
      pending change of its ioprio class. If both conditions hold when
      __bfq_entity_update_weight_prio() is invoked, then the entity moves to
      a sort of hybrid state: the new service tree for the entity, as
      returned by bfq_entity_service_tree(), differs from service tree on
      which the entity still is. The functions that handle activations and
      deactivations of entities do not cope with such a hybrid state (and
      would need to become more complex to cope).
      This commit addresses this issue by just making
      __bfq_entity_update_weight_prio() not perform also a possible pending
      change of ioprio class, when invoked on an I/O-request dispatch for a
      bfq_queue. Such a change is thus postponed to when
      __bfq_entity_update_weight_prio() is invoked on deactivation or
      re-scheduling of the bfq_queue.
      Reported-by: default avatarMarco Piazza <mpiazza@gmail.com>
      Reported-by: default avatarLaurentiu Nicola <lnicola@dend.ro>
      Signed-off-by: default avatarPaolo Valente <paolo.valente@linaro.org>
      Tested-by: default avatarMarco Piazza <mpiazza@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
  2. 08 Jun, 2017 1 commit
    • Paolo Valente's avatar
      block, bfq: access and cache blkg data only when safe · 8f9bebc3
      Paolo Valente authored
      In blk-cgroup, operations on blkg objects are protected with the
      request_queue lock. This is no more the lock that protects
      I/O-scheduler operations in blk-mq. In fact, the latter are now
      protected with a finer-grained per-scheduler-instance lock. As a
      consequence, although blkg lookups are also rcu-protected, blk-mq I/O
      schedulers may see inconsistent data when they access blkg and
      blkg-related objects. BFQ does access these objects, and does incur
      this problem, in the following case.
      The blkg_lookup performed in bfq_get_queue, being protected (only)
      through rcu, may happen to return the address of a copy of the
      original blkg. If this is the case, then the blkg_get performed in
      bfq_get_queue, to pin down the blkg, is useless: it does not prevent
      blk-cgroup code from destroying both the original blkg and all objects
      directly or indirectly referred by the copy of the blkg. BFQ accesses
      these objects, which typically causes a crash for NULL-pointer
      dereference of memory-protection violation.
      Some additional protection mechanism should be added to blk-cgroup to
      address this issue. In the meantime, this commit provides a quick
      temporary fix for BFQ: cache (when safe) blkg data that might
      disappear right after a blkg_lookup.
      In particular, this commit exploits the following facts to achieve its
      goal without introducing further locks.  Destroy operations on a blkg
      invoke, as a first step, hooks of the scheduler associated with the
      blkg. And these hooks are executed with bfqd->lock held for BFQ. As a
      consequence, for any blkg associated with the request queue an
      instance of BFQ is attached to, we are guaranteed that such a blkg is
      not destroyed, and that all the pointers it contains are consistent,
      while that instance is holding its bfqd->lock. A blkg_lookup performed
      with bfqd->lock held then returns a fully consistent blkg, which
      remains consistent until this lock is held. In more detail, this holds
      even if the returned blkg is a copy of the original one.
      Finally, also the object describing a group inside BFQ needs to be
      protected from destruction on the blkg_free of the original blkg
      (which invokes bfq_pd_free). This commit adds private refcounting for
      this object, to let it disappear only after no bfq_queue refers to it
      any longer.
      This commit also removes or updates some stale comments on locking
      issues related to blk-cgroup operations.
      Reported-by: default avatarTomas Konir <tomas.konir@gmail.com>
      Reported-by: default avatarLee Tibbert <lee.tibbert@gmail.com>
      Reported-by: default avatarMarco Piazza <mpiazza@gmail.com>
      Signed-off-by: default avatarPaolo Valente <paolo.valente@linaro.org>
      Tested-by: default avatarTomas Konir <tomas.konir@gmail.com>
      Tested-by: default avatarLee Tibbert <lee.tibbert@gmail.com>
      Tested-by: default avatarMarco Piazza <mpiazza@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
  3. 20 Apr, 2017 1 commit
    • Jens Axboe's avatar
      bfq: fix compile error if CONFIG_CGROUPS=n · 659b3394
      Jens Axboe authored
      If we don't have CGROUPS enabled, the compile ends in the
      following misery:
      In file included from ../block/bfq-iosched.c:105:0:
      ../block/bfq-iosched.h:819:22: error: array type has incomplete element type
       extern struct cftype bfq_blkcg_legacy_files[];
      ../block/bfq-iosched.h:820:22: error: array type has incomplete element type
       extern struct cftype bfq_blkg_files[];
      Move the declarations under the right ifdef.
      Reported-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
  4. 19 Apr, 2017 1 commit
    • Paolo Valente's avatar
      block, bfq: split bfq-iosched.c into multiple source files · ea25da48
      Paolo Valente authored
      The BFQ I/O scheduler features an optimal fair-queuing
      (proportional-share) scheduling algorithm, enriched with several
      mechanisms to boost throughput and reduce latency for interactive and
      real-time applications. This makes BFQ a large and complex piece of
      code. This commit addresses this issue by splitting BFQ into three
      main, independent components, and by moving each component into a
      separate source file:
      1. Main algorithm: handles the interaction with the kernel, and
      decides which requests to dispatch; it uses the following two further
      components to achieve its goals.
      2. Scheduling engine (Hierarchical B-WF2Q+ scheduling algorithm):
      computes the schedule, using weights and budgets provided by the above
      3. cgroups support: handles group operations (creation, destruction,
      move, ...).
      Signed-off-by: default avatarPaolo Valente <paolo.valente@linaro.org>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>