blkcg: make sure blkg_lookup() returns %NULL if @q is bypassing
Currently, blkg_lookup() doesn't check @q bypass state. This patch
updates blk_queue_bypass_start() to do synchronize_rcu() before
returning and updates blkg_lookup() to check blk_queue_bypass() and
return %NULL if bypassing. This ensures blkg_lookup() returns %NULL
if @q is bypassing.
This is to guarantee that nobody is accessing policy data while @q is
bypassing, which is necessary to allow replacing blkio_cgroup->pd[] in
place on policy [de]activation.
v2: Added more comments explaining bypass guarantees as suggested by
Vivek.
v3: Added more comments explaining why there's no synchronize_rcu() in
blk_cleanup_queue() as suggested by Vivek.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
diff --git a/block/blk-core.c b/block/blk-core.c
index 991c1d6..f2db628 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -416,7 +416,8 @@
* In bypass mode, only the dispatch FIFO queue of @q is used. This
* function makes @q enter bypass mode and drains all requests which were
* throttled or issued before. On return, it's guaranteed that no request
- * is being throttled or has ELVPRIV set.
+ * is being throttled or has ELVPRIV set and blk_queue_bypass() %true
+ * inside queue or RCU read lock.
*/
void blk_queue_bypass_start(struct request_queue *q)
{
@@ -426,6 +427,8 @@
spin_unlock_irq(q->queue_lock);
blk_drain_queue(q, false);
+ /* ensure blk_queue_bypass() is %true inside RCU read lock */
+ synchronize_rcu();
}
EXPORT_SYMBOL_GPL(blk_queue_bypass_start);
@@ -462,7 +465,15 @@
spin_lock_irq(lock);
- /* dead queue is permanently in bypass mode till released */
+ /*
+ * Dead queue is permanently in bypass mode till released. Note
+ * that, unlike blk_queue_bypass_start(), we aren't performing
+ * synchronize_rcu() after entering bypass mode to avoid the delay
+ * as some drivers create and destroy a lot of queues while
+ * probing. This is still safe because blk_release_queue() will be
+ * called only after the queue refcnt drops to zero and nothing,
+ * RCU or not, would be traversing the queue by then.
+ */
q->bypass_depth++;
queue_flag_set(QUEUE_FLAG_BYPASS, q);