md: fix test in md_write_start()
authorNeilBrown <neilb@suse.com>
Tue, 8 Aug 2017 06:56:36 +0000 (16:56 +1000)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Fri, 25 Aug 2017 00:15:01 +0000 (17:15 -0700)
commitfebaf83252f2e5c94e396619d904047f1c13d4a1
treebea113ddcf62ab6f91d21e103680a1a88e8ccd50
parent5f46f337bdeb3e461d14b8119d12d0d109ab42f4
md: fix test in md_write_start()

commit 81fe48e9aa00bdd509bd3c37a76d1132da6b9f09 upstream.

md_write_start() needs to clear the in_sync flag is it is set, or if
there might be a race with set_in_sync() such that the later will
set it very soon.  In the later case it is sufficient to take the
spinlock to synchronize with set_in_sync(), and then set the flag
if needed.

The current test is incorrect.
It should be:
  if "flag is set" or "race is possible"

"flag is set" is trivially "mddev->in_sync".
"race is possible" should be tested by "mddev->sync_checkers".

If sync_checkers is 0, then there can be no race.  set_in_sync() will
wait in percpu_ref_switch_to_atomic_sync() for an RCU grace period,
and as md_write_start() holds the rcu_read_lock(), set_in_sync() will
be sure ot see the update to writes_pending.

If sync_checkers is > 0, there could be race.  If md_write_start()
happened entirely between
if (!mddev->in_sync &&
    percpu_ref_is_zero(&mddev->writes_pending)) {
and
mddev->in_sync = 1;
in set_in_sync(), then it would not see that is_sync had been set,
and set_in_sync() would not see that writes_pending had been
incremented.

This bug means that in_sync is sometimes not set when it should be.
Consequently there is a small chance that the array will be marked as
"clean" when in fact it is inconsistent.

Fixes: 4ad23a976413 ("MD: use per-cpu counter for writes_pending")
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
drivers/md/md.c