The bug should be "accidently" fixed by recent changes in 2.6.31,
all kernels <= 2.6.30 need the fix. The problem was never noticed before,
it was found because it causes mysterious failures with GFS mount/umount.
Credits to Robert Peterson. He blaimed kthread.c from the very beginning.
But, despite my promise, I forgot to inspect the old implementation until
he did a lot of testing and reminded me. This led to huge delay in fixing
this bug.
kthread_stop() does put_task_struct(k) before it clears kthread_stop_info.k.
This means another kthread_create() can re-use this task_struct, but the
new kthread can still see kthread_should_stop() == T and exit even without
calling threadfn().
Reported-by: Robert Peterson <rpeterso@redhat.com>
Tested-by: Robert Peterson <rpeterso@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
/* Now set kthread_should_stop() to true, and wake it up. */
kthread_stop_info.k = k;
wake_up_process(k);
- put_task_struct(k);
/* Once it dies, reset stop ptr, gather result and we're done. */
wait_for_completion(&kthread_stop_info.done);
kthread_stop_info.k = NULL;
ret = kthread_stop_info.err;
+ put_task_struct(k);
mutex_unlock(&kthread_stop_lock);
trace_sched_kthread_stop_ret(ret);