Discussion:
[PATCH] bcache: fix uninterruptible sleep in writeback thread
Slava Pestov
2014-05-01 21:52:30 UTC
Permalink
There were two issues here:

- writeback thread did not start until the device first became dirty
- writeback thread used uninterruptible sleep once running

Without this patch I see kernel warnings printed and a load average of
1.52 after booting my test VM. With this patch the warnings are gone and
the load average is near 0.00 as expected.
---
drivers/md/bcache/super.c | 3 +++
drivers/md/bcache/writeback.c | 13 +++++++++----
drivers/md/bcache/writeback.h | 3 ++-
3 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 926ded8..3ebe829 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1041,6 +1041,9 @@ int bch_cached_dev_attach(struct cached_dev *dc, struct cache_set *c)
*/
atomic_set(&dc->count, 1);

+ if (bch_cached_dev_writeback_start(dc))
+ return -ENOMEM;
+
if (BDEV_STATE(&dc->sb) == BDEV_STATE_DIRTY) {
bch_sectors_dirty_init(dc);
atomic_set(&dc->has_dirty, 1);
diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
index f4300e4..08c1abb 100644
--- a/drivers/md/bcache/writeback.c
+++ b/drivers/md/bcache/writeback.c
@@ -239,7 +239,7 @@ static void read_dirty(struct cached_dev *dc)
if (KEY_START(&w->key) != dc->last_read ||
jiffies_to_msecs(delay) > 50)
while (!kthread_should_stop() && delay)
- delay = schedule_timeout_uninterruptible(delay);
+ delay = schedule_timeout_interruptible(delay);

dc->last_read = KEY_OFFSET(&w->key);

@@ -436,7 +436,7 @@ static int bch_writeback_thread(void *arg)
while (delay &&
!kthread_should_stop() &&
!test_bit(BCACHE_DEV_DETACHING, &dc->disk.flags))
- delay = schedule_timeout_uninterruptible(delay);
+ delay = schedule_timeout_interruptible(delay);
}
}

@@ -478,7 +478,7 @@ void bch_sectors_dirty_init(struct cached_dev *dc)
dc->disk.sectors_dirty_last = bcache_dev_sectors_dirty(&dc->disk);
}

-int bch_cached_dev_writeback_init(struct cached_dev *dc)
+void bch_cached_dev_writeback_init(struct cached_dev *dc)
{
sema_init(&dc->in_flight, 64);
init_rwsem(&dc->writeback_lock);
@@ -494,14 +494,19 @@ int bch_cached_dev_writeback_init(struct cached_dev *dc)
dc->writeback_rate_d_term = 30;
dc->writeback_rate_p_term_inverse = 6000;

+ INIT_DELAYED_WORK(&dc->writeback_rate_update, update_writeback_rate);
+}
+
+int bch_cached_dev_writeback_start(struct cached_dev *dc)
+{
dc->writeback_thread = kthread_create(bch_writeback_thread, dc,
"bcache_writeback");
if (IS_ERR(dc->writeback_thread))
return PTR_ERR(dc->writeback_thread);

- INIT_DELAYED_WORK(&dc->writeback_rate_update, update_writeback_rate);
schedule_delayed_work(&dc->writeback_rate_update,
dc->writeback_rate_update_seconds * HZ);
+ bch_writeback_queue(dc);

return 0;
}
diff --git a/drivers/md/bcache/writeback.h b/drivers/md/bcache/writeback.h
index e2f8598..0a9dab1 100644
--- a/drivers/md/bcache/writeback.h
+++ b/drivers/md/bcache/writeback.h
@@ -85,6 +85,7 @@ static inline void bch_writeback_add(struct cached_dev *dc)
void bcache_dev_sectors_dirty_add(struct cache_set *, unsigned, uint64_t, int);

void bch_sectors_dirty_init(struct cached_dev *dc);
-int bch_cached_dev_writeback_init(struct cached_dev *);
+void bch_cached_dev_writeback_init(struct cached_dev *);
+int bch_cached_dev_writeback_start(struct cached_dev *);

#endif
--
2.0.0.rc0
Daniel Smedegaard Buus
2014-05-02 07:20:14 UTC
Permalink
Post by Slava Pestov
- writeback thread did not start until the device first became dirty
- writeback thread used uninterruptible sleep once running
Without this patch I see kernel warnings printed and a load average of
1.52 after booting my test VM. With this patch the warnings are gone and
the load average is near 0.00 as expected.
Ooooh, that sounds fantastic! I can't test it now, I'm at work and
have to look at other task, and ATM I don't have an instance with
proper sources either... I'm currently running a mainline 3.15-rc3
kernel for Utopic (I'm on Ubuntu Trusty). I would love to give it a
whirl during the weekend, though.

Forgive me for being a rookie, but I'm not entirely sure which sources
I should test this on. Is there a git branch specific to bcache? Or
another specific kernel branch that I should use this on? Or would it
just be any recent version (such as 3.15-rc3, which I kinda need for
Java to work without oopsing).

Thanks,
Daniel :)
Daniel Smedegaard Buus
2014-05-02 08:10:05 UTC
Permalink
On Fri, May 2, 2014 at 9:20 AM, Daniel Smedegaard Buus
Post by Daniel Smedegaard Buus
Forgive me for being a rookie, but I'm not entirely sure which sources
I should test this on. Is there a git branch specific to bcache? Or
another specific kernel branch that I should use this on? Or would it
just be any recent version (such as 3.15-rc3, which I kinda need for
Java to work without oopsing).
Okay, it patched just fine against the 3.15-rc3 tar from kernel.org.
I'm compiling it to debs on a micro instance now, gonna check it out
later (probably gonna take forever ;) ).

Thanks!
Nikolay Amiantov
2014-05-05 22:30:34 UTC
Permalink
Post by Slava Pestov
- writeback thread did not start until the device first became dirty
- writeback thread used uninterruptible sleep once running
Without this patch I see kernel warnings printed and a load average of
1.52 after booting my test VM. With this patch the warnings are gone and
the load average is near 0.00 as expected.
I've tried this patch and it has indeed fixed [1]! Thanks!

[1]: https://bugzilla.kernel.org/show_bug.cgi?id=69471
Peter Kieser
2014-05-12 18:27:49 UTC
Permalink
Post by Nikolay Amiantov
Post by Slava Pestov
- writeback thread did not start until the device first became dirty
- writeback thread used uninterruptible sleep once running
Without this patch I see kernel warnings printed and a load average of
1.52 after booting my test VM. With this patch the warnings are gone and
the load average is near 0.00 as expected.
I've tried this patch and it has indeed fixed [1]! Thanks!
[1]: https://bugzilla.kernel.org/show_bug.cgi?id=69471
Kent,

Could you please review this patch, and have it pushed upstream?

Regards,
-Peter
Francis Moreau
2014-05-15 08:02:33 UTC
Permalink
Hello Jens,
Post by Peter Kieser
Post by Nikolay Amiantov
Post by Slava Pestov
- writeback thread did not start until the device first became dirty
- writeback thread used uninterruptible sleep once running
Without this patch I see kernel warnings printed and a load average of
1.52 after booting my test VM. With this patch the warnings are gone and
the load average is near 0.00 as expected.
I've tried this patch and it has indeed fixed [1]! Thanks!
[1]: https://bugzilla.kernel.org/show_bug.cgi?id=69471
Kent,
Could you please review this patch, and have it pushed upstream?
Would it be possible to merge this patch directly before 3.15 is being
released since kent don't seem to care about bugs in bcache or maybe he
does but very selectively ?

Also it would be great that stable trees will be fixed.

Eventually I would suggest to mark bcache as an experimental thing since
it's really not ready for production, just take a look at the bcache
mailing list to see why. At least people won't be disappointed when
they'll use bcache and see ton of koops.

Thanks.
Peter Kieser
2014-05-15 16:18:35 UTC
Permalink
Post by Francis Moreau
Eventually I would suggest to mark bcache as an experimental thing since
it's really not ready for production, just take a look at the bcache
mailing list to see why. At least people won't be disappointed when
they'll use bcache and see ton of koops.
I'm going to second this request to mark bcache as experimental. XFS
performance has been stunted (almost DoS condition) since 3.10.4, btrfs
doesn't currently work and Kent is leaving it up to the btrfs team to
fix the bugs in either bcache or btrfs. Multiple deadlocks/interruptible
sleeps and corruption.

-Peter
Ross Anderson
2014-05-15 17:29:53 UTC
Permalink
Post by Peter Kieser
Post by Francis Moreau
Eventually I would suggest to mark bcache as an experimental thing since
it's really not ready for production, just take a look at the bcache
mailing list to see why. At least people won't be disappointed when
they'll use bcache and see ton of koops.
I'm going to second this request to mark bcache as experimental. XFS
performance has been stunted (almost DoS condition) since 3.10.4,
btrfs doesn't currently work and Kent is leaving it up to the btrfs
team to fix the bugs in either bcache or btrfs. Multiple
deadlocks/interruptible sleeps and corruption.
-Peter
Greetings,

First let me say I can empathize with your struggles and agree
these bugs do need to be resolved. There's been a lot of work going into
resolving the issues without avail. Sometimes the bugs are other teams
issues as they are not following system that is outlined.
I'd like to represent the other side of the coin here. I'm aware
of numerous storage solution situations where bcache has been stable and
in production for 18-24 months. I myself have over 20 systems in place
with numerous file systems running without issues. There are storage
vendors in the process of introducing into their next release. If it all
possible, lets try to reach out to other developers and help Kent with
getting these few issues tracked down. They are obviously tricky. Again
it is up to others to decide.

Thanks,
Ross Anderson
Jens Axboe
2014-05-15 17:30:58 UTC
Permalink
Post by Francis Moreau
Hello Jens,
Post by Peter Kieser
Post by Nikolay Amiantov
Post by Slava Pestov
- writeback thread did not start until the device first became dirty
- writeback thread used uninterruptible sleep once running
Without this patch I see kernel warnings printed and a load average of
1.52 after booting my test VM. With this patch the warnings are gone and
the load average is near 0.00 as expected.
I've tried this patch and it has indeed fixed [1]! Thanks!
[1]: https://bugzilla.kernel.org/show_bug.cgi?id=69471
Kent,
Could you please review this patch, and have it pushed upstream?
Would it be possible to merge this patch directly before 3.15 is being
released since kent don't seem to care about bugs in bcache or maybe he
does but very selectively ?
Also it would be great that stable trees will be fixed.
Eventually I would suggest to mark bcache as an experimental thing since
it's really not ready for production, just take a look at the bcache
mailing list to see why. At least people won't be disappointed when
they'll use bcache and see ton of koops.
I'd really like to get Kent to weigh in on this. Sometimes it appears
straightforward to switch from uninterruptible to interruptible sleep,
but then signals get in the way.
--
Jens Axboe
Francis Moreau
2014-06-02 14:07:53 UTC
Permalink
Hello,
Post by Jens Axboe
Post by Francis Moreau
Hello Jens,
Post by Peter Kieser
Post by Nikolay Amiantov
Post by Slava Pestov
- writeback thread did not start until the device first became dirty
- writeback thread used uninterruptible sleep once running
Without this patch I see kernel warnings printed and a load average of
1.52 after booting my test VM. With this patch the warnings are gone and
the load average is near 0.00 as expected.
I've tried this patch and it has indeed fixed [1]! Thanks!
[1]: https://bugzilla.kernel.org/show_bug.cgi?id=69471
Kent,
Could you please review this patch, and have it pushed upstream?
Would it be possible to merge this patch directly before 3.15 is being
released since kent don't seem to care about bugs in bcache or maybe he
does but very selectively ?
Also it would be great that stable trees will be fixed.
Eventually I would suggest to mark bcache as an experimental thing since
it's really not ready for production, just take a look at the bcache
mailing list to see why. At least people won't be disappointed when
they'll use bcache and see ton of koops.
I'd really like to get Kent to weigh in on this. Sometimes it appears
straightforward to switch from uninterruptible to interruptible sleep,
but then signals get in the way.
Any progress ?

Thanks.
Francis Moreau
2014-07-25 07:30:29 UTC
Permalink
Hi,
Post by Francis Moreau
Hello,
Post by Jens Axboe
Post by Francis Moreau
Hello Jens,
Post by Peter Kieser
Post by Nikolay Amiantov
Post by Slava Pestov
- writeback thread did not start until the device first became dirty
- writeback thread used uninterruptible sleep once running
Without this patch I see kernel warnings printed and a load average of
1.52 after booting my test VM. With this patch the warnings are gone and
the load average is near 0.00 as expected.
I've tried this patch and it has indeed fixed [1]! Thanks!
[1]: https://bugzilla.kernel.org/show_bug.cgi?id=69471
Kent,
Could you please review this patch, and have it pushed upstream?
Would it be possible to merge this patch directly before 3.15 is being
released since kent don't seem to care about bugs in bcache or maybe he
does but very selectively ?
Also it would be great that stable trees will be fixed.
Eventually I would suggest to mark bcache as an experimental thing since
it's really not ready for production, just take a look at the bcache
mailing list to see why. At least people won't be disappointed when
they'll use bcache and see ton of koops.
I'd really like to get Kent to weigh in on this. Sometimes it appears
straightforward to switch from uninterruptible to interruptible sleep,
but then signals get in the way.
Any progress ?
still present on 3.15.5-2-ARCH :-/

Bye
Francis Moreau
2014-09-05 07:08:09 UTC
Permalink
Post by Francis Moreau
Hi,
Post by Francis Moreau
Hello,
Post by Jens Axboe
Post by Francis Moreau
Hello Jens,
Post by Peter Kieser
Post by Nikolay Amiantov
Post by Slava Pestov
- writeback thread did not start until the device first became dirty
- writeback thread used uninterruptible sleep once running
Without this patch I see kernel warnings printed and a load average of
1.52 after booting my test VM. With this patch the warnings are gone and
the load average is near 0.00 as expected.
I've tried this patch and it has indeed fixed [1]! Thanks!
[1]: https://bugzilla.kernel.org/show_bug.cgi?id=69471
Kent,
Could you please review this patch, and have it pushed upstream?
Would it be possible to merge this patch directly before 3.15 is being
released since kent don't seem to care about bugs in bcache or maybe he
does but very selectively ?
Also it would be great that stable trees will be fixed.
Eventually I would suggest to mark bcache as an experimental thing since
it's really not ready for production, just take a look at the bcache
mailing list to see why. At least people won't be disappointed when
they'll use bcache and see ton of koops.
I'd really like to get Kent to weigh in on this. Sometimes it appears
straightforward to switch from uninterruptible to interruptible sleep,
but then signals get in the way.
Any progress ?
still present on 3.15.5-2-ARCH :-/
still present on 3.14.17 :-/

Pavel Goran
2014-05-17 05:47:21 UTC
Permalink
Post by Slava Pestov
- writeback thread did not start until the device first became dirty
- writeback thread used uninterruptible sleep once running
Without this patch I see kernel warnings printed and a load average of
1.52 after booting my test VM. With this patch the warnings are gone and
the load average is near 0.00 as expected.
Uninterruptible sleep is indeed fixed (as well as the annoying "blocked for
more than 120 seconds" message), but with this patch, I have the following
problem. If I run a bcache device with no cache attached, the kernel hangs
when I try to reboot. It prints a long stack trace that takes all screen
space and I can't see what is above it.

If a cache is attached to the bcache device, rebooting works fine. Without
the patch, rebooting works both with and without attached cache.
Loading...