Correct way to remove a cache device?

Discussion:

Daniel Smedegaard Buus

2014-03-31 11:47:21 UTC

Hi there.

Still having issues with bcache on my AWS EC2 adventure...

I'm trying to figure out what the correct way of taking down a bcache
cache device is.

If I echo 1 to /sys/block/BACKING_DEVICE/bcache/detach, and then to
/sys/fs/bcache/*/unregister, the system will hang. The detach part
goes well, but immediately after unregistering, it will crash.

I only have SSH access to this instance, and get no output from the
shell, but if I do this in a startup script, what I see from the
system log in the AWS console is the below output.

Any ideas?

Output:

[ 20.756111] BUG: unable to handle kernel NULL pointer
dereference at 0000000000000a00
[ 20.756125] IP: [<ffffffffa0066280>]
journal_write_unlocked+0x130/0x540 [bcache]
[ 20.756137] PGD 0
[ 20.756139] Oops: 0000 [#1] SMP
[ 20.756143] Modules linked in: dm_crypt isofs raid10 raid456
async_memcpy async_raid6_recov async_pq async_xor async_tx xor
raid6_pq raid1 multipath linear bcache raid0 crct10dif_pclmul
crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul
glue_helper ablk_helper cryptd
[ 20.756165] CPU: 0 PID: 30 Comm: kworker/0:1 Not tainted
3.13.0-19-generic #39-Ubuntu
[ 20.756173] Workqueue: events journal_write_work [bcache]
[ 20.756176] task: ffff8800e8eedfc0 ti: ffff8800e8fe4000
task.ti: ffff8800e8fe4000
[ 20.756179] RIP: e030:[<ffffffffa0066280>]
[<ffffffffa0066280>] journal_write_unlocked+0x130/0x540 [bcache]
[ 20.756187] RSP: e02b:ffff8800e8fe5d90 EFLAGS: 00010202
[ 20.756189] RAX: 0000000000000000 RBX: 0000000000000001 RCX:
0000000000000000
[ 20.756192] RDX: ffff8800e60c0c48 RSI: ffff8800e60ccad8 RDI:
ffff8800e60f8040
[ 20.756194] RBP: ffff8800e8fe5de8 R08: 200398332f400000 R09:
5e80000000000000
[ 20.756197] R10: dffbefcdb6cccbd0 R11: 0000000000000000 R12:
0000000000000001
[ 20.756200] R13: ffff8800e60ccba0 R14: ffff8800e60ccce8 R15:
ffff8800e60c0000
[ 20.756206] FS: 00007f65089c7740(0000)
GS:ffff8800ef600000(0000) knlGS:0000000000000000
[ 20.756209] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 20.756211] CR2: 0000000000000a00 CR3: 00000000e70f6000 CR4:
0000000000002660
[ 20.756214] Stack:
[ 20.756216] ffff8800e8fe5db0 ffff8800e60c0000 ffffffff81c15480
ffffffff81c15480
[ 20.756221] ffff8800e8fe5dc8 ffffffff8109cb2d ffff8800e60c0000
ffff8800e60ccba0
[ 20.756225] ffff8800e60ccbd0 0000000000000000 0000000000000000
ffff8800e8fe5e08
[ 20.756229] Call Trace:
[ 20.756237] [<ffffffff8109cb2d>] ? vtime_common_task_switch+0x3d/0x40
[ 20.756243] [<ffffffffa00666e0>] journal_try_write+0x50/0x60 [bcache]
[ 20.756248] [<ffffffffa0066712>] journal_write_work+0x22/0x30 [bcache]
[ 20.756253] [<ffffffff810824a2>] process_one_work+0x182/0x450
[ 20.756257] [<ffffffff81083241>] worker_thread+0x121/0x410
[ 20.756260] [<ffffffff81083120>] ? rescuer_thread+0x3e0/0x3e0
[ 20.756264] [<ffffffff81089ed2>] kthread+0xd2/0xf0
[ 20.756267] [<ffffffff81089e00>] ? kthread_create_on_node+0x190/0x190
[ 20.756273] [<ffffffff817219bc>] ret_from_fork+0x7c/0xb0
[ 20.756276] [<ffffffff81089e00>] ? kthread_create_on_node+0x190/0x190
[ 20.756278] Code: 00 00 e8 04 03 30 e1 31 c0 66 41 83 bd 94 38
ff ff 00 49 8b 8d a0 40 ff ff 49 8d 97 48 0c 00 00 74 3c 66 0f 1f 84
00 00 00 00 00 <48> 8b b9 00 0a 00 00 0f b7 89 ce 00 00 00 83 c0 01 49
8b 36 48
[ 20.756310] RIP [<ffffffffa0066280>]
journal_write_unlocked+0x130/0x540 [bcache]
[ 20.756316] RSP <ffff8800e8fe5d90>
[ 20.756317] CR2: 0000000000000a00
[ 20.756320] ---[ end trace 84c8ace3e9ccb27e ]---
[ 20.756384] BUG: unable to handle kernel paging request at
ffffffffffffffd8
[ 20.756390] IP: [<ffffffff8108a570>] kthread_data+0x10/0x20
[ 20.756396] PGD 1c11067 PUD 1c13067 PMD 0
[ 20.756401] Oops: 0000 [#2] SMP
[ 20.756405] Modules linked in: dm_crypt isofs raid10 raid456
async_memcpy async_raid6_recov async_pq async_xor async_tx xor
raid6_pq raid1 multipath linear bcache raid0 crct10dif_pclmul
crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul
glue_helper ablk_helper cryptd
[ 20.756434] CPU: 0 PID: 30 Comm: kworker/0:1 Tainted: G D
3.13.0-19-generic #39-Ubuntu
[ 20.756450] task: ffff8800e8eedfc0 ti: ffff8800e8fe4000
task.ti: ffff8800e8fe4000
[ 20.756455] RIP: e030:[<ffffffff8108a570>]
[<ffffffff8108a570>] kthread_data+0x10/0x20
[ 20.756461] RSP: e02b:ffff8800e8fe59e8 EFLAGS: 00010002
[ 20.756464] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
0000000000000005
[ 20.756468] RDX: 0000000000000004 RSI: 0000000000000000 RDI:
ffff8800e8eedfc0
[ 20.756472] RBP: ffff8800e8fe59e8 R08: 0000000000000000 R09:
ffff8800ef618580
[ 20.756476] R10: ffffffff8133443a R11: ffffea0003996900 R12:
ffff8800ef614440
[ 20.756481] R13: 0000000000000000 R14: ffff8800e8eedfb0 R15:
ffff8800e8eedfc0
[ 20.756487] FS: 00007f65089c7740(0000)
GS:ffff8800ef600000(0000) knlGS:0000000000000000
[ 20.756492] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 20.756497] CR2: 0000000000000028 CR3: 00000000e70f6000 CR4:
0000000000002660
[ 20.756501] Stack:
[ 20.756504] ffff8800e8fe5a00 ffffffff81083951 ffff8800e8eedfc0
ffff8800e8fe5a60
[ 20.756511] ffffffff81715249 ffff8800e8eedfc0 ffff8800e8fe5fd8
0000000000014440
[ 20.756518] 0000000000014440 ffff8800e8eedfc0 ffff8800e8eee5f8
ffff8800e8eedfb0
[ 20.756525] Call Trace:
[ 20.756530] [<ffffffff81083951>] wq_worker_sleeping+0x11/0x90
[ 20.756536] [<ffffffff81715249>] __schedule+0x589/0x7d0
[ 20.756541] [<ffffffff817154b9>] schedule+0x29/0x70
[ 20.756547] [<ffffffff81068c3f>] do_exit+0x6df/0xa50
[ 20.756553] [<ffffffff8171a539>] oops_end+0xa9/0x150
[ 20.756559] [<ffffffff81709614>] no_context+0x27e/0x28b
[ 20.756564] [<ffffffff81709694>] __bad_area_nosemaphore+0x73/0x1ca
[ 20.756570] [<ffffffff817097fe>] bad_area_nosemaphore+0x13/0x15
[ 20.756576] [<ffffffff8171cf07>] __do_page_fault+0xa7/0x560
[ 20.756582] [<ffffffff81718eb0>] ? _raw_spin_unlock_irqrestore+0x20/0x40
[ 20.756589] [<ffffffff810a95f4>] ? __wake_up+0x44/0x50
[ 20.756595] [<ffffffff81641479>] ?
netlink_broadcast_filtered+0x129/0x3b0
[ 20.756602] [<ffffffff8135c510>] ? kobj_ns_drop+0x50/0x50
[ 20.756607] [<ffffffff8171d3da>] do_page_fault+0x1a/0x70
[ 20.756611] [<ffffffff81719848>] page_fault+0x28/0x30
[ 20.756616] [<ffffffffa0066280>] ?
journal_write_unlocked+0x130/0x540 [bcache]
[ 20.756620] [<ffffffff8109cb2d>] ? vtime_common_task_switch+0x3d/0x40
[ 20.756625] [<ffffffffa00666e0>] journal_try_write+0x50/0x60 [bcache]
[ 20.756630] [<ffffffffa0066712>] journal_write_work+0x22/0x30 [bcache]
[ 20.756634] [<ffffffff810824a2>] process_one_work+0x182/0x450
[ 20.756638] [<ffffffff81083241>] worker_thread+0x121/0x410
[ 20.756641] [<ffffffff81083120>] ? rescuer_thread+0x3e0/0x3e0
[ 20.756644] [<ffffffff81089ed2>] kthread+0xd2/0xf0
[ 20.756648] [<ffffffff81089e00>] ? kthread_create_on_node+0x190/0x190
[ 20.756651] [<ffffffff817219bc>] ret_from_fork+0x7c/0xb0
[ 20.756654] [<ffffffff81089e00>] ? kthread_create_on_node+0x190/0x190
[ 20.756657] Code: 00 48 89 e5 5d 48 8b 40 c8 48 c1 e8 02 83 e0
01 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 87 a8 03 00
00 55 48 89 e5 <48> 8b 40 d8 5d c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f
44 00 00
[ 20.756688] RIP [<ffffffff8108a570>] kthread_data+0x10/0x20
[ 20.756691] RSP <ffff8800e8fe59e8>
[ 20.756693] CR2: ffffffffffffffd8
[ 20.756695] ---[ end trace 84c8ace3e9ccb27f ]---
[ 20.756697] Fixing recursive fault but reboot is needed!

Sitsofe Wheeler

2014-03-31 12:06:24 UTC

Permalink

Post by Daniel Smedegaard Buus
I'm trying to figure out what the correct way of taking down a bcache
cache device is.
If I echo 1 to /sys/block/BACKING_DEVICE/bcache/detach, and then to
/sys/fs/bcache/*/unregister, the system will hang. The detach part
goes well, but immediately after unregistering, it will crash.

Try sleeping at least two seconds between the operations.

--
Sitsofe | http://sucs.org/~sits/

Daniel Smedegaard Buus

2014-03-31 12:13:56 UTC

Permalink

Post by Sitsofe Wheeler

Try sleeping at least two seconds between the operations.

Hi Sitsofe, thanks for replying :)

I actually tried sleeping for five seconds, I'll try upping it and see
what happens.

Daniel Smedegaard Buus

2014-03-31 12:30:06 UTC

Permalink

On Mon, Mar 31, 2014 at 2:13 PM, Daniel Smedegaard Buus

Post by Daniel Smedegaard Buus
I actually tried sleeping for five seconds, I'll try upping it and see
what happens.

Hmmm, it's pretty flaky. At first, increasing the wait time to ten
seconds seemed to work. I then tried again, and this time I got to get
a (script-produced) message about /sys/fs/bcache/*/stop not existing,
which means it actually doesn't fail on the unregister part, but some
time after the detach part. It's not consistent, though.

Is the sequence incorrect? I.e. detach, then unregister? I actually
had it the other way around at first, but my debugging led me to try
to switch them.

Sitsofe Wheeler

2014-03-31 12:37:57 UTC

Permalink

Post by Daniel Smedegaard Buus
On Mon, Mar 31, 2014 at 2:13 PM, Daniel Smedegaard Buus

Post by Daniel Smedegaard Buus
I actually tried sleeping for five seconds, I'll try upping it and see
what happens.

Hmmm, it's pretty flaky. At first, increasing the wait time to ten
seconds seemed to work. I then tried again, and this time I got to get
a (script-produced) message about /sys/fs/bcache/*/stop not existing,
which means it actually doesn't fail on the unregister part, but some
time after the detach part. It's not consistent, though.
Is the sequence incorrect? I.e. detach, then unregister? I actually
had it the other way around at first, but my debugging led me to try
to switch them.

To the best of my knowledge you're not doing anything wrong - it's been
flaky for me too. Offhand I think I could detach the front device, wait,
then stop the backing device but I have a feeling doing it over and over
always resulted in problems (such as the one described on
https://bugzilla.redhat.com/show_bug.cgi?id=1074492 ) until the system
was rebooted...

--
Sitsofe | http://sucs.org/~sits/

Daniel Smedegaard Buus

2014-03-31 12:42:17 UTC

Permalink

Post by Sitsofe Wheeler

Post by Daniel Smedegaard Buus
Is the sequence incorrect? I.e. detach, then unregister? I actually
had it the other way around at first, but my debugging led me to try
to switch them.

Cool enough then, all things considered ;)

If it's just about waiting a long time and doing an extra reboot, then
that's not an issue. I just need to figure out how to make it work
consistently. This is for disaster recovery anyway, so having to wait
and reboot is no bigge ;)

Thanks!

Daniel

Daniel Smedegaard Buus

2014-03-31 13:04:09 UTC

Permalink

Correction to the previous correction about it failing on detach, not
unregister. I think I may have been simply running the detach commands
twice in sequence without recreating the cache device and re-attaching
it in-between.

However, it seems that waiting isn't really going to help. 30 seconds
didn't help either =E2=80=94 it seems to die whenever some other proces=
s
fiddles with the detached cache device =E2=80=94 be it the unregister
operation or my subsequent mdadm --stop (as this is a raid0 of two ssd
devices).

I'll try to work around it somehow. Thanks for your time :)

Sitsofe Wheeler

2014-03-31 13:39:25 UTC

Permalink

Any ideas about this oops Kent? I've seen similar problems too...

Post by Daniel Smedegaard Buus
Still having issues with bcache on my AWS EC2 adventure...
I'm trying to figure out what the correct way of taking down a bcache
cache device is.
If I echo 1 to /sys/block/BACKING_DEVICE/bcache/detach, and then to
/sys/fs/bcache/*/unregister, the system will hang. The detach part
goes well, but immediately after unregistering, it will crash.
I only have SSH access to this instance, and get no output from the
shell, but if I do this in a startup script, what I see from the
system log in the AWS console is the below output.
Any ideas?
[ 20.756111] BUG: unable to handle kernel NULL pointer
dereference at 0000000000000a00
[ 20.756125] IP: [<ffffffffa0066280>]
journal_write_unlocked+0x130/0x540 [bcache]
[ 20.756137] PGD 0
[ 20.756139] Oops: 0000 [#1] SMP
[ 20.756143] Modules linked in: dm_crypt isofs raid10 raid456
async_memcpy async_raid6_recov async_pq async_xor async_tx xor
raid6_pq raid1 multipath linear bcache raid0 crct10dif_pclmul
crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul
glue_helper ablk_helper cryptd
[ 20.756165] CPU: 0 PID: 30 Comm: kworker/0:1 Not tainted
3.13.0-19-generic #39-Ubuntu
[ 20.756173] Workqueue: events journal_write_work [bcache]
[ 20.756176] task: ffff8800e8eedfc0 ti: ffff8800e8fe4000
task.ti: ffff8800e8fe4000
[ 20.756179] RIP: e030:[<ffffffffa0066280>]
[<ffffffffa0066280>] journal_write_unlocked+0x130/0x540 [bcache]
[ 20.756187] RSP: e02b:ffff8800e8fe5d90 EFLAGS: 00010202
0000000000000000
ffff8800e60f8040
5e80000000000000
0000000000000001
ffff8800e60c0000
[ 20.756206] FS: 00007f65089c7740(0000)
GS:ffff8800ef600000(0000) knlGS:0000000000000000
[ 20.756209] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
0000000000002660
[ 20.756216] ffff8800e8fe5db0 ffff8800e60c0000 ffffffff81c15480
ffffffff81c15480
[ 20.756221] ffff8800e8fe5dc8 ffffffff8109cb2d ffff8800e60c0000
ffff8800e60ccba0
[ 20.756225] ffff8800e60ccbd0 0000000000000000 0000000000000000
ffff8800e8fe5e08
[ 20.756237] [<ffffffff8109cb2d>] ? vtime_common_task_switch+0x3d/0x40
[ 20.756243] [<ffffffffa00666e0>] journal_try_write+0x50/0x60 [bcache]
[ 20.756248] [<ffffffffa0066712>] journal_write_work+0x22/0x30 [bcache]
[ 20.756253] [<ffffffff810824a2>] process_one_work+0x182/0x450
[ 20.756257] [<ffffffff81083241>] worker_thread+0x121/0x410
[ 20.756260] [<ffffffff81083120>] ? rescuer_thread+0x3e0/0x3e0
[ 20.756264] [<ffffffff81089ed2>] kthread+0xd2/0xf0
[ 20.756267] [<ffffffff81089e00>] ? kthread_create_on_node+0x190/0x190
[ 20.756273] [<ffffffff817219bc>] ret_from_fork+0x7c/0xb0
[ 20.756276] [<ffffffff81089e00>] ? kthread_create_on_node+0x190/0x190
[ 20.756278] Code: 00 00 e8 04 03 30 e1 31 c0 66 41 83 bd 94 38
ff ff 00 49 8b 8d a0 40 ff ff 49 8d 97 48 0c 00 00 74 3c 66 0f 1f 84
00 00 00 00 00 <48> 8b b9 00 0a 00 00 0f b7 89 ce 00 00 00 83 c0 01 49
8b 36 48
[ 20.756310] RIP [<ffffffffa0066280>]
journal_write_unlocked+0x130/0x540 [bcache]
[ 20.756316] RSP <ffff8800e8fe5d90>
[ 20.756317] CR2: 0000000000000a00
[ 20.756320] ---[ end trace 84c8ace3e9ccb27e ]---
[ 20.756384] BUG: unable to handle kernel paging request at
ffffffffffffffd8
[ 20.756390] IP: [<ffffffff8108a570>] kthread_data+0x10/0x20
[ 20.756396] PGD 1c11067 PUD 1c13067 PMD 0
[ 20.756401] Oops: 0000 [#2] SMP
[ 20.756405] Modules linked in: dm_crypt isofs raid10 raid456
async_memcpy async_raid6_recov async_pq async_xor async_tx xor
raid6_pq raid1 multipath linear bcache raid0 crct10dif_pclmul
crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul
glue_helper ablk_helper cryptd
[ 20.756434] CPU: 0 PID: 30 Comm: kworker/0:1 Tainted: G D
3.13.0-19-generic #39-Ubuntu
[ 20.756450] task: ffff8800e8eedfc0 ti: ffff8800e8fe4000
task.ti: ffff8800e8fe4000
[ 20.756455] RIP: e030:[<ffffffff8108a570>]
[<ffffffff8108a570>] kthread_data+0x10/0x20
[ 20.756461] RSP: e02b:ffff8800e8fe59e8 EFLAGS: 00010002
0000000000000005
ffff8800e8eedfc0
ffff8800ef618580
ffff8800ef614440
ffff8800e8eedfc0
[ 20.756487] FS: 00007f65089c7740(0000)
GS:ffff8800ef600000(0000) knlGS:0000000000000000
[ 20.756492] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
0000000000002660
[ 20.756504] ffff8800e8fe5a00 ffffffff81083951 ffff8800e8eedfc0
ffff8800e8fe5a60
[ 20.756511] ffffffff81715249 ffff8800e8eedfc0 ffff8800e8fe5fd8
0000000000014440
[ 20.756518] 0000000000014440 ffff8800e8eedfc0 ffff8800e8eee5f8
ffff8800e8eedfb0
[ 20.756530] [<ffffffff81083951>] wq_worker_sleeping+0x11/0x90
[ 20.756536] [<ffffffff81715249>] __schedule+0x589/0x7d0
[ 20.756541] [<ffffffff817154b9>] schedule+0x29/0x70
[ 20.756547] [<ffffffff81068c3f>] do_exit+0x6df/0xa50
[ 20.756553] [<ffffffff8171a539>] oops_end+0xa9/0x150
[ 20.756559] [<ffffffff81709614>] no_context+0x27e/0x28b
[ 20.756564] [<ffffffff81709694>] __bad_area_nosemaphore+0x73/0x1ca
[ 20.756570] [<ffffffff817097fe>] bad_area_nosemaphore+0x13/0x15
[ 20.756576] [<ffffffff8171cf07>] __do_page_fault+0xa7/0x560
[ 20.756582] [<ffffffff81718eb0>] ? _raw_spin_unlock_irqrestore+0x20/0x40
[ 20.756589] [<ffffffff810a95f4>] ? __wake_up+0x44/0x50
[ 20.756595] [<ffffffff81641479>] ?
netlink_broadcast_filtered+0x129/0x3b0
[ 20.756602] [<ffffffff8135c510>] ? kobj_ns_drop+0x50/0x50
[ 20.756607] [<ffffffff8171d3da>] do_page_fault+0x1a/0x70
[ 20.756611] [<ffffffff81719848>] page_fault+0x28/0x30
[ 20.756616] [<ffffffffa0066280>] ?
journal_write_unlocked+0x130/0x540 [bcache]
[ 20.756620] [<ffffffff8109cb2d>] ? vtime_common_task_switch+0x3d/0x40
[ 20.756625] [<ffffffffa00666e0>] journal_try_write+0x50/0x60 [bcache]
[ 20.756630] [<ffffffffa0066712>] journal_write_work+0x22/0x30 [bcache]
[ 20.756634] [<ffffffff810824a2>] process_one_work+0x182/0x450
[ 20.756638] [<ffffffff81083241>] worker_thread+0x121/0x410
[ 20.756641] [<ffffffff81083120>] ? rescuer_thread+0x3e0/0x3e0
[ 20.756644] [<ffffffff81089ed2>] kthread+0xd2/0xf0
[ 20.756648] [<ffffffff81089e00>] ? kthread_create_on_node+0x190/0x190
[ 20.756651] [<ffffffff817219bc>] ret_from_fork+0x7c/0xb0
[ 20.756654] [<ffffffff81089e00>] ? kthread_create_on_node+0x190/0x190
[ 20.756657] Code: 00 48 89 e5 5d 48 8b 40 c8 48 c1 e8 02 83 e0
01 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 87 a8 03 00
00 55 48 89 e5 <48> 8b 40 d8 5d c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f
44 00 00
[ 20.756688] RIP [<ffffffff8108a570>] kthread_data+0x10/0x20
[ 20.756691] RSP <ffff8800e8fe59e8>
[ 20.756693] CR2: ffffffffffffffd8
[ 20.756695] ---[ end trace 84c8ace3e9ccb27f ]---
[ 20.756697] Fixing recursive fault but reboot is needed!

--
Sitsofe | http://sucs.org/~sits/

Daniel Smedegaard Buus

2014-04-01 07:01:55 UTC

Permalink

I couldn't figure out a predictable way to detach it properly. Even
when doing it as early as possible in the boot sequence, it'd succeed
anywhere from the first time I tried to seven reboots later.

I actually ended up doing something very nasty, but very efficient: I
simply dd zeroes to the beginning of the cache device, then reboot,
and the kernel would no longer recognize the cache device, and I could
continue normally.

Not pretty by any standard (actually makes me feel like showering),
but it works ;)