Discussion:
Md stripe on top of bcache (using EC2 SSDs + EBS)
Daniel Smedegaard Buus
2014-02-06 13:05:08 UTC
Permalink
Hi :)

Sorry if this isn't the right place to ask questions like this, but
I'm having a bit of trouble getting information about this.

Thing is, I'm trying to set up bcache for an Amazon EC2 machine which
has two ephemeral SSDs. They're not very large, and I'm going to need
more space than what they provide, so I was thinking about using
bcache and use them together with larger (and much slower) EBS drives,
then add mdadm striping on top of that.

This works (I'm using kernel 3.13, by the way), but the performance is
pretty bad. I'm wondering about what the optimal settings would be for
such a setup. Both for the bcache backed devices, and for the md array
itself.

I'm thinking that the chunk size of the md array might negatively
affect the way bcache works, and that it might be better to set it to
something like 4k, assuming that that's what bcache works with
internally (the SSDs being 4k and so on)?

Any ideas?

Thanks in advance,
Daniel
Daniel Smedegaard Buus
2014-02-06 14:28:55 UTC
Permalink
Hi Matthew, and thanks for responding :)
EBS is dog-ass SLOW! so unless your data happens to be in the locally attached SSD you won't see a benefit. Plus as you are using ephemeral SSD you have to use write-thru to have a prayer of having your data survive so once more you won't see a performance increase on writes.
Well, about the slowness, that's why I'd want to use bcache on top of
it :) With 32GB of SSD, I imagine the hot set of data (cassandra db)
would easily fit for the very foreseeable future. Plus, switching from
standard EBS to IO prioritized would be very easy.

Since it's cassandra, I'm not concerned about the survival of any one
node, and so writeback mode would be perfectly fine for my needs.
What exactly are you MDRAID'ing across? Your chunk size should be AT LEAST 64K and I would suggest 256K as a better value.
Since I have two SSDs, and bcache supports only one cache device for a
bcache device, I'm looking at joining two bcache devices with md in
stripe mode, hence my questions about chunk size in this setup. That
is,

EBS 1 as backing device 1 + SSD 1 as cache device 1 = bcache0
EBS 2 as backing device 2 + SSD 2 as cache device 2 = bcache1

mdadm stripe of bcache0 and bcache1 = storage device
Don't MD raid bcache devices but rather md raid some number of EBS volumes (simple stripe is sufficient) and attach THAT to the ephemeral SSDs.
Okay, if that's the way to go, that's what I'll do. It'd just be
really nice if I could utilize both SSDs. Does it make sense to mdadm
the two SSDs in stripe and use the array as the cache device or would
that also negatively impact performance?
What hardware type are you using? The SSDs should be of very good quality (Intel DC3500 or Samsung's not publicly available enterprise) at least that's the setup we were using for S3 nodes. EC2 should be using the same hardware configuration.
I'm currently testing on a c3.large. Not sure what hardware is behind that.

Thank you for your time :)

Daniel

Loading...