In “Sharding, horizontal and vertical scaling“, Tom Christ from Tumblr discussed sharding for scale. He discussed scale dimensions of read, write, and size. While challenges are often specific to data sets and/or applications, there are ways to categorize and reason about them. Read challenges include query complexity, query volume, and buffer pool effectiveness. To fix read issues you can scale horizontally with replica fanout, or you can add caching. You can also scale vertical with bigger servers, or leverage smaller concurrent queries.
He discussed the options for write and size scaling and this is where sharding comes into help. He discussed the trade off between costs, complexity, write amplification, and concurrency when deciding how to scale and if sharding should be part of your solution. Tom discussed the types of sharding lookup, range, and mod sharding. He also went over forms of caching: key/value (memcache) and structured (redis).
All in all a good overview of issues with scale and how sharding and cache can help and the pitfalls.
[…] Sharding, horizontal and vertical scaling […]
LikeLike