All volume data is stored with 3-way replication on separate disks and separate servers.
The underlying mechanism is Ceph RBD. Its data storage is very reliable in our experience. Even though we have to replace faulty disks every now and then, data remains intact and usually accessible. Access doesn’t even slow down much when we lose a disk, as data is just served from the remaining two copies while the missing 3rd copies are restored between the (hundreds) or remaining disks. Access does slow down noticeably when a replacement disk is inserted, as all those 3rd copies are then copied back to the “right” disk, which creates a bottleneck.
In the past few days (between 6 and 14 October) we had some networking issues that lead to (temporary) unavailability of volume data. This is a consequence of a tradeoff—“if in doubt, block access rather than risking data loss or inconsistency”. We regret these issues, they were very hard to diagnose, but we think we found the origin and should be able to work around them from now on until the underlying software bug (presumably in the Linux kernel) is fixed.
Another nice feature is that Ceph regularly “scrubs” its data: The three copies are read and compared periodically. This will detect accidental modifications of data “at rest”—and also latent disk errors.