I'm inclined to suggest replication which is data agnostic, like drbd. The large number of files is going to cause anything running at a higher level than "block storage" to spend an inordinate amount of time walking the tree - as you've found using rsync or creating inotify watches.
The short version of my personal story backing that up: I've not used Ceph, but I'm pretty sure this isn't in their prime market target based on its similarity to Gluster. I have, however, been trying to implement this kind of solution with Gluster for the past several years. It's been up and running most of that time, though several major version updates, but I've had no end of problems. If your goal is more redundancy than performance, Gluster may not be a good solution. Particularly if your usage pattern has a lot of stat() calls, Gluster doesn't do real well with replication. This is because stat calls to replicated volumes go to all of the replicated nodes (actually "bricks", but you're probably just going to have one brick per host). If you have a 2-way replica, for example, each stat() from a client waits for a response from both bricks to make sure it's using current data. Then you also have the FUSE overhead and lack of caching if you're using the native gluster filesystem for redundancy (rather than using Gluster as the backend with NFS as the protocol and automounter for redundancy, which still sucks for the stat() reason). Gluster does really well with large files where you can spread data across multiple servers, though; the data striping and distribution works well, as that's really what it's for. And the newer RAID10-type replication performs better than the older straight replicated volumes. But, based on what I'm guessing is your usage model, I'd advise against it.
Bear in mind that you'll probably have to find a way to have master elections between the machines, or implement distributed locking. The shared block device solutions require a filesystem which is multi-master aware (like GFS), or requires that only one node mount the filesystem read-write. Filesystems in general dislike when data is changed at the block device level underneath them. That means your clients will need to be able to tell which is the master, and direct write requests there. That may turn out to be a big nuisance. If GFS and all of its supporting infrastructure is an option, drbd in multi-master mode (they call it "dual primary") could work well. https://www.drbd.org/en/doc/users-guide-83/s-dual-primary-mode for more information on that.
Regardless of direction you go with, you're apt to find that this is still a fairly big pain to do realtime without just giving a SAN company a truckload of money.