The unfortunate situation that we're in right now is that historically, we've been over-provisioning RAM to our on-prem VMs and we're starting to run up to the limits of actual usage with midday spikes threatening to OOM a host. At the moment we're seeing VMware's balloon driver kick in to reclaim RAM from cache, but some of our applications are sensitive to this particular blunt instrument, namely Elasticsearch, causing oom-killer to trigger.
What I've been looking for is a tunable parameter to cause older inactive pages to be evicted from the cache after a period of time, rather than residing there until some kind of contention throws them out. It looks like RHEL 5 had /proc/sys/vm/pagecache to at least define a ratio for how much overall space the cache could consume, but that didn't even last until RHEL 6 which I'm not terribly surprised by since a ratio approach quite obviously "smells bad" and there's already min_free_kb that accomplishes the same goal, but better.
Is there a "cache expiry" tunable I've missed somewhere, or perhaps another approach to clear out the cache that isn't quite as aggressive as sync; echo 1 > /proc/sys/vm/drop_caches?
For the record, I know that the true solution is "use less RAM" and/or "get more RAM" and I am very loudly sounding those alarms, but the business is slow to approve any course of action and I need to address this somehow in the interim.