3

Recently we are trying to tune ZFS on a machine which runs on 256RAM memory.

Our current ZFS memory variables for the ARC are max 255Gb and min 64Mb.

A main issue that we face is that during high peak times getting workflows aborded with not enough memory. (there are several flows that need up to 55G memory)
When tried to limit the max value to 4G we faced degradation with slow performance.

Output of

uname -a  
SunOS xxxxx 5.11 11.1 sun4v sparc sun4v   
Publisher: solaris
   Version: 0.5.11 (Oracle Solaris 11.1 SRU 1.4)
   Build Release: 5.11   
   Branch: 0.175.1.1.0.4.0

psrinfo -pv  
The physical processor has 2 cores and 16 virtual processors (0-15)  
The core has 8 virtual processors (0-7)  
The core has 8 virtual processors (8-15)  
SPARC-T4 (chipid 0, clock 2848 MHz)  

I am looking forward a rule of thumb in order to configure the values min/max arc memory.
Should the arc gets a fixed amount of memory (min max the same) or should check the max memory per timeslot (h or 1/2h ) and go with that value added a cap of ~ +10%

Edit 1 It's an application server with informatica powervemter 9.6.1 installed

Our current hit rate is above 96%

igiannak
  • 740
  • 1
  • 5
  • 23
  • You haven't given much information as to what the issue is, and what you're using the server for. ie: NFS server? DB server? App server? Web server? You only hint that ZFS is to blame for something, but provide no information on why. – sleepyweasel Aug 16 '17 at 17:34
  • 1
    Does "255" mean "255 GB"? What's your ARC hit rate (see https://community.oracle.com/docs/DOC-914874 for how to get that)? Generally, 90-95% is where you want to be. Making the ARC bigger to get the ARC hit rate higher won't improve performance much if at all. Go below about 90% and you'll start getting performance drop offs. Don't listen to "let the ARC grow, the memory is released when you need it" advice. That's garbage. ARC release is S-L-O-W, and a large ARC will wreak havoc on performance of apps that have transient demands for large amounts of huge pages, such as Oracle databases. – Andrew Henle Aug 17 '17 at 09:15
  • Our current hit rate is 96%+ – igiannak Aug 17 '17 at 09:25
  • @GiannakopoulosJ What's the output from `echo ::memstat | mdb -k`? Run that as root (from the global zone if you have zones). That will provide data as to how memory is being used right now. – Andrew Henle Aug 17 '17 at 09:26
  • I will ask for the info as we are the operations team and infra is supported by other team :/ – igiannak Aug 17 '17 at 09:28
  • @GiannakopoulosJ A 96%+ hit rate might be getting into the "higher than it really needs to be" range. If your ARC is taking a huge amount of RAM, like 180-200 GB, I'd say try limiting it to something like 32 or 64 GB and see what that does to both your performance and the ARC hit rate. If that gets you better performance without dramatically changing the hit rate - say it goes from 96% to 94%, keep shrinking the ARC until you see the hit rate dropping under 90% or so. Just make sure when you measure your hit rate you're not measuring across a full backup of the system or anything like that. – Andrew Henle Aug 17 '17 at 16:20
  • @AndrewHenle thank you for the pin points TBH things and suggestions like this I am looking in order to define actions for the infra team – igiannak Aug 17 '17 at 16:34
  • @GiannakopoulosJ In this answer (https://unix.stackexchange.com/questions/383472/slowness-at-outputs-in-solaris-8/383903#383903) I posted a couple of dTrace scripts that are extremely useful in determining what the kernel is doing at any time. When I've encountered performance issues that involve a lot of accumulated system time in the kernel, I've used those to quickly identify what's really going on. Sometimes, what I've found had nothing to do with the suspected cause... – Andrew Henle Aug 18 '17 at 16:03

1 Answers1

2

Still suggest you provide additional information so that you'd getter better advice.

First, you may want to consider upgrading to 11.3 for some additional features and ZFS performance tweaks: https://blogs.oracle.com/zfs/welcome-to-oracle-solaris-113-zfs

You don't note which SRU you have installed, but Solaris 11.2 and Solaris 11.1 SRU 20.5 or newer includes a new tunable parameter user_reserve_hint_pct that provides a hint for how much memory to reserve for application use, thereby limiting how much memory can be used by the ZFS ARC cache.

You could check out Joerg's blog: http://www.c0t0d0s0.org/archives/7757-user_reserve_hint_pct.html, or check out MOS DOC Memory Management Between ZFS and Applications in Oracle Solaris 11.x (Doc ID 1663862.1) directly.

sleepyweasel
  • 1,013
  • 5
  • 11