SUSE 11 Memory Leak

Hey folks,

Was hoping if anyone could assist - i am experiencing Memory Leak on our SUSE 11 physical box - we’ve executed supportconfig - but unable to gather any helpful insight on it.

Any suggestions or direction would be helpful.

Regards,

@Rohan_Khanna Hi and welcome to the Forum :smile:

Sounds like you might have to run valgrind, or do you know the misbehaving application/service?

Does the system have ECC RAM, if so do you run rasdaemon to check for errors?

Thank you @malcolmlewis1!

We haven’t used valgrind yet - we were trying to use supportconfig - which is the advised tool from SUSE - do you believe valgrind can help us get more insight on Memory Leak?

We don’t know which application is responsible for this but we have a reason to believe that it could be OS related processes

We don’t have ECC RAM.

Let me know if you believe i can share supportconfig logs here which might help get better insight?

Regards

@Rohan_Khanna You probably need to run top to monitor, also watch /proc/meminfo if not sure what application it is…

Is it over a long period of time, or fairly quickly?

What sort of services are running?

Hi @malcolmlewis1,

So here’s the interesting part - when we run “top” command we don’t see that memory add up - for example when our memory consumption on SUSE is showing “60 GBs” - when we run top command we only see processes which are contributing to 30 GBs - remaining 30 GBs we don’t know where that is coming from.

We have tools like Dynatrace - it shows the remaining memory consumption is happening thru “Other Processes” when we connected with Dynatrace Support they indicated its not an Application process but most likely OS process.

not sure if its related but we identified “tmpfs” had an allocated space of 32 GBs - however, its used memory was only in KBs so we are still not sure what’s contributing to that 30 GBs of memory

@Rohan_Khanna Sure it’s not just cached? The cat /proc/meminfo should show where.

If cached, then does it change when you run /usr/bin/echo 3 > /proc/sys/vm/drop_caches?

For example;

free -h
               total        used        free      shared  buff/cache   available
Mem:           125Gi        10Gi        89Gi       395Mi        26Gi       114Gi

/usr/bin/echo 3 > /proc/sys/vm/drop_caches

free -h
               total        used        free      shared  buff/cache   available
Mem:           125Gi        10Gi       115Gi       395Mi       1.7Gi       115Gi

@Rohan_Khanna You said “SUSE 11” but SUSE is the company so presumably you mean SLES11 but which SP?

Apologies should’ve mentioned this before but its SUSE 11 SP4

Hi @malcolmlewis1 ,

Sure will execute this and will share an update shortly - btw its a gradual increase in memory consumption does not happen overnight, could take about couple of weeks or a month to consume that much memory.

Hi @malcolmlewis1 ,

Apologies it is ECC RAM - but we are unable to install rasdaemon - it says “No Provider available”

as for cache command please find details below - yes we do see a change there:

Before:
image

After Execution of Command

image

Let me know your thoughts.

@Rohan_Khanna So that recovered 10G by the looks? I suggest you capture the output of the meminfo, wait some time and then do another capture to see what has been consumed.

Maybe some sysctl tweaks are needed…?

So the actual applications you have running all look normal for their memory consumption?

hi @malcolmlewis1 ,

So it seems meminfo came back to its original state after we executed the command - what tweaks of sysctl are you thinking of?

All Applications look normal we’ve not come across any app which shows memory consumption.

Any other suggestions you have in mind around this?

Regards,

@Rohan_Khanna So running the command sufficed to clean things up? So not a memory leak as such?

AFAIK the kernel (Newer kernels do) should take care of it when it hits it’s limits.

You shouldn’t need any tweaks hopefully now.

Hi @malcolmlewis1 ,

Memory was consumed again after few seconds of executing that command and this point our memory consumption continue to increase gradually no change in that - still suspect there is some memory leak happening.

Tried to run SupportConfig - but it has too much information not sure how to read through it easily to identify issue, any suggestions?

@Rohan_Khanna so I would run /usr/bin/echo 3 > /proc/sys/vm/drop_caches && cat /proc/meminfo > pre-clean.txt then wait a bit for the memory to be consumed and then run cat /proc/meminfo > post-clean.txt and compare the changes…

Hey apologies @malcolmlewis1,

Was unavailable last few days - this issue is back to haunt us.

Now based on your suggestion i tried the steps it seems the memory consumption starts increasing eventually.

Is there any tool which you would suggest that we can execute in order to identify which process it would be?

Regards

@Rohan_Khanna Hi, I would suggest setting up sar to monitor, perhaps ps aux --sort=-%mem | head or ps -e -o pid,vsz,comm= | sort -n -k 2 then there is always the top command and press m to sort and monitor…

Thanks , let me try this - will keep you posted.

Hi @malcolmlewis1,

Monitoring using both was of no use to conclude which process is contributing towards this consumption.

Any other suggestion you have in mind?

Regards,