[QUOTE=ab;39178]On 08/20/2017 06:34 PM, alpha754293 wrote:[color=blue]
ab;39171 Wrote:[color=green]
On 08/18/2017 07:14 PM, alpha754293 wrote:[color=darkred]
Code:
aes@aes3:~> free
total used free shared buffers[/color]
cached[color=darkred]
Mem: 132066080 122653184 9412896 155376 24[/color]
84712440[color=darkred]
-/+ buffers/cache: 37940720 94125360
Swap: 268437500 35108 268402392--------------------[/color]
This is showing what we would hope, that swap is basically unused.
Sure,
35 MiB are being used, but that’s just about nothing, and it is
probably
only data which should be swapped, like libraries loaded once and never
needed again but still needing to be loaded. You could tune swappiness
further, but I can hardly imagine it will make a big difference since
the
system does not need the memory that is even completely free (9 GiB of
it)
or that is used and freeable by cache (94 GiB).
[color=darkred](I think that you meant /proc/sys/vm/swappiness and that is still at[/color]
the[color=darkred]
default value of 60.)[/color]Change that if you want; sixty (60) is the default I have as well on my
boxes that I have not tuned, but again I doubt it matters too much
since
the system isn’t using almost any swap currently now that xorg is not
trying to use all of the virtual memory the system has available.[/color]Xorg isn’t using it, but cache is (pagecache and slab objects) - 81.74
GiB of it to be precise.[/color]
I think we were talking about different “it” things here; I was talking
about swap, and on this current system almost nothing is using swap.
Sure, when you were running your analysis program under xorg it was xorg
taking up both RAM and swap, but that was not usual at all because xorg, a
user program, thought it needed (NEEDED) 333977404 KB of RAM, meaning 333
GB all by itself. That’s not normal, and only indicative (usually) of a
memory leak. In that case your kernel was not holding onto either RAM or
swap to the chagrin of xorg, but rather, as I would expect, it had
probably freed up as much RAM as possible from cache and given it to xorg,
which had happily used it terribly. There’s no fix for this other than to
fix the xorg bug, but it shows that the system did not hold onto cache
while keeping memory (RAM) from a user application, and this is how it
should work. Things are cached when nothing else needs the RAM, but the
system will free it at the drop of a hat when something important
(basically anything) needs it.[/quote]
As you recall though, the thread title is “memory leak issue?”.
If an OS uses swap because it ran out of memory, then the issue isn’t about the swap, but RAM.
Swap is just the consequence of what’s going on with the RAM usage. Fix that, and I would say you’d have a 99% chance of fixing the swap issue.
[quote=ab][color=blue]
So when an application is going to make a request for ca. 70 GiB of RAM,
let’s say, and since the system only has 128 GB installed, it’s going to
push any new demands on the RAM into swap and this is where it becomes a
problem.[/color]
I would agree that it would be a problem, but I think your own xorg
example shows that is not the case. Yes, you were using lots of swap on
the system at that time, but you were also using all of your RAM (or
nearly so). If xorg had been denied RAM because the system just had to
cache things, xorg would have crashed like any application that needed RAM
and was denied it from the OS. More below to prove this, though.[/quote]
[quote=ab][color=blue]
ab Wrote:[color=green]
[color=darkred]
These were screenshots that I took of the terminal window (ssh)[/color]
earlier.[color=darkred]You can see that on one system, it was caching 80.77 GiB and the[/color]
other,[color=darkred]
I was caching 94.83 GiB.This is confirmed because when I run:
Code:
echo 3 > /proc/sys/vm/drop_caches
it clears the cache up right away.[/color]
Yes, that makes sense, but I do not understand why there is a perceived
problem considering the system state now that xorg is stopped. The
system
is not in need of memory, at least not at the time of the snapshot you
took.[/color]Again, the root cause of the issue actually isn’t the swap in and of
itself. It first manifested as such, especially with X running, but in
run level 3, I was able to find out that the root cause of the issue is
due to the OS kernel’s vm caching of pagecache and slab objects.[/color]
I do not see how you reached this conclusion. I see RAM being used, and
by cache, but I also do not see anything in your last bit of output that
shows anything wanting to use all of the RAM, so everybody is running in
RAM, and that’s good (I’m ignoring those tiny 35 MiB of swap because it’s
almost nothing). System performance went down when swap was heavily used
by xorg, yes, but that only happened because xorg wanted nearly 3x your
system RAM, so it was given a lot of RAM and way more swap, because your
swap is (in my opinion) way too big. If you were to start a new HPC job
that needed that much memory, you’d have similar results, but worse
because your programs probably actually use the RAM heavily, rather than
just filling it, and swap, once.[/quote]
So, I’m re-running the tests now (back in run level 5) to see if changing the vfs_cache_pressure has made any difference or to recreate the condition that caused X to want so much RAM in the first place.
It’ll likely be a couple of days before I will be able to report back.
If X was truly the culprit and NOT due to the caching, then I will rescind and retract my statement. However, right now, with X running on one node and NOT running on another node, it is difficult to tell.
I’ll likely have to do a 2x2 test - where I set the vfs_cache_pressure back to 100 with X running and not running and then with vfs_cache_pressure=200 and do both tests again (with X running and X not running) in order to collect enough data and to confirm.
Thanks.
[quote=ab][color=blue]
Linux marks the RAM that pagecache and slab objects that are cached into
as being RAM that is used (which is TECHNICALLY true). What it DOESN’T
do when an application demands the RAM though is that it won’t release
the cache a la (# echo 3 > /proc/sys/vm/drop_caches) in order to release
the cached pagecache and slab objects back to the free memory pool so
that it can then be used for/by a USER application.THAT is the part that it DOESN’T seem to do/be doing.
And that is, to be blunt and frank - stupid.[/color]
If true, it would be a terrible thing for sure, but I have never seen
Linux do this, and I’ve tested it many times; as mentioned above, I think
your xorg example also shows this, but you disagree so I would like to
figure out if my conceptions are all wrong, or if you are, perhaps,
interpreting differently than I am and perhaps we can find some agreement.[/quote]
So far, it appears that changing the vfs_cache_pressure to 200 (from the default 100) seems to be helping.
Swap is still at a measily 36 MiB, but the cached mem (read from top) is 63.029 GiB, down from 80+ GiB earlier (when I started this analysis run).
I’ll know more when I run the 2x2 permutative matrix of tests.
[quote=ab][color=blue]
If you have user applications that require RAM, it should take
precedence over the OS’ need/desire to cache pagecache and slab
objects.[/color]
This is what I have always seen over the years; let’s test it.[/quote]
Stay tuned for the results from my 2x2 tests. It’ll take a while for me to run it (because I’ll probably have two bring the other two nodes online in SLES so that I can help it speed up the tests of the 2x2 matrix otherwise the two nodes that are currently running SLES will have to run the cycle twice, which, for the purposes of this discussion, will take twice as much time.)
(And I want to stick to my batch processing/shell script only because it is representative of the condition that I am going to be actually using the system in rather than trying to create something new to test it with.)
So, if it takes quite a bit of time, I am okay with that because I really want to have a firm understanding of what’s going on here.
[quote=ab][color=blue]
Yes, I realise that to Linux, it thinks cached objects in RAM = RAM is
in USE but it should be intelligent enough to know what it is TRULY
being used vs. what’s only cached so that the cache can be cleared and
the subsequent memory/RAM is released back into the free/available pool
so that user apps can use it.THAT is the root cause of the underlying issue.[/color]
No, Linux definitely sees the difference between cached objects in RAM and
everything else in RAM; if that were not the case, the ‘free’ command
could never show you, as any old user, how much of RAM is being used for
mere caching/buffers, and of course it does.[/quote]
cf. my comments about running the 2x2 permutative matrix of tests above.
[quote=ab][color=blue]
The console output of “free” actually tells you that on one of the
nodes, it has cached 81.74 GiB of objects and the other has cached
well…it WAS 94.83 GiB, now it is 116.31 GiB.[/color]
I cannot see this picture for whatever reason; maybe host it on another
site, unless you can just paste it as text (if it was already text).[/quote]
The output of top is difficult (or I just don’t know how) I would capture that as text.
The attachment is hosted here natively (no different than some of the other pictures/attachments).
Check the forum (not the email) for details.
[quote=ab][color=blue]
Here is the output of ps aux for that node:[/color]
The node from which you posted the ps output looks fine to me. As far as
I can tell, the ‘ps’ output shows that this node is using either 15212268
KB (15 GiB) of VSZ, or 9759828 (9 GiB) Resident memory. If that 9 GiB is
added on top of 116 GiB cached data, you are stll not using all of your
RAM. Seeing the ‘free’ output would probably show this as well. Sure,
maybe SOME swap was being used, but I would be very surprised if it was
using swap heavily, but still we need to test this.
[color=blue]
Code:
$ cat /proc/sys/vm/swappiness
60--------------------[/color]
Yes, if concerned, at least set this to one (1). Unless you are using a
lot of swap it will not matter, but at least it will have the system
prefer RAM more-heavily.
[color=blue]
I highly doubt 116.31 GiB of cached objects is a “perceived” problem.[/color]
It definitely is only a perceived problem if your user processes are able
to take back the RAM when they need it.
In this last bit of ‘ps aux’ output the majority of your solver processes
are only using something like 300 MiB RAM, so much smaller than before.
You have one using around 4 GiB, but it’s definitely the biggest thing on
there. As a result, while your system shows a lot of cache, that is
because nothing else needs the RAM. Get something to use that RAM and
watch it free up as if you had told the system to drop caches with the
echo statement, only just for the amount needed by the program.
[color=blue]
In my case, swap exists in the event of an analysis requiring more
memory than is physically available.[/color]
Fair enough, as that is a decent purpose for having it, so long as
everybody understands that it will perform terribly (compared to RAM0 once
needed. Linux, I am arguing, should delay that time as much as possible,
preferring RAM until then, with your swappiness value at the default of
sixty (60).
Ways to est this are varied, but here are a couple. First, on your box
you should have a /dev/shm (shared memory) mountpoint in which you can
write anything you want, and normally it is about half the size of your
system’s RAM (by default), meaning on your box it will be 64 GiB. If your
system is actually using 10 GiB memory for your processes, but it is
caching 116 GiB of stuff, your memory is all filled up and anything you do
will, per my theory, require freeing up RAM. Per your theory, it will
require using swap. Run the following command to request 25 GiB of RAM
for a file in that “ramdisk” area and see where it is used:
free
dd if=/dev/zero of=/dev/shm/25gib bs=1048576 count=25000
free
If my theory is correct, your swap amounts will not change much, and your
used amount will not change much, but your system will have a lot less
cached suddenly. Delete the file and then check ‘free’ again:
rm /dev/shm/25gib
free
At this point you probably have 25 GiB RAM (or a bit more) free and swap
should still be minimally used. Testing this on my system (which has a
lot less RAM than yours) shows these exact results, and they’re the
results I’ve seen for years, and come to expect.
Of course, you’re not foolish and realize that writing a file to a ramdisk
is maybe not exactly the same as any other user process wanting memory.
The easy test there, of course, is to have something gobble RAM.
Thankfully, folks have written programs that will do just that for us.
The original site is gone, but I can paste the code here, you can drop it
into a file, and then compile it see the results; I just tested it on my
server and it still works as hoped; warning, running code from weirdos
online is slightly scary unless you trust them:
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main(int argc, char *argv[])
{
char *p;
long i;
size_t n;
/* I'm too bored to do proper cmdline parsing */
if (argc != 2 || atol(argv[1]) <= 0 ) {
fprintf(stderr, "I'm bored... Give me the size of the memory chunk in
KB\
");
return 1;
}
n = 1024 * atol(argv[1]);
if (! (p = malloc(n))) {
perror("malloc failed");
return 2;
}
/* Temp, just want to check malloc */
printf("Malloc was successful\
");
//return 0;
/* Touch all of the buffer, to make sure it gets allocated */
for (i = 0; i < n; i++)
p[i] = 'A';
printf("Allocated and touched buffer, sleeping for 60 sec...\
");
sleep(60);
printf("Done!\
");
return 0;
}
Drop that into something like mem-alloc-test.c and then compile it:
gcc mem-alloc-test.c
The resulting executable will be named ‘a.out’ by default, so now run it
and have it allocate 10 GiB RAM, which in theory you do not have free
other than if cache is freed up:
../a.out 10000000 #10 million KBs = 10 GBs or so
While it is running, run ‘free’ in another shell and watch the cache get
freed to make room for this memory-gobbling monster. when it finishes
(after sixty seconds, or when you hit Ctrl+c) see that you have free
space, and less cached than when you started.
I think this proves, at least on my systems, that things work as I have
described. Cache is treated separately, and it is a second-class consumer
of RAM, and is free up nice and quickly, without using swap.
It is entirely possible your system behaves differently; I have low
swappiness values, and my boxes are probably older and running older
versions of SLES than yours, but if that is the case I would like to
understand why since, as you have noted well, this is a big deal.
Either way, I look forward to better-understanding the memory management here.
–
Good luck.
If you find this post helpful and are logged into the web interface,
show your appreciation and click on the star below.
If you want to send me a private message, please let me know in the
forum as I do not use the web interface often.[/QUOTE]
Stand by for the results from the 2x2 matrix testing.
I think that will address pretty much the rest of your points once the results come in.
Thanks.