100% use of /dev/sda3

Dear all,

b1:/ # df -h Filesystem Size Used Avail Use% Mounted on /dev/sda3 68G 64G 152M 100% /

Is this SWAP partition?
How to check waht is consuming disk space?
How to free this space?

Regards
GN

Hi GN,

[QUOTE=gniecka;19951]Dear all,

b1:/ # df -h Filesystem Size Used Avail Use% Mounted on /dev/sda3 68G 64G 152M 100% /

Is this SWAP partition? [/CODE]

looking at the last column, I have to say: no. This is your root FS, which is close to full - only root processes will be able to use up those remaining 152MB, which is a security measure so that your system won’t crash immediately.

Having a root file system at 100% is bad. It’s like packing your car, stuffing in everything, even covering the driver seat :frowning:

[QUOTE=gniecka;19951]How to check waht is consuming disk space?[/QOUTE]
Using “ls -l” and “du” (on subdirectories, to get a quick glance).

Are /var and /tmp on separate file systems? Else I’d first check these two for large/extra files, i.e. in /var/log/adm. /var/cache might be a good candidate, too, but requires a bit of knowledge to manually free space there.

By removing files.

Sounds obvious, but unfortunately the advice can not go into more detail without getting hazardous. You need to check where excess space is used and decide case by case if you can remove stuff.

As mentioned above, check if at least /var and /tmp are on separate file systems - if not, it’s time to change that. (A root FS with 68G sure looks like the file systems are’t further split up.) On a production server, spaces “everyone” can fill up should definitely not reside on the root FS. And personally, I’d even separate /var/cache and /var/log as well.

Regards,
Jens

Dear Jens,

thank you for quick response.

b1:/ # du -skh * | sort -n 0 proc 0 sys 2.8M root 4.0K media 4.0K selinux 4.2G usr 4.6G opt 9.2M bin 9.9G HANA_LOG 13M sbin 16G mnt 16K lost+found 17G hana 19M lib64 29G HANA_DATA 30M boot 56K home 75M etc 88K srv 158M lib 197M var 272K dev 456K tmp

the biggest catalogs HANA_LOG and HANA_DATA are on another drives - sdb1 and sdc1

b1:/ # df -h Filesystem Size Used Avail Use% Mounted on /dev/sda3 68G 64G 150M 100% / udev 64G 196K 64G 1% /dev tmpfs 64G 76K 64G 1% /dev/shm /dev/sda1 195M 0 195M 0% /boot/efi /dev/sdb1 233G 29G 193G 13% /HANA_DATA /dev/sdc1 233G 11G 211G 5% /HANA_LOG 10.0.0.15:/HANA 1.9T 720G 1.2T 37% /mnt/share01

and once again - why there is 100% usage on /dev/sda3 ?

I’m attaching screenshot of my disks with question - why there is 100% space usage on /dev/sda3 ?

Hi GN,

I’m assuming that deleted files are taking up the space. This happens if such files are still held open by some process.

A “file” is assorted disk space, referenced by its root inode. That reference is made by each directory entry of the file, multiple ones if you have hard links. Open file handles are references, too. A file, in terms of disk space, is only deleted and the space freed if all references to that file are gone.

Have you i.e. cleaned up log files without restarting syslogd? Then the original directory entries are gone - but the actual files are still held open by syslogd, and possible directory entries reference new files.

To check for this, use “lsof” and look out for “(deleted)” entries:

[QUOTE]jmozdzen@myhost:~/Documents> touch /tmp/hugo
jmozdzen@myhost:~/Documents> tail -f /tmp/hugo &
[1]+ tail -f /tmp/hugo &
jmozdzen@myhost:~/Documents> rm /tmp/hugo
jmozdzen@myhost:~/Documents> ls -l /tmp/hugo
ls: cannot access /tmp/hugo: No such file or directory
jmozdzen@myhost:~/Documents> lsof|grep deleted
tail 26293 jmozdzen 3r REG 253,3 0 40 /tmp/hugo (deleted)[/QUOTE]

Regards,
Jens

Dear Jens,

# lsof | grep deleted nautilus 8899 root 34r REG 8,3 272 1302701 /root/.local/share/gvfs-metadata/home (deleted) nautilus 8899 root 35r REG 8,3 32768 1302702 /root/.local/share/gvfs-metadata/home-37deb310.log (deleted)

Will server restart fix this space issue?

Regards

A utility I find very useful in this sort of situation is ncdu It gives you a list of file/directory sizes sorted by size and you can drill down in to directories.
There’s a version for SLE 11 SP3 at the openSUSE Build Service you can quickly install with

$ rpm -ivh http://download.opensuse.org/repositories/utilities/SLE_11_SP3/x86_64/ncdu-1.10-11.1.x86_64.rpm

You look to have enough free space to do that. Then run with

$ ncdu -x /

-x won’t cross file system boundaries. See also http://dev.yorhel.nl/ncdu/man

--- / ------------------------------------------------------------------------------------------------------------------------------------------------
   16.2GiB [##########] /hana
    4.6GiB [##        ] /opt
    4.2GiB [##        ] /usr
  197.0MiB [          ] /var
  157.5MiB [          ] /lib
   74.0MiB [          ] /etc
   29.1MiB [          ] /boot
   18.9MiB [          ] /lib64
   13.0MiB [          ] /sbin
    9.1MiB [          ] /bin
    2.8MiB [          ] /root
  612.0KiB [          ] /tmp
   88.0KiB [          ] /srv
   56.0KiB [          ] /home
e  16.0KiB [          ] /lost+found
e   4.0KiB [          ] /selinux
    4.0KiB [          ] /mnt
e   4.0KiB [          ] /media
>   0.0  B [          ] /sys
>   0.0  B [          ] /proc
>   0.0  B [          ] /dev
>   0.0  B [          ] /HANA_LOG
>   0.0  B [          ] /HANA_DATA

total disk usage:  25.4GiB  Apparent size:  25.4GiB  Items: 527980[/code]

but df -h still shows me 100% usage on /dev/sda3 / 
[code]# df -h
Filesystem       Size  Used Avail Use% Mounted on
/dev/sda3         68G   64G  145M 100% /
udev              64G  196K   64G   1% /dev
tmpfs             64G   80K   64G   1% /dev/shm
/dev/sda1        195M     0  195M   0% /boot/efi
/dev/sdb1        233G   29G  193G  13% /HANA_DATA
/dev/sdc1        233G   12G  210G   6% /HANA_LOG
10.0.0.15:/HANA  1.9T  720G  1.2T  37% /mnt/share01

Hi GN,

in column 1 you see the name of the process holding open the file - nautilus is the Gnome file manager, so closing that should be sufficient. OTOH, if I read those lines correctly, then both files are only allocating 2.6 MB in total - that wouldn’t give you back that much space…

(On the subject of reboot: a reboot will stop all processes, so all “(deleted)” files should get finally freed within the file system. But this statement is just for completeness’ sake…)

I’m obviously missing something here - you have a 68 GB file system filled up to the rim, but “du” does not show an allocation even close to that (sums up to about 30G), nor are deleted files of large size to be found.

Could you please run “du -skx /* | sort -n” , so that it reports only allocations on that single file system? There are some other oddities in the output of your last “du” call - i.e. /mnt is reported with 16G used, while “mount” reports 720G used.

Are there any hidden directories in / (“ls -la” and look for directories starting with “.”) that might have content taking up that space, without being included in the report of “du”?

Regards,
Jens

# du -skx /* | sort -n du: cannot access `/proc/15909/task/15909/fd/4': No such file or directory du: cannot access `/proc/15909/task/15909/fdinfo/4': No such file or directory du: cannot access `/proc/15909/fd/4': No such file or directory du: cannot access `/proc/15909/fdinfo/4': No such file or directory du: cannot access `/proc/19692/task/30445/fdinfo/272': No such file or directory 0 /proc 0 /sys 4 /media 4 /selinux 5 /mnt 16 /lost+found 56 /home 88 /srv 196 /dev 612 /tmp 2856 /root 9328 /bin 13292 /sbin 19356 /lib64 29768 /boot 75784 /etc 161264 /lib 201776 /var 4383356 /usr 4791916 /opt 12702492 /HANA_LOG 16944068 /hana 29706688 /HANA_DATA

b1:/ # ls -la total 132 drwxr-xr-x 25 root root 4096 Mar 14 22:02 . drwxr-xr-x 25 root root 4096 Mar 14 22:02 .. drwxr-x--- 6 ndbadm sapsys 4096 Feb 10 12:16 HANA_DATA drwxr-x--- 5 ndbadm sapsys 4096 Feb 10 12:17 HANA_LOG drwxr-xr-x 2 root root 4096 Feb 4 14:41 bin drwxr-xr-x 4 root root 4096 Mar 14 21:46 boot drwxr-xr-x 16 root root 4360 Mar 20 10:53 dev drwxr-xr-x 108 root root 12288 Mar 20 10:53 etc drwxr-xr-x 3 root root 4096 Feb 10 09:05 hana drwxr-xr-x 3 root root 4096 Feb 4 14:52 home drwxr-xr-x 13 root root 4096 Mar 14 21:46 lib drwxr-xr-x 8 root root 12288 Mar 14 21:46 lib64 drwx------ 2 root root 16384 Feb 4 14:13 lost+found drwxr-xr-x 2 root root 4096 Feb 10 12:22 media drwxr-xr-x 3 root root 4096 Mar 13 10:57 mnt drwxr-xr-x 5 root root 4096 Feb 16 17:58 opt dr-xr-xr-x 310 root root 0 Mar 14 22:02 proc drwx------ 26 root root 4096 Mar 20 12:01 root drwxr-xr-x 3 root root 12288 Mar 14 21:46 sbin drwxr-xr-x 2 root root 4096 May 5 2010 selinux drwxr-xr-x 4 root root 4096 Feb 4 14:14 srv drwxr-xr-x 12 root root 0 Mar 14 22:02 sys drwxrwxrwt 30 root root 12288 Mar 20 12:45 tmp drwxr-xr-x 14 root root 4096 Feb 10 09:08 usr drwxr-xr-x 15 root root 4096 Feb 10 10:24 var

this is mounted windows NFS folder for backups - this values are correct here: 16 GB comes from Linux, rest is windows files inside this folder.

Hi GN,

we might be right on track now:

[QUOTE=gniecka;19963]# du -skx /* | sort -n [...] 12702492 /HANA_LOG [...] 29706688 /HANA_DATA [...] [/QUOTE]

while according to your earlier messages, both /HANA_LOG and /HANA_DATA are mounted file systems - there seems to be data within these directories stored on the local file system.

That “content” is hidden as soon as you mount a different file system to that mount point - from when you mount, you’ll only see the contents of the mounted file system, rather than anything that’s in those directories on the underlying FS.

To check if this is really the case, you’d need to stop HANA and unmount those two file systems and then look inside those directories… you’ll probably see lots of content there…

Regards,
Jens

sorry - didn’t get this…

this /HANA_DATA and /HANA_LOG folders are mounted on separate disks:

when I go to /HANA_DATA and delete some files, before deletion is 29 G:

# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda3 68G 64G 143M 100% / ... /dev/sdb1 233G 29G 193G 13% /HANA_DATA
after deletion some files /HANA_DATA is 25 G, but still shows on /dev/sda3 100% usage:

# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda3 68G 64G 143M 100% / ... /dev/sdb1 233G 25G 197G 12% /HANA_DATA

# du -skx /* | sort -n ... 25460424 /HANA_DATA

Hi GN,

sorry - didn’t get this…

think of it this way: /HANA_DATA (and /HANA_LOG) are, first of all, just directories on your root file system.

As long as nothing is mounted there, you’ll see it as a local directory and can put files there, like with any other directory. Those file, of course, take up disk space on the root file system.

At some later point in time, you mount a different file system there. Once you’ve done that, the “original” files are no longer accessible - when you look in /HANA_DATA, you’ll only see the content of the mounted file system. Anything you do there affects the mounted file system, not the root file system. (Which is what you just tested and confirmend.)

But the “original files” are still there, just not visible - once you umount the extra file system, you can again see and access those “original files”… and of course they’ll take up disk space all the time.

How can you run into such a situation? Two typical occurrences come to mind:

  1. You start without a separate file system… all data is placed on the root FS, then later you decide to separate into an extra file system. After creating the FS, you copy all the current data from /HANA_DATA to the new file system (temporarily mounted somewhere else, i.e. in /mnt) and then mount the new file system to /HANA_DATA. But since you only copied… the original files are still there on the root FS, taking up disk space.

  2. You indeed created the extra file system right from the start, but at some point in time, you started your application while the extra file system wasn’t mounted (think of some error situation or manual action for whatever reason). The application notices the directory and its emptiness and starts creating files there… filling up the root FS. Upon some later reboot, the extra file system again gets mounted and the application has again access to the files on the extra file system - but the files on the root file system are still there.

That’s why I suggested stopping the application and umounting those two file systems. If then, after umounting, you see files inside those directories, that’s where your disk space went :wink:

Regards,
Jens

I believe we are at point 2.
When I look into /etc/fstab all my FS are mounting at boot time

/dev/disk/by-id/scsi-360050760409bf7181a83af4e237471e8-part2 swap swap defaults 0 0 /dev/disk/by-id/scsi-360050760409bf7181a83af4e237471e8-part3 / ext3 acl,user_xattr 1 1 /dev/disk/by-id/scsi-360050760409bf7181a83af4e237471e8-part1 /boot/efi vfat umask=0002,utf8=true 0 0 proc /proc proc defaults 0 0 sysfs /sys sysfs noauto 0 0 debugfs /sys/kernel/debug debugfs noauto 0 0 usbfs /proc/bus/usb usbfs noauto 0 0 devpts /dev/pts devpts mode=0620,gid=5 0 0 /dev/disk/by-id/scsi-360050760409bf7181a83af4e23747b4e-part1 /HANA_DATA/ ext3 acl,user_xattr 1 2 /dev/disk/by-id/scsi-360050760409bf7181a83af4e2374869f-part1 /HANA_LOG/ ext3 acl,user_xattr 1 2 10.0.0.15:/HANA /mnt/share01 NFS defaults 0

What Jens mentioned is that while you may have data in these mountpoints,
you may also have data in the original directors that are now mounted to
other disks. This is not superbly common, but it happens regularly enough
that it’s definitely worth checking. For example, perhaps you start SAP
HANA with a script that automatically mounts these to disks, but something
goes amiss and the mount fails before the application starts, so SAP runs,
fills the local / disk a bit, and then later you reboot and things come up
normally. The data in the /HANA_DATA directory is now unavailable to you,
but it is still taking disk space. ‘df’ is smart enough to see it, but
‘du’ is not physically able to go and calculate it.

To find this, stop SAP HANA and then unmount these mountpoints and then
check the size of the still-remaining directories. If they are completely
empty, that’s good. If not, clean them out.


Good luck.

If you find this post helpful and are logged into the web interface,
show your appreciation and click on the star below…

Hi GN,

I believe we are at point 2.
When I look into /etc/fstab all my FS are mounting at boot time

which says nothing… are you in a position to stop SAP (you were asking to reboot the server, so it ought to be an option) and run the check I suggested?

Regards,
Jens

Jens,

this is productive env so I have do it after hours - surely I will update this therad with my results.

Regards
GN

Hi GN,

[QUOTE=gniecka;19973]Jens,

this is productive env so I have do it after hours - surely I will update this therad with my results.

Regards
GN[/QUOTE]

ok, I understand. A suggestion to minimize downtime:

(stop SAP)

umount /HANA_DATA

mv /HANA_DATA /HANA_DATA.old; mkdir /HANA_DATA

mount /HANA_DATA

umount /HANA_LOG

mv /HANA_LOG /HANA_LOG.old; mkdir /HANA_LOG

mount /HANA_LOG

(start SAP)

those two new extra directories won’t cost you much disk space and will serve as guaranteed empty mount points - and afterwards, you can take your time to analyze what’s in /HANA_DATA.old and /HANA_DATA.old without hurry.

Regards,
Jens

Jens,

now I’m testing this disk space usage and I’m copying smth big to /HANA_DATA/,
/HANA_DATA is groving with amount of data beeing copied, but /dev/sda3 (/) remains at the same level, so I assume than now everything goes to /dev/sdb1 (desired behaviour) instead to local file system (/)…