rancher-NFS and databases

thoth · August 12, 2017, 1:35pm

Now I know that databases have always had a stigma about being in a docker container, and even moreso about being on an NFS share. But I keep trying it out just to punish myself with lot’s of pain. And I find myself wondering if there are any other masochists out there succesfully running these sorts of things? Anyhow, I have a mongoDB that likes to die intermittently with no regularity whatsoever (my favorite sort of problem), its not in production, it’s just hanging out producing errors periodically and the restart logs look like this:

8/12/2017 8:03:22 AMThis is the lowest numbered contianer.. Handling the initiation.
8/12/2017 8:03:22 AM2017-08-12T13:03:22.554+0000 I CONTROL  [initandlisten] MongoDB starting : pid=14 port=27017 dbpath=/data/db 64-bit host=MongoDB-mongo-cluster-1
8/12/2017 8:03:22 AM2017-08-12T13:03:22.554+0000 I CONTROL  [initandlisten] db version v3.4.6
8/12/2017 8:03:22 AM2017-08-12T13:03:22.554+0000 I CONTROL  [initandlisten] git version: c55eb86ef46ee7aede3b1e2a5d184a7df4bfb5b5
8/12/2017 8:03:22 AM2017-08-12T13:03:22.554+0000 I CONTROL  [initandlisten] OpenSSL version: OpenSSL 1.0.1t  3 May 2016
8/12/2017 8:03:22 AM2017-08-12T13:03:22.554+0000 I CONTROL  [initandlisten] allocator: tcmalloc
8/12/2017 8:03:22 AM2017-08-12T13:03:22.554+0000 I CONTROL  [initandlisten] modules: none
8/12/2017 8:03:22 AM2017-08-12T13:03:22.554+0000 I CONTROL  [initandlisten] build environment:
8/12/2017 8:03:22 AM2017-08-12T13:03:22.554+0000 I CONTROL  [initandlisten]     distmod: debian81
8/12/2017 8:03:22 AM2017-08-12T13:03:22.554+0000 I CONTROL  [initandlisten]     distarch: x86_64
8/12/2017 8:03:22 AM2017-08-12T13:03:22.554+0000 I CONTROL  [initandlisten]     target_arch: x86_64
8/12/2017 8:03:22 AM2017-08-12T13:03:22.555+0000 I CONTROL  [initandlisten] options: { replication: { replSet: "rs0" } }
8/12/2017 8:03:22 AM2017-08-12T13:03:22.555+0000 W -        [initandlisten] Detected unclean shutdown - /data/db/mongod.lock is not empty.
8/12/2017 8:03:22 AM2017-08-12T13:03:22.567+0000 I -        [initandlisten] Detected data files in /data/db created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'.
8/12/2017 8:03:22 AM2017-08-12T13:03:22.568+0000 W STORAGE  [initandlisten] Recovering data from the last clean checkpoint.
8/12/2017 8:03:22 AM2017-08-12T13:03:22.568+0000 I STORAGE  [initandlisten]
8/12/2017 8:03:22 AM2017-08-12T13:03:22.568+0000 I STORAGE  [initandlisten] ** WARNING: Using the XFS filesystem is strongly recommended with the WiredTiger storage engine
8/12/2017 8:03:22 AM2017-08-12T13:03:22.568+0000 I STORAGE  [initandlisten] **          See http://dochub.mongodb.org/core/prodnotes-filesystem
8/12/2017 8:03:22 AM2017-08-12T13:03:22.568+0000 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=1065M,session_max=20000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
8/12/2017 8:03:23 AM2017-08-12T13:03:23.344+0000 I STORAGE  [initandlisten] Starting WiredTigerRecordStoreThread local.oplog.rs
8/12/2017 8:03:23 AM2017-08-12T13:03:23.345+0000 I STORAGE  [initandlisten] The size storer reports that the oplog contains 138731 records totaling to 86143606 bytes
8/12/2017 8:03:23 AM2017-08-12T13:03:23.345+0000 I STORAGE  [initandlisten] Sampling from the oplog between Jul 28 20:38:38:1 and Aug 12 13:02:21:1 to determine where to place markers for truncation
8/12/2017 8:03:23 AM2017-08-12T13:03:23.345+0000 I STORAGE  [initandlisten] Taking 50 samples and assuming that each section of oplog contains approximately 27346 records totaling to 16980221 bytes
8/12/2017 8:03:23 AM2017-08-12T13:03:23.384+0000 I STORAGE  [initandlisten] Placing a marker at optime Jul 29 21:00:06:17
8/12/2017 8:03:23 AM2017-08-12T13:03:23.385+0000 I STORAGE  [initandlisten] Placing a marker at optime Aug  1 19:38:55:1
8/12/2017 8:03:23 AM2017-08-12T13:03:23.385+0000 I STORAGE  [initandlisten] Placing a marker at optime Aug  4 22:37:23:1a1
8/12/2017 8:03:23 AM2017-08-12T13:03:23.385+0000 I STORAGE  [initandlisten] Placing a marker at optime Aug  6 08:44:23:1
8/12/2017 8:03:23 AM2017-08-12T13:03:23.385+0000 I STORAGE  [initandlisten] Placing a marker at optime Aug 11 18:24:15:1
8/12/2017 8:03:24 AM2017-08-12T13:03:24.741+0000 I CONTROL  [initandlisten]
8/12/2017 8:03:24 AM2017-08-12T13:03:24.741+0000 I CONTROL  [initandlisten] ** WARNING: Access control is not enabled for the database.
8/12/2017 8:03:24 AM2017-08-12T13:03:24.741+0000 I CONTROL  [initandlisten] **          Read and write access to data and configuration is unrestricted.
8/12/2017 8:03:24 AM2017-08-12T13:03:24.741+0000 I CONTROL  [initandlisten]
8/12/2017 8:03:24 AM2017-08-12T13:03:24.741+0000 I CONTROL  [initandlisten]
8/12/2017 8:03:24 AM2017-08-12T13:03:24.741+0000 I CONTROL  [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/enabled is 'always'.
8/12/2017 8:03:24 AM2017-08-12T13:03:24.741+0000 I CONTROL  [initandlisten] **        We suggest setting it to 'never'
8/12/2017 8:03:24 AM2017-08-12T13:03:24.742+0000 I CONTROL  [initandlisten]
8/12/2017 8:03:24 AM2017-08-12T13:03:24.742+0000 I CONTROL  [initandlisten] ** WARNING: soft rlimits too low. rlimits set to 24462 processes, 1000000 files. Number of processes should be at least 500000 : 0.5 times number of files.
8/12/2017 8:03:24 AM2017-08-12T13:03:24.742+0000 I CONTROL  [initandlisten]

Does anyone else have these sorts of issues? I seem to remember having the same sort of stuff running postgres in rancher (I now run an external postgres).

EDIT: some additional info I think is pertinent, I’m running that mongoDB using rancher-nfs, which by all other accounts is healthy

thoth · August 14, 2017, 5:09pm

http://rancher.com/microservices-block-storage/ <-- this is very interesting and may be an alternative

rawmind · August 21, 2017, 3:52pm

Hi @thoth,

Unclean shutdown, recovering messages and xfs recommendation seems normal if mongodb crashed. It doesn’t seem so clear that the cause of your issue is due to storage.
Usually, nfs is not recommended for db services, performance is not the best. iscsi or fc shared luns are more recommended. But it’s strange that nfs makes mongodb just crash. Could you please post logs after mongodb crash??

Anyway, have you applied the nfs params recommended by mongodb??
https://docs.mongodb.com/manual/administration/production-notes/#remote-filesystems

thoth · August 22, 2017, 5:29am

Thanks for the reply @rawmind,

And the link was definitely a good call, I had not implemented those params when making the rancher-nfs, This is now my mount options when creating the nfs service

bg,nolock,noatime

Of note I had to kill off all containers using nfs as well. Mongo does appear to be stable at the moment, and nothing else is complaining about the added parms. I’ll certainly post back with any more issues, and also to indicate success if I have no further complications.

cheers!

Topic		Replies	Views
Rancher server 1.4.1 keep restarting Rancher 1.x	0	346	April 22, 2023
Rancher-NFS: How to debug? Rancher 1.x	1	913	March 6, 2017
Rancher-nfs and MariaDB: SElinux relabeling	1	1518	February 15, 2017
Persistant Storage in a Rancher enviroment	3	2673	August 10, 2016
Rancher NFS - I/O errors between managed and host network Rancher 1.x	1	927	June 14, 2018

rancher-NFS and databases

Related Topics