Persistent Storage with Docker in Production - Which Solution and Why?


#1

Hello, I’ve recently started working for a company that wants to break their monolithic SaaS application up into containerized microservices. I’m having a hard time grasping a fundamental part of persistent storage, though. Why are there so many different competing platforms? Portworx, Rexray, StorageOS, Flocker, Inifint, etc.

My Questions

    1. Why wouldn’t someone simply spin up an NFS server and use a hierarchical folder structure there as their storage backend? What gains do you get when using one of these tools?
    1. How dangerous is it to use something like this with Docker? What are the common causes for catastrophic data loss in a docker-based environment?
    1. What persistent storage solution would you recommend and why? My company operates a SaaS platform. The data payloads are small in size (5kb-100kb). Data processing is small-medium in resource consumption. Overall volume is medium, but continues to grow. We’re hoping to completely move our monolithic application to the cloud as separate containerized microservices. Including our data warehouse.
    1. Somewhat unrelated, but it ties in. What are the strengths of using Kubernetes as an orchestrator as opposed to Rancher/Cattle? Isn’t Kubernetes over-engineered for a small-medium sized platform? Are there any strengths to using Kubernetes in Rancher aside from the one-click installation?

Thank you for the insight. Sorry for the naivety. I welcome all documentation and supplemental reading material.


#2

The “Holy Grail” of Cloud Native is letting go of both local and remote storage and being resilient to exactly the scenarios you described. At this point in time ( you wrote this in 2017, in Dockertime half a century ago ) it is wise to look at a resilient K8s stack with underlying nodes that handle the storage.

NFS is local remote storage; there is no sane way to duplicate the data across all the nodes scattered on the Internet IMHO.

I found your question whilst I’m searching to create a stack doing just that.
It is based on Rancher 2.0alpha, K8s 1.8 and StorageOS.

When nodes, containers and storage are independant and can be destroyed and created without losing policies, coherancy and responsiveness I’ll let you know. Should be 2019 by then (?)

I feel the resiliency of a K8s controller is one important aspect, maturity of Rancher 2.0 with its groups, policies and nice UI the other.


#3

Hey twobombs,
I’m currently attempting to get storageOS set up with my rancher 2.0 cluster as well. Have you had any luck with it? Not sure how to enable mount propagation on the kubelet/api server since it’s run by rancher.


#4

Go Longhorn :slight_smile: storage os was a distraction at best :slight_smile:


#5

Ha, one year later and we’re no closer to finding the answer here. Although we have left Rancher in favor of building vanilla K8s clusters with kubeadm instead.

Last I knew the Longhorn project wasn’t under active development.


#6

@Thomas_Zimmerman would you go with StorageOS or Longhorn? I’m setting up Rancher2.0 as I want to move away from Rancher1.6+Cattle because of soon to be end of support.


#7

I wouldn’t use either but I operate in the cloud. I think Longhorn is dead. Also I would suggest you build your own cluster with Kubeadm or a cloud provider instead of using Rancher.

https://kubernetes.io/docs/concepts/storage/#types-of-volumes


#8

Thx @Thomas_Zimmerman! I am stuck with bare metal for a while. Longhorn seems to be active again: https://github.com/rancher/longhorn/releases, so is StorageOS, hence my hesitation.

Why were you unsatisfied with Rancher?


#9

No, not dissatisfied. I actually interviewed with them once upon a time. I just think that kubeadm is a better solution for bootstrapping your own cluster since you won’t have the overhead. Plus if you use Rancher you’ll be locked into supported versions of k8s rather than having the newest available version like you would with kubeadm.

A lot of what made Rancher relevant sorta disappeared with Cattle. At least in my opinion.


#10

Longhorn is not dead. RKE is our analog of kubeadm. Rancher uses it internally to create clusters but it is its own standalone tool. Bleeding edge is not something our customers generally need, but you can run any images you want if you don’t want support from us.


#11

Thx @vincent! I really want to stay with Rancher, as we are happily using Cattle with Gitlab now in production. So what you say is that longhorn is alive and ready to be battle tested on staging env?