@vincent So, in talking with my board… i guess my question is this as well… How much would it cost to get support to get this working? If a support contract will get these things resolved faster, I’m very willing to talk about it.
FYI… 1.2.1 production version does not work either and exhibits the same behavior. The UI does not boot, and the logs look like the ones above (no point in posting new ones).
Haven’t heard from them since Friday, so I’m assuming a case of
People that have no problems don’t come to github or the forum and tell you how awesome things are. There are many, many more installations (even just among the opt-ins I can see) than people commenting on issues/forum posts.
I’ll update. Sorry. Yes, we have been able to do 1 upgrade successfully on
the new system (1.2.1 with upgraded hardware). We are doing another one
today ,and if that goes, it does look like it’s solved.
Sorry for the delay in response. I came down with the flu and was out a
couple of days.
So, the net of this, is that with a machine that is 64GB RAM, solo… and on 1.2.1, we were able to deploy a few times successfully. However, for the last few days, everything got VERY erratic, and now Rancher is not accepting connections.
This is the machine. I think its safe to say is not a resource problem on the machine itself:
And, we are down again, because this has brought down at least one of our processes… not necessarily that its down, but its not DNS reachable it seems.
I would appreciate any advice at this point. The “failure” was slower… but still there… just like previous times…
We have rebuilt this from scratch again, and are coming back up.
The thing we realized though, for MySQL is that, while the MySQL database is 5.2gb, the binlog is producing about 150GB per day. So, we modified our MySQL to reduce that…
Hopefully this was the problem, and not the server itself.
Things are stable right now. So, so far, looking better.
Note that we made a BIG change to drop our image sizes to below 1G. Delete all your .git repo stuff (big size shrink), all extraneous libraries, use an Alpine-based image, ensure logs are deleted, etc.
I’ll update in a few days. We’ve been actually doing some development with the devops env stable, and we haven’t really wired it all back in. So, I’ll update when we start regularly pushing again. But, for the pushes we do… seems to be working.
Basically, this stuff is not simple and requires a thorough understanding. “She’ll be right mate” is not going to cut it.
This has been my biggest takeaway from getting involved in the “container way”.
It might be easy to get started, however once real work becomes a thing, unless you are actually familiar with everything (of which there is always something else to figure out), Your gonna have a bad time.
My biggest focus right now is sorting out the monitoring of hosts, containers and apps as well as associated outputs that end up on disk, especially log files. Monitoring and Persistent data are massive topics on their own and become essential once containers get involved.
Standing up a Wordpress blog that has zero posts or traffic is easy to run. However running a Wordpress blog that gets significant traffic and it is important that it stays online as it is the source of all revenue that pays the 10+ people employed, well, that is something completely different.
Glad you guys have been able to narrow things down, and seem to be having success