I’ve read several articles on setting up Rancher in AWS, the official documentation, the bug reports on “exceeded 60 retries”, etc. None of the articles/instructions are accurate/complete enough to walk someone through the whole process without running into potholes.
I’ve spent two days on this already and I’m trying to help a customer that uses Rancher who is having troubles with their Rancher generated K8S cluster (certs issues) and I’m trying to reproduce myself. I don’t have much time to reproduce, and every minute lost with these docs means $$$ lost in sales.
Can someone point me to documentation that is super accurate and detailed so I don’t spin on silly issues?
The answer to anything in AWS is “it depends” because there’s 100 ways to do everything. If your problem is with certificates on the k8s cluster then instances are already running and registered so machine provisioning (“exceeded 60 retries”) is irrelevant and the only thing left specific to the provider is networking config.
Is there authoritative, very accurate and up to date, documentation that can walk me through the process end to end? What I see is incomplete, or inaccurate.
No, there is probably no magic unicorn documentation you haven’t already found. Other than machine drivers to deploy instances we do little that is cloud-specific.
It’s not a unicorn to have clear documentation with each step detailed, and prerequisites clearly identified. Is this really how Rancher responds? Calling it a Unicorn to a request for accurate documentation? Really? Is that what you want to be known for?
You’ve already seen the documentation we have. Most users seem to find it adequate, but you already know it doesn’t meet your criteria. Everything including the documentation is open-source, you can certainly put in PRs to improve it. But the problem you’re currently describing in the other thread is with docker in general and probably has nothing to even do with rancher.
So imagine a production issue for a SaaS company at 3 am in the morning. The steps need to work flawlessly. And be regularly tested. Following the instructions as is leaves lots of room for interpretation. There are indeed 100 ways to do everything in AWS, but if there was one approach documented clearly end to end, then we can always reinterpret into Terraform + Chef, CloudFormation, or other simply enough.
It might have something to do with Rancher, it might not. But without step-wise instructions that are tested daily, if not at least weekly, it leaves room for mistakes. Having someone run through your steps, other than yourself, is a great way to surface issues in documentation or communication. Doc is only as good as it’s author, which is why it should be reviewed by a broader audience. Looking at the support forums reveals several lost opportunities for clarifying the documentation – where suggestions were made which resolved an issue for someone, but the follow-through to update the doc never happened.
As for RCA feedback, it might be nice if the doc were updated to account for each of the prior tickets Rancher answered.