How fast your web pages load on client sides, especially the portal or login pages? And how do you detect unexpected 302, 404 or 5XX responses. It would probably hurt your user experience or even the functionalities. Slow or problematic page loading are always bad.
Good, if you have examined that carefully. And most likely, with multiple unpleasant manual steps. The thing is how can you be so sure that it’s always good. Even with the endless code changes.
To automate the GUI test? You have to learn lots of GUI automation skills, and the complicated setup. Yes, I certainly believe you are capable for this, my friend. But you might not have enough time.
Good news! With container technology, things are way much easier now. Check out this short post. And have a try now!
The more projects you handle, the more servers you manage. But when you ssh to servers of different projects, are you using the same private key?
And how secured you feel about this? Let’s imagine. One day, your powerful private key gets compromised somehow. Boom! All your servers, and all your projects are in danger.
Check out this post. And get improved for all your projects, in just five minutes!
People may start their elasticsearch cluster with very few shards. Or even start with 1 somehow.
WATCH OUT! You might be in big troubles, when your data grows much bigger.
Like slow query performance. Endless service crash with out-of-memory issues. Even worse, adding more VMs won’t help!
And the cure? You have to re-index them. But it comes with a cost, which you wish not to pay. Trust me, it hurts. Badly.
As DevOps/Ops, you maintain DB instances or RAM intensive services. You see OOM issues occasionally, don’t you? Yes, the scary Out-Of-Memory issues.
Nobody enjoys OOM issues. When it does happen, what should be checked? More importantly, how to monitor OOM issues? And get alerts, before it actually happens.
Here are some of my thoughts. Take a look and discuss with me!
Say you have issued a command in your servers. Typically the command might either backup something or perform a critical hot fix.
Surely you know the start time of the process. But when it will end? How can you find the execution time, when the process has already been started?
Before deployment, people might need to provide multiple information. For example, which nodes to deploy what services, use which tcp ports to listen on application endpoints, etc.
Even very careful person would make stupid mistakes! e.g, wrong ip format, invalid port, unsupported OS version, machine doesn’t have RAM, etc.
These human errors may not only fail your deployments, but also cause unexpected damages to your existing envs. Even mess up critical envs sometimes. So it’s better we enforce pre-check before update.
People might manually change critical config files in servers occasionally. For example, /etc/hosts, /etc/hostname, etc.
As an experienced operator, you will remember to backup, before making any changes. Right? What would you do? cp /etc/hosts /etc/hosts.bak.
But is that good enough?
Using Docker, deployments are more reliable and faster than ever. But how about the docker images build? Containers don’t have any silver bullets. It shifts installation instability from deployment cycle to image build cycle.
I would expect a general solution for the verification of all docker images build. And it should work across different projects. This means less time and effort. Certainly, save money!
Following git workflows, there is a branch called activesprint, or develop. It is the release candidate. Most of active branches should base on it.
Team need to be notified, whenever a new activesprint branch has been created. To lower the communication effort, we can automate the detection process and get slack notifications.