After almost half a century, I'm still doing it...

[email protected]

Because I plain forgot I was remote. It's as simple and as stupid as that.

[email protected]

In my defense, I just installed the machine. I was configuring it from home after hours.

[email protected]

Fair enough. I've done worse in my time as a keyboard jockey.

[email protected]

I knew a guy who did this and had to fly to Germany to fix it because he didn’t want to admit what he’d done.

[email protected]

There, but for the grace of god....

[email protected]

This hits....

[email protected]

after hours

I've configured PAM to not let me login remotely after hours, because I just know that someday I'll want to fix "just this tiny thing" and I'll break production because I'm too tired. I clearly need protection from myself, and this is one slice in Dr.Reasons's Swiss cheese model.

Don't let the people drag you down, this happens to all of us.

[email protected]

Don't be shitty.

[email protected]

I have a failsafe service for one of my servers, it pings the router and if it hasn't reached it once for an entire hour then it will reboot the server.

This won't save me from all mistakes but it will prevent firewall, link state, routing and a few other issues when I'm not present.

[email protected]

time to setup a console server so that you don't do that again.

[email protected]

I hope you don't admin any mission critical servers. That's a first year mistake.

[email protected]

This is a server I was setting up. It's not doing anything useful at all at the moment, hence the lax work practice. The only reason I drove back to work is because it's needed tomorrow and I wanted to finish setting it up tonite.

[email protected]

Until they have to troubleshoot the console server ..

[email protected]

then setup a super console server. lol

[email protected]

Did this once on a router in a datacenter that was a flight away. Have remembered to set the reboot in future command since. As I typed the fatal command I remember part of my brain screaming not to hit enter as my finger approached the keyboard. ‍️

[email protected]

Have remembered to set the reboot in future command since

That's not a bad idea actually. I'll have to reuse that one.
Thanks!

[email protected]

You’d think you’d learn from your mistakes

Yes, that what you'd think. And then you'll sit with a blank terminal once again when you did some trivial mistake yet again.

A friend of mine developed a habit (working on a decent sized ISP 20+ years ago) to set up a scheduled reboot for everything in 30 minutes no matter what you're going to do. The hardware back then (I think it was mostly cisco) had a 'running conrfig' and 'stored config' which were two separate instances. Log in, set up scheduled reboot, do whatever you're planning to do and if you mess up and lock yourself out the system will restore to previous config in a while and then you can avoid the previous mistake. Rinse and repeat.

And, personally, I think that's the one of the best ways to differentiate actual professionals from 'move fast and break things' group. Once you've locked yourself out of the system literally half way across the globe too many times you'll eventually learn to think about the next step and failovers. I'm not that much of a network guy, but I have shot myself in the foot enough that whenever there's dd, mkfs or something similar on the root shell I automatically pause for a second to confirm the command before hitting enter.

And while you gain experience you also know how to avoid the pitfalls, the more important part (at least for myself) is to think ahead. The constant mindset of thinking about processes, connectivity, what you can actually do if you fuck up and so on becomes a part of your workflow. Accidents will happen, no matter how much experience you have. The really good admins just know that something will go wrong at some point in the process and build stuff to guarantee that when you fuck things up you still have availability to fix it instead of calling someone 6 timezones away in the middle of the night to clean up your mess.

[email protected]

We've all been there. If you do this stuff for a living, you've done that way more than once.

[email protected]

Every network engineer must lock themselves out of a node at some point, it is a rite of passage.

[email protected]

It's console servers all the way down (up?)

agnos.is Forums

After almost half a century, I'm still doing it...