Engineering Efficiency Explained | Developer Experience

March 3rd, 2021

How to get started in DevOps

"What would be a path to get into DevOps? Would you start with a software engineering or a backend job then transition to DevOps?" I guess the first question I would ask is, what does DevOps mean to you? Are you wanting to become an operations person? Are you wanting to be someone who writes the scripts that causes all the things to happen? Or are you more of a product dev that wants to be more involved in production and shipping code? I think that's probably one of the first places you start with DevOps is, what even is it? Because, people mean different things.

In fact, at my last company, there was a role in a department called DevOps, which makes no sense from my understanding of what DevOps is. DevOps, from what I understand, is a philosophy, an approach, to bringing development and operations together. It's taking the things that operations does and using development practices to automate them and make them more self-service, and it's about taking developers and exposing them more to the operations side by giving them more control over how their application ships to production and operates into production.

So, I don't know how you could have a role in a philosophy, but I think what people usually mean is, how do you get a job on an operations team that follows DevOps practices? I think that's what you mean.

Understand the impact of your code

There's different ways you can go about it. If you're already a software developer, probably your first step is to start going beyond just writing features and understanding how your application runs in production. So, that means getting familiar with tools like Datadog, or New Relic, or CloudWatch metrics, or logging systems, so that you understand what's going on in production, how your code is scaling, how it's not, how it's affecting CPU, memory, resource usage, how different peak operations happen, how you could have one request that actually ends up hurting the database as it causes 100s, 1000s of calls at the database. It's understanding the impact of your code. I think that's the first step into operations.

Think in DevOps terms

The second step is to actually start automating some of the things around that. So for example, deployment process is a common one. Usually when people deploy code, they might run a script on their local machine. They might SSH it into a box and run something, or they might go through a big checklist. So, if you want to get involved in DevOps, as a developer, you should look at those manual processes that you do around delivering your code and find a way to automate them. That could be through a continuous integration server, like Circle CI or Jenkins, that could even just mean creating a Python script that will go SSH into this box, copy this artifact over here, publish it to this repository, restart the service through hitting a webhook API, something like that.

But the point is, start thinking in DevOps terms, which is, "Hey, let's not do this manually. Let's find a way to automate this. Let's find a way to put this in code so that we can track changes in our process by simply looking at git commits. And that's a big part to DevOps, which is now evolving a subset of it is GitOps, where all the operation work that you do should be reflected in code, and auditable, and you can create pull requests and reviews and all that stuff.

My recommendation is to start with your code that you do on a daily basis, start to understand how it works in production, and look for operational type bits that you do manually, such as deploying code, and find ways to automate it. For example, another one might be determining the health of your code. Right now, you might have to look at different graphs in your metrics to see that your CPU is high or your database is high. Well, why not automate that? Why not create a script which will pull down those values, and comparing to known thresholds that you have, and do that every so often? And, the script will now warn you when your code is unhealthy.

That's a great way to automate it, and then you can go even farther and combine the two, where you read a script that does a deployment, then you kick off the script that monitors a health for, let's say five minutes. If that becomes unhealthy, then you kick off another script which rolls back the deployment. Now you're starting into really interesting, modern continuous delivery and continuous deployment practices where you can really automate all those things that you would have done before manually. And so, you've gone beyond just a developer and now into someone who is fully invested in delivering and owning that code, which is what DevOps is all about.

"Or you Sleuth." No, it's an "and you Sleuth", because once you deploy that code to production, you need to track those deployments so that you know what went, you need to know what's inside of those deployments so when things go wrong, you can quickly diagnose the problem, and that's where Sleuth comes in.

More than a trendy word

If anyone says, "Hey, I have a DevOps job available." I think you'd really need to ask, "What do you mean by that?" Because they could mean, "Hey, you are an operations team and you're going to be on call 24/7. You're going to get code that's given to you from the developers and you're going to be responsible for it." Which is not DevOps at all, but some people call it DevOps thinking that's just a cool new word for operations. Other places might be a platform team where your job is to create internal applications that make operations more self-service for developers.

For example, at Atlassian, there's a platform called Micros. And this Micros is a platform as a service type thing, where I can go to a web UI and I can say, create me a service, and this service will automatically be connected to a data source, it'll will be monitored, it'll have logs, it'll be registered in a registry somewhere, it will have all this infrastructure around it, and I don't need to go and create all these things. I just get it by going through Wizard and saying, "Give me a new Java service, go." And it's just done. So, there's a team that builds and operates Micros, and in some ways you could call them a DevOps team because they're creating software to help operations, but it's for developers so developers can own build and own their code better.


So, that's also what it could mean, and in fact, if I had to pick one that I'm more interested in, it would be more that. The people that are building platforms for more product focused teams, that's a cool one to do.

More examples

"Solving Linux system administration issues, sometimes it can mean setting up services in Amazon..." Right, that's another good point, which is, what is the target of the operations? And that changes wildly. For example, sometimes when people say, "Hey, I want to hire a DevOps engineer," what they really mean is "I want an AWS expert who can help my team deploy their software in AWS." But, Azure is a different system, Google has its own cloud, you could have a customer company that has their own data centers, that does co-locating, that does snowflake development or snowflake operations, which means each Linux machine is handcrafted and updated with something like Chef and is a little bit different than the other ones. The modern approach to operations, at least in the SAS world is to treat your servers as cattle, not as pets.

Basically the idea is that in the pets model, each one of your servers is special and unique and you manage it individually. In the cattle model, you don't have names for all the cows, you just have cows, and so it's a similar thing. I have this service, I want to deploy it on five boxes, go. And then I want to deploy it on a six box, and now I want to upgrade it, so I'm going to tear down all six boxes, stand up a new six boxes, and send traffic over there. You're just bringing them up and down non-stop.

Catalysts for microservices as well. Well, cattle is a little bit orthogonal. So, cattle verse pets is a different philosophy on how to manage your servers and your infrastructure. Microservices could be done with pets model where each machine is unique and you push your microservice to a machine called Obi Wan 03, or something like that. You could do it that way. You just probably wouldn't want to, because it would be really unscalable. The thing about microservices is you take the challenges of running a service and now you times that by a hundred, because you have a hundred different services, which means a hundred different things, a logging, a hundred different boxes that it's to go on, a hundred different database connections, a hundred different everything. Oh, and each one of those is, of course, scaled even farther. So, you're just blowing any operational challenges up a hundred fold, if not a thousand fold.

The way to deal with that is by using a cattle approach where your servers are very uniform, there's no differences, and so that simplifies the process of adding a new microservice. So, yes, definitely when you're talking about microservices, you want your operations to take a cattle approach where everything is the same, and you can bring servers up and down without even thinking.

And in fact, if you go into a Kubernetes type world, you don't even have to think about bringing servers up and down. You think about a job that you want to run, which is basically a Docker container, so to speak, that you want to execute and you say, run it on five nodes. And those five nodes, you have no idea if that's five individual machines, if that's two on one machine and three on another, you don't even know or care. All you do is you say, "Scheduler, go. Run this application at least five different times with these requirements." And, it figures out the right place for it to go, which makes something like microservices much more feasible. If you try to maintain a server per microservice, that would get real tedious real quick, not to mention it wouldn't scale, which is something you need to do.

Related Content