Just as DevOps provides developers and operations with a more unified ability to work better, GitOps provides infrastructure, change and integration intelligence to modern cloud operations, to make them run.
What is GitOps?
What is GitOps and how can it be applied to modern cloud computing software engineering environments? Much like DevOps, at its core, GitOps isn’t a product, a technology or a singular service. Much like DevOps, it is better described as an operational framework.
GitOps was coined by Weaveworks founder Alexis Richardson but is now being further defined by a Cloud Native Computing Foundation (CNCF) working group co-founded by Amazon, Codefresh, GitHub, Microsoft and Weaveworks itself. The working group defines GitOps as consisting of declarative configuration, version control (i.e. Git), automated delivery and software agents in a closed loop.
Where DevOps is the coming together of Dev (developers) and Ops (operations) people to work in a more symbiotic mutual goal-sharing way; GitOps is a coming together of core DevOps elements (including version control, collaboration, compliance plus Continuous Integration & Continuous Deployment (CI/CD) tooling) to apply those elements to infrastructure automation, which in this case is our higher-level Ops.
According to GitLab, GitOps = IaC + MRs + CI/CD. In this equation, IaC denotes Infrastructure as Code, MRs are Merge Requests where new code is brought into a main (or trunk) branch… and of course, CI/CD is defined above.
GitOps exists because while DevOps drove the automation of infrastructure, the scripts can still become unwieldy and engineers sometimes say that it is difficult to find the current ‘state’ of the infrastructure. Let alone be able to revert to a known good state when things go awry.
With GitOps, the software engineer merges a pull request bringing the configuration to version "4", for instance. Then, software agents in the infrastructure update the running configuration to match. Instead of versioning a bunch of DevOps scripts and having only a tenuous connection to what is running, the configuration and runtime environment can be thought of as having been versioned together. If something goes wrong, we can just roll back the state of the running branch to version "3" and let the software agents do their job.
According to the GitLab blog, “GitOps is used to automate the process of provisioning infrastructure. Similar to how teams use application source code, operations teams that adopt GitOps use configuration files stored as code (Infrastructure as Code). GitOps configuration files generate the same infrastructure environment every time it’s deployed, just as application source code generates the same application binaries every time it’s built.”
It’s almost a misnomer in a sense i.e. GitOps has Git from the start, which includes elements of DevOps, but it works for cloud infrastructures… so perhaps it should have been called GitCloudOps, or just GitCloud. Let’s not go there and move on, right?
GitOps in practice & production
Organizations in the real world use multi-cloud services architectures - it’s a deployment fact of life - and multi-cloud needs orchestration and smart infrastructure advancements.
Businesses deploy multi-cloud due to geographical reasons relating to speed and compliance, they deploy it to break the shackles of single vendor lock-in… and they go multi-cloud for reasons including reliability, price, optimization options and perhaps even for good old fashioned personal service.
What this reality means is, no two private, public, or hybrid clouds are alike. Despite the industry standardizing on Kubernetes for cloud container orchestration, configuration is different for every cloud. Amazon's Elastic Kubernetes Service (EKS) requires a different configuration than Microsoft's Azure Kubernetes Service (AKS) or Google Kubernetes Engine (GKE).
In practice, this means that system engineers have to maintain different configurations for each implementation. But that is only half of it, they also have to maintain different scripts for logging in and applying the changes. If one script fails in one environment -- getting to a consistent state is often a manual process.
However you cut this, it’s a heavy lift
Even if a software team adopts an Infrastructure as Code approach, managing infrastructure scripts in Git, managing the scripts across a set of Git repositories and branches... and then ensuring the right configs are applied to the right runtime environments is a lot of work.
Consider an environment where a cloud network spans not just Amazon Web Services (AWS) and Google Cloud Platform (GCP), but multiple regions, countries and perhaps includes an edge network. The resulting DevOps scripts and workflow are enormously complex. The opportunity for entropy and chaos is profound, yet this is the environment that many system engineers are working in.
Further automation is clearly called for.
The technology proposition here is that GitOps makes multi-cloud and hybrid cloud feasible when deployment topologies get more complicated. Cloud, Kubernetes and application configurations are stored declaratively in Git, allowing for changes to be submitted as pull requests, approved by a maintainer and automatically applied by software agents.
GitOps clearly includes some technology like Argo or Flux, but it is also a set of best practices and an overall mindset. Instead of running scripts against an environment, the software engineering team declares the configuration and lets it continuously reconcile against what is actually running.
What state do you want to go to today?
Refactoring the cheese-factor from Microsoft’s advertising ‘Where do you want to go today?’ tagline for the cloud generation; in GitOps terms we would say: What state do you want your system to be in today? We’re declaring the desired state of a system (defined by its infrastructure) so that a software agent can leverage the right config to enact that state… and so automatically operate deployments.
The idea is that GitOps treats system administration the way you treat a code regression i.e. if something breaks, you roll it back and let the GitOps process take care of correcting the system.
There are simple tutorials for getting started with GitOps using an open source tool like Flux and there may be value in the ability to audit and roll back changes, even if you are not managing massive, complex systems. Commercial vendor solutions like those from Weaveworks, GitLab, or Cloudbees also promise to further simplify the process and do things like integrate with existing CI/CD pipelines.
Like any technology, GitOps isn’t a cure-all and there will be both smaller teams who have their own fine-tuned approach and larger terms who have their own custom-built techniques for running complex infrastructures on multiple clouds. But panacea or not, if multi-cloud deployments get increasingly complex and fragmented (spoiler alert: they probably will), then this technology could have an increasingly prevalent role in the cloud-native age.