This is the first in a series of posts on how we upped our Jenkins game by treating Jenkins jobs as code, rather than pointing-and-clicking to create jobs.
In this series, I’ll cover:
- the problems we had as our Jenkins use scaled throughout the organization
- the target conditions we wished to achieve
- how we addressed those problems using the job-dsl-plugin along with some sugar on top
- what the development workflow looks like
- what a realistic set of jobs looks like for a sample project
- the sugar we built on top of job-dsl-plugin
- how we encouraged adoption of this approach across teams
- how this can be used complementary to the new pipeline jobs in Jenkins 2.x
Before I even get started, I want to be very clear that I had very little to do with any of this. On our team, these people did all the hard work, notably David G and Dan D for initial experiments; and especially Irina M for ultimately executing on the vision, to whom I am eternally grateful. And none of this would be possible without the heroes behind the job-dsl-plugin.
At work, we have:
- Multiple Jenkins servers, in multiple separate hosting environments; none of these can communicate with one another
- Most Jenkins jobs run in 1 of those hosting environments, not both. But some run in both.
- Hundreds of Jenkins jobs across all these environments
- Of the jobs that run in only 1 environment, many of them run on multiple Jenkinses in that environment, with slight differences (eg, in Dev all projects deploy on SCM change; on prod, most projects manually deploy and prompt for a tag to be deployed)
- No control over the hosting/environment situation
- Dozens of developers, working on dozens of projects
- Significant growth in number of developers, demand for automation, and consequently number of Jenkins jobs
- A very small group of folks who know Jenkins well
We also have:
- A fantastic team of people
- An organizational commitment, with leadership support, to solving the problems described above
For the jobs that were duplicated — with slight differences per environment — we found ourselves doing a significant amount of redundant pointing-and-clicking in different Jenkinses. In addition, we were creating a lot of snowflake jobs that did similar things differently, because of silos, skill gaps, absence of consistency / standards, etc. We were witnessing job configuration drift both between environments, and also between teams.
In practice, it looked like this:
“Why does this deploy job do [Thing A] in dev, but [Thing A+] in prod?”
“Why does this app deploy [this way], but this other app which is structurally the same deploy [that way]?”
“Who wants to build this [some job useful everywhere] we need in all of our Jenkinses?”
“What’s our policy for discarding old builds? Because these jobs retain for 30 days, these for 50 builds, and most just retain forever.”
“Why do these jobs use Extended email, and these use plain email?”
“I really, really wish every job would have failure claiming turned on by default. Why the hell is that an option, anyways?”
And on and on. In other words, we accumulated a lot of organizational deployment technical debt, and we were not happy.
The solution: job-dsl-plugin
I’ll spare you the history and cut to where we landed. We realized we couldn’t solve the multi-environment problem… that is our infrastructure reality. And our automations team isn’t big enough — nor would we want to — police hundreds of Jenkins jobs across multiple environments and turn into the consistency enforcement team. We wanted to continue to empower all developers to use Jenkins, and we wanted to satisfy our own needs for increased consistency. We wanted to make it easy to do the right thing. After a several month discovery phase to investigate solutions to the problems above, we ended up adopting an approach to creating Jenkins jobs that centered around the job-dsl-plugin.
This enabled us to:
- use text to create Jenkins jobs
- store those jobs in source control
- easily code necessary differences per environment
- more easily see and eradicate unnecessary differences in jobs across environments
- easily create these jobs in multiple jenkinses, with a bit of config
- “make it easy to do the right thing”… providing the consistent defaults we wanted, for free
- simplify the small handful of jobs where we wanted “the one and only one way to do this thing”
- foster knowledge sharing and discovery for job configuration
- perform peer review of Jenkins job configuration
In short, we treat our Jenkins jobs like configuration-as-code
Some really high-level info just to make what comes below grokkable until I get to the nitty-gritty details:
- Jobs are configured in text, using Groovy. No, you do not need to become a Groovy expert, retool, learn a whole new language, etc. The syntax is very basic, and the API viewer makes it trivial to copy/paste snippets for job configuration
- These jobs, contained in one or more .groovy file, are kept in source control
- One or more “seed jobs” are manually configured to pull those .groovy files from source control and then “process” them, turning the Groovy text into Jenkins jobs (truth: we even automate the creation of seed jobs; more in a future post)
- Nearly all of that happens via the job-dsl-plugin; the only exception is creating the seed jobs.
I also want to mention now — and I’ll repeat this a lot — that the job-dsl API viewer is pure friggin gold: https://jenkinsci.github.io/job-dsl-plugin/
Apologies in advance for starting a few steps ahead and leaving out a lot of hand-wavey stuff for now. I want to begin with the end in mind. I’ll fill in all the gaps later, I promise. As you’re reading along wondering “What’s this Builder stuff? How do these actually turn into Jenkins jobs?”, trust me, I’ll fill it all in.
I’ll start with the dead simplest job you can create with job-dsl. This is, say, step 0. Then I’ll do what everyone hates and rocket ahead to step 10.
When the seed job runs and pulls that config from SCM, it’ll create a job named “example-job-from-job-dsl”, pulling from a github repo, triggered by a push, with a gradle step, and archiving artifacts.
Now, truth be told, at work we don’t use straight up “job” but instead have Builders that wrap job, and add some useful defaults for us that we want on all our jobs.
Here’s a fairly representative sample of what a simple Jenkins job looks like for us, in code. Ignore the “BaseJobBuilder” business for now, as it’s just some sugar on top of job-dsl-plugin that adds some sane (for us) defaults:
When the seed job for this job runs, it results in a Jenkins job named “operations-jira-restart”, configured to pull from a git repo, with a shell step of “fab restart_jira”. The “BaseJobBuilder” bits for us add some other things to the job that we want to exist in all of our jobs (log rotation, failure claiming, and so on)
Here’s another, using a different “Builder“, with a bit more config added. We use this “SiteMonitorJobBuilder” builder
This dsl script will result in a Jenkins job named “jenkins-outbound-connectivity-check”, which we have configured in every single one of our Jenkinses. It runs hourly, runs http requests against configured URLs, and pulls from an external and internal GitHub repo, to confirm that the Jenkins instance can talk to all the things it should talk to.
I included this example because it demonstrates how easy it is for us now to solve one of the problems above, namely, how to change a job easily that runs in multiple Jenkinses. If we want to change how this connectivity check runs — or, heck, even delete it entirely — we just change it in code and push to SCM. The seed job responsible for this will run, and update the job in all our Jenkinses.
Next up: Getting started with job-dsl
Now that I’ve covered the problems we needed to solve, and a very high level look at our solution, I’ll go more in depth in the next post, covering the job-dsl-plugin and how to use it.