gagen

GitHub action files can be a nightmare to maintain.

Conditions often need to be repeated across many steps.
Referencing values/ids by a string is fragile (ex. matrix values).
Maintaining pinned dependencies is difficult.
YAML is hard to work with.

What's an easier way to maintain these?

Initial Solution

In the Deno repo, our YAML file was complicated and the CI was slow. In 2023, we decided to generate the YAML with TypeScript.

Essentially it looked similar to the following:

const ci = {
  name: "ci",
  jobs: {
    build: {
      name: "...",
      steps: [{
        // ...etc...
      }],
    },
  },
};

const finalText = yaml.stringify(ci);
Deno.writeTextFileSync(
  new URL("./ci.generated.yml", import.meta.url),
  finalText,
);

This was a good first step because now applying a condition to multiple steps only required piping the step objects through functions:

function skipIfDraftPr(steps: Record<string, unknown>[]): unknown[] {
  const condition = "github.event.pull_request.draft == true";
  return [
    ...steps.map((step) => {
      step.if = "if" in step ? `${condition} && (${step.if})` : condition;
      return step;
    }),
  ];
}

Although the above was a good first step, a few years had passed and our CI was again too slow. This was mostly due to us having way more tests now. So, we decided to split up our single job with a matrix into build, and many test jobs to parallelize that work. We'd tried to do this in the past, but the upload and download artifact steps were slow enough that it made it not worth it. It's 2026 now and it's fast.

An issue though is that doing this would be too complicated to maintain. The solution I came up with was gagen.

`gagen`

gagen allows you to define steps and then describe the relationships between steps along with the conditions that a step should occur.

import { conditions, step, workflow } from "gagen";

const checkout = step({
  uses: "actions/checkout@v6",
});

const test = step.dependsOn(checkout)({
  name: "Test",
  run: "cargo test",
});

const installDeno = step({
  uses: "denoland/setup-deno@v2",
});

const lint = step
  .dependsOn(checkout)
  // this condition gets propagated to installDeno, but not checkout
  .if(conditions.isBranch("main").not())(
    {
      name: "Clippy",
      run: "cargo clippy",
    },
    step.dependsOn(installDeno)({
      name: "Deno Lint",
      run: "deno lint",
    }),
  );

// only specify the leaf steps — the other steps
// are pulled in automatically
workflow({
  name: "ci",
  on: ["push", "pull_request"],
  jobs: [{
    id: "build",
    runsOn: "ubuntu-latest",
    steps: [lint, test],
  }],
}).writeOrLint({
  filePath: new URL("./ci.generated.yml", import.meta.url),
  header: "# GENERATED BY ./ci.ts -- DO NOT DIRECTLY EDIT",
});

This outputs the following workflow file:

# GENERATED BY ./ci.ts -- DO NOT DIRECTLY EDIT

name: ci
on:
  - push
  - pull_request
jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd
      - name: Test
        run: cargo test
      - name: Clippy
        if: github.ref != 'refs/heads/main'
        run: cargo clippy
      - uses: denoland/setup-deno@667a34cdef165d8d2b2e98dde39547c9daac7282
        if: github.ref != 'refs/heads/main'
      - name: Deno Lint
        if: github.ref != 'refs/heads/main'
        run: deno lint

# gagen:pin actions/checkout@v6 = de0fac2e4500dabe0009e67214ff5f5447ce83dd
# gagen:pin denoland/setup-deno@v2 = 667a34cdef165d8d2b2e98dde39547c9daac7282

Notice:

Dependencies like actions/checkout@v6 get locked to the hash.
- On subsequent runs, gagen uses the output file as the lockfile.
The condition to not run on main is specified only once. It's then automatically propagated backward to the necessary steps.
The denoland/setup-deno step runs at the latest time that it can. This means if the cargo clippy step fails, no time is wasted running denoland/setup-deno unnecessarily (so faster feedback).

Under the hood, how gagen works is it creates a graph between steps and then when creating each workflow it evaluates the graph and conditions. This means you can reuse step objects between workflows and jobs too.

Typed values

We've resolved most of the above, but now we're still left with the problem that referencing values/ids by a string is fragile.

- 1. Conditions often need to be repeated across many steps.
  2. Referencing values/ids by a string is fragile (ex. matrix values).
- 3. Maintaining pinned dependencies is difficult.
- 4. YAML is hard to work with.

gagen provides some helpers for doing that. For example, matrices are typed:

import { defineMatrix, workflow } from "gagen";

const matrix = defineMatrix({
  include: [
    { runner: "ubuntu-latest" },
    { runner: "macos-latest" },
  ],
});

matrix.runner; // ExpressionValue("matrix.runner") — autocompletes
matrix.foo; // TypeScript error — not a matrix key

workflow({
  // ...
  jobs: [
    {
      id: "build",
      runsOn: matrix.runner,
      strategy: { matrix },
      steps: [test],
    },
  ],
}).writeOrLint({
  filePath: new URL("./ci.generated.yml", import.meta.url),
});

This allows for getting auto-complete on the matrix values when writing something like matrix.os.equals("linux"), which can then be used in a step.

Also, there's a helper for artifacts:

import { artifact, step, workflow } from "jsr:@david/gagen@<version>";

const buildArtifact = artifact("build-output");

workflow({
  name: "CI",
  on: ["push", "pull_request"],
  jobs: [
    {
      id: "build",
      runsOn: "ubuntu-latest",
      steps: [
        step({ name: "Build", run: "make build" }),
        buildArtifact.upload({ path: "dist/" }),
      ],
    },
    // `needs: [build]` is inferred automatically from the artifact link
    {
      id: "deploy",
      runsOn: "ubuntu-latest",
      steps: [
        buildArtifact.download({ dirPath: "output/" }),
        step({
          name: "Deploy",
          run: "make deploy",
        }),
      ],
    },
  ],
}).writeOrLint({
  filePath: new URL("./ci.generated.yml", import.meta.url),
});

How to keep `ci.generated.yml` up-to-date?

An obvious problem with this solution is that we need to ensure the YAML file is up to date with the code generation file.

To achieve this, the writeOrLint function will ensure the output is up to date when the script being executed is passed a --lint CLI flag, so we can add that as a CI step:

// note: this requires ci.ts to have a shebang in it that
// runs the typescript code using your preferred runtime
const lintStep = step({
  name: "Lint CI generation",
  run: "./.github/workflows/ci.ts --lint",
});

Impact?

By taking advantage of all this, in February I was able to increase the complexity of the generated output and simplify the maintained code generation script.

Now it has:

A build job for each platform uploading the executable artifacts.
Many test jobs downloading the executable artifacts and running tests in parallel.

Note: The blue dips on main are release workflow runs, which do less work. Also, sorry the chart is not great, but I created this a couple months ago and now the raw data seems gone.

The main slowness now is compiling Deno on certain platforms (like Mac x86).

Code
Output

Sure, this could have been done in regular YAML, but I believe the code is way more maintainable. Yes, it's still complicated, but maintainable.

For more on what gagen can do, read the docs on GitHub: https://github.com/dsherret/gagen

Initial Solution

Slow CI

Typed values

How to keep ci.generated.yml up-to-date?

Impact?

`gagen`

How to keep `ci.generated.yml` up-to-date?