• Github Actions is a decidedly unserious product, used largely by unserious people.

    It's always been poo, the YAML is bad, the reliability is bad and the cost is bad.

    So there is really no redeeming features because even if you tout forge integration it's UI is, you guessed it, also bad.

    Putting aside the anti-pattern of using vendor YAML for literally anything (please don't do that) you are distinctly better off with literally any other CI/orchestration service. Buildkite is good, dynamic pipeline = good, there are other good options. If you are a serious person you will find good things to use.

    Getting back to vendor YAML, please just use a real build system instead. Define all the actual logic there with entry points/targets the YAML hits. Also generally make sure that you don't need the actual CI system to be up to do releases, deployments etc. A sufficiently elevated local user should be able to run the appropriate target with the appropriate credentials to get the job done in absence of said CI system.

  • Back when GitHub Actions first came out, I used commit hashes rather than tags in all my `uses:` lines. Some of my colleagues disagreed, saying that tags were secure enough. I eventually said, "Well, for well-known actions like actions/checkout, sure; if that one gets compromised it'll be all over the news within minutes." But for all the third-party actions, I kept commit hashes.

    I feel rather vindicated now. There's still a small possibility of getting supply-chain attacked via a SHA collision, or a relatively much larger (though still small in absolute terms) possibility of getting supply-chain attacked via NPM dependencies of the action you're relying on.

    But if you're not using a commit hash in your `uses:` lines, go switch to it now. And if you're just using major-version-only tags like `v5` then do it RIGHT now, before that action gets a compromised version uploaded with a `v5.2.3` tag.

    • GitHub Actions doesn't have a lock file, so your repo is still prone to transitive attacks if the SHA-locked actions you use also happen to use other composite actions by tags, which could be compromised in the future.
      • Agreed. Good news is GitHub will address that with Immutable Releases https://github.blog/news-insights/product-news/whats-coming-... You won't even need to use commit SHA as long as the maintainer follows this approach.
        • What an absolute joke that it has taken GitHub this long to clean up it's act when it comes to supply chain security.
      • Even with a lock file, the action can download and execute arbitrary code from the internet.
        • It would be cool if CI could inject a platform-wide lockfile into every remote download or lookup made by your scripts. So if you pull a container or git tag, the CI platform would automatically ensure that the exact digest downloaded is controlled by a lock file that you can inspect, check in, etc.
      • "Require actions to be pinned to a full-length commit SHA" applies to composite actions, too. I had to replace pre-commit/action as a result.
    • I feel pretty happy we use Renovator (EDIT: It's Renovate) at my current workplace which by default will raise PRs to change any tags for actions with the SHA instead. Then, even when it bumps the version in future PRs, it bumps the SHA (with a comment of which tag version it represents)
      • Glad to hear you're enjoying Renovate - I'm biased, but I agree that the SHA pinning PR updates are a very nice feature

        We recently found (in Renovate) some edge cases with how tags work in GitHub Actions which was fun (https://news.ycombinator.com/item?id=47892740) and there's a few things in there Dependabot doesn't seem to support too

      • If you auto merge those PRs you're back to square 1 as you're not vetting your dependency updates. And if you don't, you incur operational overhead unless you put in a fair amount of effort centralizing. Wrote a couple of posts that touched on this https://developerwithacat.com/blog/202604/github-actions-sup...
        • How many people actually audit the code changes in their dependencies when updating them?
        • Valid point. We have minimum age requirements set on some rules to avoid absorbing every latest change instantly.
          • How would that solve the problem though? You're still bringing compromises in, just with a delay. And the fixes will come in after the compromise, in accordance with the delay policy.

            To make matters worse, you'd lose getting alerts on vulnerabilities. Dependabot won't send them, and neither will Renovate last time I checked.

      • Is it Renovator or Renovate? I'm trying to find it to check it out...
    • just noting that pinning within your own actions is not enough, you also need to ensure any composite actions do not use mutable references (for actions, docker images, etc.)
    • There is no realistic risk of a SHA collision attack. Getting supply chain attacked via NPM dependencies is much more likely. Hopefully the actions creators are also pinning their hashes.
      • > There is no realistic risk of a SHA collision attack.

        Indeed. To illustrate why:

        1. It is not possible to "retroactively" find a SHA-1 collision for an already known hash. If somebody has produced a SHA-1 hash non-maliciously at any point in the past, it is safe from collisions. This is due to second-preimage resistance, which hasn't been broken for SHA-1 and doesn't seem likely to be broken any time soon.

        2. The only way to obtain a SHA-1 collision is to do so knowingly when producing the original hash. You generate a pair of inputs at the same time that both hash to the same value. Certainly, this is an imaginable scenario; e.g. a trusted committer could push one half of the pair wittingly or a reviewer could be fooled into accepting one half of the pair unwittingly, both scenarios creating a timebomb where the malicious actor swaps the commit to the second half of the pair (which presumably carries a malicious payload) later. However, there are two blockers to this approach: Git (not just GitHub) will not accept a commit with a duplicate hash, always sticking with the original one, and GitHub specifically has implemented signature detection for the known SHA-1 collision-generating methods and will reject both halves of such a pair.

        In short, there's just no practical way to exploit this weakness of SHA-1 with Git.

    • Even SHA pinning only lets you go one hop. If the pinned action itself uses any non pinned actions, you’re still susceptible.

      I don’t think this problem is fixable without a higher level way to specify the full nested tree. Something like TOFU for the first time your action ran (pinning all children as of that run) might be an improvement, but that is still can be gamed by a timed attack that modifies the action at a later date (literally, if time greater than X do …).

    • You can enforce at the org level to only allow actions pinned to hashes. You can also choose a small whitelist of actions to allow.
      • I used to think whitelist could be a partial solution. But after Checkmarx KICS got compromised I can't see this working. I would've considered a well-established brand, in security industry of all places, to be in the whitelist.
    • There are downsides to it though. You... - lose vulnerability alerts - increase maintenance overhead - take on all that for value that will go to 0 once Immutable Releases gets widely adopted

      I wrote a couple of blog posts on it, and a makeshift way of tackling that https://developerwithacat.com/blog/202604/github-actions-sup...

      • You lose vulnerability alerts, on GitHub. This is a (ridiculous, IMO) platform limitation that GitHub could lift by applying more engineering time to Dependabot and Dependabot's integrated security alerts feature.

        zizmor (and other tools) correctly recovers vulnerability information for SHA-pinned actions[1].

        [1]: https://docs.zizmor.sh/audits/#known-vulnerable-actions

        • I agree, silly limitation.

          On zizmor, there's no mention of coverage on commit SHA the section you've linked, nor in the entire page when I do Ctrl+F. Is there anything I'm missing?

      • The maintenance aspect is relatively straightforward to automate.

        Renovate handles this well. Ratchet and pinact can also be used

        • I mention in the posts the problem with the likes of Renovate. Auto merging is equivalent to semantic versioning. You have to properly vet the influx of updates, and that unfortunately won't work in practice.
    • Maybe it's better to pull that dependency source in your action altogether?
      • Better to treat it as a dependency still, but audit each new commit/release as it comes in, and pin to the exact last commit id that you verified.
      • I hadn't previously considered vendoring GHA dependencies, but yes, that might be a good idea. Perhaps not in all circumstances, but for anything that might be at risk of supply-chain compromise, the same arguments that apply to NPM apply to GHA.
  • I apologize in advance for the plug. I've spent the last 5 years warning of the importance of not leaving CI locked in a black box platform and proprietary DSL. All the while going on a quest to reinvent CI as an open, programmable platform. Honestly it's still a work-in-progress: it turns out that reinvention is hard! But, if you want a glimpse of what CI can be when you shed 30 years of legacy, consider checking out Dagger (https://dagger.io).

    Or, if you just want to talk about the future of CI with like-minded systems engineers, without committing to using a particular product, consider joining our Discord: https://discord.com/invite/dagger-io

    • Wow I've been struggling with deployment/CI on Claude/Codex/devcontainers for the last several weeks and this looks amazing. I'm trying to find a "universal" way to deploy on multiple cloud and baremetal platforms.
    • I ALMOST chose dagger, but the idea of writing code to build my code felt like maintaining two applications. While I didn't chose it, the idea that new paradigms are needed was the draw.
      • Yes, it can be a double-edged sword. One reason I called Dagger a "work in progress" is that we took it too far. It's one thing that you can write custom code for your pipeline; it's another that you must write custom code.

        We are actively overhauling our design (in a backwards compatible way) to reach a better balance. The result is that, for most users, writing custom code will not be required to use Dagger. But it will be available for power users who want to extend and customize the platform. Writing code for Dagger will be less like using a frameworok, and more like writing a plugin for a devops tool.

        If you're interested, you can track our progress in our combined changelog / roadmap page: https://dagger.io/changelog/#modules-v2 . The overhaul project is called "modules v2".

        Perhaps once it ships, you can give Dagger another try :)

    • A while ago I checked this out and the homepage looked like it had fallen to the 'AI hype' trend, you know like how everything was 'AI-native XYZ for Autonomous Agents' at the time. I'm not seeing that now though.

      Am I thinking of someone else or did you reverse on that?

      • Yes, that was us. And yes, we reversed on that. The feedback from our community was quite clear :)
    • Are you hiring? This sounds like a really cool space and product to work on.
    • No apology necessary - I appreciate the straightforward offer of solutions to difficult problems.
    • Looks cool. Can it be self hosted? I.e. can I self host it next to my self hosted forgejo instance?
      • Yes, the Dagger engine is open source. Note that the engine on its own is not a CI replacement: it provides a runtime for your pipelines, but you still need an external system to trigger pipelines from git events. This decoupling is intentional, because CI should not be tightly coupled to git events. Sometimes you want to run a pipeline after pushing; but sometimes you need it before pushing, or even before committing. The pipeline runtime therefore should operate at a different layer than git events.

        In practice this means you can combine Dagger with, say, Github Actions or another "legacy" CI platform. And use it as runner & event infrastructure for your portable Dagger pipelines.

        We also offer a complete Dagger-native CI platform, which combines hosted Dagger engines, git triggers, and all the infrastructure necessary to run your CI end-to-end. That is in early access as part of Dagger Cloud, our commercial offering.

        • Well, I'm sold! Trying out your offering this weekend :)
  • Programming in YAML has always seemed crazy to me. Actions seem like a great place to create a simple mixed imperative/declarative scripting language (js extension or whatever) with a solid instrumented/observable/debuggable runtime and an OO API that can be run locally against mock infrastructure.
    • No thanks, Jenkins has three DSL languages and none of it is good. You dont have to inline code in yaml, you can call a script and call it day, write that script in any language you want.
      • You can do the same in jenkins, but a bit of scripting is probably more readable in Groovy than whatever Yaml dsl.

        But I totally agree that the Jenkins langs are terrible, the errors even worse, somehow they managed to make jvm backtraces even more unreadable.

        • I don't know why they don't pivot to Kotlin.

          Gradle did it successfully and it's great now.

          • Don't they have a major thing going on with CSP (as with Scheme) that sort ot persists pipeline state automatically? That would allow you to kill Jenkins and afterwards restart the pipeline from exactly where you left off?

            But I never tried it personally

      • idk I always just wrote shell to be called by jenkins. none of this idiocy of programming with html comboboxes. DSL for the domain is shell, no need to invent hyperwheels here.
    • YAML isn't the problem. It's that every single action is basically curl-to-sudo-bash. Even disregarding the security implications, the ergonomics are truly horrendous. They were with Azure DevOps and they certainly are with GitHub Actions. Bad interfaces, surprising behavior, it's got it all.

      CI must only consist of shell commands. No abstractions, no surprises. (Except maybe with PowerShell, where the principle of most surprise rules.)

    • Having tried Pulumi for IaC I am not a fan. Pulumi is excellent but the concept is what I am not keen on. It is a rabbit hole for devs and it allows complexity where in Yaml you are forced to KISS.
    • The YAML is way less concerning than the lack of any decent tooling to test and debug the code.
  • Yup! Still haven't switched off of Github, but considering it at this point. If you're in my shoes, here's some tools we use that help:

    - https://github.com/sethvargo/ratchet for pinning external Actions/Workflows to specific commit hashes

    - https://www.warpbuild.com/ for much faster runners (also: runs-on/namespace/buildjet/blacksmith/depot/... take your pick)

    - soon moving to Buildkite for orchestration of our CI jobs

    I still just need a reasonable alternative for the "store our git repo, allow us to make and merge prs" part of things. Hopefully someone takes all the pieces that the Pierre team is publishing and makes this available soon. The Github UI and the `gh` cli are actually really nice and the existing alternative code storage tools are not great IMO.

    • VP at Buildkite here; let me know if you need anything as you begin to move over to us for orchestration. The new trial we just released unlocks everything in the platform, and we can extend past 30 days if you need.
    • Why warpbuild over the alternatives? I've seen depot before and am tempted, but open to other platforms.
      • Founder is active on HN and the service is high quality. Support is reasonable. Machines are fast and work well. There are a bunch of alternatives, the switching cost is extremely low, pick whatever you'd like.
      • Founder of WarpBuild here. We have faster compute: baremetal for amd64 workloads, AWS for arm64 etc.

        We optimize for overall performance in real world jobs and have a broad selection of regions/OSes/arch available. There aren't any fixed subscription fees either.

  • Great writeup. Though combined with the lack of lockfiles for transitive actions, relying purely on static analysis is tough. Linter like zizmor are great, but they struggle with deep composite actions trees and runtime template injection.

    I got frustrated with the lack of security to started working myself on an open-source runtime sandbox for GHA: https://github.com/electricapp/hasp

    The first check was inspired by the trivy attack. hasp enforces SHA pinning AND checks that a comment (# v4.1.2) actaully resolves to its preceding SHA. That grew into a larger suite of checks.

    Instead of just statically parsing YAML it hooks into the runner env itself. Some of its runtime checks mirror what zizmor already does including resolving upstream SHAs to canonical branches (no impostor commits) and traversing the transitive dependency tree. I have a PR up with a comparison document here (hasp vs. zizmor): https://github.com/electricapp/hasp/pull/13/changes#diff-aab...

    Furthermore, it sandboxes itself to prevent sensitive exfiltration by acting as a token broker which injects the secret at runtime -- the GH token can only ever be used to call the GH API. It uses landlock, seccomp, and eBPF via Rust, so no docker. The token broker sandbox can also be used to wrap a generic executable giving hasp generic applicability beyond GHA context (i.e. agentic or other contexts, where token runtime injection seems quite in vogue)

    I'm using this as a stopgap until GH rolls out some of the features on its roadmap. I'm moving torward treating the runner as a zero-trust or actively malicious environment, so this was my small contribution on that front.

  • We just use GHA as a simple caller, and everything is coded in nix scripts. The best part of this is how you can call the CI run directly from your own machine and it works the same.
  • John Howard (one of the maintainers of Istio and currently with Solo.io) blogged about "Fast GitHub Actions with Blacksmith" [1]. The blog also contains a link to "GitHub Action Runner Alternatives" [2].

    [1]: https://blog.howardjohn.info/posts/blacksmith-gha/

    [2]: https://binhong.me/blog/github-action-runner-alternatives/

  • I'm personally not a fan of GitHub actions, because of those dependencies outside your control and more because they're a pain to debug. A lot of the time, it feels like I'm tinkering with this huge script then holding my breath and hoping I got it right.

    The reason I use them, however, is because its more trouble than its worth to maintain build servers for the 3 platforms I care about (Windows, macOS, Linux) myself. Especially for projects that get built sporadically. I think one reason for this pain is that while you can easily run VMs for Windows and Linux on the same host, macOS is kinda its own special unicorn and might need a dedicated box. (But even that aside, maintaining machines you don't use every day can get annoying.)

  • This should really what LLM ought to bring in terms of security. Be able to break things faster considering it is now easier for the maintainers to fix them.

    This has downsides of course, moving further into the "everything rot so fast these days" trope, but we will in a adversarial world where the threat is constantly evolving.

    Tomorrow (today) the servers and repo won't be scanned by scripts anymore but by increasingly capable models with knowledge about more security issues than many searchers.

  • The crazy thing is there are sooo many good alternatives and things built on top of it that github / ms could purchase and integrate. Product is asleep at the wheel.
  • <tangent>

    Github actions is running like treacle now. Even when our company pays lots of money for cloud and private Github runners.

    I know its the go-to punchbag but I think enabling Copilot reviews globally for a large proportion of Github was a bit hasty.

    The security problems aside, if it continues this way, people won't be able to ship and deploy code from Github actions.

    We might dare I say it, have to go back to self hosted Jenkins or Travis CI.

    • shameless self plug, but please check out RWX! (rwx.com)
  • Question: could someone make a GitHub pages website that runs zismor in a dropped yml file and tells if it’s bad in some way?
    • You’d need to build zizmor for WASM. I’ve thought about doing that work, but I’d happily accept contributions from people towards that who understand WASM better than I do.
  • I just have a Spot instance we use for our builds. It's turned on via serverless, runs it's job with a timeout and exits.

    Lately i don't use any managed services and life couldn't be any simpler.

    • My team has been using https://runs-on.com/ for AWS instance runners, had a few glitches but largely been great for using AWS instances for runners.
  • I still don't understand why the official github pages action is on an account called "peaceiris" ?? peaceiris/actions-gh-pages@v3
  • pull_request_target is criminally negligent -- github should simply disable it.

    The security risk for running unvalidated code on any random PR with access to account secrets has no legitimate use case which outweighs its unbounded risk.

  • This aligns nicely with today's/current GitHub Actions outage
    • Github outage? Must be a Y in the day
  • I thought GitHub was great back in the day. My account goes back to 2009. It was so much better than what came before, e.g. Sourceforge. Admittedly, the centralised nature was a problem.

    I was heartbroken when Microsoft bought it. There should be a way for citizens to rebel against such things. It feels like it's been on a downward trajectory ever since.

  • The OIDC federation between the runner and the cloud resources it touches , that credential gets created once. Permissive enough to not block the first deploy, and it is not what is reviewed when a pinning incident happens. Every one is looking at the action. The identity it runs as just sits there.
    • Common mistake is trusting the repo instead of the workflow. Then any workflow inherits the same cloud access.