Could lockfiles just be SBOMs?

(nesbitt.io)

50 points | by zdw 6 hours ago

12 comments

  • endorphine 4 hours ago
    From https://en.wikipedia.org/wiki/Software_supply_chain:

    > A software bill of materials (SBOM) declares the inventory of components used to build a software artifact, including any open source and proprietary software components. It is the software analogue to the traditional manufacturing BOM, which is used as part of supply chain management.

  • Ferret7446 2 hours ago
    No because SBOMs are a hot mess and not standardized at all. They're "standardized" in the same sense as HL7 (ask someone in the healthcare industry, make sure to have some sedatives on hand first). A comprehensive SBOM for something like Chromium is many dozens of MBs compressed (I forget exactly, but it's patently ridiculous). Also SBOMs should be build artifacts, so them (also) being build inputs is problematic.
    • zvr 29 minutes ago
      The format is standardized, to the highest level possible: ISO/IEC 5962:2021 defines SPDX v2.2.1. The actual standard text is available for free at the ISO website (and other places, like spdx.org).

      The newer version, SPDX v3.0, will become ISO/IEC 5962:2026, and work is already underway for further versions.

      What is not standardized at all are the integration of processes for producing/consuming/maintaining SBOMs in the software development world.

    • isodev 8 minutes ago
      Oh dear, HL7, I may be suffering from a form of PTSD… my therapist has heard about this “standard” at length.

      But I think SBOMs are better structured. I also feel that if package managers refocus their efforts on that, the standard and its implementations can be evolved. It’s the whole perk of using standards. I think it would be a good thing

    • larusso 1 hour ago
      This year I had to create SBOM files for our Unity projects. Of course there is nothing. For all that don’t know: UPM (Unity Package Manager) is a way to easily install packages in Unity. And as a side note, for whatever reason they decided to built on top of npm not nuget for the package infrastructure and metadata format. Anyways: Most packages we use are simply wrapper packages for other packages. Like a wrapper for a .NET library. There is no clear dependency try but based on the package ID I’m able to see them. So I wrote the SBOM files based manually with an SBOM library and added pedigree statements to the original nuget package being wrapped. Idea was if the nuget package has a security issue the UPM package also gets flagged. Showed that one of the security engineers of the software we use. As wer was cool but that is not a standard. There is also no official package specification for UPM (I also made that up as part of the purl) So yes SBOM is a standard with a huge array of ways to declare said information. And it seems most companies consuming the files don’t built general parsers but expect specific formats for X.
  • Lvl999Noob 3 hours ago
    Personally, I would prefer that the package managers keep their own lockfiles with all their metadata. A CI process (using the package managers itself) can create the SBOM for every commit in a standardized environment. We get all the same benefits without losing anything (the package managers can keep their own formats and metadata and remove anything unneeded for the SBOM from it).
    • ozim 1 hour ago
      Second that. It is trivial to add SBOM generator to your pipeline - it is not trivial to make all kind of package managers to switch and each format is used for different audiences.
      • zvr 25 minutes ago
        Exactly.

        To understand what an impossible task this is, there is no need to think about different ecosystems (PyPI vs NPM vs Cargo vs ...). Even in the case of different Linux distributions, the package managers are so different that expecting them to support the same formats is a lost cause.

  • notepad0x90 2 hours ago
    Wouldn't lock files require running the thing? People need to be able to verify SBOM without doing that. It's the kind of thing you check against a large fleet of devices. If someone has software installed on their laptop but hasn't run it in a year, you need to be able to measure SBOM for that.

    SBOM is too similar to things like authenticode and package signing for it to be some unique solution. We're too used to how things have always been done. Too stuck in the "monkey see, monkey do" mindset. How about any piece of software, under any execution environment should not only have an SBOM declaration, but cyptographic authentication of all of its components, including any static data files.

    This should be a standardized mechanism. Everyone is doing their own thing and it's creating lots of insecurity and chaos. Why can't I answer all security-related questions about the software I'm running on any device or OS using the same protocol?

    Everyone would consider it absurd if we used a different TLS when talking to an Apache server or a Windows server than alternatives.

    SBOM, code signing (originator of the code), capability declarations, access requirements (camera, mic, etc...) are not things that are unique to an OS or platform. And for the details that are, those are data values that should be different, not the entire method of verification.

    I wonder what it would take to enact this, I'd imagine some sort of regulatory push? But we don't even have a good cross-platform and standardized way of doing this for anyone to enforce it to begin with.

    • perbu 1 hour ago
      Want to verify the installed package, the package should provide checksums you can verify. AFAIK, the SBOM is to documents the build, not the install.
      • zvr 19 minutes ago
        Ah, but there are actually different types of SBOMs, that describe the software in different parts of its lifecycle. It's a completely different outcome to record the software when looking at its source, at what is being distributed, or at what is being installed, for example.

        At some point we realized that we were talking across each other, since everyone was using "SBOM" to describe different contents and use cases.

        The consensus was expressed around 3 years ago, and published in https://www.cisa.gov/sites/default/files/2023-04/sbom-types-...

      • notepad0x90 1 hour ago
        The checksum just tells you what the hash is, nothing more. Supply chain attacks aren't always against the main executable either. With authenticode, the "catalog" can be signed. You're even more opposite of OP than I (OP proposes lockfiles which are at runtime).

        It shouldn't be for "just" any state of the software. We should be able to verify SBOM and take actions at any point. At build time, it is only useful for the developer, I don't get why SBOM is relevant at all. I think you mean at deployment time (when someone installs it - they check SBOM). What I'm saying is, when you fetch the software (download, package manager, appstore,curl|sh), when you "install" it, when you run it, and when it is dormant and unused. At all of those times, SBOM should be checkable. Hashes are useless unless you want people to collect hashes for every executable constantly, including things like software updates.

        The problem is, people are looking at it only from their own perspective. People interested in audits and compliance don't care about runtime policy enforcement. People worried about software supplychain compromises, care more about immediate auditability of their environment and ability to take actions.

        The recent Shai-Hulud node worm is a good example. Even the best sources were telling people to check specific files at specific locations. There was just one post I found on github issues where someone was suggesting checking the node package cache. Ideally, we would be able to allow-list even js files based on real-time SBOM driven policies. We should be able to easily say "if the software version is published by $developer between dates $start and $end it is disallowed".

        • baobun 16 minutes ago
          I still don't see how lockfiles can't be SBOM.

          They contain for each dependency name, version, (derivable) URL and integrity checksum, plus of course the intra-dependency relationships.

          This can all be verified at any point in the lifecycle without running any of the code, provided a network connection and/or the module cache. What's missing?

          > With authenticode, the "catalog" can be signed

          You could trivially sign any lockfile, though I've never seen it. I think it could be neat and it might have a chance to catch on if manifest and lockfile PGP sigs were natively supported by the NPM registry and tooling.

  • onion2k 1 hour ago
    Isn't one fairly major problem with using lockfiles that there could be packages in the lockfile that aren't used in the application? If I run "npm i package" that doesn't tell you whether or not 'package' is actually used in the app.

    For most things that unused dependency is just annoying but if your government has mandated that you use a specific package for something (e.g. cryptography) the lockfile isn't enough to give you confidence that the app is actually doing that. You'll still need to audit the application code.

  • woodruffw 5 hours ago
    This is a great summary, although I think I'm more bearish on SBOMs than Andrew is: my experience integrating them so far (in both pip-audit and uv) has been that there's much more malleability at the representation level than the presence of a standard might imply, and that consumers have adapted (a la Postel) to this reality by being very permissive with the kinds of broken stuff they permit when ingesting third-party SBOMs.

    (Case in point: pip-audit's CycloneDX emission was subtly incorrect for years, and nobody noticed[1].)

    [1]: https://github.com/pypa/pip-audit/pull/981

  • perbu 1 hour ago
    Software I built will have the following ingredients.

    source from git ~30 go packages ~150 npm packages ~A three layered docker image

  • ozim 1 hour ago
    Typical software developer fallacy - well it looks the same so we can abstract and merge concept.

    Well NO lock file and SBOM formats are used for different purposes and are to be consumed by different audiences. They will evolve in different speeds and ways. Ideally SBOM should not evolve and package lock should be able to change on a whim by package manager developers.

    SBOMs are meant to be shared by 3’rd parties while lock files not - just because some tooling accidentally started using lock files for ingestion is just because people didn’t knew better or couldn’t explain to their customers why they should do SBOM so they did first easiest thing.

  • voidUpdate 28 minutes ago
  • zingar 5 hours ago
    In hearing the SBOM term for the first time from that article and the linked Wikipedia page. For the ignorant like me: what is it that SBOM is used for that lockfiles aren’t? Everything in the article is something that I’m used to seeing automated scanners using lockfiles for.

    Is it just that the two are used by different communities? What is the SBOM community?

    • zvr 8 minutes ago
      Think of the SBOM as a "table of contents" for the software you are receiving. Another metaphors that has been used is the "nutrition label" that you get in all packaged food.

      So, it's a list of the "software components" that are inside a piece of software. And then you add metadata about each of these components: what's its name? its version? its hash? Up to now we're in lockfile territory.

      But you want more information: what is the license? who supplied it? what is the security status? does it have known CVEs? are they relevant?

      And then you go to special cases, like "AI" software: oh, it's a model? how was it trained? on which data? Or like software that has to be certified, to be used when safety is important.

      An SBOM is capable of providing all this information. Take a look at the different parts that SPDX provides, and it's an ever expanding area.

    • edoceo 5 hours ago
      In many cases the lock files are for one part of the stack. Like npm and composer and $other_lang thing. sBOM is when all are together and version-pinned. (I've over simplified).

      Edit: for my domain we have Alpine, Debian, PHP, JS, Go in the stack. So our BOM has all that (and dependencies). It's a big list. Some is just necessary base (Alpine, Debian) but some are core stack and other are edge (dependency on python lib when we're mostly Rust (or something)).

      Mirror/Vendor all these things for supply-chain integrity (it's what they tell me)

    • Khaine 3 hours ago
      SBOMs are a solution intended to help solve a couple of problems:

      1) help identify and remediate software that has been built with vulnerable packages (think log4j).

      2) help protect against supply chain compromise as the SBOM contains hashes that allow packages to be verified

      • ozim 1 hour ago
        You forgot about the important one SBOMs are created with thought about sharing them with third parties like your customers - lock files not.
    • Tomte 1 hour ago
      Software licensing information is the big use case where SPDX originated from.

      In CycloneDX you can also express things like attestations/certifications, possibly down to the code review level (although I think nobody does that).

    • LoganDark 5 hours ago
      > what is it that SBOM is used for that lockfiles aren’t?

      Compliance. The article mentions "the EU’s Cyber Resilience Act will push vendors toward providing SBOMs", and having package managers generate SBOMs directly would certainly be convenient for that.

      • jlubawy 3 hours ago
        The FDA also requires SBOMs as of a few years ago for medical device software.
  • phendrenad2 4 hours ago
    > the security world has been pushing CycloneDX and SPDX

    > CycloneDX supports JSON, XML, and YAML

    And SPDX is JSON.

    Are there any other examples of government-mandated non-human-readable file formats? I feel like bureaucracies have a natural tendency to water down requirements such as this and instead focuses on getting wet signatures on pen-and-paper.

    • Tomte 25 minutes ago
      Or tag-value, which is actually preferred by many practitioners. Nesting is implicit in that format, but SBOMs should be mostly flat, anyway.

      Unfortunately, T-V hs been dropped in SPDX 3.0.

      • zvr 4 minutes ago
        It was dropped exactly because it was flat and it was becoming completely unmanageable.

        SPDX v3 is based on a graph model that can represent hierarchies natively. It can then be serialized in a file, for example, in JSON format.

  • firloop 5 hours ago
    Another drawback could be that package manager lockfile schemas are optimized for performance[0]. I wouldn't appreciate seeing slower install times by default - especially if the lockfile could be converted with other tooling.

    [0]: https://bun.com/blog/behind-the-scenes-of-bun-install#optimi...