Instant database clones with PostgreSQL 18

(boringsql.com)

181 points | by radimm 7 hours ago

12 comments

  • elitan 1 hour ago
    For those who can't wait for PG18 or need full instance isolation: I built Velo, which does instant branching using ZFS snapshots instead of reflinks.

    Works with any PG version today. Each branch is a fully isolated PostgreSQL container with its own port. ~2-5 seconds for a 100GB database.

    https://github.com/elitan/velo

    Main difference from PG18's approach: you get complete server isolation (useful for testing migrations, different PG configs, etc.) rather than databases sharing one instance.

    • teiferer 1 hour ago
      You mean you told Claude a bunch of details and it built it for you?

      Mind you, I'm not saying it's bad per se. But shouldn't we be open and honest about this?

      I wonder if this is the new normal. Somebody says "I built Xyz" but then you realize it's vibe coded.

      • pritambarhate 57 minutes ago
        Let's say there is an architect and he also owns a construction company. This architect, then designs a building and gets it built from of his employees and contractors.

        In such cases the person says, I have built this building. People who found companies, say they have built companies. It's commonly accepted in our society.

        So even if Claude built for it for GP, as long as GP designed it, paid for tools (Claude) to build it, also tested it to make sure that it works, I personally think, he has right to say he has built it.

        If you don't like it, you are not required to use it.

        • rootnod3 2 minutes ago
          That has to be the worst analogy I have read in a while, and I’m HN that says something.
        • greatgib 54 minutes ago
          The architect knows what it is doing. And the workers are professionals with supervisors to check that the work is done properly.
        • pebble 42 minutes ago
          No, it's more like the architect has a cousin who is like "I totally got this bro" and builds the building for them.
          • foobarbecue 38 minutes ago
            Right and also in this world there are no building codes or building inspections.
        • testdelacc1 23 minutes ago
          What an outrageously bad analogy. Everyone involved in that building put their professional reputations and licenses on the line. If that building collapses, the people involved will lose their livelihoods and be held criminally liable.

          Meanwhile this vibe coded nonsense is provided “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED. We don’t even know if he read it before committing and pushing.

      • elAhmo 1 hour ago
        It is the new normal, whether you are against it or not.

        If someone used AI, it is a good discussion to see whether they should explicitly disclose it, but people have been using assisted tools, from auto-complete, text expanders, IDE refactoring tools, for a while - and you wouldn't make a comment that they didn't build it. The lines are becoming more blurry over time, but it is ridiculous to claim that someone didn't build something if they used AI tools.

      • dpedu 1 hour ago
        Huh? It says so right in the README.

        https://github.com/elitan/velo/blame/12712e26b18d0935bfb6c6e...

        And are we really doing this? Do we need to admit how every line of code was produced? Why? Are you expecting to see "built with the influence of Stackoverflow answers" or "google searches" on every single piece of software ever? It's an exercise of pointlessness.

        • renewiltord 17 minutes ago
          I think you need to start with the following statement:

          > We would like to acknowledge the open source people, who are the traditional custodians of this code. We pay our respects to the stack overflow elders, past, present, and future, who call this place, the code and libraries that $program sits upon, their work. We are proud to continue their tradition of coming together and growing as a community. We thank the search engine for their stewardship and support, and we look forward to strengthening our ties as we continue our relationship of mutual respect and understanding

          Then if you would kindly say that a Brazilian invented the airplane that would be good too. If you don’t do this you should be cancelled for your heinous crime.

      • earthnail 1 hour ago
        Not sure why this is downvoted. For a critical tool like DB cloning, I‘d very much appreciate if it was hand written. Simply because it means it’s also hand reviewed at least once (by definition).

        We wouldn’t have called it reviewed in the old world, but in the AI coding world we’re now in it makes me realise that yes, it is a form of reviewing.

        I use Claude a lot btw. But I wouldn’t trust it on mission critical stuff.

        • dpedu 1 hour ago
          It's being downvoted because the commenter is asking for something that is already in the readme. Furthermore, it's ironic that the person raising such an issue is performing the same mistake as they are calling out - neglecting to read something they didn't write.
        • renewiltord 15 minutes ago
          If you don’t read code you execute someone is going to steal everything on your file system one day
        • ffsm8 1 hour ago
          Eh, DB branching is mostly only necessary for testing - locally, in CI or quick rollbacks on a shared dev instance.

          Or at least I cannot come up with a usecase for prod.

          From that perspective, it feels like it'd be a perfect usecase to embrace the LLM guided development jank

          • notKilgoreTrout 1 hour ago
            Mostly..

            App migrations that may fail and need a rollback have the problem that you may not be allowed to wipe any transactions so you may want to be putting data to a parallel world that didn't migrate.

            • parthdesai 5 minutes ago
              > App migrations that may fail and need a rollback have the problem that you may not be allowed to wipe any transactions so you may want to be putting data to a parallel world that didn't migrate.

              This is why migrations are supposed to be backwards compatible

          • gavinray 1 hour ago

              > Eh, DB branching is mostly only necessary for testing - locally
            
            For local DB's, when I break them, I stop the Docker image and wipe the volume mounts, then restart + apply the "migrations" folder (minus whatever new broken migration caused the issue).
    • whalesalad 1 hour ago
      Hell yeah. I’ve been meaning to prototype this exact thing but with btrfs.
    • tobase 1 hour ago
      Cool
    • Rovanion 1 hour ago
      You, is an interesting word to use given that you plagiarized it.
      • anonymars 1 hour ago
        Do you have a link to the original?
  • majodev 2 hours ago
    Uff, I had no idea that Postgres v15 introduced WAL_LOG and changed the defaults from FILE_COPY. For (parallel CI) test envs, it make so much sense to switch back to the FILE_COPY strategy ... and I previously actually relied on that behavior.

    Raised an issue in my previous pet project for doing concurrent integration tests with real PostgreSQL DBs (https://github.com/allaboutapps/integresql) as well.

  • radarroark 2 hours ago
    In theory, a database that uses immutable data structures (the hash array mapped trie popularized by Clojure) could allow instant clones on any filesystem, not just ZFS/XFS, and allow instant clones of any subset of the data, not just the entire db. I say "in theory" but I actually built this already so it's not just a theory. I never understood why there aren't more HAMT based databases.
  • BenjaminFaal 2 hours ago
    For anyone looking for a simple GUI for local testing/development of Postgres based applications. I built a tool a few years ago that simplifies the process: https://github.com/BenjaminFaal/pgtt
    • okigan 2 hours ago
      Would love to see a snapshot of the GUI as part of the README.md.

      Also docker link seems to be broken.

      • BenjaminFaal 2 hours ago
        Fixed the package link. Github somehow made it private. I will add a snapshot right now.
  • christophilus 2 hours ago
    As an aside, I just jumped around and read a few articles. This entire blog looks excellent. I’m going to have to spend some time reading it. I didn’t know about Postgres’s range types.
  • francislavoie 1 hour ago
    Is anyone aware of something like this for MariaDB?

    Something we've been trying to solve for a long time is having instant DB resets between acceptance tests (in CI or locally) back to our known fixture state, but right now it takes decently long (like half a second to a couple seconds, I haven't benchmarked it in a while) and that's by far the slowest thing in our tests.

    I just want fast snapshotted resets/rewinds to a known DB state, but I need to be using MariaDB since it's what we use in production, we can't switch DB tech at this stage of the project, even though Postgres' grass looks greener.

    • proaralyst 1 hour ago
      You could use LVM or btrfs snapshots (at the filesystem level) if you're ok restarting your database between runs
      • briffle 11 minutes ago
        LVM snapshots work well. Used it for years with other database tools.. But make sure you allocate enough write space for the COW.. when the write space fills up, LVM just 'drops' the snapshot.
      • francislavoie 1 hour ago
        Restarting the DB is unfortunately way too slow. We run the DB in a docker container with a tmpfs (in-memory) volume which helps a lot with speed, but the problem is still the raw compute needed to wipe the tables and re-fill them with the fixtures every time.
        • ikatson 9 minutes ago
          How about do the changes then bake them into the DB docker image. I.e. "docker commit".

          Then spin up the dB using that image instead of an empty one for every test run.

          This implies starting the DB through docker is faster than what you're doing now of course.

          • francislavoie 5 minutes ago
            Yeah there's absolutely no way restarting the container will be faster.
        • renewiltord 10 minutes ago
          I have not done this so it’s theorycrafting but can’t you do the following?

          1. Have a local data dir with initial state

          2. Create an overlayfs with a temporary directory

          3. Launch your job in your docker container with the overlayfs bind mount as your data directory

          4. That’s it. Writes go to the overlay and the base directory is untouched

          • francislavoie 3 minutes ago
            But how does the reset happen fast, the problem isn't with preventing permanent writes or w/e, it's with actually resetting for the next test. Also using overlayfs will immediately be slower at runtime than tmpfs which we're already doing.
  • 1f97 3 hours ago
    • horse666 2 hours ago
      Aurora clones are copy-on-write at the storage layer, which solves part of the problem, but RDS still provisions you a new cluster with its own endpoints, etc, which is slow ~10 mins, so not really practical for the integration testing use case.
  • TimH 3 hours ago
    Looks like it would probably be quite useful when setting up git worktrees, to get multiple claude code instances spun up a bit more easily.
  • 1a527dd5 3 hours ago
    Many thanks, this solves integration tests for us!
  • mvcosta91 4 hours ago
    It looks very interesting for integration tests
    • radimm 3 hours ago
      OP here - yes, this is my use case too: integration and regression testing, as well as providing learning environments. It makes working with larger datasets a breeze.
    • presentation 3 hours ago
      We do this, preview deploys, and migration dry runs using Neon Postgres’s branching functionality - seems one benefit of that vs this is that it works even with active connections which is good for doing these things on live databases.
    • drakyoko 3 hours ago
      would this work inside test containers?
      • radimm 2 hours ago
        OP here - still have to try (generally operate on VM/bare metal level); but my understanding is that ioctl call would get passed to the underlying volume; i.e. you would have to mount volume
  • horse666 1 hour ago
    This is really cool, looking forward to trying it out.

    Obligatory mention of Neon (https://neon.com/) and Xata (https://xata.io/) which both support “instant” Postgres DB branching on Postgres versions prior to 18.

  • oulipo2 1 hour ago
    Assuming I'd like to replicate my production database for either staging, or to test migrations, etc,

    and that most of my data is either:

    - business entities (users, projects, etc)

    - and "event data" (sent by devices, etc)

    where most of the database size is in the latter category, and that I'm fine with "subsetting" those (eg getting only the last month's "event data")

    what would be the best strategy to create a kind of "staging clone"? ideally I'd like to tell the database (logically, without locking it expressly): do as though my next operations only apply to items created/updated BEFORE "currentTimestamp", and then:

    - copy all my business tables (any update to those after currentTimestamp would be ignored magically even if they happen during the copy) - copy a subset of my event data (same constraint)

    what's the best way to do this?

    • gavinray 1 hour ago
      You can use "psql" to dump subsets of data from tables and then later import them.

      Something like:

        psql <db_url> -c "\copy (SELECT * FROM event_data ORDER BY created_at DESC LIMIT 100) TO 'event-data-sample.csv' WITH CSV HEADER"
      
      https://www.postgresql.org/docs/current/sql-copy.html

      It'd be really nice if pg_dump had a "data sample"/"data subset" option but unfortunately nothing like that is built in that I know of.