I Don't Maintain My Homelab

(cleberg.net)

56 points | by surprisetalk 1 day ago

32 comments

  • l72 3 minutes ago
    This is surprising to me and the exact opposite of what I want for a few reasons:

    1. I don't like surprise breakages. I am not prepared to fix a service my family uses midday on a Tuesday when I am working since it auto updated. I'd like to specifically make sure I have dedicated time and plan if something is going to go wrong.

    2. My family HATES when things change. I try to run LTS versions of things, but annoyingly, some software like nextcloud doesn't have LTS version. One of the things my family likes the most, is that the stuff I host isn't constantly changing like commercial products. Having google photos change or netflix have a new interface randomly is very, very frustrating for them.

    Since my homelab is completely internal, I avoid quickly doing updates (unless it is a critical security issue), and definitely avoid doing major version upgrades unless there is good value in it.

  • silversmith 2 hours ago
    I also have a "homelab" with minimal maintenance requirements. I'd wager it works out to much less than 15 minutes a month over a year. The strategy is as follows: pin all services to known good versions, deny access from outside LAN, and don't touch it unless there's a new service release with new features I want. Not something I would do at work, but perfectly fine for home setting.
  • kordlessagain 2 hours ago
    I've had "servers" or a "homelab" at home for de3cades. I stopped a while ago when I burned out. About 4 month ago, I bought a new motherboard and graphics card for my desktop and dropped the old ones into a $70 case I got from Best Buy and put Ubuntu on it. I think I spent 10x that on memory for my new desktop, but that's just a passing grumble. The new server now runs transcription and embeddings for me on the old GPU. That motherboard is still plenty fast, but pushing 8 years old now. That's the advantage of buying a nice board from the outset.

    The rest of the lab is a few ephemeral instances on Google, with dual A100s that spin up when I need to train things.

    I put Ubuntu on the old beast, and never touch it. If the power goes out, it automatically comes on and Docker launches all the services when it comes up.

    About the only thing that needs watching is the tiny SDR radio plugged into it, which I use for pure random numbers and talking to it with a hand held radio from the other house. Sometimes I have to unplug it and then plug it back in to get it back into service. No amount of finagling seems to fix it from software.

    • freedomben 1 hour ago
      > About the only thing that needs watching is the tiny SDR radio plugged into it, which I use for pure random numbers and talking to it with a hand held radio from the other house.

      You are an interesting person! We would be friends IRL :-)

      May I ask what you use the pure random numbers for? And what you use the radio link for?

      • kordlessagain 1 hour ago
        Thanks! I'm a nerd, for sure.

        I've built an SDR radio stack to integrate with a single pane navigation and chart plotting app I'm building for my new company. You use the radio to talk to the local agent (I just finished a submission to the Gemma Challenge on HF to learn how to train models). Wake words show up on the glass, for security. Been working on training a small model to do agentic controls, including changing autopilot and switching displays when you want to see other content on the screen. I've been working on isochrone routing and have it working well now. Waiting for Fabel to come back to continue the work...

        Everything is here: https://deepbluedynamics.com. It's just me as an LLC. No VC. No backers. No users, yet. Few stars, but 100% written by me, and most of it is open source. FWIW, I've been around the block and I'm getting old now. So the radio helps with arthritic fingers! :) The radio stack stuff lives here: https://nuts.services/sdrrand

        I'm also integrating the voice stuff into Hyperia, my terminal emulator forked from Hyper Terminal. It's on Github. Hyperia is also agentic controlled, so I can talk to it by pane name to inject text into the prompts. This lets me get up and roam while I'm roaming around. I'm remodeling my front house, and for a while I had one of those smart lights turning red or blue when things happened in the sessions, but I want something I can talk to without having to click and type. I use the Windows transcription feature a lot, but talking on the radio to it is much easier.

        Oh and to answer your question about random numbers, I use them for various Monte Carlo based approaches. For example, I inject those numbers into training runs, rolls for the iChing (helpful for feeding agents for decision busting), and seeding other things like simulated wind speeds and current (in the sim beause I'm not sitting on a boat). I even use it for sampling documents I've been indexing (cut up a bunch of business books and have a local model use them for reference).

        I have the SDRrand sources wired into almost everything now that needs randomness. Is it necessary? Maybe. We only get pseudo random numbers from computers, so I just attached to the purity of it, if anything.

    • trey-jones 1 hour ago
      When I found Tesla P100s on ebay for $75 I thought I was getting a cheap server, but 1/3 of the total cost was RAM. Sorry, first time I've built a computer since the shortage.
  • exiguus 28 minutes ago
    This is a fantastic article! I completely agree with the author's philosophy. Simple automation can reduce maintenance to nearly zero, and it's incredible how much can be achieved with just a few well-crafted scripts.

    I use a nearly identical alias for docker pull to keep my containers updated. To ensure everything stays running smoothly, I've built a lightweight watchdog (a mix of bash scripting and Uptime Kuma/Beszel) that monitors my services and containers and restarts them if they crash. This way, I rarely need to intervene manually.

    For critical services (DNS, VPN, git, web search, crawler and mail, etc.), I add an extra layer of redundancy by running them on multiple servers across different locations. If one server fails, the others seamlessly take over. I also use DNS round-robin as a simple but effective way to handle load balancing and failover; no HaProxy, K8, expensive IP Takeover (ARP Spoofing) or BGP Anycast and VRRP/CARP, Proxmox or fancy orchestration tools required. If a node goes down, another watchdog script temporarily removes it from DNS, and traffic shifts to the remaining servers. Most often the services are self-healing. The best part? My deployment and monitoring are fully self-scripted (no Terraform, Ansible or BundleWrap). Moving services to a new server is as easy as running some scripts over SSH. Everything sets itself up automatically. Currently I run my services on 2 Pi's, 2 stratum 1 servers (from centerclick), and 8 VPSs that cost me around $40/month. It's a great example of how a little automation and redundancy can go a long way in keeping things cheap and reliable without unnecessary complexity.

    I invest around 1-2h/month to maintain and (mainly) adjust my setup. Before I head multiple Proxmox instances and a backup server that cost me around $250/month, I was spending 1-2h/week just to keep everything running. The difference is night and day.

    Thanks for the inspiration; it's always refreshing to see others embracing simplicity!

  • itomato 2 hours ago
    Yes, but you didn’t mention anything that would suggest a need to ‘maintain’.

    It doesn’t change.

    Many people keep swapping gear in so they can learn BGP on Cisco edge gear or run clusters on salvaged IB.

    OP is not that person.

    • NBJack 2 hours ago
      I gotta agree. I setup a homelab originally to start learning more about virtualization, Kubernetes, etc. It was painful, required time to fix my mistakes, and I hit my head on the ugly realities of distributed hardware. But it was also experience I could (and did) apply to my job.
  • freedomben 1 hour ago
    Getting to this point with my homelab has always been my goal, and I've also arrived. I mainly just want a stable, reliable Jellyfin, Audiobookshelf, archivebox, Navidrome, ollama/openwebui, and a place with plenty of RAM and CPU to spin up and run a half-dozen various VMs at a time, without having to mess around to use them.

    Building/tinkering/playing around is fun, but once you are actually self-hosting services you rely on, it needs to "just work" or you will eventually burn out or lose interest. Especialy when you take on more users than just yourself. The day my wife cancelled her audible subscription because audiobookshelf was just as good (IMHO better) was a good day, but that only happens because it is stable/reliable.

  • stego-tech 2 hours ago
    This has been a similar approach to what I did for my own homelab. I still need to setup some sort of GitOps so I don’t have to ssh into the box and manually bootstrap whatever compose file I’ve thrown on there, but that’s honestly about it.

    * Docker Compose files and various folders for containers live on an NFS share

    * SQLite and other databases run off a local SATA SSD for speed and reliability

    * Cronjob tarballs the critical stuff nightly and throws it on another NFS share to get ingested into Backblaze B2.

    Now I just get to kick back and actually experiment with new things instead of babysitting a convoluted Proxmox upgrade or shunt onto a new container standard.

    Does it run rootless? Not atm (blame FreshRSS, my sole holdout). Is it super secure? Probably not, but I’m not doing anything goofy like mounting the Unix socket into a container at the very least, and the server credentials don’t work anywhere else should it get popped. The blast radius is contained, and that’s more important to me than Enterprise-grade security for my homelab (a la Wazuh, another backlog project TBD).

  • kamov 2 hours ago
    > I've approximated it somewhere around 15 minutes of maintenance per month, barring an emergency. If that's normal to you, congrats - you've peaked in life. However, that's absolutely absurd to me. I used to spend days on end building, maintaining, and debugging various aspects of my servers, databases, apps, etc.

    It's been normal for me for the past 3 years thanks to using NixOS for all server infrastructure.

    • embedding-shape 2 hours ago
      Same. As someone who is OK with a small amount of maintenance every N months, but keeps forgetting how things are setup or what I did, moving absolutely everything into Nix and running NixOS made things a hell of a lot simpler when you come back after 6 months and can easily find where and what to change, as long as you take care to declaratively set things up via Nix as much as possible, and use git.

      Helps that things are really easy to test too, spin up a new test VM with your new config and copy of real data, check if it works, then apply the change to the real hardware and you're good to go. Alternatively, do it live with a copy of real data, then rollback in case it doesn't work.

    • sunaurus 1 hour ago
      I've been maintaining homelab servers for two decades, used all kinds of IaC approaches etc.

      Switched to NixOS a few years ago, and I can't overstate the amount of peace it has brought to my life. It just takes so much stress away, compared to everything else I've used before.

      My only criticism is that the Nix language is not super ergonomic or easy to learn. But with LLMs nowadays, even that is barely ever a problem.

    • ochoseis 1 hour ago
      Options still get deprecated in NixOS and require working around, and while debugging has gotten easier it can be a pain to debug when there’s an error somewhere deep in your configs. I’ve found that NixOS is like 0 maintenance most months and then half a Saturday two or three times a year figuring out why I can’t update.
  • teekert 1 hour ago
    I thought this was going towards the "I have an agent do it". glad it didn't :)

    What this skips though is the complexity of services like NextCloud (stuck in maintenance mode again?), Immich (needs a compose file edit?), MineCraft worlds (Dad! my client is on another version again!), (dmn) AlbyHub (needs re-login and closed its channel).

    But to be fair this is really getting quite minimal these days indeed. I didn't really realize it but I too have a mostly hand-off home-lab... Ok, then it's not really a lab anymore, its more "stable home-infra" ;)

  • teiferer 1 hour ago
    > I've never required a backup, but it's good to be safe.

    Indeed. And if you never test your recovery then you don't actually have a workable backup.

    • uhoh-itsmaciek 1 hour ago
      Right. Backups don't matter. Restores matter.
  • INTPenis 41 minutes ago
    Me and all my friends are active homelabbers and selfhosters.

    Recently one had their first baby, so they migrated from Fedora to RHEL, just to spend less time on upgrades. :D I thought that was cute. Like RHEL is so stable, even a first time parent can use it.

  • 28304283409234 2 hours ago
    Is it still a lab then? Or selfhosted services on auto-pilot?
  • zf00002 1 hour ago
    I used to think of my home setup as a homelab but I've realized for about 4 years now I barely do anything with it beyond what it's supposed to do. I just have some basic services running, most time spent over the last 4 years has to do with broadcom acquiring vmware so I had to switch to Proxmox, and then just moving houses and having to setup again.
  • Gigachad 1 hour ago
    Same here, I've just kept it simple with Immich and Nextcloud. Automatic updates set up on debian and automatic docker pull to update the apps. With a nightly backup to both a local hard drive and encrypted backups to google drive.

    After I set it up and stopped fiddling with it it's just run flawlessly for the last 6 months.

  • jmbwell 2 hours ago
    I’m so almost here. The thing holding me back is projects that don’t do their own migrations reliably. Through no fault of their own, perhaps, though at this point I would argue LLMs should eliminate any good reason not to have alembic integrated or something. And even Home Assistant is bizarrely averse to fully automated system wide updates. Updating system and core and addons all independently is bonkers. But yes, the simplest implementation is often the best
  • Havoc 2 hours ago
    tbh most of my time is making active changes and trying new things. Or say moving from say LXC to kubernetes

    Don’t super care about updates. If it isn’t too ancient and not internet facing then it’s probably ok

  • meindnoch 2 hours ago
    Yeah, and what happens when every now and then upstream changes break your config? Like when Debian removed systemd-resolved, breaking mDNS.
    • rcxdude 2 hours ago
      Then you spend a bit of time fixing it. With the right stack, these things are rare and not often difficult to resolve.
    • szszrk 1 hour ago
      It breaks. So you fix it and go back to previous mode.

      I'm not sure what's here to talk about. Things break. We don't have to overthink this. But if you want more predictability, stable distros exist.

  • s_ting765 2 hours ago
    Same here. Even though my homelab runs on a VPS. https://github.com/rhee876527/expert-octo-robot
  • JoelMcCracken 1 hour ago
    I always love coming across a new site and it screams “org-mode”.
  • bittumenEntity 2 hours ago
    Definitely opened this thinking it would be a story of handing the keys to AI. Refreshing, simple and to the point
  • _pdp_ 2 hours ago
    I wrote a small agent (single go binary) that does all the monitoring and maintenance for me. Possibly overkill but it is amusing to think there is a little ghost in the machine.
    • endre 1 hour ago
      please elaborate
  • owaislone 2 hours ago
    Interesting. For me if I want to keep my lab stable, I have to ensure I pin all images and components to a specific version. I rarely but deliberately upgrade them (2-3 months). I feel putting things on auto-update is bound to break stuff and force you to spend time on it at the worst possible times.
  • pshirshov 2 hours ago
    > UniFi supports automatic and scheduled updates,

    Yeah, right until the moment it bricks after an update.

    • Arainach 2 hours ago
      I've had automatic updates on for a decade without issue.
  • cheschire 2 hours ago
    I suspect my approach is even more controversial… I just open Claude code and type /routine-maintenance and it reads the skill file, logs into all my systems on my home network and runs updates, validate backups are still healthy, update any docker images, checks SMART stats, reviews some logs, and then fires off an email using brevo to tell me any future maintenance concerns I might have.

    Edit: zero minutes old already downvoted.

    • aleksiy123 1 hour ago
      Manual type? No cron job?

      Practically Luddite

    • bilekas 2 hours ago
      But using AI is not the point of the article.
      • cheschire 2 hours ago
        The headline: “It's true. I don't maintain my homelab… it maintains itself.”

        So using AI is not the point of the article but neither was it mine.

        My point was I also attempt to implement homelab automation rather than manual maintenance, and I listed a few things that are onerous to do regularly by hand just like the article.

        But I totally expected people to just skim my message, see “AI” and dismiss it, so I’m not terribly upset.

  • PunchyHamster 2 hours ago
    Debian + unatteneded-upgrade package (+ some setup like telling it at which time it can reboot itself) is essentially "forget for 2 years then do dist-upgrade and forget for another 2 years" setup
  • endre 2 hours ago
    that's cruise control for supply chain attacks, at the bare minimum
    • nicomt 2 hours ago
      I think if you set cooldowns and stick to more reputable sources, it might be okay. I do pin my versions and do manual updates in my home lab, but that's more for stability and so it increases the chances I'll catch update issues while I'm already there. I don't pretend that gives me any extra security, though, because I don't have the time to review updates beyond surface-level changelogs. I don't think the solution to supply chain issues is for every developer to be paranoid at all times. I think we need better systems built on top of existing package managers to check provenance and integrity, and to allow security researchers and automated tools to vet releases before they're distributed more broadly.
    • KaiserPro 2 hours ago
      I mean is it really, more than any other update technique?
  • skydhash 1 hour ago
    I don’t have a “homelab”, just an old mac mini that runs jellyfin, gonic and calibre (content server), and on which I do try some linux things. It runs Debian and the actual maintenance is mostly “apt update && apt upgrade”.

    I don’t use docker, I’d rather create my own packages. And if a project is too trigger happy about requiring new dependency version, I drop them.

  • atoav 1 hour ago
    I run the mediatech department in an university. My tech requirement for any infrastructure boils down to three basic questions:

      1. How often do I have to touch it during the next ten years?  
      
      2. How many of the times that I have to touch it are because I decided to do so?  
      
      3. How much pain is it to fix and understand if I had my mind erased?  
      
    This often works out in favour of dead simple solutions.
  • NBJack 2 hours ago
    Damn. That was boring. Putting all updates on autopilot is certainly a choice. But, hey, it's their homelab.
  • colordrops 1 hour ago
    I'm working on an all-in-one box that has OTA updates, requiring virtually zero maintenance after setup. It's currently at the pre-alpha stage. It bundles a router/firewall, app server, and NAS. Not trying to be everything to everyone, but covers the basic functionality most people would need. Automatically handles DDNS, TLS certs, backups, and SSO wiring. Entire config is in a single JSON file, but the system can be extended using plugins. It's based on NixOS but doesn't require the user to know that.

    https://HomeFree.host

    Longer term goal is a sleek plug-and-play box anyone can connect to their ISP modem with minimal technical knowledge.

    I'm currently running it on a Aoostar WTR Max NAS with my AT&T connection. Got another NUC connected to a Spectrum modem. My goal is to be able to flip back and forth between the two with a backup bundle within minutes.

    Considering breaking up the router and app server functionality so they can be run separately. Another idea is to use custom a 3D printed case with Framework laptop motherboard and battery, switch, and wifi AP to make a true all-in-one box. I currently need an external switch, backup battery, and wifi access point.

    Once the system feels mature, next steps would be things like federated tailnets with friends and family for things like distributed backups, compute/GPU, CDN, social networking, etc. Hoping that decentralized model training is cracked by someone at some point.

    From a coding perspective I'm hoping to modularize everything (since it's NixOS) and add thorough testing and hardening. It's already relatively modularized considering it's built on Nix flakes.

  • botfriendsarent 2 hours ago
    This "home lab" stuff is kind of nice hobbyist talk. I wish we had fancy words like that back in the 80s.

    Technology has come along way. But I think that in tech we should be careful to not fall prey to monkey see monkey do.

    We should not be deploying technology in our homes to "mimick our employers"

    Remember they are miserable for a reason.

    • itomato 2 hours ago
      We did. We just didn’t classify it with a hashtag.

      Frankenstein couldn’t build a monster without influence. Same thing here.

      “CCNA? I’ll show you CCNA…”

      • botfriendsarent 2 hours ago
        I had a friend back in the 90s who referred to his desktop computer as "his mainframe" lol
  • cyberjunkie 2 hours ago
    No slop. Love it.