A native graphical shell for SSH

(probablymarcus.com)

208 points | by mrcslws 6 hours ago

37 comments

  • hatradiowigwam 3 hours ago
    This appears to me like a solution in search of a problem, like many others before it...the quote below seems relevant to this effort.

    "Those who do not understand Unix are condemned to reinvent it, poorly." ~Henry Spencer

    • hughw 1 hour ago
      I hired a programmer and after giving him his Linux laptop let him set up a few things. A couple hours later he asked me where he could get PuTTY for it, and I recognized a huge gap in my interview coverage.
    • forgot_old_user 2 hours ago
      that seems a little harsh. I think there is a real usability gap which this takes a crack at.

      Some ideas like using viewing a linux dir over _ssh_ using native UI components.. seem cool.

      I do agree, some of these do seem like they have already been solved in other ways (like an sshfs mount).

    • hedgehog 1 hour ago
      This resembles Plan9 more than UNIX. I wouldn't put UNIX up on a pedestal.
      • projektfu 1 hour ago
        Plan9 is funny because it's what UNIX might look like if the people working on UNIX understood UNIX, i.e. everything is a file and simple primitives are composed into complex systems.
        • lstodd 0 minutes ago
          yea, as sibling said. p9 was not possible on pdp11. what was possible there was .. v7 and bsd2. see https://github.com/felipenlunkes/run-ancient-unix

          p9 was done when "current state of unix" was already fixed in form of aix, sysv and bsds, it suffered the same fate as say beos.

        • hedgehog 37 minutes ago
          They had the benefit of hindsight and bigger hardware, but UNIX got too popular and now we're struggling to move past it. It would have been interesting to see what the fourth try would be like (though looking at Go I would probably not completely like it).
    • Modified3019 2 hours ago
      > "Those who do not understand Unix

      Funny enough, that right there is the actual fundamental problem here.

      I am reminded of a post or blog long ago that talked about programmable thermostats and how awful they are for most people to use despite how powerfully in the weeds one can get with them. Basically summarizing the issue as something like “People do not want to learn your arcane system, they just want the benefit it’s advertising”. A good UI knows how to minimize that gap.

      • XorNot 19 minutes ago
        I mean that's true but the number of UIs which simply don't add access to necessary features in the name of "simplicity" is enormous.

        The poster child of this is the Microsoft Office ribbon.

    • whatever1 1 hour ago
      No. It’s just that now more people are using Linux the more the ux decisions that were made 40 years ago will be questioned.

      Almost all dev facing machines have ssh server installed and accessible.

      Why ssh terminal has to look like character-only trash from 1960s? Why a TUI is the best thing we pipe through ssh? Why I cannot watch a 4k movie in the terminal or browse the web using pinch to zoom ?

    • aslihana 1 hour ago
      I think this is a `There’s no such thing as bad publicity`
  • trashb 5 hours ago
    I like the idea of separating the frontend and backend of a graphical app. But I feel like this is hardly a novel idea, maybe I'm missing something.

    I take it you don't know about "X11Forwarding yes" or "html5 web app"

      For browsers, capabilities like connecting to Unix sockets have been dismissed as extremely niche
    
    That is a security concern, that's why it isn't implemented. At least raw unix socks. You can have WebSockets and other ports only limited to http.
    • mrcslws 4 hours ago
      Quick response regarding security:

      On various Mozilla forums that I saw, the discussion was basically: 1. We can't just allow the browser to connect to any socket, since many either explicitly don't want browsers connecting to them, or are oblivious to browsers. 2. ...so we need to also add some sort of allow list 3. ...this is getting too complicated for such a niche feature.

      So I think the nicheness was the high-order bit here.

      (FYI, Outer Loop does add an allow-list: https://outerloop.sh/unix-domain-sockets/)

  • guhcampos 3 hours ago
    Author apparently has never heard about Cockpit.

    Everything they mention as "missing", or "novel" has been part of Cockpit for over a decade, from socket-based web server connection, backend-frontend separation for server apps and the whole idea of a server console with shell access itself.

    To answer them: "Isn’t it weird that this doesn’t already exist?" - No, it's not, because it has existed for ages.

    • gurjeet 1 hour ago
      > Be kind. Don't be snarky. Converse curiously; don't cross-examine. Edit out swipes.

      Sincerely, HN Guidelines Police :-)

      https://news.ycombinator.com/newsguidelines.html

      • guhcampos 1 hour ago
        I get it, but if the author of the article uses a biased and loaded language, I think it's fair game to do the same in the comments.
        • gurjeet 58 minutes ago
          I don't believe in that kind of response. Anything that one can say in rage or anger can be communicated in a calm and measured response.
    • jng 1 hour ago
      If I'm not mistaken cockpit is web UI and doesn't run native code, important differences.
      • mrcslws 49 minutes ago
        Thanks for pointing this out. I'm not hating on Cockpit, but Outer Loop (with Outer Shell) has solved a lot more of the stack. Cockpit accepts the constraints of living in existing browsers, so it requires exposing a port to the internet or using some SSH port forwarding tool. Whereas I built a dedicated browser to push capabilities so that users can get a "Just point me to a server" flow.

        This thread has been useful -- I think Cockpit will also work great in Outer Loop. And it will be easy to add it as an app in Outer Shell.

      • guhcampos 48 minutes ago
        It's a very, very thin web layer on top of native code:

        https://cockpit-project.org/guide/latest/features.html

        To the author's defense: Cockpit is Linux only, and they seem to intend on making this also available on Windows and Mac.

        Still, I don't see the appeal they seem to do, especially since it relies so much on SSH. The biggest use case I can think for something like this in the real world is something like first-time setup or MDM, and on both situations setting up SSH to begin with has the same level of friction they're trying to remove.

        • XorNot 11 minutes ago
          Windows has quite a lot of remote admin tools that work pretty transparently over the network though.

          The issue is that they're historically never turned on or heavily restricted.

          Where the user is involved though RDP is a world class remote desktop never exceeded by Linux anywhere.

          If someone wants to impress me, point Claude at Wayland and get it so I can seamlessly open remote RDP from somewhere else, lock the local user session and resume it on the remote desktop, then walk back to the original terminal and continue working in that same user session. This worked perfectly over 20 years ago.

    • NooneAtAll3 2 hours ago
      I never heard of cockpit either

      what is it?

  • purplehat_ 5 hours ago
    i'm trying to understand how outer shell works here. on the website you give the following as your motivation:

    > Apps like Jupyter and Tensorboard are not typically visible to standard web browsers if they’re running on remote servers, because it would be terribly unsafe to let the whole internet touch this app. Instead, they run on a local port on the server, which your computer can’t access directly.

    > Classically, to get access to these, you had to open a new terminal and run:

    > ssh -L 24601:localhost:8889 mrcslws@lambda4.mycompany.com &

    > ssh -L 24602:localhost:6006 mrcslws@lambda4.mycompany.com &

    is this true? isn't the normal thing just to do this ssh forwarding for prototyping, then for deployment, you set up a website like myjupyternotebook.com, and then set up auth so that others can't access it. HTTP basic auth is not too much work.

    if you want SSH, not HTTP, to be what's publicly exposed, there's other options too, like putting it behind a VPN or tunnel.

    all this to say, outer loop is super cool, but I don't get it. I must be missing something about why you built it, so could you help me understand?

    • mrcslws 4 hours ago
      I think there are different clusters of people who use servers, SSH, etc.

      I'm closer to the cluster that uses them for deep learning experiments, GPU kernel optimization, robot development (a robot is just a server that moves!)... use cases where you are explicitly using a remote computer.

      For this cluster of people, I think this tool feels more intuitive than the flow you suggest. But maybe I'm projecting!

      And, to me, this just feels like one of the fundamental things that could exist; it's like a graphical operating system, but remote-first.

    • _def 5 hours ago
      I guess it saves you the hassle of dealing with reverse proxies and TLS certs if your use case is "userbase is 1 person and it is me, and i only access services from a desktop os"
      • KomoD 4 hours ago
        Ever since I started using Caddy, doing that has been soooo easy.

        Download the binary, make a Caddyfile

          myservice.example.com {
           basic_auth {
            admin some_password_hash_here
           }
           reverse_proxy :3000
          }
        
        And then just "./caddy start"
        • gizzlon 3 hours ago
          Caddy can also proxy to unix sockets !
        • Natfan 4 hours ago
          does this work with multiple caddy servers? ie can you bind multiple caddy servers to port 80/443?
          • KomoD 4 hours ago
            You can have multiple configs in a single Caddyfile and reload when you make changes, and it'll just route them as you wish, e.g.

            domain1.com -> service on port 1234

            domain2.com -> service on port 5678

            domain3.com -> serving a file directory.

            And then you still access domain1.com, domain2.com, domain3.com on port 80/443

          • tcoff91 4 hours ago
            You set up multiple services behind a single caddy reverse proxy
    • procaryote 3 hours ago
      Btw, if you find yourself sending a lot of ports over ssh, you can also consider the option of having ssh start a socks5 proxy

      ssh -D 4711 -q -C -N user@host

      sets localhost:4711 up as a socks5 proxy you can tell your browser to use

      ...

      A wireguard VPN is better of course; among other things because ssh is multiplexing over a single TCP connection and will encounter head of line blocking (where one dropped packet blocks all forwarded traffic until resent)

  • tammer 16 minutes ago
    I think the approach here where interfacing with a device is considered from first principles is one that is rarely taken on, and this is a thought provoking implementation. Kudos.
  • calmbonsai 3 hours ago
    Do not do this. There are many, many excellent long-standing security and "web control plane isolation" reasons browsers are not permitted generic socket permissions.

    The closest mechanical analog that comes to mind is why 3-wheeled ATVs are a bad idea.

    • mrcslws 2 hours ago
      I think it's okay as long as:

        - sockets are blocked by default, until they are added to an allow-list explicitly on the server side
        - True sudo awareness ensures root sockets aren't reachable without the sudo password. (This capability is important, because otherwise you create an incentive for people to run root backends with user-accessible sockets.)
      
      More here: https://outerloop.sh/security/
  • smusamashah 30 minutes ago
    Feedback: Home pages of each of Outer Loop, Outer Frame and Outer Shell contain basic intro of each instead of a link redirecting to them. By the time I click the link and on the new Outer X I have already what Outer X I came from and what it meant.
  • abnercoimbre 4 hours ago
    Lovely writeup! I'll bookmark this for my own research.

    My terminal's "clickity clackity" features [0] are local to the machine so I lose graphical-ness as soon as we remote in somewhere.

    That's starting to change a bit with offline replay [1] where the native GUI and TUI work in tandem to unlock some rewind. But there's quite a road ahead and I love seeing others experiment properly. (Terminals are massively underserved.)

    [0] https://terminal.click

    [1] https://terminal.click/posts/2026/06/tui-stability/#:~:text=...

  • cloudfudge 2 hours ago
    This reminds me of an idea that I build a PoC of many years ago (maybe 2013 if I recall) that I always felt was the nugget of a useful idea. You would SSH into a server and processes on the other end would emit data which was then displayed in a webapp that was served from a localhost port, with a local backend that consumed the data. So for example a short-lived web-based remote 'top'. I did it as part of a company-internal hackathon and thought it was really cool, but nobody else was impressed with it. It was a very half-baked idea, and this looks like a fully-baked version of it. I'll check it out.
  • flying_sheep 5 hours ago
    That's interesting idea. If we put into CLI with some ANSI escape code, that may become something real. Imagine a normal terminal app just render part of the UI in web and communicating in UNIX socket. While doing the fancy UI, everything is still controllable with keyboard, and optionally with mouse. The UI will fallback to text UI for older terminal
    • jerf 3 hours ago
      If your UI is not fully controllable with a keyboard, the same forces that made that happen will eventually make a mouse mandatory for this hypothetical tech stack too.

      The terminal has no Platonic quality of being keyboard only. It is an accident of history and the limitations it has had. Remove the limitations and remove the accident of history and you will just end up drawn into the strange attractor of GUIs, warts and all.

      There could be a brief honeymoon where the tech stack looks like some of you are imagining in your heads, but it would only last as long as it wasn't used by very many people. Google "gemini protocol" for a similar situation. That protocol has basically a cap on how popular it could possibly get before it just turned into HTTP B as the rest of the world forcibly upgraded it regardless of what the core project thinks. They exist in the shadow of HTTP, as the terminal exists in the shadow of GUIs. This is not a bad thing. It is what lets them be what they are. The shadows of GUIs or HTTP is large and there is plenty of space to be. Trying to give the terminal more GUI capabilities is like trying to give Gemini more web capabilities; you'll just end up in the same place, only with less refinement.

    • ori_b 5 hours ago
      So, uh... X11? VNC? RDP?
      • flying_sheep 5 hours ago
        No no not something on top of the UI stack. They also need framebuffer support so they are big headache to setup on headless server.

        What I mean is that we can bring some web tech to terminal natively. We don't even need a separated shell. Security and bi-directional communication is built by default because of UNIX socket. But we still need to think how to handle stuff like cookie, local storage, external CSS / JS, ...

  • toenail 5 hours ago
    Interesting, kind of like a more fancy web shell. Haven't really ever seen the need for those, mostly because terminals work better than browsers.
    • dboreham 5 hours ago
      Sometimes the browser is the only "computing platform" you have available (e.g. on some mobile devices, hotel kiosks).
  • v3ss0n 23 minutes ago
    UI/UX is very bad why would we need it over Warp / Wave Terminal
  • saltamimi 5 hours ago
    One of the more interesting pieces of Microsoft software is the Windows Admin Center where it's a web app to configure a Windows Server. Ideally, it was made for core installs where there's no GUI but it's there as a viable web management panel.

    The tool from OP and WAC are pretty similar in terms of functionality and usecase. Why would you want this? Well, imagine your team needing to be able to do server functions but you have less technical team members to do it for you, which is very often the case in big places, most people are familiar with the web browser and having a website to do these sorts of actions makes it easier to have things done in one place without a lot of tools like Remote Desktop, SSH, WinRM, etc. configured.

    • jon-wood 4 minutes ago
      At the risk of being considered a snob I don’t want someone who can’t deal with SSH or RDP configuring servers within my company. If you can’t work out how to SSH into the server you sure as hell aren’t going to work out how to safely expose network services on it.
  • dwb 5 hours ago
    Just had a quick look but I like the look so far. I’ve been thinking along similar lines for ages but never quite got around to making something. I very much support any effort to make remoting less dependent on the archaic character grid.
  • tom1337890 5 hours ago
    Lovely video and ingenious implementation. Kudos!

    As someone managing various servers, both at home and at work, I see how this can be really useful. I see it not in the production space yet but rather in the experimenting, using a Linux machine as a second compute device!

    So regarding your last point, I'm convinced. I think it is useful! The one fact that is bugging me is that now it requires a client specific app, with GUI, on my PC and I wonder if using ssh port forwarding could reduce the surface. I mean I wonder if either having a rich client that executes commands via ssh or a rich server (including Web Server) with ssh port wouldn't suffice, so that I can avoid installing stuff on the server AND on my computer.

  • tjohnell 4 hours ago
    I’m good with just tailscale and self-hosted web-apps. Seems the main selling point is either native UX or reduced barriers to entry security-wise. I like barriers to entry.
  • Tepix 3 hours ago
    It's a cool video and I like the idea in general. The author mentions that the code runs in a sandbox. I'm surprised that WASM hasn't come up. You want the code to be platform agnostic anyway (it should run whether you start Outshell on Linux, macOS or whatever on different CPU architectures).
  • xuhu 4 hours ago
    Being able to initiate a shell app from a regular remote ssh CLI prompt (like "ApacheConfig myhost.com" or "Editor ~/myrepo") might improve integration with people's existing CLI workflows.

    It does need an agent that starts with every X or Wayland session and waits for requests from remote SSH sessions to start an app.

  • bobajeff 4 hours ago
    I don't really know what outerframe frame is. I tried to understand from the video and the blog but I'm still not sure what it is. Is it like a web browser but instead of DOM, HTML and JS you have Swift and SwiftUI running in a sandbox?

    If so how would that work on non Apple devices? Also how much will that sandbox protect you?

  • torm 5 hours ago
    I can’t make up my mind if I love it or hate it. On one hand this is like SSHapi on the other there’s no structure, no contract… i had similar doubts with Cockpit.
  • akshayKMR 5 hours ago
    This is cool. Though I don't see why someone would want to do more work/design for the custom GUI rendering for a custom/renderer (your viewer app) ?
  • nativeit 5 hours ago
    I thought this looks interesting, but was a little confused with what appears to be MacOS-only support at https://outerloop.sh/? I'm running Ubuntu 24.04, I kind of assumed from context that it'd be something I could spin up in a few minutes just to give it a go?
    • nativeit 5 hours ago
      Also worth noting, my decision to give it a go relied mostly on the fact that I couldn't quite work out what the product is. Having "Outer Shell" and "Outer Loop" described as distinct-but-connected entities is a little confusing, IMO, which do I need to install, on what, and in what order?

      Cool idea anyway, no shade here.

      • al_borland 1 hour ago
        I have also been having trouble grasping the difference between Outer Loop and Outer Shell. I thought maybe one was the desktop browser app for macOS and the other was something running locally on the Pi to create the socket. However, after bouncing between the links for the two, I don't think that assumption was correct.
  • abtinf 4 hours ago
    I wrote an early version of the Cylance AV desktop client. The UI side was a web app that talked to its windows service backend using HTTP over windows pipes. This was surprisingly easy to do using WCF.
  • myaccountonhn 5 hours ago
    I am not sure I'd use this over exposing websites with wireguard as those will automatically work across platforms. But it looks like you could create some really cool experiences with it, and I'm happy people are exploring this space.
  • setheron 5 hours ago
    I'm confused -- does this compile it live when the server ships code? How do we resolve dependencies, toolset etc.. Is the idea to just pick an old enough platform toolchain you expect to be present?
    • mrcslws 4 hours ago
      In all cases, the code is pre-compiled. A user never waits for anything to compile. When Outer Loop installs Outer Shell, it downloads pre-compiled binaries to the server. For Linux these are compiled against a manylinux ABI. Ditto for when Outer Shell installs one of the bundled apps. When a backend serves a native "web" app over HTTP it sends already-compiled ARM (or x86) code to the client.

      Dependencies are less of a concern for the frontend binaries. For backends, I use a dependency-light approach, static-linking anything that's needed. Of course, people are welcome to do backends however they want, and just tell Outer Shell about the systemd/launchd units via the API. I used this no-dependency approach to keep everything lightweight and to keep install steps trivial, but admittedly it pushes me in certain directions (for example, using custom binary formats rather than sqlite).

  • fnordpiglet 4 hours ago
    I prefer hytelnet and MUDs but I don’t count, I’m just too old.
  • syngrog66 23 minutes ago
    not sure what problem this solves. smells like setup for a security exploit. tho my most generous take is its CV-ware
  • Panzerschrek 4 hours ago
    > every app is a small HTTP server

    This adds unnecessary overhead for communication. using web and web-like approaches on desktop system is a terrible idea.

  • wolvoleo 3 hours ago
    So a bit like X-forwarding used to do? Cool.
  • IshKebab 1 hour ago
    I'm actually way more interested in option 2 - the VNC-like experience.

    TUI apps are convenient over SSH because they're right there in your terminal. But they suck because they're restricted to shitty monospaced character grids. Why can't we have something more like VNC over SSH? Like, `top` and `micro` but with good graphics?

    I did try doing something like that with the Kitty graphics protocol and you can get kind of close..ish, but it's really restricted by having to send everything as PNGs.

    Anyway upvote for not being blinkered and thinking terminals are just for CLI stuff and must be forever.

  • Asooka 2 hours ago
    In general I would like to see a web browser escape sequence for console applications. Just send a command to the terminal to connect a web browser to your stdin/out and present any UI you want over html. The terminal can then open a regular socket listening on localhost and act as a CGI server. For security the terminal should pick a random IP in the localhost range and a random URL. Technically that is security by obscurity, but guessing a cryptographically secure URL should be hard enough for attackers. The reasons to do it as an escape sequence and not just have the application open a socket and start the browser are: To enable remote GUI; To avoid the complexity of each application implementing networking; To enable better desktop integration, since the terminal itself is part of the Desktop Environment, so it can start a DE-specific browser, preferably in single-application mode. Also, it should be possible to automatically put the application in the background so you basically just run GUI applications like normal.
  • arnefm 5 hours ago
    Heresy!
  • whalesalad 1 hour ago
    this exists as https://xpipe.io/
  • PunchyHamster 5 hours ago
    > Isn’t it weird that this doesn’t already exist?

    It does. MobaXterm have a bunch of it already, file manager on the side and ability to pass X11

  • CamperBob2 5 hours ago
    Edit: withdrawing this objection, had no idea that right-clicking allowed the speed to be adjusted.
    • mrcslws 5 hours ago
      Sure, I just added YouTube mirror link to the post: https://youtu.be/e40PLLuZ5KI

      (The one on the website is the standard browser video player, not custom.)

      • CamperBob2 5 hours ago
        Thanks (and to pelzatessa as well), TIL about the right-click menu on these. That'll come in handy.
    • pelzatessa 5 hours ago
      but its just standard <video> element, in firefox I can even right-click to change the speed to 2x. It's certainly better privacy-wise.
  • mad182 3 hours ago
    Cool, I hate it.
  • supertroop 5 hours ago
    Defeats the purpose of the shell. The shell is for CLI interaction.
    • hnlmorg 5 hours ago
      No. A shell is any user interface. Windows shell is explorer.exe and it used to be possible to change that via a config line in a system INI file.

      SSH protocol also isn’t just for CLI work. It supports file transport (eg SFTP), TCP/IP forwarding and even SOCKS HTTP proxying.

      You also used to be able to run GUI applications over SSH via X11.

      • supertroop 5 hours ago
        You have a very loose definition of a shell that conflicts with about 40 years of history.
        • nativeit 5 hours ago
          I don't have a dog in this fight, and anyway dogfighting is bad, but the intro to the Wikipedia article[0] reads:

          > An operating system shell is a computer program that provides relatively broad and direct access to the system on which it runs. The term shell refers to how it is a relatively thin layer around an operating system.

          > Most shells are command-line interface (CLI) programs. Some graphical user interfaces (GUI) also include shells.

          The last line I think supports the notion that the term "shell" at least implies a CLI, but I can understand both positions.

          ---

          0. https://en.wikipedia.org/wiki/Shell_(computing)

          Edit: I'm shite at formatting on HN

        • thaumaturgy 5 hours ago
          The earliest versions of MacOS, all the way up through 9, had a ROM call at 0xA9F4 which was labeled `_exitToShell`. In the days before pre-emptive multitasking, this instruction's job was to force the current application to close and return the user to the MacOS desktop (the Finder). The "shell" in this context being the desktop user interface.

          Just FYI.

        • projektfu 57 minutes ago
        • mrcslws 5 hours ago
          I wondered if this would be controversial. It all depends where you grew up.

          > Cairo, like Chicago, had a new shell (Microsoft’s favorite word for the user interface for launching programs and managing files) and a new file system

          https://hardcoresoftware.learningbyshipping.com/p/020-innova...

          When I worked at Microsoft 2010 - 2014, the word "shell" was still used in this way. I decided to say "graphical shell", to make it clearer.

        • steve1977 1 hour ago
          Why do you think GNOME Shell is called GNOME Shell?

          https://gitlab.gnome.org/GNOME/gnome-shell

          (just as one example)

        • hnlmorg 5 hours ago
          Not really no. I’ve been using shells and authoring new ones for around 40 years across a variety of platforms. The term has always been pretty loosely defined because as technology evolved the term “shell” was borrowed. So like I said, a shell can refer to a graphical core just as much as a text-based one. You can get web shells too.

          The original intent was that a shell is a thin wrapper on top of the OS to expose the hosts capabilities. But that hasn’t been an apt description for most of those 40 years.

    • metalliqaz 5 hours ago
      command line shell vs graphical shell. My first experience with a graphical shell was dosshell[1]. For a while we called the Windows 3.1 interface "the shell". I guess the terminology has changed since that time.

      [1] https://en.wikipedia.org/wiki/DOS_Shell