Ask HN: Anyone Using a Mac Studio for Local AI/LLM?
Curious to know your experience running local LLM's with a well spec'ed out M3 Ultra or M4 Pro Mac Studio. I don't see a lot of discussion on the Mac Studio for Local LLMs but it seems like you could put big models in memory with the shared VRAM. I assume that the token generation would be slow, but you might get higher quality results because you can put larger models in memory.
Not a Mac Studio but I use a basic Macbook Pro laptop with 24 GB of RAM (16 usable as VRAM) and I can run a number of models on it at decent speed, my main bottleneck is context window size, but if I am asking single purpose questions I am fine.
I do! I have an M3 Ultra with 512GB. A couple of opencode sessions running work well. Currently running GML 4.7 but was on Kimi K2.5. Both great. Excited for more efficiencies to make their way to LLMs in general.
There are some people on r/LocalLlama using it [0]. Seems like the consensus is while it does have more unified RAM for running models, up to half a terabyte, the token generation speed can be fairly slow such that it might just be better to get an Nvidia or AMD machine.
[0] https://old.reddit.com/r/LocalLLaMA/search?q=mac+studio&rest...