Guide on running trillion-parameter Kimi K2.5 locally on a cluster of Framework Desktop systems. No mention of RDMA or Converged Ethernet (RoCE), but I’ll bet that will get much higher tokens/sec if configured that way. Pretty cool to get it usable locally in any case! https://www.amd.com/en/developer/resources/technical-articles/2026/how-to-run-a-one-trillion-parameter-llm-locally-an-amd.html