Social Wire

Media Room RSS

Social Wire

About Exo Labs

Visit Website

View Media Room

EXO Labs is a company that creates innovative products that add functionality to mobile devices.

Learn More »

Address

Exo Labs

3131 Western Ave

Seattle, WA 98121

United States

Follow +

Tweeted Oct 15, 2025

EXO Labs
@exolabs Full blog post and more details about EXO 1.0: https://t.co/2ZdPUJe4iR Thanks @NVIDIA for early access to two DGX Sparks. #SparkSomethingBig @NVIDIA for early accesblog.exolabs.net/nvidia-dgx-spa… Thanks @NVIDIA for early access to two DGX Sparks. #SparkSomethingBig

Tweeted Oct 15, 2025

EXO Labs
@exolabs But the KV cache is created for each transformer layer. By sending each layer’s KV cache after it’s computed, we overlap communication with computation. We stream the KV cache and hide the network delay. We achieve a 4x speedup in prefill & 3x in decode, with 0 network delVbvk

Tweeted Oct 15, 2025

EXO Labs
@exolabs We can run these two stages on different devices: Prefill: DGX Spark (high compute device, 4x compute) Decode: M3 Ultra (high memory-bandwidth device, 3x memory-bandwidth) However, now we need to transfer the KV cache over the network (10GbE). This introduces a delay.

Tweeted Oct 15, 2025

EXO Labs
@exolabs LLM inference consists of a prefill and decode stage. Prefill processes the prompt, building a KV cache. It’s compute-bound so gets faster with more FLOPS. Decode reads the KV cache and generates tokens one by one. It’s memory-bound so gets faster with more memory bandwidGX1V

Tweeted Oct 15, 2025

EXO Labs
@exolabs Clustering NVIDIA DGX Spark + M3 Ultra Mac Studio for 4x faster LLM inference. DGX Spark: 128GB @ 273GB/s, 100 TFLOPS (fp16), $3,999 M3 Ultra: 256GB @ 819GB/s, 26 TFLOPS (fp16), $5,599 The DGX Spark has 3x less memory bandwidth than the M3 Ultra but 4x more FLOPS. By running

Tweeted Sep 20, 2025

EXO Labs
@exolabs EXO covers all UK visa costs and relocation costs. We have 100% success rate, and it’s usually a fast process (~1 month)x.com/alexocheema/st…B2

Tweeted Sep 11, 2025

EXO Labs
@exolabs EXO Gym mentioned by @jackclarkSF x.com/MattBeton/stat…qxV

Tweeted Sep 2, 2025

EXO Labs
@exolabs A deep dive on KPOP at @Cohere_Labs ML efficiency group. KPOP is an optimizer designed specifically for the hardware constraints of Apple Silicon. We're doubling the number of Apple Silicon macs that can train together coherently every 2 months. In 12 months we'll have rebuilt x.com/alexocheema/st…

Tweeted Aug 30, 2025

EXO Labs
@exolabs EXO Gym: simulate large-scale distributed training experiments on a single MacBook x.com/MattBeton/stat…

Tweeted Aug 22, 2025

EXO Labs
@exolabs run massive models, add macs incrementally for linear scaling (no limit) x.com/MattBeton/stat…

Press Releases	Instant	Daily	Never
News	Instant	Daily	Never
Event	Instant	Daily	Never
Blog		Daily	Never
Facebook		Daily	Never
Twitter		Daily	Never