
Github
Needle is a 26M-parameter open-source model by Cactus Compute, distilled from Gemini for single-shot function/tool calling. Achieves 6000 tok/s prefill on edge devices, targeting AI assistants on phones, watches, and glasses. 775pts on HN, 2400+ GitHub stars.