{"id":334,"date":"2026-05-19T15:10:22","date_gmt":"2026-05-19T15:10:22","guid":{"rendered":"https:\/\/clvrclvr.com\/content\/?p=334"},"modified":"2026-05-19T15:10:22","modified_gmt":"2026-05-19T15:10:22","slug":"ai-agents-in-8gb-vram","status":"publish","type":"post","link":"https:\/\/clvrclvr.com\/content\/?p=334","title":{"rendered":"AI Agents in 8GB VRAM?"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">After spending a few days testing every local LLM and coding agent I could get my hands on with an RTX 4060 (8GB VRAM), here&#8217;s what I found: the new generation of small models, granite4.1:8b, qwen3.5:9b, and gemma4:e2b, are shockingly good, scoring perfect marks on both bug-finding and tool-calling benchmarks where last-gen models like phi4-mini and gemma3:4b fell flat, but the model is only half the story. The agent framework matters just as much, and most of them can&#8217;t actually use local models properly. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Out of 12 frameworks tested, only Aider, Cline, Goose, and Fabric could reliably find and fix bugs with local models, while OpenHands, OpenCode, and Open Interpreter failed with every model I threw at them because their tool-calling interfaces are too complex for 7B-class models to drive. Aider was the most model-agnostic (it just works with everything), Goose went from completely broken to perfect once I swapped in qwen3.5:9b, and granite4.1:8b turned out to be the speed king for agent workflows at 0.2-0.8 seconds per tool call. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The bottom line: you can absolutely run a useful AI coding agent locally on a mid-range GPU today, but you have to pick the right model-framework pairing, and most of the flashy agent projects out there aren&#8217;t ready for local models yet.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>After spending a few days testing every local LLM and coding agent&hellip;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-334","post","type-post","status-publish","format-standard","hentry","category-post"],"_links":{"self":[{"href":"https:\/\/clvrclvr.com\/content\/index.php?rest_route=\/wp\/v2\/posts\/334","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/clvrclvr.com\/content\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/clvrclvr.com\/content\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/clvrclvr.com\/content\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/clvrclvr.com\/content\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=334"}],"version-history":[{"count":1,"href":"https:\/\/clvrclvr.com\/content\/index.php?rest_route=\/wp\/v2\/posts\/334\/revisions"}],"predecessor-version":[{"id":335,"href":"https:\/\/clvrclvr.com\/content\/index.php?rest_route=\/wp\/v2\/posts\/334\/revisions\/335"}],"wp:attachment":[{"href":"https:\/\/clvrclvr.com\/content\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=334"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/clvrclvr.com\/content\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=334"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/clvrclvr.com\/content\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=334"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}