Very timely, this article from bbg also addresses the question of rising AI costs. Although the cost of compute and AI inference cost is going down as software and hardware improvements are made, do you think overall costs per user will go down or continue to move up? If I had to personally pay $100/mos for Claude Im not sure if I would do it. I think most ppl feel the same. Companies are really starting to incorporate AI into their processes, but for them to continue to increase spending, they would need to be able to quantify the impact on their bottom line.
I agree with you, theres definitely a cap for how much frontier models can change, but Im not sure what that is and what that means for their margins.
Nice post! As someone who tried to run local models on a 16GB M4 Mac Mini (and got horrible performance), running Qwen 3.6 27B on 2022 video hardware is impressive. Wondering what the surrounding machine is and how much the hardware costs (i.e., did you custom build a Windows machine?).
I did upgrade to a new AMD chip, which did require DDR5 RAM, so some of the stuff is newer. Definitely not just custom though since in theory I can do other things on it.
I dual booted into Ubuntu though and mostly run that. It’s just easier to manage running servers, including AI serving ones, on Linux.
For Mac, the great part with high memory (which 16 unfortunately isn’t) is that you can run frontier models at all even if a bit slow. An M3 512GB can run DeepSeek v4—and literally anything else.
No, not misinterpreting and it’s what’s helping old M3 Mac Studios with high RAM configurations sell for $20k+ on the street and GPU prices to still be elevated!
Very timely, this article from bbg also addresses the question of rising AI costs. Although the cost of compute and AI inference cost is going down as software and hardware improvements are made, do you think overall costs per user will go down or continue to move up? If I had to personally pay $100/mos for Claude Im not sure if I would do it. I think most ppl feel the same. Companies are really starting to incorporate AI into their processes, but for them to continue to increase spending, they would need to be able to quantify the impact on their bottom line.
I agree with you, theres definitely a cap for how much frontier models can change, but Im not sure what that is and what that means for their margins.
https://www.bloomberg.com/opinion/articles/2026-05-26/ai-boom-bankers-love-of-claude-carries-a-heavy-price
Nice post! As someone who tried to run local models on a 16GB M4 Mac Mini (and got horrible performance), running Qwen 3.6 27B on 2022 video hardware is impressive. Wondering what the surrounding machine is and how much the hardware costs (i.e., did you custom build a Windows machine?).
I did upgrade to a new AMD chip, which did require DDR5 RAM, so some of the stuff is newer. Definitely not just custom though since in theory I can do other things on it.
I dual booted into Ubuntu though and mostly run that. It’s just easier to manage running servers, including AI serving ones, on Linux.
For Mac, the great part with high memory (which 16 unfortunately isn’t) is that you can run frontier models at all even if a bit slow. An M3 512GB can run DeepSeek v4—and literally anything else.
But if software gets better isn't the demand for hardware go up? Jevons paradox or am I misinterpreting it?
No, not misinterpreting and it’s what’s helping old M3 Mac Studios with high RAM configurations sell for $20k+ on the street and GPU prices to still be elevated!