Claude.ai adds throttling to max users but with no transparecy.
It's frustrating that Claude doesn't expose usage metrics to max plan users who still face rate limits.
The lack of transparency around actual usage vs. limits makes it impossible to work fully efficiently, having to just very roughly guestimate with no real metrics to go by.
The reality is:
- The API response headers with rate limit info exist but aren't accessible through Claude Code
- The /cost command is essentially useless for max plan users
- There's no API endpoint to query your usage
- You can't predict when you'll hit limits until you actually get throttled
This is a significant oversight in the product design. Users need visibility into their usage to plan their work effectively, especially when hitting limits means dropping to a lower-quality model and losing context.
I'll put in a feature request for this, because this is a big deal now that they are throttling "max" users. :(
https://github.com/anthropics/claude-code/issues
Here is my feature request submission:
"
I really appreciate what the folks at claude.ai are doing in their best efforts with claude.
however, could you please include transparency of usage? Especially now that you are throttling max users, it's frustrating that Claude doesn't expose usage metrics to max plan users who still face rate limits. The lack of transparency around actual usage vs. limits makes it impossible to work efficiently, having to rely on arbitrary guestimates without useful metrics to guide one's work planning.
- The API response headers with rate limit info supposedly exist but aren't accessible through Claude Code?
- The /cost command is essentially useless for max plan users - it only states "/cost - With your Claude Max subscription, no need to monitor cost — your subscription includes Claude Code usage", but I keep getting throttled a few times a day sometimes, even though I'm trying to be as conservative as possible my usage.
- There's no API endpoint to query usage?
- This makes it nearly impossible to predict when when a max-user will hit limits until actually get throttled - and then that wreaks havoc with work context.
This is a significant oversight in the product design. While it wasn't an issue when "max" was actually "max" and not throttled. Now that you have introduced your throttling policy, users need real-time (or as close to real-time as possible) visibility into their usage to plan their work effectively, especially when hitting limits means losing current chat context and/or dropping to a lower-quality model."
https://github.com/anthropics/claude-code/issues/4883