fosstodon.org is one of the many independent Mastodon servers you can use to participate in the fediverse.
Fosstodon is an invite only Mastodon instance that is open to those who are interested in technology; particularly free & open source software. If you wish to join, contact us for an invite.

Administered by:

Server stats:

10K
active users

New feature from Anthropic today: you can ask their Claude API to cache parts of your prompt, resulting in a large price discount and performance boost provided your app reuses the same prompt at least once every five minutes.

Blogged a few notes here: simonwillison.net/2024/Aug/14/

simonwillison.netPrompt caching with ClaudeThe Claude API now supports prompt caching, allowing you to mark reused portions of long prompts (like a large document provided as context). Claude will cache these for up to …
Alex Bradbury

@simon This is an exciting evolution! DeepSeek started offering this as well in the last couple of weeks, though there's no cost for storage and you just get a lower charge based on any hits in the cache platform.deepseek.com/api-docs This has advantages, but of course leaves your cache hit rate dependent on how long DeepSeek choose to keep the cache around.

For individual personal usage I'd probably prefer the DeepSeek "do your best and don't make me think about it" pricing model.

platform.deepseek.comDeepSeek API introduces Context Caching on Disk, cutting prices by an order of magnitude | DeepSeek API DocsIn large language model API usage, a significant portion of user inputs tends to be repetitive. For instance, user prompts often include repeated references, and in multi-turn conversations, previous content is frequently re-entered.

@asb wow their offering is so much simpler!