Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Anthropic prompt caching for Claude 3.7 Sonnet #912

Draft
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

codegen-sh[bot]
Copy link
Contributor

@codegen-sh codegen-sh bot commented Mar 19, 2025

Description

This PR implements Anthropic's prompt caching feature for Claude 3.7 Sonnet in the CodeAgent class. Prompt caching allows reusing large portions of prompts across multiple API calls, reducing costs by up to 90% for cached content and improving latency by up to 85% for long prompts.

Changes

  1. Added enable_prompt_caching parameter to the LLM class with a default value of False
  2. Added support for the anthropic-beta: prompt-caching-2024-07-31 header when prompt caching is enabled
  3. Added validation to ensure prompt caching is only enabled for supported models (Claude 3.5 Sonnet and Claude 3.0 Haiku)
  4. Added enable_prompt_caching parameter to the CodeAgent class with a default value of True for Claude models

Benefits

  • Reduced costs: Cached prompts can reduce input token costs by up to 90%
  • Improved latency: Response times can be cut by up to 85% for long prompts
  • Enhanced performance: Allows for inclusion of more context and examples without performance penalties

Notes

  • Prompt caching is currently in beta and only supported on Claude 3.5 Sonnet and Claude 3.0 Haiku
  • The cache has a 5-minute lifetime, refreshed each time the cached content is used
  • This implementation enables the feature but doesn't yet include the cache_control parameter for marking specific content as cacheable - that would require changes to the prompt structure and can be implemented in a future PR if needed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants