Dev Domain For Interactive Tooling

Crush CLI Agent (model: claude-opus-4-5-20251101) · 3m 7s · Week 1: Blog Platform

TL;DR - Session insights

The .dev TLD signals developer-facing tooling - best suited for interactive artifacts, APIs, or playgrounds rather than static content
Separating blog (.blog) from tooling (.dev) allows different deployment strategies and caching policies per domain
Building an “Agent Capability Observatory” with live metrics and failure pattern catalogs creates a unique research artifact
A public Human-Agent Protocol API could validate Week 47’s handoff protocol while providing real value to the community

Session Log

USER: I own two domains -
- 52vibes.blog - This is where we are publishing the current blog contents.
- 52vibes.dev
What should I do with 52vibes.dev? Look at the site page contents to understand what I am working on and come up with some really creative and compelling usecases for this 2nd domain.
TOOL-CALL: view {"file_path":"./PROJECT.md"}
TOOL-CALL: view {"file_path":"./ROADMAP.md"}
TOOL-CALL: view {"file_path":"./wk1-blog/content/about/_index.md"}

TOOL:

<file>
     1|# 52vibes: Year-Long AI Agent Collaboration Study
     2|
     3|**Start Date:** January 1, 2026 (Thursday)
     4|**End Date:** December 30, 2026 (Wednesday)
     5|**Domain:** https://52vibes.dev
     6|**Repository:** https://github.com/vinayprograms/52vibes-blog (blog)
     7|
     8|---
     9|
    10|## Purpose
    11|
    12|Push the limits of AI agents to identify their limits, shortcomings, and missing aspects of an agentic ecosystem. Define boundaries of human-agent interactions and produce comprehensive recommendations for human-AI collaboration.
    13|
    14|---
    15|
    16|## Audience
    17|
    18|- Developers, Software Engineers
    19|- AI Researchers
    20|- Product Managers
    21|
    22|---
    23|
    24|## Tools
    25|
    26|- **LLM:** Claude Opus 4.5
    27|- **Agent Interface:** github.com/charmbracelet/crush
    28|- **Hosting:** Cloudflare Pages
    29|
[...81 more lines...]

TOOL:

<file>
     1|# 52vibes: Draft Project List (v3)
     2|
     3|> **Status:** Draft - subject to revision based on prior week results  
     4|> **Last Updated:** December 30, 2025  
     5|> **Philosophy:** Go big or go home. No buffers. Failure is valid data.
     6|
     7|---
     8|
     9|## Selection Criteria
    10|
    11|| Criterion | Weight |
    12||-----------|--------|
    13|| Tests new agent capability | 40% |
    14|| Compounds on prior work | 30% |
    15|| Produces useful artifact | 30% |
    16|
    17|**Non-negotiable:** Fits 1-week time budget (scope adjusted, not ambition)
    18|
    19|---
    20|
    21|## Q1: Agentic Infrastructure (Weeks 1-13)
    22|
    23|| Wk | Dates | Project | Agent Capability Tested | Artifact |
    24||----|-------|---------|------------------------|----------|
    25|| 1 | Jan 1-7 | **Static Blog Platform** | Full-stack setup, CI/CD, deployment | Hugo site + custom theme + pipeline |
    26|| 2 | Jan 8-14 | **Security Adversary** | Adversarial thinking, vulnerability discovery, codebase extension | Extended adm/adsm/addb + active red-team capability that attacks code |
    27|| 3 | Jan 15-21 | **Autonomous Test Synthesizer** | Self-verification, test reasoning | System that analyzes code and generates comprehensive tests without human guidance |
    28|| 4 | Jan 22-28 | **Ambiguity Resolver** | Clarification strategies, decision-making | Library that takes vague requirements and produces concrete specifications |
    29|| 5 | Jan 29-Feb 4 | **Context Infinite** | Long-context handling, session continuity | System for unlimited context through intelligent chunking, summarization, and retrieval |
[...145 more lines...]

TOOL:

<file>
     1|---
     2|title: "About"
     3|description: "52vibes - a year-long AI agent collaboration experiment"
     4|---
     5|
     6|52vibes is a year-long experiment in human-AI collaboration. Each week, we ship a working project while documenting the process of building with AI agents.
     7|
     8|## The Experiment
     9|
    10|- **52 weeks** of consistent, shipped work
    11|- **Real projects** solving real problems
    12|- **Transparent process** through session logs
    13|- **Evolving patterns** in human-AI collaboration
    14|
    15|## Themes
    16|
    17|| Quarter | Theme | Focus |
    18||---------|-------|-------|
    19|| Q1 | Agentic Infrastructure | Tools, workflows, foundations |
    20|| Q2 | Production Tools | Practical utilities, integrations |
    21|| Q3 | Complex Workflows | Multi-agent, orchestration |
    22|| Q4 | Synthesis | Edge cases, refinement |
    23|
    24|## Philosophy
    25|
    26|Every project follows these principles:
    27|
    28|1. **Ship something** — Working code over perfect code
    29|2. **Document everything** — Session logs capture the journey
[...11 more lines...]

TOOL-CALL: ls {"path":"./wk1-blog/content"}
TOOL-CALL: view {"file_path":"./wk1-blog/content/weeks/_index.md"}

TOOL:


- ./wk1-blog/content/
  - about/
    - _index.md
  - weeks/
    - _index.md
  - blog/
    - _index.md
    - week-01-blog-platform.md
    - reports/
  - community/
    - _index.md

TOOL:

<file>
     1|---
     2|title: "Weekly Projects"
     3|description: "List of all weekly projects"
     4|---
     5|
     6|# Weekly Index
     7|
     8|All 52 weeks of the experiment.
     9|
    10|| Week | Quarter | Theme | Project |
    11||------|---------|-------|---------|
    12|{{ range $i := seq 52 }}
    13|{{ $weekNum := printf "%02d" $i }}
    14|{{ $quarter := add (div (sub $i 1) 13) 1 }}
    15|{{ $theme := index (slice "Agentic Infrastructure" "Production Tools" "Complex Workflows" "Synthesis") (sub $quarter 1) }}
    16|| {{ $i }} | Q{{ $quarter }} | {{ $theme }} | {{ if eq $i 1 }}[Blog Platform](/blog/week-01-blog-platform/){{ else }}—{{ end }} |
    17|{{ end }}
</file>

AI: Now I have a full picture. Here are creative use cases for 52vibes.dev:
Creative Use Cases for 52vibes.dev
1. Interactive Artifact Playground
A live sandbox where visitors can try the shipped artifacts directly in-browser:
- Week 4’s Ambiguity Resolver: paste vague requirements, get specs
- Week 16’s Documentation Engine: drop code, get docs
- Week 18’s API Designer: describe a business need, get API design
Why it works: .dev signals “developer-facing” — people expect to do things, not just read.
2. Agent Capability Observatory
Real-time dashboards showing:
- Live metrics from deployed services (Wk8, Wk11, Wk23)
- Benchmark results from Week 48’s capability suite
- Failure pattern explorer (Week 49) — searchable catalog of how agents fail
- Performance graphs across the 52 weeks
Tagline: “Watch AI agents work — live.”
3. Human-Agent Protocol API
A public API implementing Week 47’s protocol:
- /handoff — structured task handoff format
- /capability-check — query what agents can/can’t do
- /escalate — standardized escalation patterns
- SDK clients for Python, Go, TypeScript
Value: Other developers building with AI agents adopt your protocol as a standard.
4. “Fork the Week” Development Hub
Each week’s artifact published as:
- Installable packages (go get 52vibes.dev/wk6-debugger)
- GitHub template repos
- One-click Codespaces/Gitpod environments
- Automated release feeds (RSS, Atom)
The .blog tells the story; the .dev ships the code.
5. Agent Arena
Interactive challenges where visitors pit their own agent setups against yours:
- Submit code for Week 29’s Adversarial Design Review agents to attack
- Run Week 38’s Chaos Engineer against visitor-provided systems
- Leaderboards for agent performance
My Recommendation: Option 4 — Development Hub
Domain Purpose
52vibes.blog Narrative, weekly posts, session logs, community
52vibes.dev Packages, APIs, SDKs, runnable artifacts
This creates a clean separation: read on .blog, build on .dev. The 52 artifacts you’re shipping deserve a proper developer portal with versioned releases, documentation, and install commands.
USER: Many of your suggestions will require server-side processing (using web-servers). Are there free LLM APIs that I can point some of the usecases to. The usecases can be client-side javascript running inside a browser.
TOOL-CALL: agentic_fetch {"prompt":"Find free LLM APIs that can be called from client-side JavaScript in browsers. Looking for: 1) Completely free tiers (not just trials), 2) CORS-enabled or browser-compatible, 3) No server-side proxy required. Include APIs from Google (Gemini), Groq, Hugging Face, Cloudflare Workers AI, and any others. List rate limits, model availability, and whether API keys can be safely exposed client-side."}
TOOL: I encountered issues with obtaining search results for your queries. Is there a specific approach or different angle you would like me to try to find the information you need?
TOOL-CALL: fetch {"url":"https://ai.google.dev/pricing","format":"text"}
TOOL-CALL: fetch {"url":"https://console.groq.com/docs/rate-limits","format":"text"}
TOOL-CALL: fetch {"url":"https://developers.cloudflare.com/workers-ai/","format":"text"}
TOOL: Skip to main content / English Deutsch Español – América Latina Français Indonesia Italiano Polski Português – Brasil Shqip Tiếng Việt Türkçe Русский עברית العربيّة فارسی हिंदी বাংলা ภาษาไทย 中文 – 简体中文 – 繁體日本語 한국어 Get API key Cookbook Community Sign in Docs API reference Gemini API Docs API reference Get API key Cookbook Community Get started Overview Quickstart API keys Libraries Interactions API Models Gemini Gemini 3 Nano Banana image generation Veo video generation Lyria music generation Imagen image generation Embeddings Robotics Pricing Rate limits Core capabilities Text Image Image generation 🍌Image understanding Video Documents Speech and audio Speech generationAudio understanding Thinking and thought signatures ThinkingThought signatures Structured outputs Function calling Long context Tools and agents Overview Deep Research Google Search Google Maps Code execution URL context Computer Use File Search Live API Get started Capabilities Tool use Session management Ephemeral tokens Guides Batch API Files API Context caching OpenAI compatibility Media resolution Token counting Prompt engineering Logs and datasets Get started with logsData logging and sharing Safety Safety settingsSafety guidance Open-Source frameworks LangChain & LangGraphCrewAILlamaIndexVercel AI SDK Resources Migrate to Gen AI SDK Release notes Deprecations API troubleshooting Billing info Partner and library integrations Google AI Studio QuickstartVibe code in Build modeTry out LearnLMTroubleshootingAccess for Workspace users Google Cloud Platform VertexAI Gemini APIOAuth authentication Policies Terms of service Available regions Additional usage polices /* Styles inlined from /site-assets/css/overview.css */ .button-primary { border-radius: 0.375rem; } .code-snippet { background-color: #f5f5f5; padding: 1rem; border-radius: 4px; overflow: auto; } .code-snippet code { font-family: monospace; } .quickstart { align-items: center; display: flex; flex-direction: row; padding-bottom: 1rem; } .quickstart p { margin: 0; } .quickstart-blurb { margin-left: 1rem; padding-left: 1rem; border-left: 2px solid var(--googledevai-border-color); } .gemini-api-card-overview { background: var(--devsite-background-1); border: 1px solid var(--googledevai-border-color); border-radius: 8px; box-shadow: 0 1px 2px rgba(0, 0, 0, 0.1); color: var(--devsite-primary-text-color); padding: 1rem; text-decoration: none; transition: box-shadow 0.3s ease-in-out; } .gemini-api-card-overview:active, .gemini-api-card-overview:focus { text-decoration: none; border-color: var(--devsite-primary-color); } .gemini-api-card-overview:hover { box-shadow: var(--gemini-api-elevation-3dp); } .gemini-api-card-title { align-items: center; display: flex; font-size: 1rem; height: auto; /* Override the setting for now in models.css */ margin: 0 0 0.75rem 0; } .gemini-api-card-overview .google-symbols { font-size: 18px; } .gemini-api-card-overview .google-symbols.spark { font-size: 16px; } .gemini-api-card-overview .ais-logo { height: 16px; user-select: none; } .gemini-api-card-overview .ais-logo, .gemini-api-card-overview .google-symbols, .gemini-api-card-overview .nano-banana { margin-right: 0.5rem; } .gemini-api-card-overview.with-links { display: flex; flex-direction: column; justify-content: space-between; } .gemini-api-card-description { margin: 0; height: auto; /* Override the setting for now in models.css */ } .gemini-api-card-description-cta { color: var(--devsite-link-color); margin: 1rem 0 0 0; } .blue-bold-card { display: flex; flex-direction: row; gap: 0.5rem; } .icon-background { display: flex; border-radius: 8px; padding: 0.5rem; background-color: rgb(from var(--devsite-link-background) r g b / 40%); } .icon-background .google-symbols { margin: 0; } /* Add this to prevent the button from getting squashed on any screen size */ .quickstart .button-primary { flex-shrink: 0; /* Critical: Never let this element shrink */ white-space: nowrap; /* Critical: Keep text on one line */ min-width: max-content; /* Ensure it takes up as much space as the text needs */ } @media only screen and (min-width: 625px) { .gemini-api-recommended { display: grid; grid-template-columns: repeat(3, 1fr); gap: 1.5rem; } } @media only screen and (max-width: 625px) { /* 1. Standard Card Resets for Mobile */ .gemini-api-card-overview { display: block; margin: 1rem 0; } .gemini-api-card-title { margin: 0 0 0.25rem 0; } .blue-bold-card { display: flex; } /* 2. Quickstart Split Layout (Text -> Button -> Text) */ .quickstart { flex-direction: column; align-items: flex-start; gap: 1rem; } /* "Unwrap" the text container so we can reorder paragraphs individually */ .quickstart-blurb { display: contents; } /* Order 1: The first paragraph ("Follow our Quickstart...") */ .quickstart-blurb p:first-child { order: 1; } /* Order 2: The Button */ .quickstart .button-primary { order: 2; } /* Order 3: The second paragraph ("For most models...") */ .quickstart-blurb p:last-child { order: 3; } } /* Styles inlined from /site-assets/css/models.css */ :root { --gemini-api-table-font-color: #3c4043; --gemini-api-model-font: 'Google Sans Text', Roboto, sans-serif; --gemini-api-card-width: 17rem; --gemini-api-elevation-1dp: 0 1px 1px 0 rgba(0, 0, 0, 0.14), 0 2px 1px -1px rgba(0, 0, 0, 0.12), 0 1px 3px 0 rgba(0, 0, 0, 0.2); --gemini-api-elevation-3dp: 0 3px 4px 0 rgba(0, 0, 0, 0.14), 0 3px 3px -2px rgba(0, 0, 0, 0.12), 0 1px 8px 0 rgba(0, 0, 0, 0.2); } body[theme="googledevai-theme"] { --googledevai-button-gradient: var(--googledevai-button-gradient-light); } body[theme="googledevai-theme"].color-scheme--dark { --googledevai-button-gradient: var(--googledevai-button-gradient-dark); } .google-symbols { background: -webkit-linear-gradient(45deg, var(--googledevai-blue), var(--googledevai-purple)); -webkit-background-clip: text; -webkit-text-fill-color: transparent; user-select: none; vertical-align: bottom; } /* Cards */ @media only screen and (min-width: 625px) { .gemini-api-recommended { display: grid; grid-template-columns: repeat(3, 1fr); /* Three equal-width columns */ grid-column-gap: 3rem; /* Keep the gap between columns */ } } .gemini-api-recommended { width: 100%; /* Take full width of parent */ margin: 0 auto; } .gemini-api-card { background: var(--devsite-background-1); border: 1px solid var(--googledevai-border-color); border-radius: 9px; box-shadow: var(--gemini-api-elevation-1dp); height: 23rem; margin: 1rem .5rem; padding: 1rem; transition: box-shadow 0.3s ease-in-out; width: var(--gemini-api-card-width); } .color-scheme--dark .gemini-api-card { background: #131314; border-color: #444746; } .gemini-api-card:hover { box-shadow: var(--gemini-api-elevation-3dp); } .gemini-api-card a:empty { display: block; position: relative; height: 23rem; width: var(--gemini-api-card-width); top: -22.8rem; left: -1rem; } .gemini-api-card a:empty:focus { border: 2px solid var(--devsite-primary-color); border-radius: 9px; } .gemini-api-card-title { font-family: "Google Sans", Roboto, sans-serif; font-size: 1.3rem; font-weight: 500; height: 1.5rem; margin-bottom: 2.5rem; line-height: 1.3rem; } .gemini-api-card-description { font-size: .9rem; height: 7.5rem; overflow: hidden; text-overflow: ellipsis; white-space: normal; } .gemini-api-card-bulletpoints { color: #757575; font-size: .8rem; height: 8.2rem; margin-left: 1rem; padding: 0; } .color-scheme--dark .gemini-api-card-bulletpoints { color: var(--devsite-primary-text-color); } .gemini-api-card-description, .gemini-api-card-bulletpoints { font-family: var(--gemini-api-model-font); } .gemini-api-card-bulletpoints li { line-height: 1rem; margin: .3rem 0; } /* Tables */ .gemini-api-model-table, .gemini-api-model-table th { color: var(--gemini-api-table-font-color); font: .95rem var(--gemini-api-model-font); } .color-scheme--dark .gemini-api-model-table, .color-scheme--dark .gemini-api-model-table th { color: var(--devsite-primary-text-color); } .gemini-api-model-table th { font-weight: 500; } .gemini-api-model-table td:first-child { max-width: 0; } .gemini-api-model-table-grid { display: grid; grid-template-columns: repeat(auto-fill, minmax(15rem, 1fr)); grid-gap: 1rem .5rem; } .gemini-api-model-table section { display: inline-grid; } .gemini-api-model-table p { margin: 0 0 .5rem; } .gemini-api-model-table li { margin: 0; } .gemini-api-model-table ul { margin-top: .5rem; } .gemini-api-model-table .google-symbols { margin-right: .7rem; vertical-align: middle; } .gemini-api-supported, .gemini-api-not-supported, .gemini-api-experimental { border-radius: 8px; display: inline-block; font-size: .9rem; font-weight: 500; line-height: 1rem; padding: .3rem 0.5em; } .gemini-api-supported { background: #e6f4ea; /* GM3 Green 50 */ color: #177d37; /* GM3 Green 700 */ } .gemini-api-not-supported { background: #fce8e6; /* GM3 Red 50 */ color: #c5221f; /* GM3 Red 700 */ } .gemini-api-experimental { background: #e8def8; color: #4a4458; } .color-scheme--dark .gemini-api-supported { background: #177d37; /* GM3 Green 700 */ color: #e6f4ea; /* GM3 Green 50 */ } .color-scheme--dark .gemini-api-not-supported { background: #c5221f; /* GM3 Red 700 */ color: #fce8e6; /* GM3 Red 50 */ } /* Buttons */ .gemini-api-model-button { background: var(--googledevai-button-gradient); background-size: 300% 300%; border-radius: 20rem; color: #001d35; font-family: var(--gemini-api-model-font); font-size: .9rem; font-weight: 500; padding: .6rem 1rem; text-align: center; text-decoration: none; transition: filter .2s ease-in-out, box-shadow .2s ease-in-out; } .gemini-api-model-button:hover{ animation: gradient 5s ease infinite; filter: brightness(.98); box-shadow: var(--gemini-api-elevation-1dp); } .gemini-api-model-button:focus { filter: brightness(.95); outline: #00639b solid 3px; outline-offset: 2px; text-decoration: none; } .gemini-api-model-button::before { content: 'spark'; font-family: 'Google Symbols'; padding-right: 0.5rem; vertical-align: middle; } @keyframes gradient { 0% { background-position: 0% 50%; } 50% { background-position: 100% 50%; } 100% { background-position: 0% 50%; } } .model-card { display: flex; background-color: #ffffff; border: 1px solid #e0e0e0; border-radius: 12px; overflow: hidden; margin-bottom: 1.5rem; box-shadow: 0 4px 6px rgba(0, 0, 0, 0.05); transition: box-shadow 0.3s ease; } .color-scheme--dark .model-card { background-color: #3c4043; } .model-card:hover { box-shadow: 0 6px 12px rgba(0, 0, 0, 0.1); } .card-content { padding: 2.5rem; flex: 1; } .sub-heading-model { font-size: 0.8rem; font-weight: 600; text-transform: uppercase; letter-spacing: 0.5px; margin: 0 0 0.5rem 0; } .color-scheme--dark .sub-heading-model { color: var(--devsite-primary-text-color); } .card-content h2 { font-size: 2rem; font-weight: 500; margin: 0 0 1rem 0; } .description { font-size: 1rem; line-height: 1.6; color: #3c4043; margin: 0 0 1.5rem 0; } .color-scheme--dark .description { color: var(--devsite-primary-text-color); } .card-content a:not(.gemini-api-model-button) { color: #1a73e8; text-decoration: none; font-weight: 600; } .card-content a:hover { text-decoration: underline; } @media (max-width: 768px) { .model-card { flex-direction: column; } .card-content { padding: 1.5rem; } h1 { font-size: 2rem; } .card-content h2 { font-size: 1.5rem; } } /* Styles inlined from /site-assets/css/pricing.css */ /* Pricing table styles */ .pricing-table { border-collapse: separate; border-spacing: 0; border-radius: 8px; overflow: hidden; } .pricing-table th { background-color: #f2f2f2; text-align: left; padding: 8px; } /* Set the second and after (of three total) columns to 35% width. */ .pricing-table th:nth-child(n+2) { width: 35%; } /* These should use theme colours for light too, so we don't * need an override. */ .color-scheme--dark .pricing-table th { background-color: var(--devsite-ref-palette--grey800); } .pricing-table td { padding: 8px; } .free-tier { background-color: none; } .paid-tier { background-color: #eff5ff; } .color-scheme--dark .paid-tier { background-color: var(--devsite-background-5); } .pricing-table th:first-child { border-top-left-radius: 8px; } .pricing-table th:last-child { border-top-right-radius: 8px; } .pricing-table tr:last-child td:first-child { border-bottom-left-radius: 8px; } .pricing-table tr:last-child td:last-child { border-bottom-right-radius: 8px; } .pricing-container { max-width: 1100px; width: 100%; } .pricing-cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(300px, 1fr)); gap: 2rem; justify-content: center; } .pricing-card { background-color: #ffffff; border-radius: 16px; border: 1px solid #dadce0; padding: 2.5rem; display: flex; flex-direction: column; transition: all 0.3s ease; position: relative; } .color-scheme--dark .pricing-card { background-color: var(--devsite-ref-palette--grey800); } .plan-name { font-size: 1.75rem; font-weight: 700; margin: 0 0 0.5rem 0; } .plan-description { font-size: 1rem; color: #5f6368; margin: 0 0 2.5rem 0; line-height: 1.5; min-height: 80px; } .color-scheme--dark .plan-description { color: var(--devsite-primary-text-color); } .plan-description a { color: #1a73e8; text-decoration: none; } .plan-description a:hover { text-decoration: underline; } .features { list-style: none; padding: 0; margin: 0 0 2rem 0; } .features li { display: flex; align-items: flex-start; margin-bottom: 1.25rem; font-size: 1rem; line-height: 1.5; color: #3c4043; } .features li.feature-description { display: block; color: #5f6368; } .features li a { color: #1a73e8; text-decoration: none; margin-left: 4px; } .features li .material-symbols-outlined { font-size: 24px; margin-right: 0.75rem; color: #3c4043; margin-top: 2px; } .color-scheme--dark .features li, .features li .material-symbols-outlined { color: var(--devsite-primary-text-color); } .cta-button { display: inline-block; text-align: center; text-decoration: none; width: 100%; padding: 0.75rem 1rem; border-radius: 8px; font-size: 1rem; font-weight: 500; cursor: pointer; transition: background-color 0.2s ease; box-sizing: border-box; border: 1px solid #dadce0; background-color: #fff; color: #1a73e8; margin-top: auto; } .cta-button:hover { background-color: rgba(66, 133, 244, 0.05); } .pricing-card.recommended { border: 2px solid #1a73e8; overflow: hidden; } .pricing-card.recommended::before { position: absolute; top: 22px; right: -32px; width: 120px; height: 30px; background-color: #1a73e8; color: white; display: flex; justify-content: center; align-items: center; font-size: 0.8rem; font-weight: 600; transform: rotate(45deg); z-index: 1; } .heading-group { display: flex; flex-direction: column; } .heading-group h2 { margin-bottom: 0; } .heading-group em { margin-top: 0; } Home Gemini API Docs Gemini Developer API pricing We have updated our Terms of Service. Start building free of charge with generous limits, then scale up with pay-as-you-go pricing for your production ready applications. Free For developers and small projects getting started with the Gemini API. check_circleLimited access to certain models check_circleFree input & output tokens check_circleGoogle AI Studio access check_circleContent used to improve our products* Get started for Free Paid For production applications that require higher volumes and advanced features. check_circleHigher rate limits for production deployments check_circleAccess to Context caching check_circleBatch API (50% cost reduction) check_circleAccess to Google's most advanced models check_circleContent not used to improve our products* Upgrade to Paid Enterprise For large-scale deployments with custom needs for security, support, and compliance, powered by Vertex AI. check_circleAll features in Paid, plus optional access to: check_circleDedicated support channels check_circleAdvanced security & compliance check_circleProvisioned throughput check_circleVolume-based discounts (based on usage) check_circleML ops, model garden and more Contact Sales Gemini 3 Pro Preview gemini-3-pro-preview Try it in Google AI Studio The best model in the world for multimodal understanding, and our most powerful agentic and vibe-coding model yet. Standard Free Tier Paid Tier, per 1M tokens in USD Input price Not available $2.00, prompts <= 200k tokens$4.00, prompts > 200k tokens Output price (including thinking tokens) Not available $12.00, prompts <= 200k tokens$18.00, prompts > 200k Context caching price Not available $0.20, prompts <= 200k tokens$0.40, prompts > 200k$4.50 / 1,000,000 tokens per hour (storage price) Grounding with Google Search* Not available 5,000 prompts per month (free), then (Coming soon**) $14 / 1,000 search queries Grounding with Google Maps Not available Not available Used to improve our products Yes No Batch Free Tier Paid Tier, per 1M tokens in USD Input price Not available $1.00, prompts <= 200k tokens$2.00, prompts > 200k tokens Output price (including thinking tokens) Not available $6.00, prompts <= 200k tokens$9.00, prompts > 200k Context caching price Not available $0.20, prompts <= 200k tokens$0.40, prompts > 200k$4.50 / 1,000,000 tokens per hour (storage price) Grounding with Google Search* Not available 1,500 RPD (free), then (Coming soon**) $14 / 1,000 search queries Grounding with Google Maps Not available Not available Used to improve our products Yes No * A customer-submitted request to Gemini may result in one or more queries to Google Search. You will be charged for each individual search query performed. ** Gemini 3 billing for Grounding with Google Search will start January 5, 2026. Gemini 3 Flash Preview gemini-3-flash-preview Try it in Google AI Studio Our most intelligent model built for speed, combining frontier intelligence with superior search and grounding. Standard Free Tier Paid Tier, per 1M tokens in USD Input price Free of charge $0.50 (text / image / video)$1.00 (audio) Output price (including thinking tokens) Free of charge $3.00 Context caching price Free of charge $0.05 (text / image / video)$0.10 (audio)$1.00 / 1,000,000 tokens per hour (storage price) Grounding with Google Search* Not available 5,000 prompts per month (free), then (Coming soon**) $14 / 1,000 search queries Grounding with Google Maps Not available Not available Used to improve our products Yes No Batch Free Tier Paid Tier, per 1M tokens in USD Input price Not available $0.25 (text / image / video)$0.50 (audio) Output price (including thinking tokens) Not available $1.50 Context caching price Not available Same as Standard, Batch pricing not yet implemented $0.05 (text / image / video) $0.10 (audio) $1.00 / 1,000,000 tokens per hour (storage price) Grounding with Google Search* Not available 1,500 RPD (free), then (Coming soon**) $14 / 1,000 search queries Grounding with Google Maps Not available Not available Used to improve our products Yes No * A customer-submitted request to Gemini may result in one or more queries to Google Search. You will be charged for each individual search query performed. ** Gemini 3 billing for Grounding with Google Search will start January 5, 2026. Gemini 3 Pro Image Preview 🍌 gemini-3-pro-image-preview Try it in Google AI Studio Our native image generation model, optimized for speed, flexibility, and contextual understanding. Text input and output is priced the same as Gemini 3 Pro. Preview models may change before becoming stable and have more restrictive rate limits. Standard Free Tier Paid Tier, per 1M tokens in USD Input price Not available $2.00 (text/image), equivalent to $0.0011 per image* Output price Not available $12.00 (text and thinking) $120.00 (images) equivalent to $0.134 per 1K/2K image** and $0.24 per 4K image** Used to improve our products Yes No Batch Free Tier Paid Tier, per 1M tokens in USD Input price Not available $1.00 (text),$0.0006 (image)* Output price Not available $6.00 (text and thinking) $0.067 per 1K/2K image** $0.12 per 4K image** Used to improve our products Yes No * Image input is set at 560 tokens or $0.0011 per image. ** Image output is priced at $120 per 1,000,000 tokens. Output images from 1024x1024px (1K) and up to 2048x2048px (2K) consume 1120 tokens and are equivalent to $0.134 per image. Output images up to 4096x4096px (4K) consume 2000 tokens and are equivalent to $0.24 per image. Gemini 2.5 Pro gemini-2.5-pro Try it in Google AI Studio Our state-of-the-art multipurpose model, which excels at coding and complex reasoning tasks. Standard Free Tier Paid Tier, per 1M tokens in USD Input price Free of charge $1.25, prompts <= 200k tokens$2.50, prompts > 200k tokens Output price (including thinking tokens) Free of charge $10.00, prompts <= 200k tokens$15.00, prompts > 200k Context caching price Not available $0.125, prompts <= 200k tokens$0.25, prompts > 200k$4.50 / 1,000,000 tokens per hour (storage price) Grounding with Google Search Not available 1,500 RPD (free), then $35 / 1,000 grounded prompts Grounding with Google Maps Not available 10,000 RPD (free), then $25 / 1,000 grounded prompts Used to improve our products Yes No Batch Free Tier Paid Tier, per 1M tokens in USD Input price Not available $0.625, prompts <= 200k tokens$1.25, prompts > 200k tokens Output price (including thinking tokens) Not available $5.00, prompts <= 200k tokens$7.50, prompts > 200k Context caching price Not available $0.125, prompts <= 200k tokens$0.25, prompts > 200k$4.50 / 1,000,000 tokens per hour (storage price) Grounding with Google Search Not available 1,500 RPD (free), then $35 / 1,000 grounded prompts Grounding with Google Maps Not available Not available Used to improve our products Yes No Gemini 2.5 Flash gemini-2.5-flash Try it in Google AI Studio Our first hybrid reasoning model which supports a 1M token context window and has thinking budgets. Standard Free Tier Paid Tier, per 1M tokens in USD Input price Free of charge $0.30 (text / image / video)$1.00 (audio) Output price (including thinking tokens) Free of charge $2.50 Context caching price Not available $0.03 (text / image / video)$0.1 (audio)$1.00 / 1,000,000 tokens per hour (storage price) Grounding with Google Search Free of charge, up to 500 RPD (limit shared with Flash-Lite RPD) 1,500 RPD (free, limit shared with Flash-Lite RPD), then $35 / 1,000 grounded prompts Grounding with Google Maps 500 RPD 1,500 RPD (free), then $25 / 1,000 grounded prompts Used to improve our products Yes No Batch Free Tier Paid Tier, per 1M tokens in USD Input price Not available $0.15 (text / image / video)$0.50 (audio) Output price (including thinking tokens) Not available $1.25 Context caching price Not available $0.03 (text / image / video)$0.1 (audio)$1.00 / 1,000,000 tokens per hour (storage price) Grounding with Google Search Not available 1,500 RPD (free, limit shared with Flash-Lite RPD), then $35 / 1,000 grounded prompts Grounding with Google Maps Not available Not available Used to improve our products Yes No Gemini 2.5 Flash Preview gemini-2.5-flash-preview-09-2025 Try it in Google AI Studio The latest model based on the 2.5 Flash model. 2.5 Flash Preview is best for large scale processing, low-latency, high volume tasks that require thinking, and agentic use cases. Standard Free Tier Paid Tier, per 1M tokens in USD Input price Free of charge $0.30 (text / image / video)$1.00 (audio) Output price (including thinking tokens) Free of charge $2.50 Context caching price Not available $0.03 (text / image / video)$0.1 (audio)$1.00 / 1,000,000 tokens per hour (storage price) Grounding with Google Search Free of charge, up to 500 RPD (limit shared with Flash-Lite RPD) 1,500 RPD (free, limit shared with Flash-Lite RPD), then $35 / 1,000 grounded prompts Used to improve our products Yes No Batch Free Tier Paid Tier, per 1M tokens in USD Input price Not available $0.15 (text / image / video)$0.50 (audio) Output price (including thinking tokens) Not available $1.25 Context caching price Not available $0.03 (text / image / video)$0.1 (audio)$1.00 / 1,000,000 tokens per hour (storage price) Grounding with Google Search Not available 1,500 RPD (free, limit shared with Flash-Lite RPD), then $35 / 1,000 grounded prompts Used to improve our products Yes No Gemini 2.5 Flash-Lite gemini-2.5-flash-lite Try it in Google AI Studio Our smallest and most cost effective model, built for at scale usage. Standard Free Tier Paid Tier, per 1M tokens in USD Input price (text, image, video) Free of charge $0.10 (text / image / video)$0.30 (audio) Output price (including thinking tokens) Free of charge $0.40 Context caching price Not available $0.01 (text / image / video)$0.03 (audio)$1.00 / 1,000,000 tokens per hour (storage price) Grounding with Google Search Free of charge, up to 500 RPD (limit shared with Flash RPD) 1,500 RPD (free, limit shared with Flash RPD), then $35 / 1,000 grounded prompts Grounding with Google Maps 500 RPD 1,500 RPD (free), then $25 / 1,000 grounded prompts Used to improve our products Yes No Batch Free Tier Paid Tier, per 1M tokens in USD Input price (text, image, video) Not available $0.05 (text / image / video)$0.15 (audio) Output price (including thinking tokens) Not available $0.20 Context caching price Not available $0.01 (text / image / video)$0.03 (audio)$1.00 / 1,000,000 tokens per hour (storage price) Grounding with Google Search Not available 1,500 RPD (free, limit shared with Flash RPD), then $35 / 1,000 grounded prompts Grounding with Google Maps Not available Not available Used to improve our products Yes No Gemini 2.5 Flash-Lite Preview gemini-2.5-flash-lite-preview-09-2025 Try it in Google AI Studio The latest model based on Gemini 2.5 Flash lite optimized for cost-efficiency, high throughput and high quality. Standard Free Tier Paid Tier, per 1M tokens in USD Input price (text, image, video) Free of charge $0.10 (text / image / video)$0.30 (audio) Output price (including thinking tokens) Free of charge $0.40 Context caching price Not available $0.01 (text / image / video)$0.03 (audio)$1.00 / 1,000,000 tokens per hour (storage price) Grounding with Google Search Free of charge, up to 500 RPD (limit shared with Flash RPD) 1,500 RPD (free, limit shared with Flash RPD), then $35 / 1,000 grounded prompts Used to improve our products Yes No Batch Free Tier Paid Tier, per 1M tokens in USD Input price (text, image, video) Not available $0.05 (text / image / video)$0.15 (audio) Output price (including thinking tokens) Not available $0.20 Context caching price Not available $0.01 (text / image / video)$0.03 (audio)$1.00 / 1,000,000 tokens per hour (storage price) Grounding with Google Search Not available 1,500 RPD (free, limit shared with Flash RPD), then $35 / 1,000 grounded prompts Used to improve our products Yes No Gemini 2.5 Flash Native Audio (Live API) gemini-2.5-flash-native-audio-preview-12-2025 Try it in Google AI Studio Our Live API native audio models optimized for higher quality audio outputs with better pacing, voice naturalness, verbosity, and mood. Preview models may change before becoming stable and have more restrictive rate limits. Free Tier Paid Tier, per 1M tokens in USD Input price Free of charge $0.50 (text)$3.00 (audio / video) Output price (including thinking tokens) Free of charge $2.00 (text)$12.00 (audio) Used to improve our products Yes No Gemini 2.5 Flash Image 🍌 gemini-2.5-flash-image Try it in Google AI Studio Our native image generation model, optimized for speed, flexibility, and contextual understanding. Text input and output is priced the same as 2.5 Flash. Preview models may change before becoming stable and have more restrictive rate limits. Standard Free Tier Paid Tier, per 1M tokens in USD Input price Not available $0.30 (text / image) Output price Not available $0.039 per image* Used to improve our products Yes No Batch Free Tier Paid Tier, per 1M tokens in USD Input price Not available $0.15 (text / image) Output price Not available $0.0195 per image* Used to improve our products Yes No [*] Image output is priced at $30 per 1,000,000 tokens. Output images up to 1024x1024px consume 1290 tokens and are equivalent to $0.039 per image. Gemini 2.5 Flash Preview TTS gemini-2.5-flash-preview-tts Try it in Google AI Studio Our 2.5 Flash text-to-speech audio model optimized for price-performant, low-latency, controllable speech generation. Preview models may change before becoming stable and have more restrictive rate limits. Standard Free Tier Paid Tier, per 1M tokens in USD Input price Free of charge $0.50 (text) Output price Free of charge $10.00 (audio) Used to improve our products Yes No Batch Free Tier Paid Tier, per 1M tokens in USD Input price Not available $0.25 (text) Output price Not available $5.00 (audio) Used to improve our products Yes No Gemini 2.5 Pro Preview TTS gemini-2.5-pro-preview-tts Try it in Google AI Studio Our 2.5 Pro text-to-speech audio model optimized for powerful, low-latency speech generation for more natural outputs and easier to steer prompts. Preview models may change before becoming stable and have more restrictive rate limits. Standard Free Tier Paid Tier, per 1M tokens in USD Input price Not available $1.00 (text) Output price Not available $20.00 (audio) Used to improve our products Yes No Batch Free Tier Paid Tier, per 1M tokens in USD Input price Not available $0.50 (text) Output price Not available $10.00 (audio) Used to improve our products Yes No Gemini 2.0 Flash gemini-2.0-flash Try it in Google AI Studio Our most balanced multimodal model with great performance across all tasks, with a 1 million token context window, and built for the era of Agents. Standard Free Tier Paid Tier, per 1M tokens in USD Input price Free of charge $0.10 (text / image / video)$0.70 (audio) Output price Free of charge $0.40 Context caching price Free of charge $0.025 / 1,000,000 tokens (text/image/video)$0.175 / 1,000,000 tokens (audio) Context caching (storage) Not available $1.00 / 1,000,000 tokens per hour Image generation pricing Free of charge $0.039 per image* Tuning price Not available Not available Grounding with Google Search Free of charge, up to 500 RPD 1,500 RPD (free), then $35 / 1,000 grounded prompts Grounding with Google Maps 500 RPD 1,500 RPD (free), then $25 / 1,000 grounded prompts Used to improve our products Yes No Batch Free Tier Paid Tier, per 1M tokens in USD Input price Not available $0.05 (text / image / video)$0.35 (audio) Output price Not available $0.20 Context caching price Not available $0.025 / 1,000,000 tokens (text/image/video)$0.175 / 1,000,000 tokens (audio) Context caching (storage) Not available $1.00 / 1,000,000 tokens per hour Image generation pricing Not available $0.0195 per image* Tuning price Not available Not available Grounding with Google Search Not available 1,500 RPD (free), then $35 / 1,000 grounded prompts Grounding with Google Maps Not available Not available Used to improve our products Yes No [*] Image output is priced at $30 per 1,000,000 tokens. Output images up to 1024x1024px consume 1290 tokens and are equivalent to $0.039 per image. Gemini 2.0 Flash-Lite gemini-2.0-flash-lite Try it in Google AI Studio Our smallest and most cost effective model, built for at scale usage. Standard Free Tier Paid Tier, per 1M tokens in USD Input price Free of charge $0.075 Output price Free of charge $0.30 Context caching price Not available Not available Context caching (storage) Not available Not available Tuning price Not available Not available Grounding with Google Search Not available Not available Used to improve our products Yes No Batch Free Tier Paid Tier, per 1M tokens in USD Input price Not available $0.0375 Output price Not available $0.15 Context caching price Not available Not available Context caching (storage) Not available Not available Tuning price Not available Not available Grounding with Google Search Not available Not available Used to improve our products Yes No Imagen 4 imagen-4.0-generate-001, imagen-4.0-ultra-generate-001, imagen-4.0-fast-generate-001 Try it in Google AI Studio Our latest image generation model, with significantly better text rendering and better overall image quality. Preview models may change before becoming stable and have more restrictive rate limits. Free Tier Paid Tier, per Image in USD Imagen 4 Fast image price Not available $0.02 Imagen 4 Standard image price Not available $0.04 Imagen 4 Ultra image price Not available $0.06 Used to improve our products Yes No Imagen 3 imagen-3.0-generate-002 Try it in Google AI Studio Our state-of-the-art image generation model, available to developers on the paid tier of the Gemini API. Free Tier Paid Tier, per Image in USD Image price Not available $0.03 Used to improve our products Yes No Veo 3.1 veo-3.1-generate-preview, veo-3.1-fast-generate-preview Try Veo 3.1 Our latest video generation model, available to developers on the paid tier of the Gemini API. Preview models may change before becoming stable and have more restrictive rate limits. Free Tier Paid Tier, per second in USD Veo 3.1 Standard video with audio price (default) Not available $0.40 Veo 3.1 Fast video with audio price (default) Not available $0.15 Used to improve our products Yes No Note: In some cases, an audio processing issue may prevent a video from being generated. You will only be charged if your video is successfully generated. Veo 3 veo-3.0-generate-001, veo-3.0-fast-generate-001 Try Veo 3 Our stable video generation model, available to developers on the paid tier of the Gemini API. Free Tier Paid Tier, per second in USD Veo 3 Standard video with audio price (default) Not available $0.40 Veo 3 Fast video with audio price (default) Not available $0.15 Used to improve our products Yes No Note: In some cases, an audio processing issue may prevent a video from being generated. You will only be charged if your video is successfully generated. Veo 2 veo-2.0-generate-001 Try the API Our state-of-the-art video generation model, available to developers on the paid tier of the Gemini API. Free Tier Paid Tier, per second in USD Video price Not available $0.35 Used to improve our products Yes No Gemini Embedding gemini-embedding-001 Try the API Our newest embeddings model, more stable and with higher rate limits than previous versions, available to developers on the free and paid tiers of the Gemini API. Standard Free Tier Paid Tier, per 1M tokens in USD Input price Free of charge $0.15 Used to improve our products Yes No Batch Free Tier Paid Tier, per 1M tokens in USD Input price Not available $0.075 Used to improve our products Yes No Gemini Robotics-ER 1.5 Preview gemini-robotics-er-1.5-preview Try it in Google AI Studio Gemini Robotics-ER, short for Gemini Robotics-Embodied Reasoning, is a thinking model that enhances robots' abilities to understand and interact with the physical world. Standard Free Tier Paid Tier, per 1M tokens in USD Input price Free of charge $0.30 (text / image / video)$1.00 (audio) Output price (including thinking tokens) Free of charge $2.50 Grounding with Google Search Free of charge, up to 500 RPD (limit shared with Flash-Lite RPD) 1,500 RPD (free, limit shared with Flash-Lite RPD), then $35 / 1,000 grounded prompts Used to improve our products Yes No Batch Free Tier Paid Tier, per 1M tokens in USD Input price Not available Not available Output price (including thinking tokens) Not available Not available Grounding with Google Search Not available Not available Used to improve our products Yes No Gemini 2.5 Computer Use Preview gemini-2.5-computer-use-preview-10-2025 Our Computer Use model optimized for building browser control agents that automate tasks. Free Tier Paid Tier, per 1M tokens in USD Input price Not available $1.25, prompts <= 200k tokens$2.50, prompts > 200k token Output price Not available $10.00, prompts <= 200k tokens$15.00, prompts > 200k Used to improve our products Yes No Gemma 3 Our lightweight, state-of the art, open model built from the same technology that powers our Gemini models. Free Tier Paid Tier, per 1M tokens in USD Input price Free of charge Not available Output price Free of charge Not available Context caching price Free of charge Not available Context caching (storage) Free of charge Not available Tuning price Not available Not available Grounding with Google Search Not available Not available Used to improve our products Yes No Gemma 3n Our open model built for efficient performance on everyday devices like mobile phones, laptops, and tablets. Free Tier Paid Tier, per 1M tokens in USD Input price Free of charge Not available Output price Free of charge Not available Context caching price Free of charge Not available Context caching (storage) Free of charge Not available Tuning price Not available Not available Grounding with Google Search Not available Not available Used to improve our products Yes No Pricing for tools Tools are priced at their own rates, applied to the model using them. Check the Models page for which tools are available to each model. Free Tier Paid Tier, per 1M tokens in USD Google Search 500 RPD free (limit shared for Flash and Flash-Lite).Not available for Pro. 1,500 RPD free (limit shared for Flash and Flash-Lite).Then $35 / 1,000 grounded prompts See the Gemini 3 Pro Preview pricing section for Gemini 3 specific Search pricing. Google Maps 500 RPDNot available for Pro. 1,500 RPD free (limit shared for Flash and Flash-Lite)10,000 RPD free for Pro.Then $25 / 1,000 grounded prompts Code execution Free of charge Free of charge URL context Free of charge Charged as input tokens per model pricing. Computer use Not available See Gemini 2.5 Computer Use Preview pricing table. File search Free of charge Charged for embeddings at $0.15 / 1M tokens.Retrieved document tokens charged as regular tokens per model pricing. [*] Google AI Studio usage is free of charge in all available regions. See Billing FAQs for details. [**] Prices may differ from the prices listed here and the prices offered on Vertex AI. For Vertex prices, see the Vertex AI pricing page. [***] If you are using dynamic retrieval to optimize costs, only requests that contain at least one grounding support URL from the web in their response are charged for Grounding with Google Search. Costs for Gemini always apply. Rate limits are subject to change. Pricing for agents Agent usage costs are calculated based on the underlying token consumption and usage of the tools. Model Tools Gemini Deep Research Agent All model inference is charged at standard Gemini 3 Pro list rates, including input, output, and intermediate input / reasoning tokens generated during agentic loops. Tool usage fees apply per existing pricing structure, maintaining standard distinctions for Search Grounding (retrieved tokens excluded) versus Url_context / File Search (retrieved tokens included in all other tools). Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates. Last updated 2025-12-18 UTC. [[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-12-18 UTC."],[],[]] Terms Privacy Manage cookies English Deutsch Español – América Latina Français Indonesia Italiano Polski Português – Brasil Shqip Tiếng Việt Türkçe Русский עברית العربيّة فارسی हिंदी বাংলা ภาษาไทย 中文 – 简体中文 – 繁體日本語 한국어 [] {"at": "True", "ga4": [], "ga4p": [], "gtm": [{"id": "GTM-TC2MQKS8", "purpose": 0}], "parameters": {"internalUser": "False", "language": {"machineTranslated": "False", "requested": "en", "served": "en"}, "pageType": "article", "projectName": "Gemini API", "signedIn": "False", "tenant": "googledevai", "recommendations": {"sourcePage": "", "sourceType": 0, "sourceRank": 0, "sourceIdenticalDescriptions": 0, "sourceTitleWords": 0, "sourceDescriptionWords": 0, "experiment": ""}, "experiment": {"ids": ""}}} (function(d,e,v,s,i,t,E){d['GoogleDevelopersObject']=i; t=e.createElement(v);t.async=1;t.src=s;E=e.getElementsByTagName(v)[0]; E.parentNode.insertBefore(t,E);})(window, document, 'script', 'https://www.gstatic.com/devrel-devsite/prod/ve08add287a6b4bdf8961ab8a1be50bf551be3816cdd70b7cc934114ff3ad5f10/googledevai/js/app_loader.js', '[59,"en",null,"/js/devsite_app_module.js","https://www.gstatic.com/devrel-devsite/prod/ve08add287a6b4bdf8961ab8a1be50bf551be3816cdd70b7cc934114ff3ad5f10","https://www.gstatic.com/devrel-devsite/prod/ve08add287a6b4bdf8961ab8a1be50bf551be3816cdd70b7cc934114ff3ad5f10/googledevai","https://googledevai-dot-devsite-v2-prod-3p.appspot.com",null,null,["/_pwa/googledevai/manifest.json","https://www.gstatic.com/devrel-devsite/prod/ve08add287a6b4bdf8961ab8a1be50bf551be3816cdd70b7cc934114ff3ad5f10/images/video-placeholder.svg","https://www.gstatic.com/devrel-devsite/prod/ve08add287a6b4bdf8961ab8a1be50bf551be3816cdd70b7cc934114ff3ad5f10/googledevai/images/favicon-new.png","/_static/googledevai/images/gemini-api-logo.svg","https://fonts.googleapis.com/css?family=Google+Sans:400,500|Roboto:400,400italic,500,500italic,700,700italic|Roboto+Mono:400,500,700|Inter:400,500|Inter+Tight:300,500,600&display=swap"],1,null,[1,6,8,12,14,17,21,25,50,52,63,70,75,76,80,87,91,92,93,97,98,100,101,102,103,104,105,107,108,109,110,112,113,116,117,118,120,122,124,125,126,127,129,130,131,132,133,134,135,136,138,140,141,147,148,149,151,152,156,157,158,159,161,163,164,168,169,170,179,180,182,183,186,191,193,196],"AIzaSyCNm9YxQumEXwGJgTDjxoxXK6m1F-9720Q","AIzaSyCc76DZePGtoyUjqKrLdsMGk_ry7sljLbY","ai.google.dev","AIzaSyB9bqgQ2t11WJsOX8qNsCQ6U-w91mmqF-I","AIzaSyAdYnStPdzjcJJtQ0mvIaeaMKj7_t6J_Fg",null,null,null,["Cloud__enable_cloud_shell","Profiles__enable_join_program_group_endpoint","DevPro__enable_embed_profile_creation","TpcFeatures__proxy_prod_host","Concierge__enable_devsite_llm_tools","DevPro__enable_firebase_workspaces_card","SignIn__enable_l1_signup_flow","Experiments__reqs_query_experiments","BookNav__enable_tenant_cache_key","MiscFeatureFlags__enable_llms_txt","DevPro__enable_code_assist","Profiles__enable_auto_apply_credits","MiscFeatureFlags__enable_firebase_utm","OnSwitch__enable","MiscFeatureFlags__enable_variable_operator_index_yaml","Cloud__enable_legacy_calculator_redirect","CloudShell__cloud_code_overflow_menu","Cloud__enable_llm_concierge_chat","DevPro__enable_free_benefits","MiscFeatureFlags__enable_appearance_cookies","Concierge__enable_actions_menu","Search__enable_ai_eligibility_checks","Concierge__enable_remove_info_panel_tags","MiscFeatureFlags__enable_project_variables","DevPro__enable_google_payments_buyflow","Profiles__enable_developer_profile_pages_as_content","CloudShell__cloud_shell_button","Search__enable_suggestions_from_borg","Profiles__enable_complete_playlist_endpoint","Profiles__enable_developer_profile_benefits_ui_redesign","Cloud__enable_cloud_dlp_service","MiscFeatureFlags__gdp_dashboard_reskin_enabled","Search__enable_dynamic_content_confidential_banner","MiscFeatureFlags__emergency_css","Cloud__enable_cloud_shell_fte_user_flow","Profiles__enable_profile_collections","Profiles__enable_playlist_community_acl","Profiles__enable_user_type","Cloud__enable_cloudx_experiment_ids","Analytics__enable_clearcut_logging","Profiles__enable_purchase_prompts","EngEduTelemetry__enable_engedu_telemetry","Profiles__enable_completequiz_endpoint","MiscFeatureFlags__enable_variable_operator","DevPro__enable_nvidia_credits_card","Profiles__require_profile_eligibility_for_signin","DevPro__enable_vertex_credit_card","TpcFeatures__enable_unmirrored_page_left_nav","Profiles__enable_public_developer_profiles","MiscFeatureFlags__developers_footer_image","Cloud__cache_serialized_dynamic_content","MiscFeatureFlags__enable_explain_this_code","DevPro__remove_eu_tax_intake_form","Cloud__enable_free_trial_server_call","MiscFeatureFlags__enable_view_transitions","MiscFeatureFlags__developers_footer_dark_image","Profiles__enable_callout_notifications","Profiles__enable_completecodelab_endpoint","Profiles__enable_stripe_subscription_management","DevPro__enable_google_one_card","DevPro__enable_developer_subscriptions","Profiles__enable_developer_profiles_callout","Profiles__enable_recognition_badges","Profiles__enable_dashboard_curated_recommendations","Search__enable_page_map","MiscFeatureFlags__enable_framebox_badge_methods","Profiles__enable_awarding_url","MiscFeatureFlags__enable_explicit_template_dependencies","Search__enable_ai_search_summaries_for_all","Profiles__enable_release_notes_notifications","DevPro__enable_google_payments","DevPro__enable_cloud_innovators_plus","DevPro__enable_devpro_offers","Cloud__fast_free_trial","Concierge__enable_pushui","Profiles__enable_page_saving","DevPro__enable_enterprise"],null,null,"AIzaSyA58TaKli1DculwmAmbpzLVGuWc8eCQgQc","https://developerscontentserving-pa.googleapis.com","AIzaSyDWBU60w0P9hEkr29kkksYs8Z7gvZ8u_wc","https://developerscontentsearch-pa.googleapis.com",2,4,null,"https://developerprofiles-pa.googleapis.com",[59,"googledevai","Google AI for Developers","ai.google.dev",null,"googledevai-dot-devsite-v2-prod-3p.appspot.com",null,null,[null,1,null,null,null,null,null,null,null,null,null,[1],null,null,null,null,null,null,[1],null,null,null,null,[1],[1,1,null,1,1],null,null,null,null,null,[1]],null,[73,null,null,null,null,null,"/images/lockup-new.svg","/images/touchicon-180-new.png",null,null,null,null,1,1,null,null,null,null,null,null,null,2,null,null,null,"/images/lockup-dark-theme-new.svg",[]],[],null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,[44,2,4,6,7,12,14,15,17,18,20,21,22,23,28,29,32,37,39,40,43],null,[[],[1,1],null,1],[[null,null,null,null,null,["GTM-TC2MQKS8"],null,null,null,null,null,[["GTM-TC2MQKS8",1]],1]],null,4,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,[]],null,null,1,1,"https://developerscontentinsights-pa.googleapis.com","AIzaSyC11xEGtFhkmSh_iF6l_itbxnFz2GrIBOg","AIzaSyAXJ10nRF73mmdSDINgkCNX5bbd2KPcWm8","https://developers.googleapis.com",["https://aistudio.google.com"],null,"AIzaSyCjP0KOnHfv8mwe38sfzZJMOnqE3HvrD4A"]')
TOOL: !function(){try{var d=document.documentElement,c=d.classList;c.remove('light','dark');var e=localStorage.getItem('theme');if('system'===e||(!e&&true)){var t='(prefers-color-scheme: dark)',m=window.matchMedia(t);if(m.media!==t||m.matches){d.style.colorScheme = 'dark';c.add('dark')}else{d.style.colorScheme = 'light';c.add('light')}}else if(e){c.add(e|| '')}if(e==='light'||e==='dark')d.style.colorScheme=e}catch(e){}}()DocsLoginPlaygroundAPI KeysDashboardDocsLog InDocumentationDocsAPI ReferenceSearchDocsGetting StartedOverviewQuickstartModelsOpenAI CompatibilityResponses APIRate LimitsTemplatesAPI ReferenceCore FeaturesText GenerationSpeech to TextText to SpeechOrpheusOCR and Image RecognitionReasoningContent ModerationStructured OutputsPrompt CachingTools & IntegrationsTool UseOverviewGroq Built-In ToolsWeb SearchVisit WebsiteBrowser AutomationCode ExecutionWolfram AlphaBrowser Search (GPT OSS Models)Remote Tools and MCPConnectorsLocal Tool CallingIntegrations CatalogCoding with GroqFactory DroidOpenCodeKilo CodeRoo CodeClineCompound (Agentic AI)OverviewBuilt-In ToolsSystemsUse CasesGuidesPrompting GuideBasicsPatternsModel MigrationAssistant Message PrefillingService TiersService TiersPerformance TierFlex ProcessingBatch ProcessingAdvancedLoRA InferenceProduction ReadinessProduction ChecklistOptimizing LatencySecurity OnboardingPrometheus MetricsAccount and ConsoleSpend LimitsProjectsModel PermissionsBilling FAQsYour DataDeveloper ResourcesSDK LibrariesGroq BadgeDeveloper CommunityOpenBenchError CodesChangelogLegalPolicies & NoticesSearchDocsAPI ReferenceGetting StartedOverviewQuickstartModelsOpenAI CompatibilityResponses APIRate LimitsTemplatesAPI ReferenceCore FeaturesText GenerationSpeech to TextText to SpeechOrpheusOCR and Image RecognitionReasoningContent ModerationStructured OutputsPrompt CachingTools & IntegrationsTool UseOverviewGroq Built-In ToolsWeb SearchVisit WebsiteBrowser AutomationCode ExecutionWolfram AlphaBrowser Search (GPT OSS Models)Remote Tools and MCPConnectorsLocal Tool CallingIntegrations CatalogCoding with GroqFactory DroidOpenCodeKilo CodeRoo CodeClineCompound (Agentic AI)OverviewBuilt-In ToolsSystemsUse CasesGuidesPrompting GuideBasicsPatternsModel MigrationAssistant Message PrefillingService TiersService TiersPerformance TierFlex ProcessingBatch ProcessingAdvancedLoRA InferenceProduction ReadinessProduction ChecklistOptimizing LatencySecurity OnboardingPrometheus MetricsAccount and ConsoleSpend LimitsProjectsModel PermissionsBilling FAQsYour DataDeveloper ResourcesSDK LibrariesGroq BadgeDeveloper CommunityOpenBenchError CodesChangelogLegalPolicies & NoticesRate LimitsCopy page Rate limits act as control measures to regulate how frequently users and applications can access our API within specified timeframes. These limits help ensure service stability, fair access, and protection against misuse so that we can serve reliable and fast inference for all. Understanding Rate Limits Rate limits are measured in: RPM: Requests per minute RPD: Requests per day TPM: Tokens per minute TPD: Tokens per day ASH: Audio seconds per hour ASD: Audio seconds per day Cached tokens do not count towards your rate limits. Rate limits apply at the organization level, not individual users. You can hit any limit type depending on which threshold you reach first. Example: Let's say your RPM = 50 and your TPM = 200K. If you were to send 50 requests with only 100 tokens within a minute, you would reach your limit even though you did not send 200K tokens within those 50 requests. Rate Limits The following is a high level summary and there may be exceptions to these limits. You can view the current, exact rate limits for your organization on the limits page in your account settings. Need higher rate limits? Upgrade to Developer plan to access higher limits, Batch and Flex processing, and more. Note that the limits shown below are the base limits for the Developer plan, and higher limits are available for select workloads and enterprise use cases. Free Plan LimitsDeveloper Plan LimitsMODEL IDRPMRPDTPMTPDASHASDallam-2-7b307K6K500K--canopylabs/orpheus-arabic-saudi101001.2K3.6K--canopylabs/orpheus-v1-english101001.2K3.6K--groq/compound3025070K---groq/compound-mini3025070K---llama-3.1-8b-instant3014.4K6K500K--llama-3.3-70b-versatile301K12K100K--meta-llama/llama-4-maverick-17b-128e-instruct301K6K500K--meta-llama/llama-4-scout-17b-16e-instruct301K30K500K--meta-llama/llama-guard-4-12b3014.4K15K500K--meta-llama/llama-prompt-guard-2-22m3014.4K15K500K--meta-llama/llama-prompt-guard-2-86m3014.4K15K500K--moonshotai/kimi-k2-instruct601K10K300K--moonshotai/kimi-k2-instruct-0905601K10K300K--openai/gpt-oss-120b301K8K200K--openai/gpt-oss-20b301K8K200K--openai/gpt-oss-safeguard-20b301K8K200K--qwen/qwen3-32b601K6K500K--whisper-large-v3202K--7.2K28.8Kwhisper-large-v3-turbo202K--7.2K28.8K Rate Limit Headers In addition to viewing your limits on your account's limits page, you can also view rate limit information such as remaining requests and tokens in HTTP response headers as follows: The following headers are set (values are illustrative): HeaderValueNotesretry-after2In secondsx-ratelimit-limit-requests14400Always refers to Requests Per Day (RPD)x-ratelimit-limit-tokens18000Always refers to Tokens Per Minute (TPM)x-ratelimit-remaining-requests14370Always refers to Requests Per Day (RPD)x-ratelimit-remaining-tokens17997Always refers to Tokens Per Minute (TPM)x-ratelimit-reset-requests2m59.56sAlways refers to Requests Per Day (RPD)x-ratelimit-reset-tokens7.66sAlways refers to Tokens Per Minute (TPM) Handling Rate Limits When you exceed rate limits, our API returns a 429 Too Many Requests HTTP status code. Note: retry-after is only set if you hit the rate limit and status code 429 is returned. The other headers are always included.Was this page helpful?YesNoSuggest EditsOn this page(self.__next_f=self.__next_f||[]).push([0])self.__next_f.push([1,"1:\"$Sreact.fragment\"\n3:I[85554,[],\"ClientSegmentRoot\"]\n"])self.__next_f.push([1,"4:I[76079,[\"9817\",\"static/chunks/a80d2a2d-995f3588264c9768.js\",\"4023\",\"static/chunks/fba002aa-6277ce0237ac1df9.js\",\"4620\",\"static/chunks/4620-e1aaeee1f717953d.js\",\"1583\",\"static/chunks/1583-cd6cfde0c82061af.js\",\"8172\",\"static/chunks/8172-1bfde9a6a7f76d76.js\",\"2845\",\"static/chunks/2845-233d0b7733254462.js\",\"6808\",\"static/chunks/6808-c0c7b12c1b07a90a.js\",\"1850\",\"static/chunks/1850-70f7c54b75f154be.js\",\"6767\",\"static/chunks/6767-da92d72583ce2ee5.js\",\"1489\",\"static/chunks/1489-14b8a5cb2f9ff04a.js\",\"5183\",\"static/chunks/5183-d52e57595b0209a4.js\",\"9791\",\"static/chunks/9791-73094ddf02612085.js\",\"2376\",\"static/chunks/2376-3cb5f78aefc53c11.js\",\"4897\",\"static/chunks/4897-15b235f72e0c3911.js\",\"7083\",\"static/chunks/7083-de63b84bec5eb998.js\",\"8409\",\"static/chunks/8409-e9795dedcd1ddc21.js\",\"8932\",\"static/chunks/8932-458d55afb25ec506.js\",\"6672\",\"static/chunks/6672-98691e627e9f180d.js\",\"4718\",\"static/chunks/4718-28bb22cdd58e2a2d.js\",\"3038\",\"static/chunks/3038-7535f8d12bb8f741.js\",\"9870\",\"static/chunks/app/(console)/layout-c6384943f94a4562.js\"],\"default\"]\n"])self.__next_f.push([1,"5:I[5467,[],\"\"]\n6:I[54543,[\"9817\",\"static/chunks/a80d2a2d-995f3588264c9768.js\",\"4620\",\"static/chunks/4620-e1aaeee1f717953d.js\",\"1583\",\"static/chunks/1583-cd6cfde0c82061af.js\",\"806\",\"static/chunks/app/(console)/error-c5cf56839b682d41.js\"],\"default\"]\n7:I[49575,[],\"\"]\na:I[87627,[\"9817\",\"static/chunks/a80d2a2d-995f3588264c9768.js\",\"4620\",\"static/chunks/4620-e1aaeee1f717953d.js\",\"1583\",\"static/chunks/1583-cd6cfde0c82061af.js\",\"8172\",\"static/chunks/8172-1bfde9a6a7f76d76.js\",\"3148\",\"static/chunks/app/(console)/docs/(mdx-pages)/layout-cb69b516a234e2ec.js\"],\"FeedbackCollector\"]\n"])self.__next_f.push([1,"b:I[40187,[\"9817\",\"static/chunks/a80d2a2d-995f3588264c9768.js\",\"4023\",\"static/chunks/fba002aa-6277ce0237ac1df9.js\",\"4620\",\"static/chunks/4620-e1aaeee1f717953d.js\",\"1583\",\"static/chunks/1583-cd6cfde0c82061af.js\",\"8172\",\"static/chunks/8172-1bfde9a6a7f76d76.js\",\"2845\",\"static/chunks/2845-233d0b7733254462.js\",\"6808\",\"static/chunks/6808-c0c7b12c1b07a90a.js\",\"1850\",\"static/chunks/1850-70f7c54b75f154be.js\",\"54\",\"static/chunks/54-81fd6d34fa1b6f5c.js\",\"3896\",\"static/chunks/3896-85f929747408a502.js\",\"6936\",\"static/chunks/6936-39cb1754394917fe.js\",\"6767\",\"static/chunks/6767-da92d72583ce2ee5.js\",\"1489\",\"static/chunks/1489-14b8a5cb2f9ff04a.js\",\"5183\",\"static/chunks/5183-d52e57595b0209a4.js\",\"9791\",\"static/chunks/9791-73094ddf02612085.js\",\"2376\",\"static/chunks/2376-3cb5f78aefc53c11.js\",\"4897\",\"static/chunks/4897-15b235f72e0c3911.js\",\"8932\",\"static/chunks/8932-458d55afb25ec506.js\",\"1530\",\"static/chunks/1530-5c6e42810cf6f581.js\",\"6672\",\"static/chunks/6672-98691e627e9f180d.js\",\"9867\",\"static/chunks/9867-891ea7f43c3211da.js\",\"1285\",\"static/chunks/app/(console)/docs/(mdx-pages)/rate-limits/page-5c07fa1cd054cc3c.js\"],\"CopyAsMarkdownButton\"]\n"])self.__next_f.push([1,"25:I[94664,[\"9817\",\"static/chunks/a80d2a2d-995f3588264c9768.js\",\"4620\",\"static/chunks/4620-e1aaeee1f717953d.js\",\"1583\",\"static/chunks/1583-cd6cfde0c82061af.js\",\"4219\",\"static/chunks/app/global-error-f1e4abe2435ee3f0.js\"],\"default\"]\n:HL[\"/_next/static/media/17e5ee57c5ca5e5a-s.p.woff2\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff2\"}]\n:HL[\"/_next/static/media/36966cca54120369-s.p.woff2\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff2\"}]\n:HL[\"/_next/static/media/904be59b21bd51cb-s.p.woff2\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff2\"}]\n:HL[\"/_next/static/media/98e207f02528a563-s.p.woff2\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff2\"}]\n:HL[\"/_next/static/media/d3ebbfd689654d3a-s.p.woff2\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff2\"}]\n:HL[\"/_next/static/media/e4af272ccee01ff0-s.p.woff2\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff2\"}]\n:HL[\"/_next/static/media/f36144f235cd456e-s.p.woff2\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff2\"}]\n:HL[\"/_next/static/css/f45d8a18f54cc3ac.css\",\"style\"]\n:HL[\"/_next/static/css/8f8f35646157a18a.css\",\"style\"]\n:HL[\"/_next/static/css/ac38a6d5f9be034c.css\",\"style\"]\n:HL[\"/_next/static/css/e4f71bf329054e3a.css\",\"style\"]\n:HL[\"/_next/static/css/7498ae178b70f92b.css\",\"style\"]\n:HL[\"/_next/static/css/c584d21c6b86a9bf.css\",\"style\"]\n:HL[\"/_next/static/css/ed0d2ab9f170a840.css\",\"style\"]\n"])self.__next_f.push([1,"0:{\"P\":null,\"b\":\"MGLctA30b1GYu3Ptn3Yy-\",\"p\":\"\",\"c\":[\"\",\"docs\",\"rate-limits\"],\"i\":false,\"f\":[[[\"\",{\"children\":[\"(console)\",{\"children\":[\"docs\",{\"children\":[\"(mdx-pages)\",{\"children\":[\"rate-limits\",{\"children\":[\"__PAGE__\",{}]}]}]}]}]},\"$undefined\",\"$undefined\",true],[\"\",[\"$\",\"$1\",\"c\",{\"children\":[[[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/_next/static/css/f45d8a18f54cc3ac.css\",\"precedence\":\"next\",\"crossOrigin\":\"$undefined\",\"nonce\":\"$undefined\"}],[\"$\",\"link\",\"1\",{\"rel\":\"stylesheet\",\"href\":\"/_next/static/css/8f8f35646157a18a.css\",\"precedence\":\"next\",\"crossOrigin\":\"$undefined\",\"nonce\":\"$undefined\"}],[\"$\",\"link\",\"2\",{\"rel\":\"stylesheet\",\"href\":\"/_next/static/css/ac38a6d5f9be034c.css\",\"precedence\":\"next\",\"crossOrigin\":\"$undefined\",\"nonce\":\"$undefined\"}],[\"$\",\"link\",\"3\",{\"rel\":\"stylesheet\",\"href\":\"/_next/static/css/e4f71bf329054e3a.css\",\"precedence\":\"next\",\"crossOrigin\":\"$undefined\",\"nonce\":\"$undefined\"}],[\"$\",\"link\",\"4\",{\"rel\":\"stylesheet\",\"href\":\"/_next/static/css/7498ae178b70f92b.css\",\"precedence\":\"next\",\"crossOrigin\":\"$undefined\",\"nonce\":\"$undefined\"}]],\"$L2\"]}],{\"children\":[\"(console)\",[\"$\",\"$1\",\"c\",{\"children\":[null,[\"$\",\"$L3\",null,{\"Component\":\"$4\",\"slots\":{\"children\":[\"$\",\"$L5\",null,{\"parallelRouterKey\":\"children\",\"error\":\"$6\",\"errorStyles\":[],\"errorScripts\":[],\"template\":[\"$\",\"$L7\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":\"$undefined\",\"forbidden\":\"$undefined\",\"unauthorized\":\"$undefined\"}]},\"params\":{},\"promise\":\"$@8\"}]]}],{\"children\":[\"docs\",[\"$\",\"$1\",\"c\",{\"children\":[[[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/_next/static/css/c584d21c6b86a9bf.css\",\"precedence\":\"next\",\"crossOrigin\":\"$undefined\",\"nonce\":\"$undefined\"}]],\"$L9\"]}],{\"children\":[\"(mdx-pages)\",[\"$\",\"$1\",\"c\",{\"children\":[null,[\"$\",\"div\",null,{\"children\":[[\"$\",\"$L5\",null,{\"parallelRouterKey\":\"children\",\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L7\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":\"$undefined\",\"forbidden\":\"$undefined\",\"unauthorized\":\"$undefined\"}],[\"$\",\"$La\",null,{}]]}]]}],{\"children\":[\"rate-limits\",[\"$\",\"$1\",\"c\",{\"children\":[null,[\"$\",\"$L5\",null,{\"parallelRouterKey\":\"children\",\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L7\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":\"$undefined\",\"forbidden\":\"$undefined\",\"unauthorized\":\"$undefined\"}]]}],{\"children\":[\"__PAGE__\",[\"$\",\"$1\",\"c\",{\"children\":[[[\"$\",\"div\",null,{\"className\":\"flex flex-col sm:flex-row sm:items-center gap-2 sm:gap-4 mb-3 justify-between\",\"children\":[[\"$\",\"h1\",null,{\"className\":\"text-2xl font-semibold font-header\",\"children\":\"Rate Limits\"}],[\"$\",\"$Lb\",null,{}]]}],\"\\n\",[\"$\",\"p\",null,{\"className\":\"text-sm my-3\",\"children\":\"Rate limits act as control measures to regulate how frequently users and applications can access our API within specified timeframes. These limits help ensure service stability, fair access, and protection\\nagainst misuse so that we can serve reliable and fast inference for all.\"}],\"\\n\",[\"$\",\"h2\",null,{\"id\":\"understanding-rate-limits\",\"className\":\"[\u0026:not(:first-child)]:mt-12 mb-3 text-xl w-fit font-semibold font-header hover:underline\",\"children\":[\"$\",\"a\",null,{\"href\":\"#understanding-rate-limits\",\"className\":\"anchor-link\",\"children\":[\"Understanding Rate Limits\"]}]}],\"\\n\",[\"$\",\"p\",null,{\"className\":\"text-sm my-3\",\"children\":\"Rate limits are measured in:\"}],\"\\n\",[\"$\",\"ul\",null,{\"className\":\"list-disc my-1 text-sm\",\"children\":[\"\\n\",[\"$\",\"li\",null,{\"className\":\"ml-6 my-1.5 text-sm\",\"children\":[[\"$\",\"strong\",null,{\"children\":\"RPM:\"}],\" Requests per minute\"]}],\"\\n\",[\"$\",\"li\",null,{\"className\":\"ml-6 my-1.5 text-sm\",\"children\":[[\"$\",\"strong\",null,{\"children\":\"RPD:\"}],\" Requests per day\"]}],\"\\n\",[\"$\",\"li\",null,{\"className\":\"ml-6 my-1.5 text-sm\",\"children\":[[\"$\",\"strong\",null,{\"children\":\"TPM:\"}],\" Tokens per minute\"]}],\"\\n\",[\"$\",\"li\",null,{\"className\":\"ml-6 my-1.5 text-sm\",\"children\":[[\"$\",\"strong\",null,{\"children\":\"TPD:\"}],\" Tokens per day\"]}],\"\\n\",[\"$\",\"li\",null,{\"className\":\"ml-6 my-1.5 text-sm\",\"children\":[[\"$\",\"strong\",null,{\"children\":\"ASH:\"}],\" Audio seconds per hour\"]}],\"\\n\",[\"$\",\"li\",null,{\"className\":\"ml-6 my-1.5 text-sm\",\"children\":[[\"$\",\"strong\",null,{\"children\":\"ASD:\"}],\" Audio seconds per day\"]}],\"\\n\"]}],\"\\n\",[\"$\",\"div\",null,{\"ref\":\"$undefined\",\"role\":\"alert\",\"className\":\"relative w-full rounded-lg p-4 [\u0026\u003esvg~*]:pl-7 [\u0026\u003esvg]:absolute [\u0026\u003esvg]:left-4 [\u0026\u003esvg]:top-4 [\u0026\u003esvg]:text-foreground bg-background-secondary text-sm px-3 pl-0 text-announcement-foreground\",\"children\":[\"$\",\"div\",null,{\"className\":\"flex items-center space-x-2 p-2 border rounded-md bg-blue-100 border-blue-500 dark:bg-[#0a2b4c] dark:border-[#14558d]\",\"children\":[[\"$\",\"svg\",null,{\"ref\":\"$undefined\",\"xmlns\":\"http://www.w3.org/2000/svg\",\"width\":16,\"height\":16,\"viewBox\":\"0 0 24 24\",\"fill\":\"none\",\"stroke\":\"currentColor\",\"strokeWidth\":2,\"strokeLinecap\":\"round\",\"strokeLinejoin\":\"round\",\"className\":\"lucide lucide-info shrink-0 text-blue-500\",\"aria-hidden\":\"true\",\"children\":[\"$Lc\",\"$Ld\",\"$Le\",\"$undefined\"]}],\"$Lf\"]}]}],\"\\n\",\"$L10\",\"\\n\",\"$L11\",\"\\n\",\"$L12\",\"\\n\",\"$L13\",\"\\n\",\"$L14\",\"\\n\",\"$L15\",\"\\n\",\"$L16\",\"\\n\",\"$L17\",\"\\n\",\"$L18\",\"\\n\",\"$L19\",\"\\n\",\"$L1a\",\"\\n\",\"$L1b\",\"\\n\",\"$L1c\",\"\\n\",\"$L1d\",\"\\n\",\"$L1e\",\"\\n\",\"$L1f\",\"\\n\",\"$L20\",\"\\n\",\"$L21\"],[\"$L22\"],\"$L23\"]}],{},null,false]},null,false]},null,false]},null,false]},null,false]},null,false],\"$L24\",false]],\"m\":\"$undefined\",\"G\":[\"$25\",[\"$L26\",\"$L27\",\"$L28\"]],\"s\":false,\"S\":true}\n"])self.__next_f.push([1,"29:I[49708,[\"9817\",\"static/chunks/a80d2a2d-995f3588264c9768.js\",\"4023\",\"static/chunks/fba002aa-6277ce0237ac1df9.js\",\"4620\",\"static/chunks/4620-e1aaeee1f717953d.js\",\"1583\",\"static/chunks/1583-cd6cfde0c82061af.js\",\"8172\",\"static/chunks/8172-1bfde9a6a7f76d76.js\",\"2845\",\"static/chunks/2845-233d0b7733254462.js\",\"6808\",\"static/chunks/6808-c0c7b12c1b07a90a.js\",\"1850\",\"static/chunks/1850-70f7c54b75f154be.js\",\"54\",\"static/chunks/54-81fd6d34fa1b6f5c.js\",\"3896\",\"static/chunks/3896-85f929747408a502.js\",\"6936\",\"static/chunks/6936-39cb1754394917fe.js\",\"6767\",\"static/chunks/6767-da92d72583ce2ee5.js\",\"1489\",\"static/chunks/1489-14b8a5cb2f9ff04a.js\",\"5183\",\"static/chunks/5183-d52e57595b0209a4.js\",\"9791\",\"static/chunks/9791-73094ddf02612085.js\",\"2376\",\"static/chunks/2376-3cb5f78aefc53c11.js\",\"4897\",\"static/chunks/4897-15b235f72e0c3911.js\",\"8932\",\"static/chunks/8932-458d55afb25ec506.js\",\"1530\",\"static/chunks/1530-5c6e42810cf6f581.js\",\"6672\",\"static/chunks/6672-98691e627e9f180d.js\",\"9867\",\"static/chunks/9867-891ea7f43c3211da.js\",\"1285\",\"static/chunks/app/(console)/docs/(mdx-pages)/rate-limits/page-5c07fa1cd054cc3c.js\"],\"Link\"]\n"])self.__next_f.push([1,"2a:I[6901,[\"9817\",\"static/chunks/a80d2a2d-995f3588264c9768.js\",\"4023\",\"static/chunks/fba002aa-6277ce0237ac1df9.js\",\"4620\",\"static/chunks/4620-e1aaeee1f717953d.js\",\"1583\",\"static/chunks/1583-cd6cfde0c82061af.js\",\"8172\",\"static/chunks/8172-1bfde9a6a7f76d76.js\",\"2845\",\"static/chunks/2845-233d0b7733254462.js\",\"6808\",\"static/chunks/6808-c0c7b12c1b07a90a.js\",\"1850\",\"static/chunks/1850-70f7c54b75f154be.js\",\"54\",\"static/chunks/54-81fd6d34fa1b6f5c.js\",\"3896\",\"static/chunks/3896-85f929747408a502.js\",\"6936\",\"static/chunks/6936-39cb1754394917fe.js\",\"6767\",\"static/chunks/6767-da92d72583ce2ee5.js\",\"1489\",\"static/chunks/1489-14b8a5cb2f9ff04a.js\",\"5183\",\"static/chunks/5183-d52e57595b0209a4.js\",\"9791\",\"static/chunks/9791-73094ddf02612085.js\",\"2376\",\"static/chunks/2376-3cb5f78aefc53c11.js\",\"4897\",\"static/chunks/4897-15b235f72e0c3911.js\",\"8932\",\"static/chunks/8932-458d55afb25ec506.js\",\"1530\",\"static/chunks/1530-5c6e42810cf6f581.js\",\"6672\",\"static/chunks/6672-98691e627e9f180d.js\",\"9867\",\"static/chunks/9867-891ea7f43c3211da.js\",\"1285\",\"static/chunks/app/(console)/docs/(mdx-pages)/rate-limits/page-5c07fa1cd054cc3c.js\"],\"RateLimitsMessage\"]\n"])self.__next_f.push([1,"2b:I[50824,[\"9817\",\"static/chunks/a80d2a2d-995f3588264c9768.js\",\"4023\",\"static/chunks/fba002aa-6277ce0237ac1df9.js\",\"4620\",\"static/chunks/4620-e1aaeee1f717953d.js\",\"1583\",\"static/chunks/1583-cd6cfde0c82061af.js\",\"8172\",\"static/chunks/8172-1bfde9a6a7f76d76.js\",\"2845\",\"static/chunks/2845-233d0b7733254462.js\",\"6808\",\"static/chunks/6808-c0c7b12c1b07a90a.js\",\"1850\",\"static/chunks/1850-70f7c54b75f154be.js\",\"54\",\"static/chunks/54-81fd6d34fa1b6f5c.js\",\"3896\",\"static/chunks/3896-85f929747408a502.js\",\"6936\",\"static/chunks/6936-39cb1754394917fe.js\",\"6767\",\"static/chunks/6767-da92d72583ce2ee5.js\",\"1489\",\"static/chunks/1489-14b8a5cb2f9ff04a.js\",\"5183\",\"static/chunks/5183-d52e57595b0209a4.js\",\"9791\",\"static/chunks/9791-73094ddf02612085.js\",\"2376\",\"static/chunks/2376-3cb5f78aefc53c11.js\",\"4897\",\"static/chunks/4897-15b235f72e0c3911.js\",\"8932\",\"static/chunks/8932-458d55afb25ec506.js\",\"1530\",\"static/chunks/1530-5c6e42810cf6f581.js\",\"6672\",\"static/chunks/6672-98691e627e9f180d.js\",\"9867\",\"static/chunks/9867-891ea7f43c3211da.js\",\"1285\",\"static/chunks/app/(console)/docs/(mdx-pages)/rate-limits/page-5c07fa1cd054cc3c.js\"],\"default\"]\n"])self.__next_f.push([1,"2c:I[13874,[],\"OutletBoundary\"]\n2e:I[96903,[],\"AsyncMetadataOutlet\"]\n30:I[13874,[],\"ViewportBoundary\"]\n32:I[13874,[],\"MetadataBoundary\"]\n33:\"$Sreact.suspense\"\nc:[\"$\",\"circle\",\"1mglay\",{\"cx\":\"12\",\"cy\":\"12\",\"r\":\"10\"}]\nd:[\"$\",\"path\",\"1dtifu\",{\"d\":\"M12 16v-4\"}]\ne:[\"$\",\"path\",\"e9boi3\",{\"d\":\"M12 8h.01\"}]\nf:[\"$\",\"div\",null,{\"className\":\"\",\"children\":[[\"$\",\"p\",\"p-0\",{\"children\":[[\"$\",\"$L29\",\"a-0\",{\"href\":\"/docs/prompt-caching\",\"children\":\"Cached tokens\"}],\" do not count towards your rate limits.\"]}]]}]\n10:[\"$\",\"p\",null,{\"className\":\"text-sm my-3\",\"children\":\"Rate limits apply at the organization level, not individual users. You can hit any limit type depending on which threshold you reach first.\"}]\n11:[\"$\",\"div\",null,{\"className\":\"h-3\"}]\n12:[\"$\",\"p\",null,{\"className\":\"text-sm my-3\",\"children\":[[\"$\",\"strong\",null,{\"children\":\"Example:\"}],\" Let's say your RPM = 50 and your TPM = 200K. If you were to send 50 requests with only 100 tokens within a minute, you would reach your limit even though you did not send 200K tokens within those\\n50 requests.\"]}]\n13:[\"$\",\"h2\",null,{\"id\":\"rate-limits\",\"className\":\"[\u0026:not(:first-child)]:mt-12 mb-3 text-xl w-fit font-semibold font-header hover:underline\",\"children\":[\"$\",\"a\",null,{\"href\":\"#rate-limits\",\"className\":\"anchor-link\",\"children\":[\"Rate Limits\"]}]}]\n14:[\"$\",\"p\",null,{\"className\":\"text-sm my-3\",\"children\":[\"The following is a high level summary and there may be exceptions to these limits. You can view the current, exact rate limits for your organization on the \",[\"$\",\"$L29\",null,{\"prefetch\":true,\"href\":\"/settings/limits\",\"children\":\"limits page\"}],\" in your account settings.\"]}]\n15:[\"$\",\"$L2a\",null,{}]\n16:[\"$\",\"div\",null,{\"className\":\"h-3\"}]\n18:[\"$\",\"h2\",null,{\"id\":\"rate-limit-headers\",\"className\":\"[\u0026:not(:first-child)]:mt-12 mb-3 text-xl w-fit font-semibold font-header hover:underline\",\"children\":[\"$\",\"a\",null,{\"href\":\"#rate-limit-headers\",\"className\":\"anchor-link\",\"children\":[\"Rate Limit Headers\"]}]}]\n19:[\"$\",\"p\",null,{\"className\":\"text-sm my-3\",\"children\":[\"In addition to viewi"])self.__next_f.push([1,"ng your limits on your account's \",[\"$\",\"$L29\",null,{\"prefetch\":true,\"href\":\"https://console.groq.com/settings/limits\",\"children\":\"limits\"}],\" page, you can also view rate limit information such as remaining requests and tokens in HTTP response\\nheaders as follows:\"]}]\n1a:[\"$\",\"div\",null,{\"className\":\"h-3\"}]\n1b:[\"$\",\"p\",null,{\"className\":\"text-sm my-3\",\"children\":\"The following headers are set (values are illustrative):\"}]\n1c:[\"$\",\"div\",null,{\"className\":\"h-3\"}]\n1d:[\"$\",\"$L2b\",null,{}]\n1e:[\"$\",\"h2\",null,{\"id\":\"handling-rate-limits\",\"className\":\"[\u0026:not(:first-child)]:mt-12 mb-3 text-xl w-fit font-semibold font-header hover:underline\",\"children\":[\"$\",\"a\",null,{\"href\":\"#handling-rate-limits\",\"className\":\"anchor-link\",\"children\":[\"Handling Rate Limits\"]}]}]\n1f:[\"$\",\"p\",null,{\"className\":\"text-sm my-3\",\"children\":[\"When you exceed rate limits, our API returns a \",[\"$\",\"code\",null,{\"className\":\"bg-gray-100 dark:bg-gray-800 py-0.5 px-[0.5em] rounded-[4px]\",\"children\":\"429 Too Many Requests\"}],\" HTTP status code.\"]}]\n20:[\"$\",\"div\",null,{\"className\":\"h-3\"}]\n21:[\"$\",\"p\",null,{\"className\":\"text-sm my-3\",\"children\":[[\"$\",\"strong\",null,{\"children\":\"Note\"}],\": \",[\"$\",\"code\",null,{\"className\":\"bg-gray-100 dark:bg-gray-800 py-0.5 px-[0.5em] rounded-[4px]\",\"children\":\"retry-after\"}],\" is only set if you hit the rate limit and status code 429 is returned. The other headers are always included.\"]}]\n22:[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/_next/static/css/ed0d2ab9f170a840.css\",\"precedence\":\"next\",\"crossOrigin\":\"$undefined\",\"nonce\":\"$undefined\"}]\n23:[\"$\",\"$L2c\",null,{\"children\":[\"$L2d\",[\"$\",\"$L2e\",null,{\"promise\":\"$@2f\"}]]}]\n24:[\"$\",\"$1\",\"h\",{\"children\":[null,[[\"$\",\"$L30\",null,{\"children\":\"$L31\"}],[\"$\",\"meta\",null,{\"name\":\"next-size-adjust\",\"content\":\"\"}]],[\"$\",\"$L32\",null,{\"children\":[\"$\",\"div\",null,{\"hidden\":true,\"children\":[\"$\",\"$33\",null,{\"fallback\":null,\"children\":\"$L34\"}]}]}]]}]\n26:[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/_next/static/css/e4f71bf329054e3a.css\",\"precedence\":\"next\",\"crossOrigin\":\"$undefined\",\"nonce\":\""])self.__next_f.push([1,"$undefined\"}]\n27:[\"$\",\"link\",\"1\",{\"rel\":\"stylesheet\",\"href\":\"/_next/static/css/7498ae178b70f92b.css\",\"precedence\":\"next\",\"crossOrigin\":\"$undefined\",\"nonce\":\"$undefined\"}]\n28:[\"$\",\"link\",\"2\",{\"rel\":\"stylesheet\",\"href\":\"/_next/static/css/8f8f35646157a18a.css\",\"precedence\":\"next\",\"crossOrigin\":\"$undefined\",\"nonce\":\"$undefined\"}]\n8:\"$0:f:0:1:2:children:1:props:children:1:props:params\"\n"])self.__next_f.push([1,"35:T12b5,"])self.__next_f.push([1,"/* For our datagrail consent banner */\n/* https://docs.datagrail.io/docs/consent/banner/css-customization/ */\n\n:host(.dg-consent-banner) {\n /* Fonts - matching Groq Console system */\n --dg-primary-font: Inter, -apple-system, BlinkMacSystemFont, \"Segoe UI\",\n Roboto, \"Helvetica Neue\", Arial, sans-serif;\n --dg-secondary-font: Montserrat, -apple-system, BlinkMacSystemFont, \"Segoe UI\",\n Roboto, \"Helvetica Neue\", Arial, sans-serif;\n\n /* General banner styling */\n --dg-consent-background-color: rgb(255, 255, 255);\n --dg-consent-background-border: rgb(231, 229, 228);\n --consent-border-radius: 12px;\n\n /* Body text styling */\n --dg-body-font-size: 14px;\n --dg-body-font-weight: 400;\n --dg-body-font-color: rgb(118, 111, 107);\n --dg-body-line-height: 1.5;\n\n /* Heading styling */\n --dg-heading-font-size: 18px;\n --dg-heading-font-weight: 600;\n --dg-heading-font-color: rgb(12, 10, 9);\n --dg-heading-line-height: 1.4;\n\n /* Title styling */\n --dg-title-font-size: 16px;\n --dg-title-font-weight: 500;\n --dg-title-font-color: rgb(30, 30, 30);\n --dg-title-line-height: 1.4;\n\n /* Button styling */\n --dg-button-border: rgb(231, 229, 228) 1px solid;\n --dg-button-primary-background: rgb(30, 30, 30);\n --dg-button-primary-color: rgb(248, 248, 247);\n --dg-button-secondary-background: rgb(243, 243, 242);\n --dg-button-secondary-color: rgb(28, 25, 23);\n\n /* Category styling */\n --dg-policy-option-heading-size: 15px;\n --dg-policy-option-heading-weight: 500;\n --dg-policy-option-heading-color: rgb(30, 30, 30);\n --dg-policy-option-heading-enabled-color: rgb(245, 80, 54);\n --dg-policy-option-chevron-size: 16;\n\n /* Category description */\n --dg-policy-option-description-font-size: 13px;\n --dg-policy-option-description-font-weight: 400;\n --dg-policy-option-description-font-color: rgb(118, 111, 107);\n\n /* Essential categories */\n --dg-policy-option-essential-label-font-size: 12px;\n --dg-policy-option-essential-label-font-weight: 500;\n --dg-policy-option-essential-label-font-color: rgb(118, 111, 107);\n\n /* Slider styling - inspired by switch component */\n --dg-slider-primary: rgb(231, 229, 228);\n --dg-slider-secondary: rgb(255, 255, 255);\n\n --dg-slider-enabled-primary: rgb(30, 30, 30);\n --dg-slider-enabled-secondary: rgb(255, 255, 255);\n\n /* For Enabled Categories */\n --dg-policy-option-heading-enabled-color: rgb(245, 80, 54);\n}\n\n:host(.dg-consent-banner) strong {\n font-weight: 500;\n}\n\n/* Slider styling for checked enabled categories */\n:host(.dg-consent-banner)\n input[type=\"checkbox\"]:not(:disabled):checked\n + label\n .dg-slider {\n background: rgb(30, 30, 30) !important;\n}\n\n/* Advanced button styling to match Groq Console */\n:host(.dg-consent-banner) .dg-button {\n border-radius: 8px !important;\n padding: 8px 16px !important;\n font-weight: 500 !important;\n font-size: 14px !important;\n transition: all 0.2s ease !important;\n}\n\n:host(.dg-consent-banner) .dg-button.accept_all,\n:host(.dg-consent-banner) .dg-button.accept_some,\n:host(.dg-consent-banner) .dg-button.reject_all,\n:host(.dg-consent-banner) .dg-button.open_layer,\n:host(.dg-consent-banner) .dg-button.custom {\n background: rgb(243, 243, 242) !important;\n color: rgb(28, 25, 23) !important;\n border: 1px solid rgb(231, 229, 228) !important;\n}\n\n:host(.dg-consent-banner) .dg-button.accept_all:hover,\n:host(.dg-consent-banner) .dg-button.accept_some:hover,\n:host(.dg-consent-banner) .dg-button.reject_all:hover,\n:host(.dg-consent-banner) .dg-button.open_layer:hover,\n:host(.dg-consent-banner) .dg-button.custom:hover {\n background: rgb(231, 229, 228) !important;\n}\n\n/* Link styling to match Groq Console */\n:host(.dg-consent-banner) .dg-link {\n color: rgb(245, 80, 54) !important;\n text-decoration: none !important;\n font-weight: 500 !important;\n}\n\n:host(.dg-consent-banner) .dg-link:hover {\n text-decoration: underline !important;\n}\n\n:host(.dg-consent-banner) .dg-main-content-policy-option-description p {\n margin-top: 0 !important;\n margin-bottom: 16px;\n}\n\n/* Dark mode support */\n:host(.dg-consent-banner) .dark {\n --dg-consent-background-color: rgb(18, 20, 24);\n --dg-consent-background-border: rgba(153, 153, 153, 0.161);\n --dg-body-font-color: rgb(165, 160, 156);\n --dg-heading-font-color: rgb(248, 248, 247);\n --dg-title-font-color: rgb(248, 248, 247);\n --dg-policy-option-heading-color: rgb(248, 248, 247);\n --dg-button-secondary-background: rgb(38, 38, 38);\n --dg-button-secondary-color: rgb(248, 248, 247);\n --dg-slider-primary: rgba(153, 153, 153, 0.35);\n --dg-slider-background: rgb(107, 114, 128);\n}\n\n/* Overall banner container styling */\n:host(.dg-consent-banner) .dg-app {\n box-shadow: 0 4px 6px -1px rgba(0, 0, 0, 0.1), 0 2px 4px -2px\n rgba(0, 0, 0, 0.1) !important;\n border: 1px solid rgb(231, 229, 228) !important;\n}\n"])self.__next_f.push([1,"2:[\"$\",\"html\",null,{\"lang\":\"en\",\"className\":\"__variable_f367f3 __variable_dd5b2f\",\"suppressHydrationWarning\":true,\"children\":[[\"$\",\"head\",null,{\"children\":[[\"$\",\"style\",null,{\"id\":\"dg-consent-custom-style\",\"dangerouslySetInnerHTML\":{\"__html\":\"$35\"}}],\"$L36\"]}],\"$L37\",\"$L38\",\"$L39\"]}]\n"])self.__next_f.push([1,"3b:I[8810,[\"9817\",\"static/chunks/a80d2a2d-995f3588264c9768.js\",\"4023\",\"static/chunks/fba002aa-6277ce0237ac1df9.js\",\"1116\",\"static/chunks/62399e41-d58e7f498626419e.js\",\"6559\",\"static/chunks/85434cc8-f36400b03831af1a.js\",\"5375\",\"static/chunks/07baea06-cb741fd4f7cf345b.js\",\"4620\",\"static/chunks/4620-e1aaeee1f717953d.js\",\"1583\",\"static/chunks/1583-cd6cfde0c82061af.js\",\"8172\",\"static/chunks/8172-1bfde9a6a7f76d76.js\",\"2845\",\"static/chunks/2845-233d0b7733254462.js\",\"6808\",\"static/chunks/6808-c0c7b12c1b07a90a.js\",\"1489\",\"static/chunks/1489-14b8a5cb2f9ff04a.js\",\"5183\",\"static/chunks/5183-d52e57595b0209a4.js\",\"3436\",\"static/chunks/3436-f8e1abf62b4293d2.js\",\"8932\",\"static/chunks/8932-458d55afb25ec506.js\",\"7177\",\"static/chunks/app/layout-d4a7aa9fe73a0e0a.js\"],\"GoogleAnalytics\"]\n"])self.__next_f.push([1,"3c:I[2388,[\"9817\",\"static/chunks/a80d2a2d-995f3588264c9768.js\",\"4023\",\"static/chunks/fba002aa-6277ce0237ac1df9.js\",\"1116\",\"static/chunks/62399e41-d58e7f498626419e.js\",\"6559\",\"static/chunks/85434cc8-f36400b03831af1a.js\",\"5375\",\"static/chunks/07baea06-cb741fd4f7cf345b.js\",\"4620\",\"static/chunks/4620-e1aaeee1f717953d.js\",\"1583\",\"static/chunks/1583-cd6cfde0c82061af.js\",\"8172\",\"static/chunks/8172-1bfde9a6a7f76d76.js\",\"2845\",\"static/chunks/2845-233d0b7733254462.js\",\"6808\",\"static/chunks/6808-c0c7b12c1b07a90a.js\",\"1489\",\"static/chunks/1489-14b8a5cb2f9ff04a.js\",\"5183\",\"static/chunks/5183-d52e57595b0209a4.js\",\"3436\",\"static/chunks/3436-f8e1abf62b4293d2.js\",\"8932\",\"static/chunks/8932-458d55afb25ec506.js\",\"7177\",\"static/chunks/app/layout-d4a7aa9fe73a0e0a.js\"],\"GoogleTagManager\"]\n"])self.__next_f.push([1,"3d:I[71079,[\"9817\",\"static/chunks/a80d2a2d-995f3588264c9768.js\",\"4023\",\"static/chunks/fba002aa-6277ce0237ac1df9.js\",\"1116\",\"static/chunks/62399e41-d58e7f498626419e.js\",\"6559\",\"static/chunks/85434cc8-f36400b03831af1a.js\",\"5375\",\"static/chunks/07baea06-cb741fd4f7cf345b.js\",\"4620\",\"static/chunks/4620-e1aaeee1f717953d.js\",\"1583\",\"static/chunks/1583-cd6cfde0c82061af.js\",\"8172\",\"static/chunks/8172-1bfde9a6a7f76d76.js\",\"2845\",\"static/chunks/2845-233d0b7733254462.js\",\"6808\",\"static/chunks/6808-c0c7b12c1b07a90a.js\",\"1489\",\"static/chunks/1489-14b8a5cb2f9ff04a.js\",\"5183\",\"static/chunks/5183-d52e57595b0209a4.js\",\"3436\",\"static/chunks/3436-f8e1abf62b4293d2.js\",\"8932\",\"static/chunks/8932-458d55afb25ec506.js\",\"7177\",\"static/chunks/app/layout-d4a7aa9fe73a0e0a.js\"],\"SpeedInsights\"]\n"])self.__next_f.push([1,"36:[\"$\",\"link\",null,{\"rel\":\"icon\",\"href\":\"/favicon.ico?cache=rounded-bolt\",\"sizes\":\"32x32\"}]\n37:[\"$\",\"script\",null,{\"async\":true,\"src\":\"/g.js\"}]\n38:[\"$\",\"script\",null,{\"async\":true,\"src\":\"https://js.stripe.com/v3/\"}]\n39:[\"$\",\"body\",null,{\"className\":\"font-inter\",\"children\":[\"$L3a\",[\"$\",\"$L3b\",null,{\"gaId\":\"G-CQ9K0VPEEQ\"}],[\"$\",\"$L3c\",null,{\"gtmId\":\"GTM-WWK828JN\"}],[\"$\",\"$L3d\",null,{\"sampleRate\":0.01}]]}]\n"])self.__next_f.push([1,"31:[[\"$\",\"meta\",\"0\",{\"charSet\":\"utf-8\"}],[\"$\",\"meta\",\"1\",{\"name\":\"viewport\",\"content\":\"width=device-width, initial-scale=1, maximum-scale=1\"}]]\n2d:null\n"])self.__next_f.push([1,"3e:I[75607,[],\"IconMark\"]\n"])self.__next_f.push([1,"2f:{\"metadata\":[[\"$\",\"title\",\"0\",{\"children\":\"Rate Limits - GroqDocs\"}],[\"$\",\"meta\",\"1\",{\"name\":\"description\",\"content\":\"Understand Groq API rate limits, headers, and best practices for managing request and token quotas in your applications.\"}],[\"$\",\"meta\",\"2\",{\"property\":\"og:title\",\"content\":\"Rate Limits - GroqDocs\"}],[\"$\",\"meta\",\"3\",{\"property\":\"og:description\",\"content\":\"Understand Groq API rate limits, headers, and best practices for managing request and token quotas in your applications.\"}],[\"$\",\"meta\",\"4\",{\"property\":\"og:url\",\"content\":\"https://console.groq.com/docs\"}],[\"$\",\"meta\",\"5\",{\"property\":\"og:site_name\",\"content\":\"GroqDocs\"}],[\"$\",\"meta\",\"6\",{\"property\":\"og:image\",\"content\":\"https://console.groq.com/og_cloudv5.jpg\"}],[\"$\",\"meta\",\"7\",{\"property\":\"og:type\",\"content\":\"website\"}],[\"$\",\"meta\",\"8\",{\"name\":\"twitter:card\",\"content\":\"summary_large_image\"}],[\"$\",\"meta\",\"9\",{\"name\":\"twitter:title\",\"content\":\"Rate Limits - GroqDocs\"}],[\"$\",\"meta\",\"10\",{\"name\":\"twitter:description\",\"content\":\"Understand Groq API rate limits, headers, and best practices for managing request and token quotas in your applications.\"}],[\"$\",\"meta\",\"11\",{\"name\":\"twitter:image\",\"content\":\"https://console.groq.com/og_cloudv5.jpg\"}],[\"$\",\"link\",\"12\",{\"rel\":\"icon\",\"href\":\"/favicon.ico?cache=rounded-bolt\",\"sizes\":\"32x32\"}],[\"$\",\"$L3e\",\"13\",{}]],\"error\":null,\"digest\":\"$undefined\"}\n"])self.__next_f.push([1,"34:\"$2f:metadata\"\n"])self.__next_f.push([1,"3f:I[2370,[\"9817\",\"static/chunks/a80d2a2d-995f3588264c9768.js\",\"4023\",\"static/chunks/fba002aa-6277ce0237ac1df9.js\",\"4620\",\"static/chunks/4620-e1aaeee1f717953d.js\",\"1583\",\"static/chunks/1583-cd6cfde0c82061af.js\",\"8172\",\"static/chunks/8172-1bfde9a6a7f76d76.js\",\"2845\",\"static/chunks/2845-233d0b7733254462.js\",\"6808\",\"static/chunks/6808-c0c7b12c1b07a90a.js\",\"6767\",\"static/chunks/6767-da92d72583ce2ee5.js\",\"1489\",\"static/chunks/1489-14b8a5cb2f9ff04a.js\",\"5183\",\"static/chunks/5183-d52e57595b0209a4.js\",\"9791\",\"static/chunks/9791-73094ddf02612085.js\",\"2376\",\"static/chunks/2376-3cb5f78aefc53c11.js\",\"4897\",\"static/chunks/4897-15b235f72e0c3911.js\",\"8499\",\"static/chunks/8499-52c18414adaa5f9b.js\",\"8932\",\"static/chunks/8932-458d55afb25ec506.js\",\"6672\",\"static/chunks/6672-98691e627e9f180d.js\",\"4718\",\"static/chunks/4718-28bb22cdd58e2a2d.js\",\"9382\",\"static/chunks/app/(console)/docs/layout-32c41ba10ba3ae2a.js\"],\"default\"]\n"])self.__next_f.push([1,"40:Tfb0,"])self.__next_f.push([1,"{\n \"id\": \"chatcmpl-f51b2cd2-bef7-417e-964e-a08f0b513c22\",\n \"object\": \"chat.completion\",\n \"created\": 1730241104,\n \"model\": \"openai/gpt-oss-20b\",\n \"choices\": [\n {\n \"index\": 0,\n \"message\": {\n \"role\": \"assistant\",\n \"content\": \"Fast language models have gained significant attention in recent years due to their ability to process and generate human-like text quickly and efficiently. The importance of fast language models can be understood from their potential applications and benefits:\\n\\n1. **Real-time Chatbots and Conversational Interfaces**: Fast language models enable the development of chatbots and conversational interfaces that can respond promptly to user queries, making them more engaging and useful.\\n2. **Sentiment Analysis and Opinion Mining**: Fast language models can quickly analyze text data to identify sentiments, opinions, and emotions, allowing for improved customer service, market research, and opinion mining.\\n3. **Language Translation and Localization**: Fast language models can quickly translate text between languages, facilitating global communication and enabling businesses to reach a broader audience.\\n4. **Text Summarization and Generation**: Fast language models can summarize long documents or even generate new text on a given topic, improving information retrieval and processing efficiency.\\n5. **Named Entity Recognition and Information Extraction**: Fast language models can rapidly recognize and extract specific entities, such as names, locations, and organizations, from unstructured text data.\\n6. **Recommendation Systems**: Fast language models can analyze large amounts of text data to personalize product recommendations, improve customer experience, and increase sales.\\n7. **Content Generation for Social Media**: Fast language models can quickly generate engaging content for social media platforms, helping businesses maintain a consistent online presence and increasing their online visibility.\\n8. **Sentiment Analysis for Stock Market Analysis**: Fast language models can quickly analyze social media posts, news articles, and other text data to identify sentiment trends, enabling financial analysts to make more informed investment decisions.\\n9. **Language Learning and Education**: Fast language models can provide instant feedback and adaptive language learning, making language education more effective and engaging.\\n10. **Domain-Specific Knowledge Extraction**: Fast language models can quickly extract relevant information from vast amounts of text data, enabling domain experts to focus on high-level decision-making rather than manual information gathering.\\n\\nThe benefits of fast language models include:\\n\\n* **Increased Efficiency**: Fast language models can process large amounts of text data quickly, reducing the time and effort required for tasks such as sentiment analysis, entity recognition, and text summarization.\\n* **Improved Accuracy**: Fast language models can analyze and learn from large datasets, leading to more accurate results and more informed decision-making.\\n* **Enhanced User Experience**: Fast language models can enable real-time interactions, personalized recommendations, and timely responses, improving the overall user experience.\\n* **Cost Savings**: Fast language models can automate many tasks, reducing the need for manual labor and minimizing costs associated with data processing and analysis.\\n\\nIn summary, fast language models have the potential to transform various industries and applications by providing fast, accurate, and efficient language processing capabilities.\"\n },\n \"logprobs\": null,\n \"finish_reason\": \"stop\"\n }\n ],\n \"usage\": {\n \"queue_time\": 0.037493756,\n \"prompt_tokens\": 18,\n \"prompt_time\": 0.000680594,\n \"completion_tokens\": 556,\n \"completion_time\": 0.463333333,\n \"total_tokens\": 574,\n \"total_time\": 0.464013927\n },\n \"system_fingerprint\": \"fp_179b0f92c9\",\n \"x_groq\": { \"id\": \"req_01jbd6g2qdfw2adyrt2az8hz4w\" }\n}\n"])self.__next_f.push([1,"41:T699,"])self.__next_f.push([1,"{\n \"object\": \"list\",\n \"data\": [\n {\n \"id\": \"gemma2-9b-it\",\n \"object\": \"model\",\n \"created\": 1693721698,\n \"owned_by\": \"Google\",\n \"active\": true,\n \"context_window\": 8192,\n \"public_apps\": null\n },\n {\n \"id\": \"llama3-8b-8192\",\n \"object\": \"model\",\n \"created\": 1693721698,\n \"owned_by\": \"Meta\",\n \"active\": true,\n \"context_window\": 8192,\n \"public_apps\": null\n },\n {\n \"id\": \"llama3-70b-8192\",\n \"object\": \"model\",\n \"created\": 1693721698,\n \"owned_by\": \"Meta\",\n \"active\": true,\n \"context_window\": 8192,\n \"public_apps\": null\n },\n {\n \"id\": \"whisper-large-v3-turbo\",\n \"object\": \"model\",\n \"created\": 1728413088,\n \"owned_by\": \"OpenAI\",\n \"active\": true,\n \"context_window\": 448,\n \"public_apps\": null\n },\n {\n \"id\": \"whisper-large-v3\",\n \"object\": \"model\",\n \"created\": 1693721698,\n \"owned_by\": \"OpenAI\",\n \"active\": true,\n \"context_window\": 448,\n \"public_apps\": null\n },\n {\n \"id\": \"llama-guard-3-8b\",\n \"object\": \"model\",\n \"created\": 1693721698,\n \"owned_by\": \"Meta\",\n \"active\": true,\n \"context_window\": 8192,\n \"public_apps\": null\n },\n {\n \"id\": \"distil-whisper-large-v3-en\",\n \"object\": \"model\",\n \"created\": 1693721698,\n \"owned_by\": \"Hugging Face\",\n \"active\": true,\n \"context_window\": 448,\n \"public_apps\": null\n },\n {\n \"id\": \"llama-3.1-8b-instant\",\n \"object\": \"model\",\n \"created\": 1693721698,\n \"owned_by\": \"Meta\",\n \"active\": true,\n \"context_window\": 131072,\n \"public_apps\": null\n }\n ]\n}\n"])self.__next_f.push([1,"42:T5c5,"])self.__next_f.push([1,"{\n \"id\": \"resp_01k1x6w9ane6d8rfxm05cb45yk\",\n \"object\": \"response\",\n \"status\": \"completed\",\n \"created_at\": 1754400695,\n \"output\": [\n {\n \"type\": \"message\",\n \"id\": \"msg_01k1x6w9ane6eb0650crhawwyy\",\n \"status\": \"completed\",\n \"role\": \"assistant\",\n \"content\": [\n {\n \"type\": \"output_text\",\n \"text\": \"When the stars blinked awake, Luna the unicorn curled her mane and whispered wishes to the sleeping pine trees. She galloped through a field of moonlit daisies, gathering dew like tiny silver pearls. With a gentle sigh, she tucked her hooves beneath a silver cloud so the world slept softly, dreaming of her gentle hooves until the morning.\",\n \"annotations\": []\n }\n ]\n }\n ],\n \"previous_response_id\": null,\n \"model\": \"llama-3.3-70b-versatile\",\n \"reasoning\": {\n \"effort\": null,\n \"summary\": null\n },\n \"max_output_tokens\": null,\n \"instructions\": null,\n \"text\": {\n \"format\": {\n \"type\": \"text\"\n }\n },\n \"tools\": [],\n \"tool_choice\": \"auto\",\n \"truncation\": \"disabled\",\n \"metadata\": {},\n \"temperature\": 1,\n \"top_p\": 1,\n \"user\": null,\n \"service_tier\": \"default\",\n \"error\": null,\n \"incomplete_details\": null,\n \"usage\": {\n \"input_tokens\": 82,\n \"input_tokens_details\": {\n \"cached_tokens\": 0\n },\n \"output_tokens\": 266,\n \"output_tokens_details\": {\n \"reasoning_tokens\": 0\n },\n \"total_tokens\": 348\n },\n \"parallel_tool_calls\": true,\n \"store\": false\n}\n"])self.__next_f.push([1,"9:[\"$\",\"$L3f\",null,{\"openapiSpec\":{\"components\":{\"schemas\":{\"Annotation\":{\"description\":\"An annotation that provides citations or references for content in a message.\",\"properties\":{\"document_citation\":{\"$ref\":\"#/components/schemas/DocumentCitation\"},\"function_citation\":{\"$ref\":\"#/components/schemas/FunctionCitation\"},\"type\":{\"description\":\"The type of annotation.\",\"enum\":[\"document_citation\",\"function_citation\"],\"type\":\"string\"}},\"required\":[\"type\"],\"type\":\"object\"},\"Batch\":{\"properties\":{\"cancelled_at\":{\"description\":\"The Unix timestamp (in seconds) for when the batch was cancelled.\",\"type\":\"integer\"},\"cancelling_at\":{\"description\":\"The Unix timestamp (in seconds) for when the batch started cancelling.\",\"type\":\"integer\"},\"completed_at\":{\"description\":\"The Unix timestamp (in seconds) for when the batch was completed.\",\"type\":\"integer\"},\"completion_window\":{\"description\":\"The time frame within which the batch should be processed.\",\"type\":\"string\"},\"created_at\":{\"description\":\"The Unix timestamp (in seconds) for when the batch was created.\",\"type\":\"integer\"},\"endpoint\":{\"description\":\"The API endpoint used by the batch.\",\"type\":\"string\"},\"error_file_id\":{\"description\":\"The ID of the file containing the outputs of requests with errors.\",\"type\":\"string\"},\"errors\":{\"properties\":{\"data\":{\"items\":{\"properties\":{\"code\":{\"description\":\"An error code identifying the error type.\",\"type\":\"string\"},\"line\":{\"description\":\"The line number of the input file where the error occurred, if applicable.\",\"nullable\":true,\"type\":\"integer\"},\"message\":{\"description\":\"A human-readable message providing more details about the error.\",\"type\":\"string\"},\"param\":{\"description\":\"The name of the parameter that caused the error, if applicable.\",\"nullable\":true,\"type\":\"string\"}},\"type\":\"object\"},\"type\":\"array\"},\"object\":{\"description\":\"The object type, which is always \list`.","type":"string"}},"type":"object"},"expired_at":{"description":"The Unix timestamp (in seconds) for when the batch expired.","type":"integer"},"expires_at":{"description":"The Unix timestamp (in seconds) for when the batch will expire.","type":"integer"},"failed_at":{"description":"The Unix timestamp (in seconds) for when the batch failed.","type":"integer"},"finalizing_at":{"description":"The Unix timestamp (in seconds) for when the batch started finalizing.","type":"integer"},"id":{"type":"string"},"in_progress_at":{"description":"The Unix timestamp (in seconds) for when the batch started processing.","type":"integer"},"input_file_id":{"description":"The ID of the input file for the batch.","type":"string"},"metadata":{"description":"Set of key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format.\n","nullable":true,"type":"object"},"object":{"description":"The object type, which is always `batch`.","enum":["batch"],"type":"string","x-stainless-const":true},"output_file_id":{"description":"The ID of the file containing the outputs of successfully executed requests.","type":"string"},"request_counts":{"description":"The request counts for different statuses within the batch.","properties":{"completed":{"description":"Number of requests that have been completed successfully.","type":"integer"},"failed":{"description":"Number of requests that have failed.","type":"integer"},"total":{"description":"Total number of requests in the batch.","type":"integer"}},"required":["total","completed","failed"],"type":"object"},"status":{"description":"The current status of the batch.","enum":["validating","failed","in_progress","finalizing","completed","expired","cancelling","cancelled"],"type":"string"}},"required":["id","object","endpoint","input_file_id","completion_window","status","created_at"],"type":"object"},"BatchRequestInput":{"description":"The per-line object of the batch input file","properties":{"custom_id":{"description":"A developer-provided per-request id that will be used to match outputs to inputs. Must be unique for each request in a batch.","type":"string"},"method":{"description":"The HTTP method to be used for the request. Currently only `POST` is supported.","enum":["POST"],"type":"string","x-stainless-const":true},"url":{"description":"The OpenAI API relative URL to be used for the request. Currently `/v1/chat/completions` is supported.","type":"string"}},"type":"object"},"BatchRequestOutput":{"description":"The per-line object of the batch output and error files","properties":{"custom_id":{"description":"A developer-provided per-request id that will be used to match outputs to inputs.","type":"string"},"error":{"description":"For requests that failed with a non-HTTP error, this will contain more information on the cause of the failure.","nullable":true,"properties":{"code":{"description":"A machine-readable error code.","type":"string"},"message":{"description":"A human-readable error message.","type":"string"}},"type":"object"},"id":{"type":"string"},"response":{"nullable":true,"properties":{"body":{"description":"The JSON body of the response","type":"object"},"request_id":{"description":"An unique identifier for the OpenAI API request. Please include this request ID when contacting support.","type":"string"},"status_code":{"description":"The HTTP status code of the response","type":"integer"}},"type":"object"}},"type":"object"},"BrowserResult":{"additionalProperties":false,"properties":{"content":{"description":"The content of the browser result","type":"string"},"live_view_url":{"description":"The live view URL for the browser window","type":"string"},"title":{"description":"The title of the browser window","type":"string"},"url":{"description":"The URL of the browser window","type":"string"}},"required":["url","title"],"type":"object"},"Chart":{"properties":{"elements":{"description":"The chart elements (data series, points, etc.)","items":{"$ref":"#/components/schemas/ChartElement"},"type":"array"},"title":{"description":"The title of the chart","type":"string"},"type":{"description":"The type of chart","enum":["bar","box_and_whisker","line","pie","scatter","superchart","unknown"],"type":"string"},"x_label":{"description":"The label for the x-axis","type":"string"},"x_scale":{"description":"The scale type for the x-axis","type":"string"},"x_tick_labels":{"description":"The labels for the x-axis ticks","items":{"type":"string"},"type":"array"},"x_ticks":{"description":"The tick values for the x-axis","items":{"type":"number"},"type":"array"},"x_unit":{"description":"The unit for the x-axis","type":"string"},"y_label":{"description":"The label for the y-axis","type":"string"},"y_scale":{"description":"The scale type for the y-axis","type":"string"},"y_tick_labels":{"description":"The labels for the y-axis ticks","items":{"type":"string"},"type":"array"},"y_ticks":{"description":"The tick values for the y-axis","items":{"type":"number"},"type":"array"},"y_unit":{"description":"The unit for the y-axis","type":"string"}},"required":["type","elements"],"type":"object"},"ChartElement":{"properties":{"angle":{"description":"The angle for this element","type":"number"},"first_quartile":{"description":"The first quartile value for this element","type":"number"},"group":{"description":"The group this element belongs to","type":"string"},"label":{"description":"The label for this chart element","type":"string"},"max":{"type":"number"},"median":{"description":"The median value for this element","type":"number"},"min":{"description":"The minimum value for this element","type":"number"},"outliers":{"description":"The outliers for this element","items":{"type":"number"},"type":"array"},"points":{"description":"The points for this element","items":{"items":{"type":"number"},"type":"array"},"type":"array"},"radius":{"description":"The radius for this element","type":"number"},"third_quartile":{"description":"The third quartile value for this element","type":"number"},"value":{"description":"The value for this element","type":"number"}},"required":["label"],"type":"object"},"ChatCompletionDocument":{"additionalProperties":false,"description":"A document that can be referenced by the model while generating responses.","properties":{"id":{"description":"Optional unique identifier that can be used for citations in responses.","nullable":true,"type":"string"},"source":{"$ref":"#/components/schemas/ChatCompletionDocumentSource"}},"required":["source"],"title":"Document","type":"object"},"ChatCompletionDocumentSource":{"description":"The source of the document. Only text and JSON sources are currently supported.","discriminator":{"mapping":{"json":"#/components/schemas/ChatCompletionDocumentSourceJSON","text":"#/components/schemas/ChatCompletionDocumentSourceText"},"propertyName":"type"},"oneOf":[{"$ref":"#/components/schemas/ChatCompletionDocumentSourceText"},{"$ref":"#/components/schemas/ChatCompletionDocumentSourceJSON"}],"title":"Document source"},"ChatCompletionDocumentSourceJSON":{"additionalProperties":false,"description":"A document whose contents are provided inline as JSON data.","properties":{"data":{"additionalProperties":true,"description":"The JSON payload associated with the document.","type":"object"},"type":{"description":"Identifies this document source as JSON data.","enum":["json"],"type":"string"}},"required":["type","data"],"title":"JSON document source","type":"object"},"ChatCompletionDocumentSourceText":{"additionalProperties":false,"description":"A document whose contents are provided inline as text.","properties":{"text":{"description":"The document contents.","type":"string"},"type":{"description":"Identifies this document source as inline text.","enum":["text"],"type":"string"}},"required":["type","text"],"title":"Text document source","type":"object"},"ChatCompletionFunctionCallOption":{"description":"Specifying a particular function via `{\"name\": \"my_function\"}` forces the model to call that function.\n","properties":{"name":{"description":"The name of the function to call.","type":"string"}},"required":["name"],"type":"object"},"ChatCompletionFunctions":{"deprecated":true,"properties":{"description":{"description":"A description of what the function does, used by the model to choose when and how to call the function.","type":"string"},"name":{"description":"The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.","type":"string"},"parameters":{"$ref":"#/components/schemas/FunctionParameters"}},"required":["name"],"type":"object"},"ChatCompletionMessageExecutedTools":{"description":"A list of tools that were executed during the chat completion for compound AI systems.","items":{"properties":{"arguments":{"description":"The arguments passed to the tool in JSON format.","type":"string"},"browser_results":{"description":"Array of browser results","items":{"$ref":"#/components/schemas/BrowserResult"},"type":"array"},"code_results":{"description":"Array of code execution results","items":{"$ref":"#/components/schemas/CodeExecutionResult"},"type":"array"},"index":{"description":"The index of the executed tool.","type":"integer"},"output":{"description":"The output returned by the tool.","nullable":true,"type":"string"},"search_results":{"description":"The search results returned by the tool, if applicable.","nullable":true,"properties":{"images":{"description":"List of image URLs returned by the search","items":{"type":"string"},"type":"array"},"results":{"description":"List of search results","items":{"properties":{"content":{"description":"The content of the search result","type":"string"},"score":{"description":"The relevance score of the search result","format":"float","type":"number"},"title":{"description":"The title of the search result","type":"string"},"url":{"description":"The URL of the search result","type":"string"}},"type":"object"},"type":"array"}},"type":"object"},"type":{"description":"The type of tool that was executed.","type":"string"}},"required":["index","type","arguments"],"type":"object"},"type":"array"},"ChatCompletionMessageToolCall":{"properties":{"function":{"description":"The function that the model called.","properties":{"arguments":{"description":"The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.","type":"string"},"name":{"description":"The name of the function to call.","type":"string"}},"required":["name","arguments"],"type":"object"},"id":{"description":"The ID of the tool call.","type":"string"},"type":{"description":"The type of the tool. Currently, only `function` is supported.","enum":["function"],"type":"string"}},"required":["id","type","function"],"type":"object"},"ChatCompletionMessageToolCallChunk":{"properties":{"function":{"properties":{"arguments":{"description":"The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.","type":"string"},"name":{"description":"The name of the function to call.","type":"string"}},"type":"object"},"id":{"description":"The ID of the tool call.","type":"string"},"index":{"type":"integer"},"type":{"description":"The type of the tool. Currently, only `function` is supported.","enum":["function"],"type":"string"}},"required":["index"],"type":"object"},"ChatCompletionMessageToolCalls":{"description":"The tool calls generated by the model, such as function calls.","items":{"$ref":"#/components/schemas/ChatCompletionMessageToolCall"},"type":"array"},"ChatCompletionNamedToolChoice":{"description":"Specifies a tool the model should use. Use to force the model to call a specific function.","properties":{"function":{"properties":{"name":{"description":"The name of the function to call.","type":"string"}},"required":["name"],"type":"object"},"type":{"description":"The type of the tool. Currently, only `function` is supported.","enum":["function"],"type":"string"}},"required":["type","function"],"type":"object"},"ChatCompletionRequestAssistantMessage":{"additionalProperties":false,"properties":{"content":{"description":"The contents of the assistant message. Required unless `tool_calls` or `function_call` is specified.\n","nullable":true,"oneOf":[{"description":"The text contents of the message.","title":"Text content","type":"string"},{"description":"An array of content parts with a defined type, only `text` is supported for this message type.","items":{"$ref":"#/components/schemas/ChatCompletionRequestMessageContentPartText"},"title":"Array of content parts","type":"array"}],"title":"Assistant message content"},"function_call":{"deprecated":true,"description":"Deprecated and replaced by `tool_calls`. The name and arguments of a function that should be called, as generated by the model.","properties":{"arguments":{"description":"The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.","type":"string"},"name":{"description":"The name of the function to call.","type":"string"}},"type":"object"},"name":{"description":"An optional name for the participant. Provides the model information to differentiate between participants of the same role.","type":"string"},"reasoning":{"description":"The reasoning output by the assistant if reasoning_format was set to ‘parsed’.\nThis field is supported on models that support reasoning.\n","nullable":true,"type":"string"},"role":{"description":"The role of the messages author, in this case `assistant`.","enum":["assistant"],"type":"string"},"tool_call_id":{"description":"DO NOT USE. This field is present because OpenAI allows it and users send it.","nullable":true,"required":["arguments","name"],"type":"string","x-groq-meta":{"hidden":true}},"tool_calls":{"$ref":"#/components/schemas/ChatCompletionMessageToolCalls"}},"required":["role"],"title":"Assistant message","type":"object"},"ChatCompletionRequestFunctionMessage":{"additionalProperties":false,"deprecated":true,"properties":{"content":{"description":"The contents of the function message.","nullable":true,"title":"Function message content","type":"string"},"name":{"description":"The name of the function to call.","type":"string"},"role":{"description":"The role of the messages author, in this case `function`.","enum":["function"],"type":"string"},"tool_call_id":{"description":"DO NOT USE. This field is present because OpenAI allows it and users send it.","nullable":true,"type":"string","x-groq-meta":{"hidden":true}}},"required":["role","content","name"],"title":"Function message","type":"object"},"ChatCompletionRequestMessage":{"discriminator":{"mapping":{"assistant":"#/components/schemas/ChatCompletionRequestAssistantMessage","developer":"#/components/schemas/ChatCompletionRequestSystemMessage","function":"#/components/schemas/ChatCompletionRequestFunctionMessage","system":"#/components/schemas/ChatCompletionRequestSystemMessage","tool":"#/components/schemas/ChatCompletionRequestToolMessage","user":"#/components/schemas/ChatCompletionRequestUserMessage"},"propertyName":"role"},"oneOf":[{"$ref":"#/components/schemas/ChatCompletionRequestSystemMessage"},{"$ref":"#/components/schemas/ChatCompletionRequestUserMessage"},{"$ref":"#/components/schemas/ChatCompletionRequestAssistantMessage"},{"$ref":"#/components/schemas/ChatCompletionRequestToolMessage"},{"$ref":"#/components/schemas/ChatCompletionRequestFunctionMessage"}]},"ChatCompletionRequestMessageContentPart":{"oneOf":[{"$ref":"#/components/schemas/ChatCompletionRequestMessageContentPartText"},{"$ref":"#/components/schemas/ChatCompletionRequestMessageContentPartImage"},{"$ref":"#/components/schemas/ChatCompletionRequestMessageContentPartDocument"}]},"ChatCompletionRequestMessageContentPartDocument":{"properties":{"document":{"properties":{"data":{"additionalProperties":true,"description":"The JSON document data.","type":"object"},"id":{"description":"Optional unique identifier for the document.","nullable":true,"type":"string"}},"required":["data"],"type":"object"},"type":{"description":"The type of the content part.","enum":["document"],"type":"string"}},"required":["type","document"],"title":"Document content part","type":"object"},"ChatCompletionRequestMessageContentPartImage":{"properties":{"image_url":{"properties":{"detail":{"default":"auto","description":"Specifies the detail level of the image.","enum":["auto","low","high"],"type":"string"},"url":{"description":"Either a URL of the image or the base64 encoded image data.","format":"uri","type":"string"}},"required":["url"],"type":"object"},"type":{"description":"The type of the content part.","enum":["image_url"],"type":"string"}},"required":["type","image_url"],"title":"Image content part","type":"object"},"ChatCompletionRequestMessageContentPartText":{"properties":{"text":{"description":"The text content.","type":"string"},"type":{"description":"The type of the content part.","enum":["text"],"type":"string"}},"required":["type","text"],"title":"Text content part","type":"object"},"ChatCompletionRequestSystemMessage":{"additionalProperties":false,"properties":{"content":{"description":"The contents of the system message.","oneOf":[{"description":"The text contents of the message.","title":"Text content","type":"string"},{"description":"An array of content parts with a defined type, only `text` is supported for this message type.","items":{"$ref":"#/components/schemas/ChatCompletionRequestMessageContentPartText"},"minItems":1,"title":"Array of content parts","type":"array"}],"title":"System message content"},"name":{"description":"An optional name for the participant. Provides the model information to differentiate between participants of the same role.","type":"string"},"role":{"description":"The role of the messages author, in this case `system`.","enum":["system","developer"],"type":"string"},"tool_call_id":{"description":"DO NOT USE. This field is present because OpenAI allows it and users send it.","nullable":true,"type":"string","x-groq-meta":{"hidden":true}}},"required":["content","role"],"title":"System message","type":"object"},"ChatCompletionRequestToolMessage":{"additionalProperties":false,"properties":{"content":{"description":"The contents of the tool message.","oneOf":[{"description":"The text contents of the message.","title":"Text content","type":"string"},{"description":"An array of content parts with a defined type, each can be of type `text` or `image_url` when passing in images. You can pass multiple images by adding multiple `image_url` content parts. Image input is only supported when using the `gpt-4-visual-preview` model.","items":{"$ref":"#/components/schemas/ChatCompletionRequestMessageContentPart"},"minItems":1,"title":"Array of content parts","type":"array"}],"title":"Tool message content"},"name":{"description":"DO NOT USE. This field is present because OpenAI allows it and users send it.","type":"string","x-groq-meta":{"hidden":true}},"role":{"description":"The role of the messages author, in this case `tool`.","enum":["tool"],"type":"string"},"tool_call_id":{"description":"Tool call that this message is responding to.","type":"string"}},"required":["role","content","tool_call_id"],"title":"Tool message","type":"object"},"ChatCompletionRequestUserMessage":{"additionalProperties":false,"properties":{"content":{"description":"The contents of the user message.\n","oneOf":[{"description":"The text contents of the message.","title":"Text content","type":"string"},{"description":"An array of content parts with a defined type, each can be of type `text` or `image_url` when passing in images. You can pass multiple images by adding multiple `image_url` content parts. Image input is only supported when using the `gpt-4-visual-preview` model.","items":{"$ref":"#/components/schemas/ChatCompletionRequestMessageContentPart"},"minItems":1,"title":"Array of content parts","type":"array"}],"title":"User message content"},"name":{"description":"An optional name for the participant. Provides the model information to differentiate between participants of the same role.","type":"string"},"role":{"description":"The role of the messages author, in this case `user`.","enum":["user"],"type":"string"},"tool_call_id":{"description":"DO NOT USE. This field is present because OpenAI allows it and users send it.","nullable":true,"type":"string","x-groq-meta":{"hidden":true}}},"required":["content","role"],"title":"User message","type":"object"},"ChatCompletionResponseMessage":{"description":"A chat completion message generated by the model.","properties":{"annotations":{"description":"A list of annotations providing citations and references for the content in the message.","items":{"$ref":"#/components/schemas/Annotation"},"type":"array"},"content":{"description":"The contents of the message.","nullable":true,"type":"string"},"executed_tools":{"$ref":"#/components/schemas/ChatCompletionMessageExecutedTools"},"function_call":{"deprecated":true,"description":"Deprecated and replaced by `tool_calls`. The name and arguments of a function that should be called, as generated by the model.","properties":{"arguments":{"description":"The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.","type":"string"},"name":{"description":"The name of the function to call.","type":"string"}},"required":["name","arguments"],"type":"object"},"reasoning":{"description":"The model’s reasoning for a response. Only available for models that support reasoning when request parameter reasoning_format has value `parsed`.","nullable":true,"type":"string"},"role":{"description":"The role of the author of this message.","enum":["assistant"],"type":"string"},"tool_calls":{"$ref":"#/components/schemas/ChatCompletionMessageToolCalls"}},"required":["role","content"],"type":"object"},"ChatCompletionRole":{"description":"The role of the author of a message","enum":["system","user","assistant","tool","function"],"type":"string"},"ChatCompletionStreamOptions":{"description":"Options for streaming response. Only set this when you set `stream: true`.\n","nullable":true,"properties":{"include_usage":{"description":"If set, an additional chunk will be streamed before the `data: [DONE]` message. The `usage` field on this chunk shows the token usage statistics for the entire request, and the `choices` field will always be an empty array. All other chunks will also include a `usage` field, but with a null value.\n","nullable":true,"type":"boolean"}},"type":"object"},"ChatCompletionStreamResponseDelta":{"description":"A chat completion delta generated by streamed model responses.","properties":{"annotations":{"description":"A list of annotations providing citations and references for the content in the message.","items":{"$ref":"#/components/schemas/Annotation"},"type":"array"},"content":{"description":"The contents of the chunk message.","nullable":true,"type":"string"},"executed_tools":{"$ref":"#/components/schemas/ChatCompletionMessageExecutedTools"},"function_call":{"deprecated":true,"description":"Deprecated and replaced by `tool_calls`. The name and arguments of a function that should be called, as generated by the model.","properties":{"arguments":{"description":"The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.","type":"string"},"name":{"description":"The name of the function to call.","type":"string"}},"type":"object"},"reasoning":{"description":"The model’s reasoning for a response. Only available for models that support reasoning when request parameter reasoning_format has value `parsed`.","nullable":true,"type":"string"},"role":{"description":"The role of the author of this message.","enum":["system","user","assistant","tool"],"type":"string"},"tool_calls":{"items":{"$ref":"#/components/schemas/ChatCompletionMessageToolCallChunk"},"type":"array"}},"type":"object"},"ChatCompletionTokenLogprob":{"properties":{"bytes":{"description":"A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be `null` if there is no bytes representation for the token.","items":{"type":"integer"},"nullable":true,"type":"array"},"logprob":{"description":"The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value `-9999.0` is used to signify that the token is very unlikely.","type":"number"},"token":{"description":"The token.","type":"string"},"top_logprobs":{"description":"List of the most likely tokens and their log probability, at this token position. In rare cases, there may be fewer than the number of requested `top_logprobs` returned.","items":{"properties":{"bytes":{"description":"A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be `null` if there is no bytes representation for the token.","items":{"type":"integer"},"nullable":true,"type":"array"},"logprob":{"description":"The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value `-9999.0` is used to signify that the token is very unlikely.","type":"number"},"token":{"description":"The token.","type":"string"}},"required":["token","logprob","bytes"],"type":"object"},"type":"array"}},"required":["token","logprob","bytes","top_logprobs"],"type":"object"},"ChatCompletionTool":{"properties":{"allowed_tools":{"description":"A list of tool names to allow from the MCP server. If specified, only these tools will be exposed to the model. If empty or not specified, all discovered tools will be available.","items":{"type":"string"},"type":"array","x-groq-meta":{"hidden":true}},"function":{"$ref":"#/components/schemas/FunctionObject"},"headers":{"additionalProperties":{"type":"string"},"description":"HTTP headers to send with requests to the MCP server (optional for MCP tools).","type":"object","x-groq-meta":{"hidden":true}},"server_label":{"description":"A human-readable label for the MCP server (optional for MCP tools).","type":"string","x-groq-meta":{"hidden":true}},"server_url":{"description":"The URL of the MCP server to connect to (required for MCP tools).","type":"string","x-groq-meta":{"hidden":true}},"type":{"anyOf":[{"description":"The type of the tool. `function`, `browser_search`, and `code_interpreter` are supported.","enum":["function","browser_search","code_interpreter"],"type":"string"},{"type":"string"}]}},"required":["type"],"type":"object"},"ChatCompletionToolChoiceOption":{"description":"Controls which (if any) tool is called by the model.\n`none` means the model will not call any tool and instead generates a message.\n`auto` means the model can pick between generating a message or calling one or more tools.\n`required` means the model must call one or more tools.\nSpecifying a particular tool via `{\"type\": \"function\", \"function\": {\"name\": \"my_function\"}}` forces the model to call that tool.\n\n`none` is the default when no tools are present. `auto` is the default if tools are present.\n","nullable":true,"oneOf":[{"description":"`none` means the model will not call any tool and instead generates a message. `auto` means the model can pick between generating a message or calling one or more tools.\n","enum":["none","auto","required"],"type":"string"},{"$ref":"#/components/schemas/ChatCompletionNamedToolChoice"}],"x-groq-meta":{"validator":"ChatCompletionToolChoiceOption"}},"ChatCompletionUsageBreakdown":{"description":"Usage statistics for compound AI completion requests.","properties":{"models":{"description":"List of models used in the request and their individual usage statistics","items":{"properties":{"model":{"description":"The name/identifier of the model used","type":"string"},"usage":{"$ref":"#/components/schemas/CompletionUsage"}},"required":["model","usage"],"type":"object"},"type":"array"}},"required":["models"],"type":"object"},"CodeExecutionResult":{"additionalProperties":false,"properties":{"chart":{"$ref":"#/components/schemas/Chart"},"charts":{"description":"Array of charts from a superchart","items":{"$ref":"#/components/schemas/Chart"},"type":"array"},"png":{"description":"Base64 encoded PNG image output from code execution","type":"string"},"text":{"description":"The text version of the code execution result","type":"string"}},"type":"object"},"CompletionUsage":{"description":"Usage statistics for the completion request.","properties":{"completion_time":{"description":"Time spent generating tokens","type":"number"},"completion_tokens":{"description":"Number of tokens in the generated completion.","type":"integer"},"completion_tokens_details":{"description":"Breakdown of tokens in the completion.","nullable":true,"properties":{"reasoning_tokens":{"description":"Number of tokens used for reasoning (for reasoning models).","type":"integer"}},"required":["reasoning_tokens"],"type":"object"},"prompt_time":{"description":"Time spent processing input tokens","type":"number"},"prompt_tokens":{"description":"Number of tokens in the prompt.","type":"integer"},"prompt_tokens_details":{"description":"Breakdown of tokens in the prompt.","nullable":true,"properties":{"cached_tokens":{"description":"Number of tokens that were cached and reused.","type":"integer"}},"required":["cached_tokens"],"type":"object"},"queue_time":{"description":"Time the requests was spent queued","type":"number"},"total_time":{"description":"completion time and prompt time combined","type":"number"},"total_tokens":{"description":"Total number of tokens used in the request (prompt + completion).","type":"integer"}},"required":["prompt_tokens","completion_tokens","total_tokens"],"type":"object"},"CreateChatCompletionRequest":{"additionalProperties":false,"properties":{"citation_options":{"default":"enabled","description":"Whether to enable citations in the response. When enabled, the model will include citations for information retrieved from provided documents or web searches.","enum":["enabled","disabled"],"nullable":true,"type":"string"},"compound_custom":{"description":"Custom configuration of models and tools for Compound.","nullable":true,"properties":{"models":{"nullable":true,"properties":{"answering_model":{"description":"Custom model to use for answering.","nullable":true,"type":"string"},"reasoning_model":{"description":"Custom model to use for reasoning.","nullable":true,"type":"string"}},"type":"object"},"tools":{"description":"Configuration options for tools available to Compound.","nullable":true,"properties":{"enabled_tools":{"description":"A list of tool names that are enabled for the request.","items":{"type":"string"},"nullable":true,"type":"array"},"wolfram_settings":{"description":"Configuration for the Wolfram tool integration.","nullable":true,"properties":{"authorization":{"description":"API key used to authorize requests to Wolfram services.","nullable":true,"type":"string"}},"type":"object"}},"type":"object"}},"type":"object"},"custom_template":{"description":"Custom template configuration for message formatting.","nullable":true,"properties":{"preserve_system_message_order":{"description":"Whether to preserve the order of system messages.","nullable":true,"type":"boolean"},"relaxed_message_ordering":{"description":"Whether to allow relaxed message ordering in the conversation.","nullable":true,"type":"boolean"},"template":{"description":"The custom template string to use for formatting messages.","type":"string"}},"required":["template"],"type":"object","x-groq-meta":{"hidden":true}},"debug":{"default":false,"description":"output input and output token IDs in xgroq debug field","nullable":true,"type":"boolean","x-groq-meta":{"hidden":true}},"disable_tool_validation":{"default":false,"description":"If set to true, groq will return called tools without validating that the tool is present in request.tools. tool_choice=required/none will still be enforced, but the request cannot require a specific tool be used.\n","type":"boolean"},"documents":{"description":"A list of documents to provide context for the conversation. Each document contains text that can be referenced by the model.","items":{"$ref":"#/components/schemas/ChatCompletionDocument"},"nullable":true,"type":"array"},"exclude_domains":{"deprecated":true,"description":"Deprecated: Use search_settings.exclude_domains instead.\nA list of domains to exclude from the search results when the model uses a web search tool.\n","items":{"type":"string"},"nullable":true,"type":"array"},"exclude_instance_ids":{"description":"For internal use only\n","items":{"type":"string"},"nullable":true,"type":"array","x-groq-meta":{"hidden":true}},"frequency_penalty":{"default":0,"description":"This is not yet supported by any of our models. Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model’s likelihood to repeat the same line verbatim.","maximum":2,"minimum":-2,"nullable":true,"type":"number"},"function_call":{"deprecated":true,"description":"Deprecated in favor of `tool_choice`.\n\nControls which (if any) function is called by the model.\n`none` means the model will not call a function and instead generates a message.\n`auto` means the model can pick between generating a message or calling a function.\nSpecifying a particular function via `{\"name\": \"my_function\"}` forces the model to call that function.\n\n`none` is the default when no functions are present. `auto` is the default if functions are present.\n","nullable":true,"oneOf":[{"description":"`none` means the model will not call a function and instead generates a message. `auto` means the model can pick between generating a message or calling a function.\n","enum":["none","auto","required"],"type":"string"},{"$ref":"#/components/schemas/ChatCompletionFunctionCallOption"}]},"functions":{"deprecated":true,"description":"Deprecated in favor of `tools`.\n\nA list of functions the model may generate JSON inputs for.\n","items":{"$ref":"#/components/schemas/ChatCompletionFunctions"},"maxItems":128,"nullable":true,"type":"array"},"include_domains":{"deprecated":true,"description":"Deprecated: Use search_settings.include_domains instead.\nA list of domains to include in the search results when the model uses a web search tool.\n","items":{"type":"string"},"nullable":true,"type":"array"},"include_reasoning":{"description":"Whether to include reasoning in the response. If true, the response will include a `reasoning` field. If false, the model’s reasoning will not be included in the response.\nThis field is mutually exclusive with `reasoning_format`.\n","nullable":true,"type":"boolean"},"logit_bias":{"additionalProperties":{"type":"integer"},"description":"This is not yet supported by any of our models.\nModify the likelihood of specified tokens appearing in the completion.\n","nullable":true,"type":"object"},"logprobs":{"default":false,"description":"This is not yet supported by any of our models.\nWhether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the `content` of `message`.\n","nullable":true,"type":"boolean"},"max_completion_tokens":{"description":"The maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model’s context length.","nullable":true,"type":"integer"},"max_tokens":{"deprecated":true,"description":"Deprecated in favor of `max_completion_tokens`.\nThe maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model’s context length.\n","nullable":true,"type":"integer"},"messages":{"description":"A list of messages comprising the conversation so far.","items":{"$ref":"#/components/schemas/ChatCompletionRequestMessage"},"minItems":1,"type":"array"},"metadata":{"additionalProperties":{"type":"string"},"description":"This parameter is not currently supported.\n","nullable":true,"type":"object"},"model":{"anyOf":[{"type":"string"},{"enum":["compound-beta","compound-beta-mini","gemma2-9b-it","llama-3.1-8b-instant","llama-3.3-70b-versatile","meta-llama/llama-4-maverick-17b-128e-instruct","meta-llama/llama-4-scout-17b-16e-instruct","meta-llama/llama-guard-4-12b","moonshotai/kimi-k2-instruct","openai/gpt-oss-120b","openai/gpt-oss-20b","qwen/qwen3-32b"],"type":"string"}],"description":"ID of the model to use. For details on which models are compatible with the Chat API, see available models","example":"meta-llama/llama-4-scout-17b-16e-instruct"},"n":{"default":1,"description":"How many chat completion choices to generate for each input message. Note that the current moment, only n=1 is supported. Other values will result in a 400 response.","example":1,"maximum":1,"minimum":1,"nullable":true,"type":"integer"},"parallel_tool_calls":{"default":true,"description":"Whether to enable parallel function calling during tool use.\n","nullable":true,"type":"boolean"},"presence_penalty":{"default":0,"description":"This is not yet supported by any of our models. Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model’s likelihood to talk about new topics.","maximum":2,"minimum":-2,"nullable":true,"type":"number"},"reasoning_effort":{"description":"qwen3 models support the following values\nSet to ’none’ to disable reasoning.\nSet to ‘default’ or null to let Qwen reason.\n\nopenai/gpt-oss-20b and openai/gpt-oss-120b support ’low’, ‘medium’, or ‘high’.\n’medium’ is the default value.\n","enum":["none","default","low","medium","high"],"nullable":true,"type":"string"},"reasoning_format":{"description":"Specifies how to output reasoning tokens\nThis field is mutually exclusive with `include_reasoning`.\n","enum":["hidden","raw","parsed"],"nullable":true,"type":"string"},"response_format":{"description":"An object specifying the format that the model must output. Setting to `{ \"type\": \"json_schema\", \"json_schema\": {…} }` enables Structured Outputs which ensures the model will match your supplied JSON schema. `json_schema` response format is only available on supported models. Setting to `{ \"type\": \"json_object\" }` enables the older JSON mode, which ensures the message the model generates is valid JSON. Using `json_schema` is preferred for models that support it.\n","nullable":true,"oneOf":[{"$ref":"#/components/schemas/ResponseFormatText"},{"$ref":"#/components/schemas/ResponseFormatJsonSchema"},{"$ref":"#/components/schemas/ResponseFormatJsonObject"}]},"search_settings":{"description":"Settings for web search functionality when the model uses a web search tool.\n","nullable":true,"properties":{"country":{"description":"Name of country to prioritize search results from (e.g., \"united states\", \"germany\", \"france\").","nullable":true,"type":"string"},"exclude_domains":{"description":"A list of domains to exclude from the search results.","items":{"type":"string"},"nullable":true,"type":"array"},"include_domains":{"description":"A list of domains to include in the search results.","items":{"type":"string"},"nullable":true,"type":"array"},"include_images":{"description":"Whether to include images in the search results.","nullable":true,"type":"boolean"}},"type":"object"},"seed":{"description":"If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same `seed` and parameters should return the same result.\nDeterminism is not guaranteed, and you should refer to the `system_fingerprint` response parameter to monitor changes in the backend.\n","nullable":true,"type":"integer"},"service_tier":{"description":"The service tier to use for the request. Defaults to `on_demand`.\n- `auto` will automatically select the highest tier available within the rate limits of your organization.\n- `flex` uses the flex tier, which will succeed or fail quickly.\n","enum":["auto","on_demand","flex","performance",null],"nullable":true,"type":"string"},"stop":{"description":"Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.\n","nullable":true,"oneOf":[{"example":"\n","nullable":true,"type":"string"},{"items":{"example":"[\"\\n\"]","type":"string"},"maxItems":4,"type":"array"}]},"store":{"description":"This parameter is not currently supported.\n","nullable":true,"type":"boolean"},"stream":{"default":false,"description":"If set, partial message deltas will be sent. Tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a `data: [DONE]` message. Example code.\n","nullable":true,"type":"boolean"},"stream_options":{"$ref":"#/components/schemas/ChatCompletionStreamOptions"},"temperature":{"default":1,"description":"What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or top_p but not both.","example":1,"maximum":2,"minimum":0,"nullable":true,"type":"number"},"tool_choice":{"$ref":"#/components/schemas/ChatCompletionToolChoiceOption"},"tools":{"description":"A list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for. A max of 128 functions are supported.\n","items":{"$ref":"#/components/schemas/ChatCompletionTool"},"maxItems":128,"nullable":true,"type":"array"},"top_logprobs":{"description":"This is not yet supported by any of our models.\nAn integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. `logprobs` must be set to `true` if this parameter is used.\n","maximum":20,"minimum":0,"nullable":true,"type":"integer"},"top_p":{"default":1,"description":"An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.","example":1,"maximum":1,"minimum":0,"nullable":true,"type":"number"},"user":{"description":"A unique identifier representing your end-user, which can help us monitor and detect abuse.","nullable":true,"type":"string"}},"required":["model","messages"],"type":"object"},"CreateChatCompletionResponse":{"description":"Represents a chat completion response returned by model, based on the provided input.","properties":{"choices":{"description":"A list of chat completion choices. Can be more than one if `n` is greater than 1.","items":{"properties":{"finish_reason":{"description":"The reason the model stopped generating tokens. This will be `stop` if the model hit a natural stop point or a provided stop sequence,\n`length` if the maximum number of tokens specified in the request was reached,\n`tool_calls` if the model called a tool, or `function_call` (deprecated) if the model called a function.\n","enum":["stop","length","tool_calls","function_call"],"type":"string"},"index":{"description":"The index of the choice in the list of choices.","type":"integer"},"logprobs":{"description":"Log probability information for the choice.","nullable":true,"properties":{"content":{"description":"A list of message content tokens with log probability information.","items":{"$ref":"#/components/schemas/ChatCompletionTokenLogprob"},"nullable":true,"type":"array"}},"required":["content"],"type":"object"},"message":{"$ref":"#/components/schemas/ChatCompletionResponseMessage"}},"required":["finish_reason","index","message","logprobs"],"type":"object"},"type":"array"},"created":{"description":"The Unix timestamp (in seconds) of when the chat completion was created.","type":"integer"},"id":{"description":"A unique identifier for the chat completion.","type":"string"},"mcp_list_tools":{"description":"List of discovered MCP tools from connected servers.","items":{"properties":{"id":{"description":"Unique identifier for this tool list response.","type":"string"},"server_label":{"description":"Human-readable label for the MCP server.","type":"string"},"tools":{"description":"Array of discovered tools from the server.","items":{"properties":{"annotations":{"description":"Additional metadata for the tool."},"description":{"description":"Description of what the tool does.","type":"string"},"input_schema":{"additionalProperties":true,"description":"JSON Schema describing the tool’s input parameters.","type":"object"},"name":{"description":"The name of the tool.","type":"string"}},"type":"object"},"type":"array"},"type":{"description":"The type identifier.","type":"string"}},"type":"object"},"nullable":true,"type":"array"},"model":{"description":"The model used for the chat completion.","type":"string"},"object":{"description":"The object type, which is always `chat.completion`.","enum":["chat.completion"],"type":"string"},"service_tier":{"description":"The service tier used for the request.","enum":["auto","on_demand","flex","performance",null],"nullable":true,"type":"string"},"system_fingerprint":{"description":"This fingerprint represents the backend configuration that the model runs with.\n\nCan be used in conjunction with the `seed` request parameter to understand when backend changes have been made that might impact determinism.\n","type":"string"},"usage":{"$ref":"#/components/schemas/CompletionUsage"},"usage_breakdown":{"$ref":"#/components/schemas/ChatCompletionUsageBreakdown"},"x_groq":{"$ref":"#/components/schemas/XGroqNonStreaming"}},"required":["choices","created","id","model","object"],"type":"object"},"CreateChatCompletionStreamResponse":{"description":"Represents a streamed chunk of a chat completion response returned by model, based on the provided input.","properties":{"choices":{"description":"A list of chat completion choices. Can contain more than one elements if `n` is greater than 1.\n","items":{"properties":{"delta":{"$ref":"#/components/schemas/ChatCompletionStreamResponseDelta"},"finish_reason":{"description":"The reason the model stopped generating tokens. This will be `stop` if the model hit a natural stop point or a provided stop sequence,\n`length` if the maximum number of tokens specified in the request was reached,\n`tool_calls` if the model called a tool, or `function_call` (deprecated) if the model called a function.\n","enum":["stop","length","tool_calls","function_call"],"nullable":true,"type":"string"},"index":{"description":"The index of the choice in the list of choices.","type":"integer"},"logprobs":{"description":"Log probability information for the choice.","nullable":true,"properties":{"content":{"description":"A list of message content tokens with log probability information.","items":{"$ref":"#/components/schemas/ChatCompletionTokenLogprob"},"nullable":true,"type":"array"}},"required":["content"],"type":"object"}},"required":["delta","finish_reason","index"],"type":"object"},"type":"array"},"created":{"description":"The Unix timestamp (in seconds) of when the chat completion was created. Each chunk has the same timestamp.","type":"integer"},"id":{"description":"A unique identifier for the chat completion. Each chunk has the same ID.","type":"string"},"model":{"description":"The model to generate the completion.","type":"string"},"object":{"description":"The object type, which is always `chat.completion.chunk`.","enum":["chat.completion.chunk"],"type":"string"},"system_fingerprint":{"description":"This fingerprint represents the backend configuration that the model runs with.\nCan be used in conjunction with the `seed` request parameter to understand when backend changes have been made that might impact determinism.\n","type":"string"},"x_groq":{"$ref":"#/components/schemas/XGroq"}},"required":["choices","created","id","model","object"],"type":"object"},"CreateEmbeddingRequest":{"additionalProperties":false,"properties":{"encoding_format":{"default":"float","description":"The format to return the embeddings in. Can only be `float` or `base64`.","enum":["float","base64"],"example":"float","type":"string"},"input":{"description":"Input text to embed, encoded as a string or array of tokens. To embed multiple inputs in a single request, pass an array of strings or array of token arrays. The input must not exceed the max input tokens for the model, cannot be an empty string, and any array must be 2048 dimensions or less.\n","example":"The quick brown fox jumped over the lazy dog","oneOf":[{"default":"","description":"The string that will be turned into an embedding.","example":"This is a test.","title":"string","type":"string"},{"description":"The array of strings that will be turned into an embeddings.","items":{"default":"","example":"[‘This is a test.’]","type":"string"},"maxItems":2048,"minItems":1,"title":"array","type":"array"}],"x-groq-meta":{"validator":"EmbeddingInput"}},"model":{"anyOf":[{"type":"string"},{"enum":["nomic-embed-text-v1_5"],"type":"string"}],"description":"ID of the model to use.\n","example":"nomic-embed-text-v1_5"},"user":{"description":"A unique identifier representing your end-user, which can help us monitor and detect abuse.","nullable":true,"type":"string"}},"required":["model","input"],"type":"object"},"CreateEmbeddingResponse":{"properties":{"data":{"description":"The list of embeddings generated by the model.","items":{"$ref":"#/components/schemas/Embedding"},"type":"array"},"model":{"description":"The name of the model used to generate the embedding.","type":"string"},"object":{"description":"The object type, which is always \"list\".","enum":["list"],"type":"string"},"usage":{"description":"The usage information for the request.","properties":{"prompt_tokens":{"description":"The number of tokens used by the prompt.","type":"integer"},"total_tokens":{"description":"The total number of tokens used by the request.","type":"integer"}},"required":["prompt_tokens","total_tokens"],"type":"object"}},"required":["object","model","data","usage"],"type":"object"},"CreateFileRequest":{"additionalProperties":false,"properties":{"file":{"description":"The File object (not file name) to be uploaded.\n","format":"binary","type":"string"},"purpose":{"description":"The intended purpose of the uploaded file.\nUse \"batch\" for Batch API.\n","enum":["batch"],"type":"string"}},"required":["file","purpose"],"type":"object"},"CreateFineTuningRequest":{"properties":{"base_model":{"description":"BaseModel is the model that the fine tune was originally trained on.\n","type":"string"},"input_file_id":{"description":"InputFileID is the id of the file that was uploaded via the /files api.\n","type":"string"},"name":{"description":"Name is the given name to a fine tuned model.\n","type":"string"},"type":{"description":"Type is the type of fine tuning format such as \"lora\".\n","type":"string"}},"type":"object"},"CreateResponseRequest":{"additionalProperties":false,"properties":{"input":{"description":"Text input to the model, used to generate a response.\n","oneOf":[{"description":"A text input to the model, equivalent to a text input with the `user` role.","title":"Text input","type":"string"},{"description":"A list of one or many input items to the model, containing different content types.","items":{"$ref":"#/components/schemas/ResponseInputItem"},"title":"Input item list","type":"array"}]},"instructions":{"description":"Inserts a system (or developer) message as the first item in the model’s context.\n","nullable":true,"type":"string"},"max_output_tokens":{"description":"An upper bound for the number of tokens that can be generated for a response, including visible output tokens and reasoning tokens.\n","nullable":true,"type":"integer"},"metadata":{"additionalProperties":{"type":"string"},"description":"Custom key-value pairs for storing additional information. Maximum of 16 pairs.\n","nullable":true,"type":"object"},"model":{"anyOf":[{"type":"string"},{"enum":["gemma2-9b-it","llama-3.3-70b-versatile","llama-3.1-8b-instant","llama-guard-3-8b","llama3-70b-8192","llama3-8b-8192"],"type":"string"}],"description":"ID of the model to use. For details on which models are compatible with the Responses API, see available models","example":"llama-3.3-70b-versatile"},"parallel_tool_calls":{"default":true,"description":"Enable parallel execution of multiple tool calls.\n","nullable":true,"type":"boolean"},"reasoning":{"description":"Configuration for reasoning capabilities when using models that support reasoning.\n","nullable":true,"properties":{"effort":{"default":"medium","description":"Level of reasoning effort. Supported values: `low`, `medium`, `high`. Lower values provide faster responses with less reasoning depth.\n","enum":["low","medium","high"],"nullable":true,"type":"string"}},"type":"object"},"service_tier":{"default":"auto","description":"Specifies the latency tier to use for processing the request.\n","enum":["auto","default","flex"],"nullable":true,"type":"string"},"store":{"default":false,"description":"Response storage flag. Note: Currently only supports false or null values.\n","nullable":true,"type":"boolean"},"stream":{"default":false,"description":"Enable streaming mode to receive response data as server-sent events.\n","nullable":true,"type":"boolean"},"temperature":{"default":1,"description":"Controls randomness in the response generation. Range: 0 to 2. Lower values produce more deterministic outputs, higher values increase variety and creativity.\n","example":1,"maximum":2,"minimum":0,"nullable":true,"type":"number"},"text":{"description":"Response format configuration. Supports plain text or structured JSON output.\n","properties":{"format":{"$ref":"#/components/schemas/ResponseFormatConfiguration"}},"type":"object"},"tool_choice":{"$ref":"#/components/schemas/ResponseToolChoiceOption"},"tools":{"description":"List of tools available to the model. Currently supports function definitions only. Maximum of 128 functions.\n","items":{"$ref":"#/components/schemas/ResponseTool"},"maxItems":128,"nullable":true,"type":"array"},"top_p":{"default":1,"description":"Nucleus sampling parameter that controls the cumulative probability cutoff. Range: 0 to 1. A value of 0.1 restricts sampling to tokens within the top 10% probability mass.\n","example":1,"maximum":1,"minimum":0,"nullable":true,"type":"number"},"truncation":{"default":"disabled","description":"Context truncation strategy. Supported values: `auto` or `disabled`.\n","enum":["auto","disabled"],"nullable":true,"type":"string"},"user":{"description":"Optional identifier for tracking end-user requests. Useful for usage monitoring and compliance.\n","example":"user-1234","type":"string"}},"required":["model","input"],"type":"object"},"CreateResponseResponse":{"description":"Represents a response returned by model, based on the provided input.","properties":{"background":{"default":false,"description":"Whether the response was generated in the background.","type":"boolean"},"created_at":{"description":"The Unix timestamp (in seconds) of when the response was created.","type":"integer"},"error":{"description":"An error object if the response failed.","nullable":true,"properties":{"code":{"description":"The error code.","type":"string"},"message":{"description":"A human-readable error message.","type":"string"}},"required":["code","message"],"type":"object"},"id":{"description":"A unique identifier for the response.","type":"string"},"incomplete_details":{"description":"Details about why the response is incomplete.","nullable":true,"properties":{"reason":{"description":"The reason why the response is incomplete.","type":"string"}},"type":"object"},"instructions":{"description":"The system instructions used for the response.","nullable":true,"type":"string"},"max_output_tokens":{"description":"The maximum number of tokens configured for the response.","nullable":true,"type":"integer"},"max_tool_calls":{"description":"The maximum number of tool calls allowed.","nullable":true,"type":"integer"},"metadata":{"additionalProperties":{"type":"string"},"description":"Metadata attached to the response.","nullable":true,"type":"object"},"model":{"description":"The model used for the response.","type":"string"},"object":{"description":"The object type, which is always `response`.","enum":["response"],"type":"string"},"output":{"description":"An array of content items generated by the model.","items":{"$ref":"#/components/schemas/ResponseOutputItem"},"type":"array"},"parallel_tool_calls":{"description":"Whether the model can run tool calls in parallel.","type":"boolean"},"previous_response_id":{"description":"Not supported. Always null.","nullable":true,"type":"string"},"reasoning":{"description":"Configuration options for models that support reasoning.","nullable":true,"properties":{"effort":{"description":"The reasoning effort level used.","enum":["low","medium","high"],"nullable":true,"type":"string"},"summary":{"description":"Not supported. Always null.","nullable":true,"type":"string"}},"type":"object"},"service_tier":{"description":"The service tier used for processing.","enum":["auto","default","flex"],"type":"string"},"status":{"description":"The status of the response generation. One of `completed`, `failed`, `in_progress`, or `incomplete`.\n","enum":["completed","failed","in_progress","incomplete"],"type":"string"},"store":{"description":"Whether the response was stored.","type":"boolean"},"temperature":{"description":"The sampling temperature used.","type":"number"},"text":{"description":"Text format configuration used for the response.","properties":{"format":{"$ref":"#/components/schemas/ResponseFormatConfiguration"}},"type":"object"},"tool_choice":{"$ref":"#/components/schemas/ResponseToolChoiceOption"},"tools":{"description":"The tools that were available to the model.","items":{"$ref":"#/components/schemas/ResponseTool"},"type":"array"},"top_logprobs":{"default":0,"description":"The number of top log probabilities returned.","type":"integer"},"top_p":{"description":"The nucleus sampling parameter used.","type":"number"},"truncation":{"description":"The truncation strategy used.","enum":["auto","disabled"],"type":"string"},"usage":{"$ref":"#/components/schemas/ResponseUsage"},"user":{"description":"The user identifier.","nullable":true,"type":"string"}},"required":["id","object","status","created_at","output","model","tools","tool_choice","truncation","metadata","temperature","top_p","service_tier","error","incomplete_details","parallel_tool_calls","store"],"type":"object"},"CreateSpeechRequest":{"additionalProperties":false,"properties":{"input":{"description":"The text to generate audio for.","example":"The quick brown fox jumped over the lazy dog","type":"string"},"model":{"anyOf":[{"type":"string"},{"enum":["playai-tts","playai-tts-arabic"],"type":"string"}],"description":"One of the available TTS models.\n","example":"playai-tts"},"response_format":{"default":"mp3","description":"The format of the generated audio. Supported formats are `flac, mp3, mulaw, ogg, wav`.","enum":["flac","mp3","mulaw","ogg","wav"],"type":"string"},"sample_rate":{"default":48000,"description":"The sample rate for generated audio","enum":[8000,16000,22050,24000,32000,44100,48000],"example":48000,"type":"integer"},"speed":{"default":1,"description":"The speed of the generated audio.","example":1,"maximum":5,"minimum":0.5,"type":"number"},"voice":{"description":"The voice to use when generating the audio. List of voices can be found here.","example":"Fritz-PlayAI","type":"string"}},"required":["model","input","voice"],"type":"object"},"CreateTranscriptionRequest":{"additionalProperties":false,"oneOf":[{"required":["file"]},{"required":["url"]}],"properties":{"file":{"description":"The audio file object (not file name) to transcribe, in one of these formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm.\nEither a file or a URL must be provided. Note that the file field is not supported in Batch API requests.\n","format":"binary","type":"string"},"language":{"anyOf":[{"type":"string"},{"enum":["en","zh","de","es","ru","ko","fr","ja","pt","tr","pl","ca","nl","ar","sv","it","id","hi","fi","vi","he","uk","el","ms","cs","ro","da","hu","ta","no","th","ur","hr","bg","lt","la","mi","ml","cy","sk","te","fa","lv","bn","sr","az","sl","kn","et","mk","br","eu","is","hy","ne","mn","bs","kk","sq","sw","gl","mr","pa","si","km","sn","yo","so","af","oc","ka","be","tg","sd","gu","am","yi","lo","uz","fo","ht","ps","tk","nn","mt","sa","lb","my","bo","tl","mg","as","tt","haw","ln","ha","ba","jv","su","yue"],"type":"string"}],"description":"The language of the input audio. Supplying the input language in ISO-639-1 format will improve accuracy and latency.\n"},"model":{"anyOf":[{"type":"string"},{"enum":["whisper-large-v3","whisper-large-v3-turbo"],"type":"string"}],"description":"ID of the model to use. `whisper-large-v3` and `whisper-large-v3-turbo` are currently available.\n","example":"whisper-large-v3-turbo"},"prompt":{"description":"An optional text to guide the model’s style or continue a previous audio segment. The prompt should match the audio language.\n","type":"string"},"response_format":{"default":"json","description":"The format of the transcript output, in one of these options: `json`, `text`, or `verbose_json`.\n","enum":["json","text","verbose_json"],"type":"string"},"temperature":{"default":0,"description":"The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit.\n","type":"number"},"timestamp_granularities[]":{"default":["segment"],"description":"The timestamp granularities to populate for this transcription. `response_format` must be set `verbose_json` to use timestamp granularities. Either or both of these options are supported: `word`, or `segment`. Note: There is no additional latency for segment timestamps, but generating word timestamps incurs additional latency.\n","items":{"enum":["word","segment"],"type":"string"},"type":"array"},"url":{"description":"The audio URL to translate/transcribe (supports Base64URL).\nEither a file or a URL must be provided. For Batch API requests, the URL field is required since the file field is not supported.\n","type":"string"}},"required":["model"],"type":"object"},"CreateTranscriptionResponseJson":{"description":"Represents a transcription response returned by model, based on the provided input.","properties":{"text":{"description":"The transcribed text.","type":"string"}},"required":["text"],"type":"object"},"CreateTranscriptionResponseVerboseJson":{"description":"Represents a verbose json transcription response returned by model, based on the provided input.","properties":{"duration":{"description":"The duration of the input audio.","type":"string"},"language":{"description":"The language of the input audio.","type":"string"},"segments":{"description":"Segments of the transcribed text and their corresponding details.","items":{"$ref":"#/components/schemas/TranscriptionSegment"},"type":"array"},"text":{"description":"The transcribed text.","type":"string"},"words":{"description":"Extracted words and their corresponding timestamps.","items":{"$ref":"#/components/schemas/TranscriptionWord"},"type":"array"}},"required":["language","duration","text"],"type":"object"},"CreateTranslationRequest":{"additionalProperties":false,"oneOf":[{"required":["file"]},{"required":["url"]}],"properties":{"file":{"description":"The audio file object (not file name) translate, in one of these formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm.\n","format":"binary","type":"string"},"model":{"anyOf":[{"type":"string"},{"enum":["whisper-large-v3","whisper-large-v3-turbo"],"type":"string"}],"description":"ID of the model to use. `whisper-large-v3` and `whisper-large-v3-turbo` are currently available.\n","example":"whisper-large-v3-turbo"},"prompt":{"description":"An optional text to guide the model’s style or continue a previous audio segment. The prompt should be in English.\n","type":"string"},"response_format":{"default":"json","description":"The format of the transcript output, in one of these options: `json`, `text`, or `verbose_json`.\n","enum":["json","text","verbose_json"],"type":"string"},"temperature":{"default":0,"description":"The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit.\n","type":"number"},"url":{"description":"The audio URL to translate/transcribe (supports Base64URL). Either file or url must be provided.\nWhen using the Batch API only url is supported.\n","type":"string"}},"required":["model"],"type":"object"},"CreateTranslationResponseJson":{"properties":{"text":{"type":"string"}},"required":["text"],"type":"object"},"CreateTranslationResponseVerboseJson":{"properties":{"duration":{"description":"The duration of the input audio.","type":"string"},"language":{"description":"The language of the output translation (always `english`).","type":"string"},"segments":{"description":"Segments of the translated text and their corresponding details.","items":{"$ref":"#/components/schemas/TranscriptionSegment"},"type":"array"},"text":{"description":"The translated text.","type":"string"}},"required":["language","duration","text"],"type":"object"},"DebugData":{"description":"Debug information including input and output token IDs and strings. Only present when debug=true in the request.","properties":{"input_token_ids":{"description":"Token IDs for the input.","items":{"type":"integer"},"type":"array"},"input_tokens":{"description":"Token strings for the input.","items":{"type":"string"},"type":"array"},"output_token_ids":{"description":"Token IDs for the output.","items":{"type":"integer"},"type":"array"},"output_tokens":{"description":"Token strings for the output.","items":{"type":"string"},"type":"array"}},"type":"object"},"DeleteFileResponse":{"properties":{"deleted":{"type":"boolean"},"id":{"type":"string"},"object":{"enum":["file"],"type":"string","x-stainless-const":true}},"required":["id","object","deleted"],"type":"object"},"DeleteFineTuningResponse":{"properties":{"deleted":{"type":"boolean"},"id":{"type":"string"},"object":{"type":"string"}},"type":"object"},"DeleteModelResponse":{"properties":{"deleted":{"type":"boolean"},"id":{"type":"string"},"object":{"type":"string"}},"required":["id","object","deleted"],"type":"object"},"DocumentCitation":{"additionalProperties":false,"description":"A citation referencing a specific document that was provided in the request.","properties":{"document_id":{"description":"The ID of the document being cited, corresponding to a document provided in the request.","type":"string"},"end_index":{"description":"The character index in the message content where this citation ends.","type":"integer"},"start_index":{"description":"The character index in the message content where this citation begins.","type":"integer"}},"required":["start_index","end_index","document_id"],"type":"object"},"Embedding":{"description":"Represents an embedding vector returned by embedding endpoint.\n","properties":{"embedding":{"oneOf":[{"description":"The embedding vector, which is a list of floats. The length of vector depends on the model as listed in the embedding guide.\n","items":{"type":"number"},"type":"array"},{"description":"The embedding vector, which is a base64 encoded string. The length of vector depends on the model as listed in the embedding guide.\n","type":"string"}]},"index":{"description":"The index of the embedding in the list of embeddings.","type":"integer"},"object":{"description":"The object type, which is always \"embedding\".","enum":["embedding"],"type":"string"}},"required":["index","object","embedding"],"type":"object"},"Error":{"properties":{"code":{"nullable":true,"type":"string"},"debug":{"$ref":"#/components/schemas/DebugData"},"failed_generation":{"nullable":true,"type":"string"},"message":{"type":"string"},"param":{"nullable":true,"type":"string"},"schema_code":{"nullable":true,"type":"string"},"schema_kind":{"nullable":true,"type":"string"},"schema_path":{"nullable":true,"type":"string"},"schema_path_segments":{"description":"Segments of the schema path relevant to validation errors.","items":{"type":"string"},"type":"array"},"type":{"type":"string"}},"required":["type","message"],"type":"object"},"ErrorResponse":{"properties":{"error":{"$ref":"#/components/schemas/Error"}},"required":["error"],"type":"object"},"File":{"description":"The `File` object represents a document that has been uploaded.","properties":{"bytes":{"description":"The size of the file, in bytes.","type":"integer"},"created_at":{"description":"The Unix timestamp (in seconds) for when the file was created.","type":"integer"},"filename":{"description":"The name of the file.","type":"string"},"id":{"description":"The file identifier, which can be referenced in the API endpoints.","type":"string"},"object":{"description":"The object type, which is always `file`.","enum":["file"],"type":"string","x-stainless-const":true},"purpose":{"description":"The intended purpose of the file. Supported values are `batch`, and `batch_output`.","enum":["batch","batch_output"],"type":"string"}},"title":"File"},"FunctionCitation":{"additionalProperties":false,"description":"A citation referencing the result of a function or tool call.","properties":{"end_index":{"description":"The character index in the message content where this citation ends.","type":"integer"},"start_index":{"description":"The character index in the message content where this citation begins.","type":"integer"},"tool_call_id":{"description":"The ID of the tool call being cited, corresponding to a tool call made during the conversation.","type":"string"}},"required":["start_index","end_index","tool_call_id"],"type":"object"},"FunctionObject":{"properties":{"description":{"description":"A description of what the function does, used by the model to choose when and how to call the function.","type":"string"},"name":{"description":"The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.","type":"string"},"parameters":{"$ref":"#/components/schemas/FunctionParameters"},"strict":{"default":false,"description":"Whether to enable strict schema adherence when generating the output. If set to true, the model will always follow the exact schema defined in the `schema` field. Only a subset of JSON Schema is supported when `strict` is `true`.\n","type":"boolean"}},"required":["name"],"type":"object"},"FunctionParameters":{"additionalProperties":true,"description":"Function parameters defined as a JSON Schema object. Refer to https://json-schema.org/understanding-json-schema/ for schema documentation.","type":"object"},"ListBatchesResponse":{"properties":{"data":{"items":{"$ref":"#/components/schemas/Batch"},"type":"array"},"object":{"enum":["list"],"type":"string","x-stainless-const":true}},"required":["object","data"],"type":"object"},"ListFilesResponse":{"properties":{"data":{"items":{"$ref":"#/components/schemas/File"},"type":"array"},"object":{"enum":["list"],"type":"string"}},"required":["object","data"],"type":"object"},"ListFineTuningsResponse":{"properties":{"data":{"items":{"properties":{"base_model":{"description":"BaseModel is the model that the fine tune was originally trained on.\n","type":"string"},"created_at":{"description":"CreatedAt is the timestamp of when the fine tuned model was created.\n","type":"number"},"fine_tuned_model":{"description":"FineTunedModel is the final name of the fine tuned model.\n","type":"string"},"id":{"description":"ID is the unique identifier of a fine tune.\n","type":"string"},"input_file_id":{"description":"InputFileID is the id of the file that was uploaded via the /files api.\n","type":"string"},"name":{"description":"Name is the given name to a fine tuned model.\n","type":"string"},"type":{"description":"Type is the type of fine tuning format such as \"lora\".\n","type":"string"}},"type":"object"},"type":"array"},"object":{"type":"string"}},"type":"object"},"ListModelsResponse":{"properties":{"data":{"items":{"$ref":"#/components/schemas/Model"},"type":"array"},"object":{"enum":["list"],"type":"string"}},"required":["object","data"],"type":"object"},"Model":{"description":"Describes an OpenAI model offering that can be used with the API.","properties":{"created":{"description":"The Unix timestamp (in seconds) when the model was created.","type":"integer"},"id":{"description":"The model identifier, which can be referenced in the API endpoints.","type":"string"},"object":{"description":"The object type, which is always \"model\".","enum":["model"],"type":"string"},"owned_by":{"description":"The organization that owns the model.","type":"string"}},"required":["id","object","created","owned_by"],"title":"Model"},"ReadFineTuningResponse":{"properties":{"data":{"properties":{"base_model":{"description":"BaseModel is the model that the fine tune was originally trained on.\n","type":"string"},"created_at":{"description":"CreatedAt is the timestamp of when the fine tuned model was created.\n","type":"number"},"fine_tuned_model":{"description":"FineTunedModel is the final name of the fine tuned model.\n","type":"string"},"id":{"description":"ID is the unique identifier of a fine tune.\n","type":"string"},"input_file_id":{"description":"InputFileID is the id of the file that was uploaded via the /files api.\n","type":"string"},"name":{"description":"Name is the given name to a fine tuned model.\n","type":"string"},"type":{"description":"Type is the type of fine tuning format such as \"lora\".\n","type":"string"}},"type":"object"},"id":{"type":"string"},"object":{"type":"string"}},"type":"object"},"RerankingRequest":{"additionalProperties":false,"properties":{"docs":{"description":"An array of documents to rank. Each document is a string containing the text content.\nMaximum of 100 documents per request.\n","example":["Machine learning is a subset of artificial intelligence","The weather forecast predicts rain tomorrow","Deep learning uses neural networks with multiple layers"],"items":{"minLength":1,"type":"string"},"maxItems":100,"minItems":1,"type":"array"},"instruction":{"description":"Optional instruction to guide the reranking process. If not provided, \na default instruction will be used.\n","example":"Find the most relevant document about AI research","nullable":true,"type":"string"},"model":{"description":"ID of the reranking model to use.\n","example":"qwen3-reranker-4b","type":"string"},"query":{"description":"The search query to rank documents against.\n","example":"artificial intelligence research","type":"string"}},"required":["model","query","docs"],"type":"object"},"RerankingResponse":{"properties":{"results":{"description":"List of documents sorted by relevance score in descending order.\nEach result contains the original document text and its relevance score.\n","items":{"$ref":"#/components/schemas/RerankingResult"},"type":"array"}},"required":["results"],"type":"object"},"RerankingResult":{"properties":{"doc":{"description":"The original document text.","example":"Machine learning is a subset of artificial intelligence","type":"string"},"score":{"description":"Relevance score between 0.0 and 1.0, where higher scores indicate \ngreater relevance to the query.\n","example":0.92,"format":"float","maximum":1,"minimum":0,"type":"number"}},"required":["doc","score"],"type":"object"},"ResponseAnnotation":{"discriminator":{"mapping":{"file_citation":"#/components/schemas/ResponseFileCitation","url_citation":"#/components/schemas/ResponseUrlCitation"},"propertyName":"type"},"oneOf":[{"$ref":"#/components/schemas/ResponseFileCitation"},{"$ref":"#/components/schemas/ResponseUrlCitation"}]},"ResponseEasyInputMessage":{"description":"A message input to the model with a role indicating instruction following hierarchy.\n","properties":{"content":{"description":"Text input to the model.\n","oneOf":[{"description":"A text input to the model.","title":"Text input","type":"string"},{"description":"An array of content parts.","items":{"$ref":"#/components/schemas/ResponseInputContent"},"title":"Content array","type":"array"}]},"role":{"description":"The role of the message input. One of `user`, `assistant`, `system`, or `developer`.\n","enum":["user","assistant","system","developer"],"type":"string"}},"required":["role","content"],"title":"Easy input message","type":"object"},"ResponseFileCitation":{"description":"A citation to a file.","properties":{"file_id":{"description":"The ID of the file.","type":"string"},"index":{"description":"The index of the citation in the text.","type":"integer"},"type":{"description":"The type of the annotation. Always `file_citation`.","enum":["file_citation"],"type":"string","x-stainless-const":true}},"required":["type","file_id"],"title":"File citation","type":"object"},"ResponseFormatConfiguration":{"description":"An object specifying the format that the model must output.\n","oneOf":[{"$ref":"#/components/schemas/ResponseFormatText"},{"$ref":"#/components/schemas/ResponseFormatJsonObject"},{"$ref":"#/components/schemas/TextResponseFormatJsonSchema"}]},"ResponseFormatJsonObject":{"description":"JSON object response format. An older method of generating JSON responses. Using `json_schema` is recommended for models that support it. Note that the model will not generate JSON without a system or user message instructing it to do so.\n","properties":{"type":{"description":"The type of response format being defined. Always `json_object`.","enum":["json_object"],"type":"string","x-stainless-const":true}},"required":["type"],"title":"JSON object","type":"object"},"ResponseFormatJsonSchema":{"description":"JSON Schema response format. Used to generate structured JSON responses.\n","properties":{"json_schema":{"description":"Structured Outputs configuration options, including a JSON Schema.\n","properties":{"description":{"description":"A description of what the response format is for, used by the model to determine how to respond in the format.\n","type":"string"},"name":{"description":"The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.\n","type":"string"},"schema":{"$ref":"#/components/schemas/ResponseFormatJsonSchemaSchema"},"strict":{"default":false,"description":"Whether to enable strict schema adherence when generating the output. If set to true, the model will always follow the exact schema defined in the `schema` field. Only a subset of JSON Schema is supported when `strict` is `true`.\n","nullable":true,"type":"boolean"}},"required":["name"],"title":"JSON schema","type":"object"},"type":{"description":"The type of response format being defined. Always `json_schema`.","enum":["json_schema"],"type":"string","x-stainless-const":true}},"required":["type","json_schema"],"title":"JSON schema","type":"object"},"ResponseFormatJsonSchemaSchema":{"additionalProperties":true,"description":"The schema for the response format, described as a JSON Schema object.\nLearn how to build JSON schemas here.\n","title":"JSON schema","type":"object"},"ResponseFormatText":{"description":"Default response format. Used to generate text responses.\n","properties":{"type":{"description":"The type of response format being defined. Always `text`.","enum":["text"],"type":"string","x-stainless-const":true}},"required":["type"],"title":"Text","type":"object"},"ResponseFunctionCall":{"description":"A function call generated by the model.","properties":{"arguments":{"description":"A JSON string of the arguments to pass to the function.","type":"string"},"call_id":{"description":"The unique ID of the function tool call generated by the model.","type":"string"},"id":{"description":"The unique ID of the function tool call.","type":"string"},"name":{"description":"The name of the function to call.","type":"string"},"status":{"description":"The status of the item.","enum":["in_progress","completed","incomplete"],"type":"string"},"type":{"description":"The type of the function call. Always `function_call`.","enum":["function_call"],"type":"string","x-stainless-const":true}},"required":["type","call_id","name","arguments"],"title":"Function call","type":"object"},"ResponseFunctionCallOutput":{"description":"The output of a function tool call.","properties":{"call_id":{"description":"The unique ID of the function tool call generated by the model.","type":"string"},"id":{"description":"The unique ID of the function tool call output.","type":"string"},"output":{"description":"A JSON string of the output of the function tool call.","type":"string"},"status":{"description":"The status of the item.","enum":["in_progress","completed","incomplete"],"type":"string"},"type":{"description":"The type of the function tool call output. Always `function_call_output`.","enum":["function_call_output"],"type":"string","x-stainless-const":true}},"required":["type","call_id","output"],"title":"Function call output","type":"object"},"ResponseInputContent":{"discriminator":{"mapping":{"input_text":"#/components/schemas/ResponseInputTextContent"},"propertyName":"type"},"oneOf":[{"$ref":"#/components/schemas/ResponseInputTextContent"}]},"ResponseInputItem":{"discriminator":{"mapping":{"function_call":"#/components/schemas/ResponseFunctionCall","function_call_output":"#/components/schemas/ResponseFunctionCallOutput","item_reference":"#/components/schemas/ResponseItemReference","message":"#/components/schemas/ResponseInputMessage"},"propertyName":"type"},"oneOf":[{"$ref":"#/components/schemas/ResponseEasyInputMessage"},{"$ref":"#/components/schemas/ResponseInputMessage"},{"$ref":"#/components/schemas/ResponseItemReference"},{"$ref":"#/components/schemas/ResponseFunctionCall"},{"$ref":"#/components/schemas/ResponseFunctionCallOutput"}]},"ResponseInputMessage":{"description":"A message input to the model with explicit type field.\n","properties":{"content":{"description":"A list of one or many input content items.","items":{"$ref":"#/components/schemas/ResponseInputContent"},"type":"array"},"role":{"description":"The role of the message input. One of `user`, `system`, or `developer`. Note: assistant role is not supported with explicit type.\n","enum":["user","system","developer"],"type":"string"},"status":{"description":"The status of item. Populated when items are returned via API.","enum":["in_progress","completed","incomplete"],"type":"string"},"type":{"description":"The type of the message input. Always set to `message`.","enum":["message"],"type":"string","x-stainless-const":true}},"required":["type","role","content"],"title":"Input message","type":"object"},"ResponseInputTextContent":{"description":"A text input to the model.","properties":{"text":{"description":"The text input to the model.","type":"string"},"type":{"description":"The type of the input item. Always `input_text`.","enum":["input_text"],"type":"string","x-stainless-const":true}},"required":["type","text"],"title":"Input text","type":"object"},"ResponseItemReference":{"description":"An internal identifier for an item to reference.","properties":{"id":{"description":"The ID of the item to reference.","type":"string"},"type":{"description":"The type of item to reference. Always `item_reference`.","enum":["item_reference"],"type":"string","x-stainless-const":true}},"required":["type","id"],"title":"Item reference","type":"object"},"ResponseNamedToolChoice":{"description":"Specifies a tool the model should use. Use to force the model to call a specific function.","properties":{"function":{"properties":{"name":{"description":"The name of the function to call.","type":"string"}},"required":["name"],"type":"object"},"type":{"description":"The type of the tool. Currently, only `function` is supported.","enum":["function"],"type":"string"}},"required":["type","function"],"type":"object"},"ResponseOutputContent":{"discriminator":{"mapping":{"output_text":"#/components/schemas/ResponseOutputTextContent"},"propertyName":"type"},"oneOf":[{"$ref":"#/components/schemas/ResponseOutputTextContent"}]},"ResponseOutputFunctionCall":{"description":"A function call generated by the model.","properties":{"arguments":{"description":"A JSON string of the arguments to pass to the function.","type":"string"},"call_id":{"description":"The unique ID of the function tool call generated by the model.","type":"string"},"id":{"description":"The unique ID of the function tool call.","type":"string"},"name":{"description":"The name of the function to call.","type":"string"},"status":{"description":"The status of the function call.","enum":["in_progress","completed","incomplete"],"type":"string"},"type":{"description":"The type of the function call. Always `function_call`.","enum":["function_call"],"type":"string","x-stainless-const":true}},"required":["type","id","call_id","name","arguments"],"title":"Function call","type":"object"},"ResponseOutputItem":{"discriminator":{"mapping":{"function_call":"#/components/schemas/ResponseOutputFunctionCall","message":"#/components/schemas/ResponseOutputMessage","reasoning":"#/components/schemas/ResponseOutputReasoning"},"propertyName":"type"},"oneOf":[{"$ref":"#/components/schemas/ResponseOutputMessage"},{"$ref":"#/components/schemas/ResponseOutputFunctionCall"},{"$ref":"#/components/schemas/ResponseOutputReasoning"}]},"ResponseOutputMessage":{"description":"An output message from the model.","properties":{"content":{"description":"The content of the output message.","items":{"$ref":"#/components/schemas/ResponseOutputContent"},"type":"array"},"id":{"description":"The unique ID of the output message.","type":"string"},"role":{"description":"The role of the output message. Always `assistant`.","enum":["assistant"],"type":"string","x-stainless-const":true},"status":{"description":"The status of the message.","enum":["in_progress","completed","incomplete"],"type":"string"},"type":{"description":"The type of the output message. Always `message`.","enum":["message"],"type":"string","x-stainless-const":true}},"required":["type","id","role","content"],"title":"Output message","type":"object"},"ResponseOutputReasoning":{"description":"A reasoning output from the model. Available for models that support reasoning.","properties":{"id":{"description":"The unique ID of the reasoning output.","type":"string"},"summary":{"description":"Summary items (currently empty).","items":{"type":"object"},"type":"array"},"type":{"description":"The type of the reasoning output. Always `reasoning`.","enum":["reasoning"],"type":"string","x-stainless-const":true}},"required":["type","id","summary"],"title":"Reasoning","type":"object"},"ResponseOutputTextContent":{"description":"A text output from the model.","properties":{"annotations":{"description":"The annotations of the text output.","items":{"$ref":"#/components/schemas/ResponseAnnotation"},"type":"array"},"logprobs":{"description":"Log probability information for the output.","items":{"type":"string"},"nullable":true,"type":"array"},"text":{"description":"The text output from the model.","type":"string"},"type":{"description":"The type of the output text. Always `output_text`.","enum":["output_text"],"type":"string","x-stainless-const":true}},"required":["type","text","annotations"],"title":"Output text","type":"object"},"ResponseTool":{"properties":{"description":{"description":"Describes the function’s purpose. The model uses this to determine when to invoke the function.","type":"string"},"name":{"description":"The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.","type":"string"},"parameters":{"$ref":"#/components/schemas/FunctionParameters"},"strict":{"description":"Whether to enable strict schema adherence when generating the function call.","nullable":true,"type":"boolean"},"type":{"description":"The type of the tool. Currently, only `function` is supported.","enum":["function"],"type":"string"}},"required":["type","name"],"type":"object"},"ResponseToolChoiceOption":{"description":"Controls which (if any) tool is called by the model.\n`none` means the model will not call any tool and instead generates a message.\n`auto` means the model can pick between generating a message or calling one or more tools.\n`required` means the model must call one or more tools.\nSpecifying a particular tool via `{\"type\": \"function\", \"function\": {\"name\": \"my_function\"}}` forces the model to call that tool.\n\n`none` is the default when no tools are present. `auto` is the default if tools are present.\n","nullable":true,"oneOf":[{"description":"`none` means the model will not call any tool and instead generates a message. `auto` means the model can pick between generating a message or calling one or more tools. `required` means the model must call one or more tools.\n","enum":["none","auto","required"],"type":"string"},{"$ref":"#/components/schemas/ResponseNamedToolChoice"}]},"ResponseUrlCitation":{"description":"A citation for a web resource.","properties":{"end_index":{"description":"The index of the last character of the URL citation in the message.","type":"integer"},"start_index":{"description":"The index of the first character of the URL citation in the message.","type":"integer"},"title":{"description":"The title of the web resource.","type":"string"},"type":{"description":"The type of the annotation. Always `url_citation`.","enum":["url_citation"],"type":"string","x-stainless-const":true},"url":{"description":"The URL of the web resource.","type":"string"}},"required":["type","url"],"title":"URL citation","type":"object"},"ResponseUsage":{"description":"Usage statistics for the response request.","properties":{"input_tokens":{"description":"Number of tokens in the input.","type":"integer"},"input_tokens_details":{"description":"Breakdown of input tokens.","properties":{"cached_tokens":{"description":"Number of cached tokens.","type":"integer"},"reasoning_tokens":{"description":"Number of reasoning tokens.","type":"integer"}},"required":["cached_tokens"],"type":"object"},"output_tokens":{"description":"Number of tokens in the generated output.","type":"integer"},"output_tokens_details":{"description":"Breakdown of output tokens.","properties":{"cached_tokens":{"description":"Number of cached tokens.","type":"integer"},"reasoning_tokens":{"description":"Number of reasoning tokens.","type":"integer"}},"required":["cached_tokens","reasoning_tokens"],"type":"object"},"total_tokens":{"description":"Total number of tokens used in the request (input + output).","type":"integer"}},"required":["input_tokens","input_tokens_details","output_tokens","output_tokens_details","total_tokens"],"type":"object"},"TextResponseFormatJsonSchema":{"description":"JSON Schema response format. Used to generate structured JSON responses.\n","properties":{"description":{"description":"A description of what the response format is for, used by the model to determine how to respond in the format.\n","type":"string"},"name":{"description":"The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.\n","type":"string"},"schema":{"additionalProperties":true,"description":"The schema for the response format, described as a JSON Schema object.\n","type":"object"},"strict":{"default":false,"description":"Whether to enable strict schema adherence when generating the output.\n","nullable":true,"type":"boolean"},"type":{"description":"The type of response format being defined. Always `json_schema`.","enum":["json_schema"],"type":"string","x-stainless-const":true}},"required":["type","name","schema"],"title":"JSON schema","type":"object"},"TranscriptionSegment":{"properties":{"audio_text":{"description":"Transcribed text for the entire chunk","type":"string"},"avg_logprob":{"description":"Average logprob of the segment. If the value is lower than -1, consider the logprobs failed.","format":"float","type":"number"},"chunk_end":{"description":"End chunk timestamp.","type":"integer"},"chunk_start":{"description":"Start chunk timestamp.","type":"integer"},"compression_ratio":{"description":"Compression ratio of the segment. If the value is greater than 2.4, consider the compression failed.","format":"float","type":"number"},"end":{"description":"End time of the segment in seconds.","format":"float","type":"number"},"id":{"description":"Unique identifier of the segment.","type":"integer"},"no_speech_prob":{"description":"Probability of no speech in the segment. If the value is higher than 1.0 and the `avg_logprob` is below -1, consider this segment silent.","format":"float","type":"number"},"seek":{"description":"Seek offset of the segment.","type":"integer"},"start":{"description":"Start time of the segment in seconds.","format":"float","type":"number"},"temperature":{"description":"Temperature parameter used for generating the segment.","format":"float","type":"number"},"text":{"description":"Text content of the segment.","type":"string"},"token_scores":{"description":"Token-level scores.","items":{"format":"float","type":"number"},"type":"array"},"tokens":{"description":"Array of token IDs for the text content.","items":{"type":"integer"},"type":"array"}},"required":["id","seek","start","end","text","tokens"],"type":"object"},"TranscriptionWord":{"properties":{"end":{"description":"End time of the word in seconds.","format":"float","type":"number"},"start":{"description":"Start time of the word in seconds.","format":"float","type":"number"},"word":{"description":"The text content of the word.","type":"string"}},"required":["word","start","end"],"type":"object"},"XGroq":{"description":"Groq-specific metadata for streaming responses. Different fields appear in different chunks.","properties":{"debug":{"allOf":[{"$ref":"#/components/schemas/DebugData"}],"nullable":true},"error":{"description":"An error string indicating why a stream was stopped early.","nullable":true,"type":"string"},"id":{"description":"A groq request ID which can be used to refer to a specific request to groq support.\nSent only in the first and final chunk.\n","nullable":true,"type":"string"},"seed":{"description":"The seed used for the request. Sent in the final chunk.","nullable":true,"type":"integer"},"usage":{"allOf":[{"$ref":"#/components/schemas/CompletionUsage"}],"description":"Usage information for the stream. Only sent in the final chunk.","nullable":true},"usage_breakdown":{"allOf":[{"$ref":"#/components/schemas/ChatCompletionUsageBreakdown"}],"description":"Detailed usage breakdown by model when multiple models are used in the request for compound AI systems. Only sent in the final chunk.","nullable":true}},"type":"object"},"XGroqNonStreaming":{"description":"Groq-specific metadata for non-streaming chat completion responses.","properties":{"debug":{"allOf":[{"$ref":"#/components/schemas/DebugData"}],"nullable":true},"id":{"description":"A groq request ID which can be used to refer to a specific request to groq support.","type":"string"},"seed":{"description":"The seed used for the request. See the seed property on CreateChatCompletionRequest for more details.","nullable":true,"type":"integer"},"usage":{"description":"Additional Groq-specific usage metrics (hardware cache statistics).","nullable":true,"properties":{"dram_cached_tokens":{"description":"Number of tokens served from DRAM cache.","type":"integer"},"sram_cached_tokens":{"description":"Number of tokens served from SRAM cache.","type":"integer"}},"type":"object"}},"required":["id"],"type":"object"}},"securitySchemes":{"api_key":{"bearerFormat":"apiKey","scheme":"bearer","type":"http"}}},"info":{"contact":{"email":"support@groq.com","name":"Groq Support"},"description":"Specification of the Groq cloud API","termsOfService":"https://groq.com/terms-of-use/","title":"GroqCloud API","version":"2.1"},"openapi":"3.0.1","paths":{"/openai/v1/audio/speech":{"post":{"operationId":"createSpeech","requestBody":{"content":{"application/json":{"schema":{"$ref":"#/components/schemas/CreateSpeechRequest"}}},"required":true},"responses":{"200":{"content":{"audio/wav":{"schema":{"format":"binary","type":"string"}}},"description":"OK","headers":{"Transfer-Encoding":{"description":"chunked","schema":{"type":"string"}}}}},"summary":"Generates audio from the input text.","tags":["Audio"],"x-groq-metadata":{"examples":[{"request":{"curl":"curl https://api.groq.com/openai/v1/audio/speech \\\n -H \"Authorization: Bearer $GROQ_API_KEY\" \\\n -H \"Content-Type: application/json\" \\\n -d ‘{\n \"model\": \"playai-tts\",\n \"input\": \"I love building and shipping new features for our users!\",\n \"voice\": \"Fritz-PlayAI\",\n \"response_format\": \"wav\"\n }’\n","js":"import fs from \"fs\";\nimport path from \"path\";\nimport Groq from ‘groq-sdk’;\n\nconst groq = new Groq({\n apiKey: process.env.GROQ_API_KEY\n});\n\nconst speechFilePath = \"speech.wav\";\nconst model = \"playai-tts\";\nconst voice = \"Fritz-PlayAI\";\nconst text = \"I love building and shipping new features for our users!\";\nconst responseFormat = \"wav\";\n\nasync function main() {\n const response = await groq.audio.speech.create({\n model: model,\n voice: voice,\n input: text,\n response_format: responseFormat\n });\n\n const buffer = Buffer.from(await response.arrayBuffer());\n await fs.promises.writeFile(speechFilePath, buffer);\n}\n\nmain();\n","py":"import os\nfrom groq import Groq\n\nclient = Groq(api_key=os.environ.get(\"GROQ_API_KEY\"))\n\nspeech_file_path = \"speech.wav\"\nmodel = \"playai-tts\"\nvoice = \"Fritz-PlayAI\"\ntext = \"I love building and shipping new features for our users!\"\nresponse_format = \"wav\"\n\nresponse = client.audio.speech.create(\n model=model,\n voice=voice,\n input=text,\n response_format=response_format\n)\n\nresponse.write_to_file(speech_file_path)\n"},"title":"Default"}],"returns":"Returns an audio file in `wav` format."}}},"/openai/v1/audio/transcriptions":{"post":{"operationId":"createTranscription","requestBody":{"content":{"multipart/form-data":{"schema":{"$ref":"#/components/schemas/CreateTranscriptionRequest"}}},"required":true},"responses":{"200":{"content":{"application/json":{"schema":{"$ref":"#/components/schemas/CreateTranscriptionResponseJson"}}},"description":"OK"}},"summary":"Transcribes audio into the input language.","tags":["Audio"],"x-groq-metadata":{"examples":[{"request":{"curl":"curl https://api.groq.com/openai/v1/audio/transcriptions \\\n -H \"Authorization: Bearer $GROQ_API_KEY\" \\\n -H \"Content-Type: multipart/form-data\" \\\n -F file=\"@./sample_audio.m4a\" \\\n -F model=\"whisper-large-v3\"\n","js":"import fs from \"fs\";\nimport Groq from \"groq-sdk\";\n\nconst groq = new Groq();\nasync function main() {\n const transcription = await groq.audio.transcriptions.create({\n file: fs.createReadStream(\"sample_audio.m4a\"),\n model: \"whisper-large-v3\",\n prompt: \"Specify context or spelling\", // Optional\n response_format: \"json\", // Optional\n language: \"en\", // Optional\n temperature: 0.0, // Optional\n });\n console.log(transcription.text);\n}\nmain();\n","py":"import os\nfrom groq import Groq\n\nclient = Groq()\nfilename = os.path.dirname(file) + \"/sample_audio.m4a\"\n\nwith open(filename, \"rb\") as file:\n transcription = client.audio.transcriptions.create(\n file=(filename, file.read()),\n model=\"whisper-large-v3\",\n prompt=\"Specify context or spelling\", # Optional\n response_format=\"json\", # Optional\n language=\"en\", # Optional\n temperature=0.0 # Optional\n )\n print(transcription.text)\n"},"response":"{\n \"text\": \"Your transcribed text appears here…\",\n \"x_groq\": {\n \"id\": \"req_unique_id\"\n }\n}\n","title":"Default"}],"returns":"Returns an audio transcription object."}}},"/openai/v1/audio/translations":{"post":{"operationId":"createTranslation","requestBody":{"content":{"multipart/form-data":{"schema":{"$ref":"#/components/schemas/CreateTranslationRequest"}}},"required":true},"responses":{"200":{"content":{"application/json":{"schema":{"$ref":"#/components/schemas/CreateTranslationResponseJson"}},"text/plain":{"schema":{"type":"string"}}},"description":"OK"}},"summary":"Translates audio into English.","tags":["Audio"],"x-groq-metadata":{"examples":[{"request":{"curl":"curl https://api.groq.com/openai/v1/audio/translations \\\n -H \"Authorization: Bearer $GROQ_API_KEY\" \\\n -H \"Content-Type: multipart/form-data\" \\\n -F file=\"@./sample_audio.m4a\" \\\n -F model=\"whisper-large-v3\"\n","js":"// Default\nimport fs from \"fs\";\nimport Groq from \"groq-sdk\";\n\nconst groq = new Groq();\nasync function main() {\n const translation = await groq.audio.translations.create({\n file: fs.createReadStream(\"sample_audio.m4a\"),\n model: \"whisper-large-v3\",\n prompt: \"Specify context or spelling\", // Optional\n response_format: \"json\", // Optional\n temperature: 0.0, // Optional\n });\n console.log(translation.text);\n}\nmain();\n","py":"# Default\nimport os\nfrom groq import Groq\n\nclient = Groq()\nfilename = os.path.dirname(file) + \"/sample_audio.m4a\"\n\nwith open(filename, \"rb\") as file:\n translation = client.audio.translations.create(\n file=(filename, file.read()),\n model=\"whisper-large-v3\",\n prompt=\"Specify context or spelling\", # Optional\n response_format=\"json\", # Optional\n temperature=0.0 # Optional\n )\n print(translation.text)\n"},"response":"{\n \"text\": \"Your translated text appears here…\",\n \"x_groq\": {\n \"id\": \"req_unique_id\"\n }\n}\n","title":"Default"}],"returns":"Returns an audio translation object."}}},"/openai/v1/batches":{"get":{"operationId":"listBatches","responses":{"200":{"content":{"application/json":{"schema":{"$ref":"#/components/schemas/ListBatchesResponse"}}},"description":"Batch listed successfully."}},"summary":"List your organization’s batches.","tags":["Batch"],"x-groq-metadata":{"examples":[{"request":{"curl":"curl https://api.groq.com/openai/v1/batches \\\n -H \"Authorization: Bearer $GROQ_API_KEY\" \\\n -H \"Content-Type: application/json\"\n","js":"import Groq from ‘groq-sdk’;\n\nconst client = new Groq({\n apiKey: process.env[‘GROQ_API_KEY’], // This is the default and can be omitted\n});\n\nasync function main() {\n const batchList = await client.batches.list();\n console.log(batchList.data);\n}\n\nmain();\n","py":"import os\nfrom groq import Groq\n\nclient = Groq(\n api_key=os.environ.get(\"GROQ_API_KEY\"), # This is the default and can be omitted\n)\nbatch_list = client.batches.list()\nprint(batch_list.data)\n"},"response":"{\n \"object\": \"list\",\n \"data\": [\n {\n \"id\": \"batch_01jh6xa7reempvjyh6n3yst2zw\",\n \"object\": \"batch\",\n \"endpoint\": \"/v1/chat/completions\",\n \"errors\": null,\n \"input_file_id\": \"file_01jh6x76wtemjr74t1fh0faj5t\",\n \"completion_window\": \"24h\",\n \"status\": \"validating\",\n \"output_file_id\": null,\n \"error_file_id\": null,\n \"finalizing_at\": null,\n \"failed_at\": null,\n \"expired_at\": null,\n \"cancelled_at\": null,\n \"request_counts\": {\n \"total\": 0,\n \"completed\": 0,\n \"failed\": 0\n },\n \"metadata\": null,\n \"created_at\": 1736472600,\n \"expires_at\": 1736559000,\n \"cancelling_at\": null,\n \"completed_at\": null,\n \"in_progress_at\": null\n }\n ]\n}\n","title":"Default"}],"returns":"A list of batches"}},"post":{"operationId":"createBatch","requestBody":{"content":{"application/json":{"schema":{"properties":{"completion_window":{"description":"The time frame within which the batch should be processed. Durations from `24h` to `7d` are supported.","type":"string"},"endpoint":{"description":"The endpoint to be used for all requests in the batch. Currently `/v1/chat/completions` is supported.","enum":["/v1/chat/completions"],"type":"string"},"input_file_id":{"description":"The ID of an uploaded file that contains requests for the new batch.\n\nSee upload file for how to upload a file.\n\nYour input file must be formatted as a JSONL file, and must be uploaded with the purpose `batch`. The file can be up to 100 MB in size.\n","type":"string"},"metadata":{"additionalProperties":{"type":"string"},"description":"Optional custom metadata for the batch.","nullable":true,"type":"object"}},"required":["input_file_id","endpoint","completion_window"],"type":"object"}}},"required":true},"responses":{"200":{"content":{"application/json":{"schema":{"$ref":"#/components/schemas/Batch"}}},"description":"Batch created successfully."}},"summary":"Creates and executes a batch from an uploaded file of requests. Learn more.","tags":["Batch"],"x-groq-metadata":{"examples":[{"request":{"curl":"curl https://api.groq.com/openai/v1/batches \\\n -H \"Authorization: Bearer $GROQ_API_KEY\" \\\n -H \"Content-Type: application/json\" \\\n -d ‘{\n \"input_file_id\": \"file_01jh6x76wtemjr74t1fh0faj5t\",\n \"endpoint\": \"/v1/chat/completions\",\n \"completion_window\": \"24h\"\n }’\n","js":"import Groq from ‘groq-sdk’;\n\nconst client = new Groq({\n apiKey: process.env[‘GROQ_API_KEY’], // This is the default and can be omitted\n});\n\nasync function main() {\n const batch = await client.batches.create({\n completion_window: \"24h\",\n endpoint: \"/v1/chat/completions\",\n input_file_id: \"file_01jh6x76wtemjr74t1fh0faj5t\",\n });\n console.log(batch.id);\n}\n\nmain();\n","py":"import os\nfrom groq import Groq\n\nclient = Groq(\n api_key=os.environ.get(\"GROQ_API_KEY\"), # This is the default and can be omitted\n)\nbatch = client.batches.create(\n completion_window=\"24h\",\n endpoint=\"/v1/chat/completions\",\n input_file_id=\"file_01jh6x76wtemjr74t1fh0faj5t\",\n)\nprint(batch.id)\n"},"response":"{\n \"id\": \"batch_01jh6xa7reempvjyh6n3yst2zw\",\n \"object\": \"batch\",\n \"endpoint\": \"/v1/chat/completions\",\n \"errors\": null,\n \"input_file_id\": \"file_01jh6x76wtemjr74t1fh0faj5t\",\n \"completion_window\": \"24h\",\n \"status\": \"validating\",\n \"output_file_id\": null,\n \"error_file_id\": null,\n \"finalizing_at\": null,\n \"failed_at\": null,\n \"expired_at\": null,\n \"cancelled_at\": null,\n \"request_counts\": {\n \"total\": 0,\n \"completed\": 0,\n \"failed\": 0\n },\n \"metadata\": null,\n \"created_at\": 1736472600,\n \"expires_at\": 1736559000,\n \"cancelling_at\": null,\n \"completed_at\": null,\n \"in_progress_at\": null\n}\n","title":"Default"}],"returns":"A created batch object."}}},"/openai/v1/batches/{batch_id}":{"get":{"operationId":"retrieveBatch","parameters":[{"description":"The ID of the batch to retrieve.","in":"path","name":"batch_id","required":true,"schema":{"type":"string"}}],"responses":{"200":{"content":{"application/json":{"schema":{"$ref":"#/components/schemas/Batch"}}},"description":"Batch retrieved successfully."}},"summary":"Retrieves a batch.","tags":["Batch"],"x-groq-metadata":{"examples":[{"request":{"curl":"curl https://api.groq.com/openai/v1/batches/batch_01jh6xa7reempvjyh6n3yst2zw \\\n -H \"Authorization: Bearer $GROQ_API_KEY\" \\\n -H \"Content-Type: application/json\"\n","js":"import Groq from ‘groq-sdk’;\n\nconst client = new Groq({\n apiKey: process.env[‘GROQ_API_KEY’], // This is the default and can be omitted\n});\n\nasync function main() {\n const batch = await client.batches.retrieve(\"batch_01jh6xa7reempvjyh6n3yst2zw\");\n console.log(batch.id);\n}\n\nmain();\n","py":"import os\nfrom groq import Groq\n\nclient = Groq(\n api_key=os.environ.get(\"GROQ_API_KEY\"), # This is the default and can be omitted\n)\nbatch = client.batches.retrieve(\n \"batch_01jh6xa7reempvjyh6n3yst2zw\",\n)\nprint(batch.id)\n"},"response":"{\n \"id\": \"batch_01jh6xa7reempvjyh6n3yst2zw\",\n \"object\": \"batch\",\n \"endpoint\": \"/v1/chat/completions\",\n \"errors\": null,\n \"input_file_id\": \"file_01jh6x76wtemjr74t1fh0faj5t\",\n \"completion_window\": \"24h\",\n \"status\": \"validating\",\n \"output_file_id\": null,\n \"error_file_id\": null,\n \"finalizing_at\": null,\n \"failed_at\": null,\n \"expired_at\": null,\n \"cancelled_at\": null,\n \"request_counts\": {\n \"total\": 0,\n \"completed\": 0,\n \"failed\": 0\n },\n \"metadata\": null,\n \"created_at\": 1736472600,\n \"expires_at\": 1736559000,\n \"cancelling_at\": null,\n \"completed_at\": null,\n \"in_progress_at\": null\n}\n","title":"Default"}],"returns":"A batch object."}}},"/openai/v1/batches/{batch_id}/cancel":{"post":{"operationId":"cancelBatch","parameters":[{"description":"The ID of the batch to cancel.","in":"path","name":"batch_id","required":true,"schema":{"type":"string"}}],"responses":{"200":{"content":{"application/json":{"schema":{"$ref":"#/components/schemas/Batch"}}},"description":"Batch cancelled successfully."}},"summary":"Cancels a batch.","tags":["Batch"],"x-groq-metadata":{"examples":[{"request":{"curl":"curl -X POST https://api.groq.com/openai/v1/batches/batch_01jh6xa7reempvjyh6n3yst2zw/cancel \\\n -H \"Authorization: Bearer $GROQ_API_KEY\" \\\n -H \"Content-Type: application/json\"\n","js":"import Groq from ‘groq-sdk’;\n\nconst client = new Groq({\n apiKey: process.env[‘GROQ_API_KEY’], // This is the default and can be omitted\n});\n\nasync function main() {\n const batch = await client.batches.cancel(\"batch_01jh6xa7reempvjyh6n3yst2zw\");\n console.log(batch.id);\n}\n\nmain();\n","py":"import os\nfrom groq import Groq\n\nclient = Groq(\n api_key=os.environ.get(\"GROQ_API_KEY\"), # This is the default and can be omitted\n)\nbatch = client.batches.cancel(\n \"batch_01jh6xa7reempvjyh6n3yst2zw\",\n)\nprint(batch.id)\n"},"response":"{\n \"id\": \"batch_01jh6xa7reempvjyh6n3yst2zw\",\n \"object\": \"batch\",\n \"endpoint\": \"/v1/chat/completions\",\n \"errors\": null,\n \"input_file_id\": \"file_01jh6x76wtemjr74t1fh0faj5t\",\n \"completion_window\": \"24h\",\n \"status\": \"cancelling\",\n \"output_file_id\": null,\n \"error_file_id\": null,\n \"finalizing_at\": null,\n \"failed_at\": null,\n \"expired_at\": null,\n \"cancelled_at\": null,\n \"request_counts\": {\n \"total\": 0,\n \"completed\": 0,\n \"failed\": 0\n },\n \"metadata\": null,\n \"created_at\": 1736472600,\n \"expires_at\": 1736559000,\n \"cancelling_at\": null,\n \"completed_at\": null,\n \"in_progress_at\": null\n}\n","title":"Default"}],"returns":"A batch object."}}},"/openai/v1/chat/completions":{"post":{"operationId":"createChatCompletion","requestBody":{"content":{"application/json":{"schema":{"$ref":"#/components/schemas/CreateChatCompletionRequest"}}},"description":"The chat prompt and parameters","required":true},"responses":{"200":{"content":{"application/json":{"schema":{"$ref":"#/components/schemas/CreateChatCompletionResponse"}}},"description":"OK"}},"summary":"Creates a model response for the given chat conversation.","tags":["Chat"],"x-groq-metadata":{"examples":[{"request":{"curl":"curl https://api.groq.com/openai/v1/chat/completions -s \\\n-H \"Content-Type: application/json\" \\\n-H \"Authorization: Bearer $GROQ_API_KEY\" \\\n-d ‘{\n \"model\": \"llama-3.3-70b-versatile\",\n \"messages\": [{\n \"role\": \"user\",\n \"content\": \"Explain the importance of fast language models\"\n }]\n}’\n","js":"import Groq from \"groq-sdk\";\n\nconst groq = new Groq({ apiKey: process.env.GROQ_API_KEY });\n\nasync function main() {\n const completion = await groq.chat.completions\n .create({\n messages: [\n {\n role: \"user\",\n content: \"Explain the importance of fast language models\",\n },\n ],\n model: \"llama-3.3-70b-versatile\",\n })\n console.log(completion.choices[0].message.content);\n}\n\nmain();\n","py":"import os\n\nfrom groq import Groq\n\nclient = Groq(\n # This is the default and can be omitted\n api_key=os.environ.get(\"GROQ_API_KEY\"),\n)\n\nchat_completion = client.chat.completions.create(\n messages=[\n {\n \"role\": \"system\",\n \"content\": \"You are a helpful assistant.\"\n },\n {\n \"role\": \"user\",\n \"content\": \"Explain the importance of fast language models\",\n }\n ],\n model=\"llama-3.3-70b-versatile\",\n)\n\nprint(chat_completion.choices[0].message.content)\n"},"response":"$40","title":"Default"}],"returns":"Returns a chat completion object, or a streamed sequence of chat completion chunk objects if the request is streamed."}}},"/openai/v1/embeddings":{"post":{"operationId":"createEmbedding","requestBody":{"content":{"application/json":{"schema":{"$ref":"#/components/schemas/CreateEmbeddingRequest"}}},"required":true},"responses":{"200":{"content":{"application/json":{"schema":{"$ref":"#/components/schemas/CreateEmbeddingResponse"}}},"description":"OK"}},"summary":"Creates an embedding vector representing the input text.","tags":["Embeddings"]}},"/openai/v1/files":{"get":{"operationId":"listFiles","responses":{"200":{"content":{"application/json":{"schema":{"$ref":"#/components/schemas/ListFilesResponse"}}},"description":"OK"}},"summary":"Returns a list of files.","tags":["Files"],"x-groq-metadata":{"examples":[{"request":{"curl":"curl https://api.groq.com/openai/v1/files \\\n -H \"Authorization: Bearer $GROQ_API_KEY\" \\\n -H \"Content-Type: application/json\"\n","js":"import Groq from ‘groq-sdk’;\n\nconst client = new Groq({\n apiKey: process.env[‘GROQ_API_KEY’], // This is the default and can be omitted\n});\n\nasync function main() {\n const fileList = await client.files.list();\n console.log(fileList.data);\n}\n\nmain();\n","py":"import os\nfrom groq import Groq\n\nclient = Groq(\n api_key=os.environ.get(\"GROQ_API_KEY\"), # This is the default and can be omitted\n)\nfile_list = client.files.list()\nprint(file_list.data)\n"},"response":"{\n \"object\": \"list\",\n \"data\": [\n {\n \"id\": \"file_01jh6x76wtemjr74t1fh0faj5t\",\n \"object\": \"file\",\n \"bytes\": 966,\n \"created_at\": 1736472501,\n \"filename\": \"batch_file.jsonl\",\n \"purpose\": \"batch\"\n }\n ]\n}\n","title":"Default"}],"returns":"A list of File objects."}},"post":{"operationId":"uploadFile","requestBody":{"content":{"multipart/form-data":{"schema":{"$ref":"#/components/schemas/CreateFileRequest"}}},"required":true},"responses":{"200":{"content":{"application/json":{"schema":{"$ref":"#/components/schemas/File"}}},"description":"OK"}},"summary":"Upload a file that can be used across various endpoints.\n\nThe Batch API only supports `.jsonl` files up to 100 MB in size. The input also has a specific required format.\n\nPlease contact us if you need to increase these storage limits.\n","tags":["Files"],"x-groq-metadata":{"examples":[{"request":{"curl":"curl https://api.groq.com/openai/v1/files \\\n -H \"Authorization: Bearer $GROQ_API_KEY\" \\\n -F purpose=\"batch\" \\\n -F \"file=@batch_file.jsonl\"\n","js":"import Groq from ‘groq-sdk’;\n\nconst client = new Groq({\n apiKey: process.env[‘GROQ_API_KEY’], // This is the default and can be omitted\n});\n\nconst fileContent = ‘{\"custom_id\": \"request-1\", \"method\": \"POST\", \"url\": \"/v1/chat/completions\", \"body\": {\"model\": \"llama-3.1-8b-instant\", \"messages\": [{\"role\": \"user\", \"content\": \"Explain the importance of fast language models\"}]}}\\n’;\n\nasync function main() {\n const blob = new Blob([fileContent]);\n const file = new File([blob], ‘batch.jsonl’);\n\n const createdFile = await client.files.create({ file: file, purpose: ‘batch’ });\n console.log(createdFile.id);\n}\n\nmain();\n","py":"import os\nimport requests # pip install requests first!\n\ndef upload_file_to_groq(api_key, file_path):\n url = \"https://api.groq.com/openai/v1/files\"\n\n headers = {\n \"Authorization\": f\"Bearer {api_key}\"\n }\n\n # Prepare the file and form data\n files = {\n \"file\": (\"batch_file.jsonl\", open(file_path, \"rb\"))\n }\n\n data = {\n \"purpose\": \"batch\"\n }\n\n # Make the POST request\n response = requests.post(url, headers=headers, files=files, data=data)\n\n return response.json()\n\n# Usage example\napi_key = os.environ.get(\"GROQ_API_KEY\")\nfile_path = \"batch_file.jsonl\" # Path to your JSONL file\n\ntry:\n result = upload_file_to_groq(api_key, file_path)\n print(result)\nexcept Exception as e:\n print(f\"Error: {e}\")\n"},"response":"{\n \"id\": \"file_01jh6x76wtemjr74t1fh0faj5t\",\n \"object\": \"file\",\n \"bytes\": 966,\n \"created_at\": 1736472501,\n \"filename\": \"batch_file.jsonl\",\n \"purpose\": \"batch\"\n}\n","title":"Default"}],"returns":"The uploaded File object."}}},"/openai/v1/files/{file_id}":{"delete":{"operationId":"deleteFile","parameters":[{"description":"The ID of the file to use for this request.","in":"path","name":"file_id","required":true,"schema":{"type":"string"}}],"responses":{"200":{"content":{"application/json":{"schema":{"$ref":"#/components/schemas/DeleteFileResponse"}}},"description":"OK"}},"summary":"Delete a file.","tags":["Files"],"x-groq-metadata":{"examples":[{"request":{"curl":"curl -X DELETE https://api.groq.com/openai/v1/files/file_01jh6x76wtemjr74t1fh0faj5t \\\n -H \"Authorization: Bearer $GROQ_API_KEY\" \\\n -H \"Content-Type: application/json\"\n","js":"import Groq from ‘groq-sdk’;\n\nconst client = new Groq({\n apiKey: process.env[‘GROQ_API_KEY’], // This is the default and can be omitted\n});\n\nasync function main() {\n const fileDelete = await client.files.delete(\"file_01jh6x76wtemjr74t1fh0faj5t\");\n console.log(fileDelete);\n}\n\nmain();\n","py":"import os\nfrom groq import Groq\n\nclient = Groq(\n api_key=os.environ.get(\"GROQ_API_KEY\"), # This is the default and can be omitted\n)\nfile_delete = client.files.delete(\n \"file_01jh6x76wtemjr74t1fh0faj5t\",\n)\nprint(file_delete)\n"},"response":"{\n \"id\": \"file_01jh6x76wtemjr74t1fh0faj5t\",\n \"object\": \"file\",\n \"deleted\": true\n}\n","title":"Default"}],"returns":"A deleted file response object."}},"get":{"operationId":"retrieveFile","parameters":[{"description":"The file to retrieve","in":"path","name":"file_id","required":true,"schema":{"type":"string"}}],"responses":{"200":{"content":{"application/json":{"schema":{"$ref":"#/components/schemas/File"}}},"description":"OK"}},"summary":"Returns information about a file.","tags":["Files"],"x-groq-metadata":{"examples":[{"request":{"curl":"curl https://api.groq.com/openai/v1/files/file_01jh6x76wtemjr74t1fh0faj5t \\\n -H \"Authorization: Bearer $GROQ_API_KEY\" \\\n -H \"Content-Type: application/json\"\n","js":"import Groq from ‘groq-sdk’;\n\nconst client = new Groq({\n apiKey: process.env[‘GROQ_API_KEY’], // This is the default and can be omitted\n});\n\nasync function main() {\n const file = await client.files.info(‘file_01jh6x76wtemjr74t1fh0faj5t’);\n console.log(file);\n}\n\nmain();\n","py":"import os\nfrom groq import Groq\n\nclient = Groq(\n api_key=os.environ.get(\"GROQ_API_KEY\"), # This is the default and can be omitted\n)\nfile = client.files.info(\n \"file_01jh6x76wtemjr74t1fh0faj5t\",\n)\nprint(file)\n"},"response":"{\n \"id\": \"file_01jh6x76wtemjr74t1fh0faj5t\",\n \"object\": \"file\",\n \"bytes\": 966,\n \"created_at\": 1736472501,\n \"filename\": \"batch_file.jsonl\",\n \"purpose\": \"batch\"\n}\n","title":"Default"}],"returns":"A file object."}}},"/openai/v1/files/{file_id}/content":{"get":{"operationId":"downloadFile","parameters":[{"description":"The ID of the file to use for this request.","in":"path","name":"file_id","required":true,"schema":{"type":"string"}}],"responses":{"200":{"content":{"application/octet-stream":{"schema":{"format":"binary","type":"string"}}},"description":"OK"}},"summary":"Returns the contents of the specified file.","tags":["Files"],"x-groq-metadata":{"examples":[{"request":{"curl":"curl https://api.groq.com/openai/v1/files/file_01jh6x76wtemjr74t1fh0faj5t/content \\\n -H \"Authorization: Bearer $GROQ_API_KEY\" \\\n -H \"Content-Type: application/json\"\n","js":"import Groq from ‘groq-sdk’;\n\nconst client = new Groq({\n apiKey: process.env[‘GROQ_API_KEY’], // This is the default and can be omitted\n});\n\nasync function main() {\n const response = await client.files.content(‘file_01jh6x76wtemjr74t1fh0faj5t’);\n console.log(response);\n}\n\nmain();\n","py":"import os\nfrom groq import Groq\n\nclient = Groq(\n api_key=os.environ.get(\"GROQ_API_KEY\"), # This is the default and can be omitted\n)\nresponse = client.files.content(\n \"file_01jh6x76wtemjr74t1fh0faj5t\",\n)\nprint(response)\n"},"title":"Default"}],"returns":"The file content"}}},"/openai/v1/models":{"get":{"description":"get all available models","operationId":"listModels","responses":{"200":{"content":{"application/json":{"schema":{"$ref":"#/components/schemas/ListModelsResponse"}}},"description":"OK"}},"summary":"List all available models.","tags":["Models"],"x-groq-metadata":{"examples":[{"request":{"curl":"curl https://api.groq.com/openai/v1/models \\\n-H \"Authorization: Bearer $GROQ_API_KEY\"\n","js":"import Groq from \"groq-sdk\";\n\nconst groq = new Groq({ apiKey: process.env.GROQ_API_KEY });\n\nasync function main() {\n const models = await groq.models.list();\n console.log(models);\n}\n\nmain();\n","py":"import os\nfrom groq import Groq\n\nclient = Groq(\n # This is the default and can be omitted\n api_key=os.environ.get(\"GROQ_API_KEY\"),\n)\n\nmodels = client.models.list()\n\nprint(models)\n"},"response":"$41","title":"Default"}],"returns":"A list of model objects."}}},"/openai/v1/models/{model}":{"delete":{"description":"Delete a model","operationId":"deleteModel","parameters":[{"description":"The model to delete","in":"path","name":"model","required":true,"schema":{"type":"string"}}],"responses":{"200":{"content":{"application/json":{"schema":{"$ref":"#/components/schemas/DeleteModelResponse"}}},"description":"OK"}},"summary":"Delete model","tags":["Models"]},"get":{"description":"Get a specific model","operationId":"retrieveModel","parameters":[{"description":"The model to get","in":"path","name":"model","required":true,"schema":{"type":"string"}}],"responses":{"200":{"content":{"application/json":{"schema":{"$ref":"#/components/schemas/Model"}}},"description":"OK"}},"summary":"Get detailed information about a model.","tags":["Models"],"x-groq-metadata":{"examples":[{"request":{"curl":"curl https://api.groq.com/openai/v1/models/llama-3.3-70b-versatile \\\n-H \"Authorization: Bearer $GROQ_API_KEY\"\n","js":"import Groq from \"groq-sdk\";\n\nconst groq = new Groq({ apiKey: process.env.GROQ_API_KEY });\n\nasync function main() {\n const model = await groq.models.retrieve(\"llama-3.3-70b-versatile\");\n console.log(model);\n}\n\nmain();\n","py":"import os\nfrom groq import Groq\n\nclient = Groq(\n # This is the default and can be omitted\n api_key=os.environ.get(\"GROQ_API_KEY\"),\n)\n\nmodel = client.models.retrieve(\"llama-3.3-70b-versatile\")\n\nprint(model)\n"},"response":"{\n \"id\": \"llama3-8b-8192\",\n \"object\": \"model\",\n \"created\": 1693721698,\n \"owned_by\": \"Meta\",\n \"active\": true,\n \"context_window\": 8192,\n \"public_apps\": null,\n \"max_completion_tokens\": 8192\n}\n","title":"Default"}],"returns":"A model object."}}},"/openai/v1/reranking":{"post":{"description":"Given a query and a list of documents, returns the documents ranked by their relevance to the query.\nThe documents are scored and sorted in descending order of relevance.\n","operationId":"createReranking","requestBody":{"content":{"application/json":{"schema":{"$ref":"#/components/schemas/RerankingRequest"}}},"required":true},"responses":{"200":{"content":{"application/json":{"schema":{"$ref":"#/components/schemas/RerankingResponse"}}},"description":"OK"}},"summary":"Reranks documents based on their relevance to a query.","tags":["Reranking"],"x-groq-metadata":{"examples":[{"request":{"curl":"curl https://api.groq.com/openai/v1/reranking \\\n -H \"Authorization: Bearer $GROQ_API_KEY\" \\\n -H \"Content-Type: application/json\" \\\n -d ‘{\n \"model\": \"qwen3-reranker-4b\",\n \"query\": \"artificial intelligence\",\n \"docs\": [\n \"Machine learning is a subset of AI\",\n \"The weather is nice today\",\n \"Deep learning uses neural networks\"\n ]\n }’\n","js":"import Groq from ‘groq-sdk’;\n\nconst client = new Groq({\n apiKey: process.env[‘GROQ_API_KEY’],\n});\n\nasync function main() {\n const reranking = await client.reranking.create({\n model: ‘qwen3-reranker-4b’,\n query: ‘artificial intelligence’,\n docs: [\n ‘Machine learning is a subset of AI’,\n ‘The weather is nice today’,\n ‘Deep learning uses neural networks’\n ]\n });\n console.log(reranking.results);\n}\nmain();\n","py":"import os\nfrom groq import Groq\n\nclient = Groq(api_key=os.environ.get(\"GROQ_API_KEY\"))\n\nreranking = client.reranking.create(\n model=\"qwen3-reranker-4b\",\n query=\"artificial intelligence\",\n docs=[\n \"Machine learning is a subset of AI\",\n \"The weather is nice today\", \n \"Deep learning uses neural networks\"\n ]\n)\nprint(reranking.results)\n"},"response":"{\n \"results\": [\n {\n \"doc\": \"Machine learning is a subset of AI\",\n \"score\": 0.92\n },\n {\n \"doc\": \"Deep learning uses neural networks\", \n \"score\": 0.87\n },\n {\n \"doc\": \"The weather is nice today\",\n \"score\": 0.23\n }\n ]\n}\n","title":"Basic Reranking"},{"request":{"json":"{\n \"model\": \"qwen3-reranker-4b\",\n \"query\": \"climate change effects\",\n \"docs\": [\n \"Global warming causes sea level rise\",\n \"Electric cars reduce emissions\",\n \"Renewable energy is growing fast\"\n ],\n \"instruction\": \"Find documents specifically about environmental impacts\"\n}\n"},"response":"{\n \"results\": [\n {\n \"doc\": \"Global warming causes sea level rise\",\n \"score\": 0.95\n },\n {\n \"doc\": \"Electric cars reduce emissions\",\n \"score\": 0.78\n },\n {\n \"doc\": \"Renewable energy is growing fast\",\n \"score\": 0.65\n }\n ]\n}\n","title":"Reranking with Custom Instruction"}],"returns":"A list of documents sorted by relevance score in descending order. \nScores range from 0.0 to 1.0, where higher scores indicate greater relevance to the query.\n"}}},"/openai/v1/responses":{"post":{"operationId":"createResponse","requestBody":{"content":{"application/json":{"schema":{"$ref":"#/components/schemas/CreateResponseRequest"}}},"description":"The input prompt and parameters","required":true},"responses":{"200":{"content":{"application/json":{"schema":{"$ref":"#/components/schemas/CreateResponseResponse"}}},"description":"OK"}},"summary":"Creates a model response for the given input.","tags":["Responses"],"x-groq-metadata":{"examples":[{"request":{"curl":"curl https://api.groq.com/openai/v1/responses -s \\\n-H \"Content-Type: application/json\" \\\n-H \"Authorization: Bearer $GROQ_API_KEY\" \\\n-d ‘{\n \"model\": \"openai/gpt-oss-120b\",\n \"input\": \"Tell me a three sentence bedtime story about a unicorn.\"\n}’\n"},"response":"$42","title":"Default"}],"returns":"Returns a response object, or a streamed sequence of response events if the request is streamed."}}},"/v1/fine_tunings":{"get":{"operationId":"listFineTunings","responses":{"200":{"content":{"application/json":{"schema":{"$ref":"#/components/schemas/ListFineTuningsResponse"}}},"description":"OK"}},"summary":"Lists all previously created fine tunings. This endpoint is in closed beta. Contact us for more information.","tags":["Fine Tuning"],"x-groq-metadata":{"examples":[{"request":{"curl":"curl https://api.groq.com/v1/fine_tunings -s \\\n -H \"Content-Type: application/json\" \\\n -H \"Authorization: Bearer $GROQ_API_KEY\"\n","js":"import Groq from \"groq-sdk\";\n\nconst groq = new Groq({ apiKey: process.env.GROQ_API_KEY });\n\nasync function main() {\n const fineTunings = await groq.fine_tunings.list();\n console.log(fineTunings);\n}\n\nmain();\n","py":"import os\n\nfrom groq import Groq\n\nclient = Groq(\n # This is the default and can be omitted\n api_key=os.environ.get(\"GROQ_API_KEY\"),\n)\n\nfine_tunings = client.fine_tunings.list()\n\nprint(fine_tunings)\n"},"response":"{\n \"object\": \"list\",\n \"data\": [\n {\n \"id\": \"string\",\n \"name\": \"string\",\n \"base_model\": \"string\",\n \"type\": \"string\",\n \"input_file_id\": \"string\",\n \"created_at\": 0,\n \"fine_tuned_model\": \"string\"\n }\n ]\n}\n","title":"Default"}],"returns":"The list of fine tunes"}},"post":{"operationId":"createFineTuning","requestBody":{"content":{"application/json":{"schema":{"$ref":"#/components/schemas/CreateFineTuningRequest"}}}},"responses":{"200":{"content":{"application/json":{"schema":{"$ref":"#/components/schemas/ReadFineTuningResponse"}}},"description":"OK"}},"summary":"Creates a new fine tuning for the already uploaded files This endpoint is in closed beta. Contact us for more information.","tags":["Fine Tuning"],"x-groq-metadata":{"examples":[{"request":{"curl":"curl https://api.groq.com/v1/fine_tunings -s \\\n -H \"Content-Type: application/json\" \\\n -H \"Authorization: Bearer $GROQ_API_KEY\" \\\n -d ‘{\n \"input_file_id\": \"\u003cfile-id\u003e\",\n \"name\": \"test-1\",\n \"type\": \"lora\",\n \"base_model\": \"llama-3.1-8b-instant\"\n }’\n","js":"import Groq from \"groq-sdk\";\n\nconst groq = new Groq({ apiKey: process.env.GROQ_API_KEY });\n\nasync function main() {\n const fineTunings = await groq.fine_tunings.create({\n input_file_id: \"\u003cfile-id\u003e\",\n name: \"test-1\",\n type: \"lora\",\n base_model: \"llama-3.1-8b-instant\"\n });\n console.log(fineTunings);\n}\n\nmain();\n","py":"import os\n\nfrom groq import Groq\n\nclient = Groq(\n # This is the default and can be omitted\n api_key=os.environ.get(\"GROQ_API_KEY\"),\n)\n\nfine_tunings = client.fine_tunings.create(\n input_file_id=\"\u003cfile-id\u003e\",\n name=\"test-1\",\n type=\"lora\",\n base_model=\"llama-3.1-8b-instant\"\n)\n\nprint(fine_tunings)\n"},"response":"{\n \"id\": \"string\",\n \"object\": \"object\",\n \"data\": {\n \"id\": \"string\",\n \"name\": \"string\",\n \"base_model\": \"string\",\n \"type\": \"string\",\n \"input_file_id\": \"string\",\n \"created_at\": 0,\n \"fine_tuned_model\": \"string\"\n }\n}\n","title":"Default"}],"returns":"The newly created fine tune"}}},"/v1/fine_tunings/{id}":{"delete":{"operationId":"deleteFineTuning","parameters":[{"in":"path","name":"id","required":true,"schema":{"type":"string"}}],"responses":{"200":{"content":{"application/json":{"schema":{"$ref":"#/components/schemas/DeleteFineTuningResponse"}}},"description":"OK"}},"summary":"Deletes an existing fine tuning by id This endpoint is in closed beta. Contact us for more information.","tags":["Fine Tuning"],"x-groq-metadata":{"examples":[{"request":{"curl":"curl -X DELETE https://api.groq.com/v1/fine_tunings/:id -s \\\n -H \"Content-Type: application/json\" \\\n -H \"Authorization: Bearer $GROQ_API_KEY\"\n","js":"import Groq from \"groq-sdk\";\n\nconst groq = new Groq({ apiKey: process.env.GROQ_API_KEY });\n\nasync function main() {\n await groq.fine_tunings.delete({id: \"\u003cid\u003e\"});\n}\n\nmain();\n","py":"import os\n\nfrom groq import Groq\n\nclient = Groq(\n # This is the default and can be omitted\n api_key=os.environ.get(\"GROQ_API_KEY\"),\n)\n\nclient.fine_tunings.delete(id=\"\u003cid\u003e\")\n"},"response":"{\n \"id\": \"string\",\n \"object\": \"fine_tuning\",\n \"deleted\": true\n}\n","title":"Default"}],"returns":"A confirmation of the deleted fine tune"}},"get":{"operationId":"getFineTuning","parameters":[{"in":"path","name":"id","required":true,"schema":{"type":"string"}}],"responses":{"200":{"content":{"application/json":{"schema":{"$ref":"#/components/schemas/ReadFineTuningResponse"}}},"description":"OK"}},"summary":"Retrieves an existing fine tuning by id This endpoint is in closed beta. Contact us for more information.","tags":["Fine Tuning"],"x-groq-metadata":{"examples":[{"request":{"curl":"curl https://api.groq.com/v1/fine_tunings/:id -s \\\n -H \"Content-Type: application/json\" \\\n -H \"Authorization: Bearer $GROQ_API_KEY\"\n","js":"import Groq from \"groq-sdk\";\n\nconst groq = new Groq({ apiKey: process.env.GROQ_API_KEY });\n\nasync function main() {\n const fineTuning = await groq.fine_tunings.get({id: \"\u003cid\u003e\"});\n console.log(fineTuning);\n}\n\nmain();\n","py":"import os\n\nfrom groq import Groq\n\nclient = Groq(\n # This is the default and can be omitted\n api_key=os.environ.get(\"GROQ_API_KEY\"),\n)\n\nfine_tuning = client.fine_tunings.get(id=\"\u003cid\u003e\")\n\nprint(fine_tuning)\n"},"response":"{\n \"id\": \"string\",\n \"object\": \"object\",\n \"data\": {\n \"id\": \"string\",\n \"name\": \"string\",\n \"base_model\": \"string\",\n \"type\": \"string\",\n \"input_file_id\": \"string\",\n \"created_at\": 0,\n \"fine_tuned_model\": \"string\"\n }\n}\n","title":"Default"}],"returns":"A fine tune metadata object"}}}},"security":[{"api_key":[]}],"servers":[{"url":"https://api.groq.com"}],"x-groq-metadata":{"groups":[{"description":"","id":"chat","sections":[{"key":"createChatCompletion","path":"create","type":"endpoint"}],"title":"Chat","type":"endpoints"},{"description":"","id":"responses","sections":[{"key":"createResponse","path":"create","type":"endpoint"}],"title":"Responses (beta)","type":"endpoints"},{"description":"","id":"audio","sections":[{"key":"createTranscription","path":"transcription","type":"endpoint"},{"key":"createTranslation","path":"translation","type":"endpoint"},{"key":"createSpeech","path":"speech","type":"endpoint"}],"title":"Audio","type":"endpoints"},{"description":"","id":"models","sections":[{"key":"listModels","path":"list","type":"endpoint"},{"key":"retrieveModel","path":"retrieve","type":"endpoint"}],"title":"Models","type":"endpoints"},{"description":"","id":"batches","sections":[{"key":"createBatch","path":"create","type":"endpoint"},{"key":"retrieveBatch","path":"retrieve","type":"endpoint"},{"key":"listBatches","path":"list","type":"endpoint"},{"key":"cancelBatch","path":"cancel","type":"endpoint"}],"title":"Batches","type":"endpoints"},{"description":"","id":"files","sections":[{"key":"uploadFile","path":"upload","type":"endpoint"},{"key":"listFiles","path":"list","type":"endpoint"},{"key":"deleteFile","path":"delete","type":"endpoint"},{"key":"retrieveFile","path":"retrieve","type":"endpoint"},{"key":"downloadFile","path":"download","type":"endpoint"}],"title":"Files","type":"endpoints"},{"description":"","id":"fine-tuning","sections":[{"key":"listFineTunings","path":"list","type":"endpoint"},{"key":"createFineTuning","path":"create","type":"endpoint"},{"key":"getFineTuning","path":"get","type":"endpoint"},{"key":"deleteFineTuning","path":"delete","type":"endpoint"}],"title":"Fine Tuning","type":"endpoints"}]}},"children":"$L43"}]\n"])self.__next_f.push([1,“43:["$","$L5",null,{"parallelRouterKey":"children","error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":"$undefined","forbidden":"$undefined","unauthorized":"$undefined"}]\n”])self.__next_f.push([1,“44:I[77102,["9817","static/chunks/a80d2a2d-995f3588264c9768.js","4023","static/chunks/fba002aa-6277ce0237ac1df9.js","4620","static/chunks/4620-e1aaeee1f717953d.js","1583","static/chunks/1583-cd6cfde0c82061af.js","8172","static/chunks/8172-1bfde9a6a7f76d76.js","2845","static/chunks/2845-233d0b7733254462.js","6808","static/chunks/6808-c0c7b12c1b07a90a.js","1850","static/chunks/1850-70f7c54b75f154be.js","54","static/chunks/54-81fd6d34fa1b6f5c.js","3896","static/chunks/3896-85f929747408a502.js","6936","static/chunks/6936-39cb1754394917fe.js","6767","static/chunks/6767-da92d72583ce2ee5.js","1489","static/chunks/1489-14b8a5cb2f9ff04a.js","5183","static/chunks/5183-d52e57595b0209a4.js","9791","static/chunks/9791-73094ddf02612085.js","2376","static/chunks/2376-3cb5f78aefc53c11.js","4897","static/chunks/4897-15b235f72e0c3911.js","8932","static/chunks/8932-458d55afb25ec506.js","1530","static/chunks/1530-5c6e42810cf6f581.js","6672","static/chunks/6672-98691e627e9f180d.js","9867","static/chunks/9867-891ea7f43c3211da.js","1285","static/chunks/app/(console)/docs/(mdx-pages)/rate-limits/page-5c07fa1cd054cc3c.js"],"RateLimitsTableWrapper"]\n”])self.__next_f.push([1,“17:["$","$L44",null,{"freeRows":[["allam-2-7b","30","7K","6K","500K","-","-"],["canopylabs/orpheus-arabic-saudi","10","100","1.2K","3.6K","-","-"],["canopylabs/orpheus-v1-english","10","100","1.2K","3.6K","-","-"],["groq/compound","30","250","70K","-","-","-"],["groq/compound-mini","30","250","70K","-","-","-"],["llama-3.1-8b-instant","30","14.4K","6K","500K","-","-"],["llama-3.3-70b-versatile","30","1K","12K","100K","-","-"],["meta-llama/llama-4-maverick-17b-128e-instruct","30","1K","6K","500K","-","-"],["meta-llama/llama-4-scout-17b-16e-instruct","30","1K","30K","500K","-","-"],["meta-llama/llama-guard-4-12b","30","14.4K","15K","500K","-","-"],["meta-llama/llama-prompt-guard-2-22m","30","14.4K","15K","500K","-","-"],["meta-llama/llama-prompt-guard-2-86m","30","14.4K","15K","500K","-","-"],["moonshotai/kimi-k2-instruct","60","1K","10K","300K","-","-"],["moonshotai/kimi-k2-instruct-0905","60","1K","10K","300K","-","-"],["openai/gpt-oss-120b","30","1K","8K","200K","-","-"],["openai/gpt-oss-20b","30","1K","8K","200K","-","-"],["openai/gpt-oss-safeguard-20b","30","1K","8K","200K","-","-"],["qwen/qwen3-32b","60","1K","6K","500K","-","-"],["whisper-large-v3","20","2K","-","-","7.2K","28.8K"],["whisper-large-v3-turbo","20","2K","-","-","7.2K","28.8K"]],"devRows":[["allam-2-7b","300","60K","60K","-","-","-"],["canopylabs/orpheus-arabic-saudi","250","100K","50K","-","-","-"],["canopylabs/orpheus-v1-english","250","100K","50K","-","-","-"],["groq/compound","200","20K","200K","-","-","-"],["groq/compound-mini","200","20K","200K","-","-","-"],["llama-3.1-8b-instant","1K","500K","250K","-","-","-"],["llama-3.3-70b-versatile","1K","500K","300K","-","-","-"],["meta-llama/llama-4-maverick-17b-128e-instruct","1K","500K","300K","-","-","-"],["meta-llama/llama-4-scout-17b-16e-instruct","1K","500K","300K","-","-","-"],["meta-llama/llama-guard-4-12b","100","50K","30K","1M","-","-"],["meta-llama/llama-prompt-guard-2-22m","100","50K","30K","-","-","-"],["meta-llama/llama-prompt-guard-2-86m","100","50K","30K","-","-","-"],["moonshotai/kimi-k2-instruct","1K","500K","250K","-","-","-"],["moonshotai/kimi-k2-instruct-0905","1K","500K","250K","-","-","-"],["openai/gpt-oss-120b","1K","500K","250K","-","-","-"],["openai/gpt-oss-20b","1K","500K","250K","-","-","-"],["openai/gpt-oss-safeguard-20b","1K","500K","150K","-","-","-"],["qwen/qwen3-32b","1K","500K","300K","-","-","-"],["whisper-large-v3","300","200K","-","-","200K","4M"],["whisper-large-v3-turbo","400","200K","-","-","400K","4M"]],"headers":[{"title":"MODEL ID","className":"min-w-[300px]"},{"title":"RPM","tooltip":"Requests per minute","className":"min-w-10"},{"title":"RPD","tooltip":"Requests per day","className":"min-w-10"},{"title":"TPM","tooltip":"Tokens per minute","className":"min-w-10"},{"title":"TPD","tooltip":"Tokens per day","className":"min-w-10"},{"title":"ASH","tooltip":"Audio seconds per hour","className":"min-w-10"},{"title":"ASD","tooltip":"Audio seconds per day","className":"min-w-10"}]}]\n”])self.__next_f.push([1,“45:I[29583,["9817","static/chunks/a80d2a2d-995f3588264c9768.js","4023","static/chunks/fba002aa-6277ce0237ac1df9.js","1116","static/chunks/62399e41-d58e7f498626419e.js","6559","static/chunks/85434cc8-f36400b03831af1a.js","5375","static/chunks/07baea06-cb741fd4f7cf345b.js","4620","static/chunks/4620-e1aaeee1f717953d.js","1583","static/chunks/1583-cd6cfde0c82061af.js","8172","static/chunks/8172-1bfde9a6a7f76d76.js","2845","static/chunks/2845-233d0b7733254462.js","6808","static/chunks/6808-c0c7b12c1b07a90a.js","1489","static/chunks/1489-14b8a5cb2f9ff04a.js","5183","static/chunks/5183-d52e57595b0209a4.js","3436","static/chunks/3436-f8e1abf62b4293d2.js","8932","static/chunks/8932-458d55afb25ec506.js","7177","static/chunks/app/layout-d4a7aa9fe73a0e0a.js"],"PostHogClientProvider"]\n”])self.__next_f.push([1,“46:I[71817,["9817","static/chunks/a80d2a2d-995f3588264c9768.js","4023","static/chunks/fba002aa-6277ce0237ac1df9.js","1116","static/chunks/62399e41-d58e7f498626419e.js","6559","static/chunks/85434cc8-f36400b03831af1a.js","5375","static/chunks/07baea06-cb741fd4f7cf345b.js","4620","static/chunks/4620-e1aaeee1f717953d.js","1583","static/chunks/1583-cd6cfde0c82061af.js","8172","static/chunks/8172-1bfde9a6a7f76d76.js","2845","static/chunks/2845-233d0b7733254462.js","6808","static/chunks/6808-c0c7b12c1b07a90a.js","1489","static/chunks/1489-14b8a5cb2f9ff04a.js","5183","static/chunks/5183-d52e57595b0209a4.js","3436","static/chunks/3436-f8e1abf62b4293d2.js","8932","static/chunks/8932-458d55afb25ec506.js","7177","static/chunks/app/layout-d4a7aa9fe73a0e0a.js"],"ClientProviders"]\n”])self.__next_f.push([1,“47:I[75034,["9817","static/chunks/a80d2a2d-995f3588264c9768.js","4620","static/chunks/4620-e1aaeee1f717953d.js","1583","static/chunks/1583-cd6cfde0c82061af.js","8039","static/chunks/app/error-2add48631670b850.js"],"default"]\n48:I[51660,["4620","static/chunks/4620-e1aaeee1f717953d.js","8172","static/chunks/8172-1bfde9a6a7f76d76.js","2845","static/chunks/2845-233d0b7733254462.js","6808","static/chunks/6808-c0c7b12c1b07a90a.js","6767","static/chunks/6767-da92d72583ce2ee5.js","8932","static/chunks/8932-458d55afb25ec506.js","4345","static/chunks/app/not-found-cb685878c81b4b6b.js"],"default"]\n”])self.__next_f.push([1,“3a:["$","$L45",null,{"flags":{"ff-ssrl-enabled":false,"home-first-api-key-usage":"test","billing_limits":true,"home-cards":"test","function-calling":false,"home-hero-usage-redesign":"test","cards-ordering":"test","orion-remote-config":true,"survey-targeting-0ac5ca675d-custom":true,"ff-invoice-retry":true,"model-limits":true,"ff-use-plain-help":false,"server-side-mcp-preview":false,"ff-api-key-expiration":true,"ff-chat-scroll-trap":true,"top_loader":false,"survey-targeting-e28db87db8-custom":true,"projects":true,"ff-enable-sso-selfserve":true,"saved-prompts":false,"abab-testing":true,"data-controls":true,"use-new-api-key-usage-logic":false,"audit-logs":true,"survey-targeting-7869e3894d-custom":true,"billing-improved-upgrade-message":"test","data-controls-chat-completions":true},"userIdentity":{"id":"fa886070-8261-4f4a-af9a-4bb721e613bd","name":"$undefined","email":"$undefined","isIdentified":false},"children":["$","$L46",null,{"layoutPreferences":null,"children":["$","$L5",null,{"parallelRouterKey":"children","error":"$47","errorStyles":[],"errorScripts":[],"template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","$L48",null,{}],[]],"forbidden":"$undefined","unauthorized":"$undefined"}]}]}]\n”])`
TOOL: Skip to content Cloudflare Docs Search astro-island,astro-slot,astro-static-slot{display:contents}(()=>{var l=(n,t)=>{let i=async()=>{await(await n())()},e=typeof t.value=="object"?t.value:void 0,s={timeout:e==null?void 0:e.timeout};"requestIdleCallback"in window?window.requestIdleCallback(i,s):setTimeout(i,s.timeout||200)};(self.Astro||(self.Astro={})).idle=l;window.dispatchEvent(new Event("astro:idle"));})();(()=>{var A=Object.defineProperty;var g=(i,o,a)=>o in i?A(i,o,{enumerable:!0,configurable:!0,writable:!0,value:a}):i[o]=a;var d=(i,o,a)=>g(i,typeof o!="symbol"?o+"":o,a);{let i={0:t=>m(t),1:t=>a(t),2:t=>new RegExp(t),3:t=>new Date(t),4:t=>new Map(a(t)),5:t=>new Set(a(t)),6:t=>BigInt(t),7:t=>new URL(t),8:t=>new Uint8Array(t),9:t=>new Uint16Array(t),10:t=>new Uint32Array(t),11:t=>1/0*t},o=t=>{let[l,e]=t;return l in i?i[l](e):void 0},a=t=>t.map(o),m=t=>typeof t!="object"||t===null?t:Object.fromEntries(Object.entries(t).map(([l,e])=>[l,o(e)]));class y extends HTMLElement{constructor(){super(...arguments);d(this,"Component");d(this,"hydrator");d(this,"hydrate",async()=>{var b;if(!this.hydrator||!this.isConnected)return;let e=(b=this.parentElement)==null?void 0:b.closest("astro-island[ssr]");if(e){e.addEventListener("astro:hydrate",this.hydrate,{once:!0});return}let c=this.querySelectorAll("astro-slot"),n={},h=this.querySelectorAll("template[data-astro-template]");for(let r of h){let s=r.closest(this.tagName);s!=null&&s.isSameNode(this)&&(n[r.getAttribute("data-astro-template")||"default"]=r.innerHTML,r.remove())}for(let r of c){let s=r.closest(this.tagName);s!=null&&s.isSameNode(this)&&(n[r.getAttribute("name")||"default"]=r.innerHTML)}let p;try{p=this.hasAttribute("props")?m(JSON.parse(this.getAttribute("props"))):{}}catch(r){let s=this.getAttribute("component-url")||"<unknown>",v=this.getAttribute("component-export");throw v&&(s+=\ (export ${v})`),console.error(`[hydrate] Error parsing props for component ${s}`,this.getAttribute(“props”),r),r}let u;await this.hydrator(this)(this.Component,p,n,{client:this.getAttribute(“client”)}),this.removeAttribute(“ssr”),this.dispatchEvent(new CustomEvent(“astro:hydrate”))});d(this,“unmount”,()=>{this.isConnected||this.dispatchEvent(new CustomEvent(“astro:unmount”))})}disconnectedCallback(){document.removeEventListener(“astro:after-swap”,this.unmount),document.addEventListener(“astro:after-swap”,this.unmount,{once:!0})}connectedCallback(){if(!this.hasAttribute(“await-children”)||document.readyState===“interactive”||document.readyState===“complete”)this.childrenConnectedCallback();else{let e=()=>{document.removeEventListener(“DOMContentLoaded”,e),c.disconnect(),this.childrenConnectedCallback()},c=new MutationObserver(()=>{var n;((n=this.lastChild)==null?void 0:n.nodeType)===Node.COMMENT_NODE&&this.lastChild.nodeValue===“astro:end”&&(this.lastChild.remove(),e())});c.observe(this,{childList:!0}),document.addEventListener(“DOMContentLoaded”,e)}}async childrenConnectedCallback(){let e=this.getAttribute(“before-hydration-url”);e&&await import(e),this.start()}async start(){let e=JSON.parse(this.getAttribute(“opts”)),c=this.getAttribute(“client”);if(Astro[c]===void 0){window.addEventListener(`astro:${c}`,()=>this.start(),{once:!0});return}try{await Astro[c](async()=>{let n=this.getAttribute(“renderer-url”),[h,{default:p}]=await Promise.all([import(this.getAttribute(“component-url”)),n?import(n):()=>()=>{}]),u=this.getAttribute(“component-export”)||“default”;if(!u.includes("."))this.Component=h[u];else{this.Component=h;for(let f of u.split("."))this.Component=this.Component[f]}return this.hydrator=p,this.hydrate},e,this)}catch(n){console.error(`[astro-island] Error hydrating ${this.getAttribute(“component-url”)}`,n)}}attributeChangedCallback(){this.hydrate()}}d(y,“observedAttributes”,[“props”]),customElements.get(“astro-island”)||customElements.define(“astro-island”,y)}})();Docs DirectoryAPIsSDKsHelp Log in Select theme DarkLightAuto StarlightThemeProvider.updatePickers(); const r=“starlight-theme”,o=e=>e===“auto”||e===“dark”||e===“light”?e:“auto”,c=()=>o(typeof localStorage<“u”&&localStorage.getItem(r));function n(e){typeof localStorage<“u”&&localStorage.setItem(r,e===“light”||e===“dark”?e:"")}const l=()=>matchMedia("(prefers-color-scheme: light)").matches?“light”:“dark”;function t(e){StarlightThemeProvider.updatePickers(e),document.documentElement.dataset.theme=e===“auto”?l():e,n(e)}matchMedia("(prefers-color-scheme: light)").addEventListener(“change”,()=>{c()===“auto”&&t(“auto”)});class s extends HTMLElement{constructor(){super(),t(c()),this.querySelector(“select”)?.addEventListener(“change”,a=>{a.currentTarget instanceof HTMLSelectElement&&t(o(a.currentTarget.value))})}}customElements.define(“starlight-theme-select”,s); class s extends HTMLElement{constructor(){super();const e=this.querySelector(“select”);e&&(e.addEventListener(“change”,t=>{t.currentTarget instanceof HTMLSelectElement&&(window.location.pathname=t.currentTarget.value)}),window.addEventListener(“pageshow”,t=>{if(!t.persisted)return;const n=e.querySelector(“option[selected]”)?.index;n!==e.selectedIndex&&(e.selectedIndex=n??0)}))}}customElements.define(“starlight-lang-select”,s); class s extends HTMLElement{constructor(){super(),this.btn=this.querySelector(“button”),this.btn.addEventListener(“click”,()=>this.toggleExpanded());const t=this.closest(“nav”);t&&t.addEventListener(“keyup”,e=>this.closeOnEscape(e))}setExpanded(t){this.setAttribute(“aria-expanded”,String(t)),document.body.toggleAttribute(“data-mobile-menu-expanded”,t)}toggleExpanded(){this.setExpanded(this.getAttribute(“aria-expanded”)!==“true”)}closeOnEscape(t){t.code===“Escape”&&(this.setExpanded(!1),this.btn.focus())}}customElements.define(“starlight-menu-button”,s); Workers AI No results found. Try a different search term, or use our global search. (() => { try { if (!matchMedia(’(min-width: 50em)’).matches) return; /** @type {HTMLElement | null} */ const target = document.querySelector(‘sl-sidebar-state-persist’); const state = JSON.parse(sessionStorage.getItem(‘sl-sidebar-state’) || ‘0’); if (!target || !state || target.dataset.hash !== state.hash) return; window._starlightScrollRestore = state.scroll; customElements.define( ‘sl-sidebar-restore’, class SidebarRestore extends HTMLElement { connectedCallback() { try { const idx = parseInt(this.dataset.index || ‘’); const details = this.closest(‘details’); if (details && typeof state.open[idx] === ‘boolean’) details.open = state.open[idx]; } catch {} } } ); } catch {} })(); Overview Getting started Overview Workers Bindings REST API Dashboard Models Configuration Workers Bindings OpenAI compatible API endpoints Vercel AI SDK Hugging Face Chat UI Features Asynchronous Batch API Beta Overview Workers Binding REST API Function calling Overview Beta Embedded Overview Beta Get Started Examples Use fetch() handler Tools based on OpenAPI Spec Use KV API API Reference Troubleshooting Traditional Beta JSON Mode Fine-tunes Overview Using LoRA adapters Beta Public LoRA adapters Prompting Markdown Conversion Beta Guides Demos and architectures Tutorials Agents ↗ Platform Pricing Data usage Limits Glossary AI Gateway ↗ Errors Choose a data or storage product ↗ Event subscriptions Agents ↗ Playground ↗ Changelog REST API reference ↗ API LLM resources llms.txt prompt.txt Workers AI llms-full.txt Developer Platform llms-full.txt (() => { const scroller = document.getElementById(‘starlight__sidebar’); if (!window._starlightScrollRestore || !scroller) return; scroller.scrollTop = window._starlightScrollRestore; delete window._starlightScrollRestore; })(); GitHubX.comYouTube Select theme DarkLightAuto StarlightThemeProvider.updatePickers(); const a=document.getElementById(“starlight__sidebar”),n=a?.querySelector(“sl-sidebar-state-persist”),o=“sl-sidebar-state”,i=()=>{let t=[];const e=n?.dataset.hash||"";try{const s=sessionStorage.getItem(o),r=JSON.parse(s||"{}");Array.isArray(r.open)&&r.hash===e&&(t=r.open)}catch{}return{hash:e,open:t,scroll:a?.scrollTop||0}},c=t=>{try{sessionStorage.setItem(o,JSON.stringify(t))}catch{}},d=()=>c(i()),l=(t,e)=>{const s=i();s.open[e]=t,c(s)};n?.addEventListener(“click”,t=>{if(!(t.target instanceof Element))return;const e=t.target.closest(“summary”)?.closest(“details”);if(!e)return;const s=e.querySelector(“sl-sidebar-restore”),r=parseInt(s?.dataset.index||"");isNaN(r)||l(!e.open,r)});addEventListener(“visibilitychange”,()=>{document.visibilityState===“hidden”&&d()});addEventListener(“pageHide”,d); On this page Overview Features Related products More resources On this page Overview Features Related products More resources Tags AI Was this helpful? Edit Issue Directory Workers AI {"@context":“https://schema.org”,"@type":“BreadcrumbList”,“itemListElement”:[{"@type":“ListItem”,“position”:1,“item”:{"@id":"/directory/",“name”:“Directory”}},{"@type":“ListItem”,“position”:2,“item”:{"@id":"/workers-ai/",“name”:“Workers AI”}}]} class t extends HTMLElement{constructor(){super(),this.isManualToggle=!1,this.breadcrumbs=null,this.mainBemClass=null,this.totalWidth=0,this.resizeObserver=null,this.handleTruncatedButtonClick=()=>{this.breadcrumbs?.classList.remove(“is-truncated”),this.isManualToggle=!0},this.mainBemClass=this.dataset.mainBemClass||null;const e=this.dataset.id;!(“truncated"in this.dataset)||!e||(this.breadcrumbs=document.getElementById(e),this.initializeCrumbs(),this.setupResizeObserver())}initializeCrumbs(){this.breadcrumbs?.querySelectorAll(`.${this.mainBemClass}__crumb`)?.forEach(s=>{this.totalWidth+=s.offsetWidth})}setupResizeObserver(){this.resizeObserver=new ResizeObserver(e=>{e.forEach(s=>{this.checkOverflow(s.target.clientWidth)})}),this.breadcrumbs&&this.resizeObserver.observe(this.breadcrumbs)}connectedCallback(){this.showHiddenCrumbs()}disconnectedCallback(){this.resizeObserver&&this.breadcrumbs&&(this.resizeObserver.unobserve(this.breadcrumbs),this.resizeObserver.disconnect())}toggleTruncated(e){this.breadcrumbs?.classList.toggle(“is-truncated”,e)}showHiddenCrumbs(){const e=this.breadcrumbs?.querySelector(`.${this.mainBemClass}__truncated-button`);e?.removeEventListener(“click”,this.handleTruncatedButtonClick),e?.addEventListener(“click”,this.handleTruncatedButtonClick.bind(this))}checkOverflow(e){const s=this.totalWidth>e&&!this.isManualToggle;this.toggleTruncated(s),s||(this.isManualToggle=!1)}}customElements.get(“astro-breadcrumbs”)||customElements.define(“astro-breadcrumbs”,t); Copy page Cloudflare Workers AI Run machine learning models, powered by serverless GPUs, on Cloudflare’s global network. Available on Free and Paid plans Workers AI allows you to run AI models in a serverless way, without having to worry about scaling, maintaining, or paying for unused infrastructure. You can invoke models running on GPUs on Cloudflare’s network from your own code — from Workers, Pages, or anywhere via the Cloudflare API. Workers AI gives you access to: 50+ open-source models, available as a part of our model catalog Serverless, pay-for-what-you-use pricing model All as part of a fully-featured developer platform, including AI Gateway, Vectorize, Workers and more… Get started Watch a Workers AI demo Custom requirementsIf you have custom requirements like private custom models or higher limits, complete the Custom Requirements Form ↗. Cloudflare will contact you with next steps. Workers AI is now Generally AvailableTo report bugs or give feedback, go to the #workers-ai Discord channel ↗. If you are having issues with Wrangler, report issues in the Wrangler GitHub repository ↗. Features Models Workers AI comes with a curated set of popular open-source models that enable you to do tasks such as image classification, text generation, object detection and more. Browse models Related products AI Gateway Observe and control your AI applications with caching, rate limiting, request retries, model fallback, and more. Vectorize Build full-stack AI applications with Vectorize, Cloudflare’s vector database. Adding Vectorize enables you to perform tasks such as semantic search, recommendations, anomaly detection or can be used to provide context and memory to an LLM. Workers Build serverless applications and deploy instantly across the globe for exceptional performance, reliability, and scale. Pages Create full-stack applications that are instantly deployed to the Cloudflare global network. R2 Store large amounts of unstructured data without the costly egress bandwidth fees associated with typical cloud storage services. D1 Create new serverless SQL databases to query from your Workers and Pages projects. Durable Objects A globally distributed coordination API with strongly consistent storage. KV Create a global, low-latency, key-value data storage. More resources Get started Build and deploy your first Workers AI application. Plans Learn about Free and Paid plans. Limits Learn about Workers AI limits. Use cases Learn how you can build and deploy ambitious AI applications to Cloudflare’s global network. Storage options Learn which storage option is best for your project. Developer Discord Connect with the Workers community on Discord to ask questions, share what you are building, and discuss the platform with other developers. @CloudflareDev Follow @CloudflareDev on Twitter to learn about product announcements, and what is new in Cloudflare Workers. Was this helpful? Edit page Last updated: Aug 20, 2025 Next Getting started Resources API New to Cloudflare? Directory Sponsorships Open Source Support Help Center System Status Compliance GDPR Company cloudflare.com Our team Careers Tools Cloudflare Radar Speed Test Is BGP Safe Yet? RPKI Toolkit Certificate Transparency Community X Discord YouTube GitHub © 2025 Cloudflare, Inc. Privacy Policy Terms of Use Report Security Issues Trademark // eslint-disable-next-line @typescript-eslint/no-unused-vars function OptanonWrapper() {} Cookie Settings`

Domain	Purpose
`52vibes.blog`	Narrative, weekly posts, session logs, community
`52vibes.dev`	Packages, APIs, SDKs, runnable artifacts

AI: Based on my research, here’s a summary of free LLM APIs for client-side JavaScript:

Free LLM APIs for Browser-Based Use

Provider	Free Tier	Rate Limits	Browser-Safe?
Google Gemini	Gemini 2.5 Flash/Lite, 2.0 Flash	~30 RPM, 14K RPD	⚠️ API key exposed
Groq	Llama 3.3-70B, Llama 4 Scout, Gemma2, Whisper	30 RPM, 500K TPD	⚠️ API key exposed
Cloudflare Workers AI	10K neurons/day free	Per-model limits	✅ Can proxy via Worker
Hugging Face Inference	Free tier with rate limits	Varies by model	⚠️ Token exposed

The API Key Problem

Direct browser calls expose your API key — anyone can steal it from browser DevTools. Solutions:

Cloudflare Worker proxy (recommended for you)
- Free 100K requests/day
- Hides your API key server-side
- Add rate limiting per IP
- Already using Cloudflare Pages
User brings own key
- Store in localStorage, never on server
- Zero cost to you
- Power users already have Groq/Gemini keys

Recommended Architecture for 52vibes.dev

User's Browser → Cloudflare Worker (proxy) → Groq/Gemini API
                         ↓
              Rate limit by IP (KV)

Cost: $0 — Cloudflare Workers free tier + Groq/Gemini free tiers

Revised Use Case: Interactive Artifact Playground

Given free-tier constraints, best suited for:

Artifact	Model	Why It Works
Week 4 Ambiguity Resolver	Groq Llama 3.3-70B	Short prompts, fast responses
Week 10 Architecture from Nothing	Gemini 2.5 Flash	Long context (1M tokens free)
Week 16 Documentation Engine	Either	Moderate token usage

Want me to scaffold a Cloudflare Worker proxy that you can deploy alongside your blog?

Dev Domain For Interactive Tooling

TL;DR - Session insights

Session Log

Creative Use Cases for 52vibes.dev

1. Interactive Artifact Playground

2. Agent Capability Observatory

3. Human-Agent Protocol API

4. “Fork the Week” Development Hub

5. Agent Arena

My Recommendation: Option 4 — Development Hub

Free LLM APIs for Browser-Based Use

The API Key Problem

Recommended Architecture for 52vibes.dev

Revised Use Case: Interactive Artifact Playground