OpenAI launches Flex processing for cheaper, slower AI tasks
Flex processing: A cost-effective AI solution
OpenAI has introduced Flex processing, a new API option designed to provide more affordable AI model usage at the expense of slower response times. This strategic move helps the company better compete with rivals like Google in the increasingly cost-sensitive AI market.
How Flex processing works
Currently in beta for OpenAI’s o3 and o4-mini reasoning models, Flex processing targets non-critical workloads including:
- Model evaluations
- Data enrichment
- Asynchronous processing tasks
The trade-off for slower performance includes potential “occasional resource unavailability,” making it ideal for development and testing scenarios rather than production environments.
Significant cost reductions
Flex processing cuts API costs by exactly 50% across OpenAI’s reasoning models:
o3 model pricing
- Standard: $10/million input tokens, $40/million output tokens
- Flex: $5/million input tokens, $20/million output tokens
o4-mini model pricing
- Standard: $1.10/million input tokens, $4.40/million output tokens
- Flex: $0.55/million input tokens, $2.20/million output tokens
Competitive AI landscape
The launch comes as:
- Frontier AI costs continue rising
- Competitors release budget-friendly alternatives
Google recently unveiled Gemini 2.5 Flash, a reasoning model that matches DeepSeek’s R1 in performance while offering lower input token costs.
New verification requirements
OpenAI has implemented additional access controls:
- Developers in tiers 1-3 must complete ID verification for o3 model access
- Reasoning summaries and streaming API support now require verification
The company states these measures aim to prevent policy violations by bad actors while maintaining platform security.
This strategic pricing tier demonstrates OpenAI’s commitment to serving diverse developer needs while remaining competitive in the rapidly evolving AI market. Flex processing provides an attractive option for cost-conscious teams working on non-time-sensitive AI projects.