OpenAI Releases GPT-5.5, Targeting Agentic Coding and Scientific Research

The model delivers state-of-the-art benchmark results in software engineering and data analysis while matching OpenAI predecessor per-token serving latency.

OpenAI released GPT-5.5 on April 23, its newest large language model, making it available to paying ChatGPT and Codex subscribers and positioning it as the company’s most capable model for extended, multi-step tasks.

The company said the model is designed to handle agentic workflow assignments that require planning, tool use, error checking, and sustained execution without step-by-step human guidance. GPT-5.5 is available to Plus, Pro, Business, and Enterprise subscribers in ChatGPT and Codex; API access at $5 per million input tokens and $30 per million output tokens is also now live as of April 24.

On Terminal-Bench 2.0, which evaluates complex command-line workflows, GPT-5.5 scored 82.7%, compared with 75.1% for GPT-5.4. On SWE-Bench Pro, which measures the resolution of real GitHub issues, it reached 58.6%. OpenAI said the model achieves those results while matching GPT-5.4’s per-token latency and using fewer tokens on equivalent Codex tasks.

The gains carry into professional and knowledge-work settings. On GDPval, a benchmark assessing structured output quality across 44 occupations, GPT-5.5 scored 84.9%. On OSWorld-Verified, which tests autonomous computer operation, it reached 78.7%. On Tau2-bench Telecom, which simulates complex customer-service workflows, it scored 98.0% without prompt tuning, up from 92.8% for GPT-5.4.

Michael Truell, co-founder and CEO of Cursor, said in a statement on OpenAI’s blog that GPT-5.5 “is noticeably smarter and more persistent than GPT-5.4, with stronger coding performance and more reliable tool use.”

OpenAI said teams inside the company are already running the model in production. The finance team used GPT-5.5 in Codex to process 24,771 K-1 tax forms spanning 71,637 pages. The company’s communications group used it to build a scoring framework for speaking requests and automate low-risk approvals via a Slack agent.

ALSO READ: Stanford Professor’s Startup Human Intelligence Seeks $1 Billion Valuation

On scientific benchmarks, GPT-5.5 scored 80.5% on BixBench, which evaluates real-world bioinformatics analysis, and 25.0% on GeneBench, a multi-stage genetics evaluation where GPT-5.4 had scored 19.0%. OpenAI said a version of the model, trained with a custom research setup, produced a new proof of a result on off-diagonal Ramsey numbers in combinatorics, which was later verified using the Lean proof assistant. It’s unclear how broadly that kind of mathematical contribution will generalize across disciplines at this stage.

Separately, OpenAI classified GPT-5.5’s cybersecurity and biological or chemical capabilities as “High” under its Preparedness Framework, one level below “Critical.” The company said it deployed stricter automated classifiers for cybersecurity-related requests at launch and is offering a Trusted Access for Cyber pathway at chatgpt.com/cyber for verified security professionals who require expanded access.

In Codex, GPT-5.5 ships with a 400,000-token context window across Plus, Pro, Business, Enterprise, Edu, and Go plans. Fast mode, which generates tokens 1.5 times faster, is available at 2.5 times the standard cost. GPT-5.5 Pro, priced at $30 per million input tokens and $180 per million output tokens in the API, is rolling out to Pro, Business, and Enterprise users.

OpenAI said GPT-5.5 was co-designed with and served on NVIDIA GB200 and GB300 NVL72 systems. The company added that Codex helped its infrastructure team analyze production traffic and write load-balancing heuristics, improving token generation speeds by more than 20%.

Disclaimer: This news is based on publicly available information. NervNow has not independently verified any claims.

MORE ON OPENAI
Nitin Bawankule to Join OpenAI as Head of Enterprise Sales, India
OpenAI Acquires TBPN to Strengthen AI Communication and Industry Narrative
OpenAI Raises $122 Billion as It Expands AI Infrastructure