GPT.5.4

OpenAI Launches GPT-5.4, Its Most Advanced Work Model Yet

GPT-5.4 introduces native computer-use abilities, stronger reasoning for professional tasks, and improved factual accuracy, as OpenAI pushes deeper into AI-driven workplace automation.

The new model introduces native computer-use abilities, stronger reasoning for professional tasks, and improved factual accuracy, as OpenAI pushes deeper into AI-driven workplace automation.

OpenAI has introduced GPT-5.4, the latest model in its GPT-5 series, expanding its capabilities for professional knowledge work, software interaction, and AI agents that can carry out multi-step digital tasks. The model is being deployed across ChatGPT, OpenAI’s API, and Codex as part of the company’s effort to make AI systems more useful for everyday work.

According to OpenAI, the new model improves significantly on GPT-5.2 in areas such as reasoning, reliability, and the ability to complete complex tasks that involve documents, spreadsheets, and web-based workflows.

On the company’s GDPval benchmark, which evaluates AI performance across tasks representing dozens of occupations, GPT-5.4 matched or outperformed human professionals in 83 percent of comparisons. That figure is up from 70.9 percent for GPT-5.2, suggesting improvements in how the model handles real-world knowledge work.

The model also showed stronger performance in financial modeling tests designed to simulate the work of junior investment banking analysts. GPT-5.4 scored 87.3 percent on spreadsheet modeling tasks compared with 68.4 percent for its predecessor.

OpenAI says users also preferred GPT-5.4 outputs in tests involving documents, presentations, and spreadsheets. Human reviewers selected GPT-5.4 responses 68 percent of the time, citing clearer structure and improved formatting.

One of the major changes in the new model is its ability to interact directly with computer interfaces. It can interpret screenshots and issue keyboard or mouse commands, allowing developers to build AI agents that perform tasks across software applications and operating systems.

ALSO READ: OpenAI Launches Framework to Measure AI’s Long-Term Impact on Learning

In tests measuring how AI systems navigate digital environments, GPT-5.4 achieved a 75 percent success rate on the OSWorld-Verified benchmark. That result surpassed GPT-5.2’s score of 47.3 percent and slightly exceeded the human baseline of 72.4 percent.

The model also improved performance in web-based task execution. On the WebArena-Verified benchmark, GPT-5.4 achieved a success rate of 67.3 percent using both screenshot-based interaction and direct page structure analysis.

OpenAI says GPT-5.4 also produces fewer factual errors. In evaluations based on user-flagged mistakes, individual claims generated by the model were 33 percent less likely to be incorrect, while full responses were 18 percent less likely to contain false information compared with GPT-5.2.

The company also expanded the model’s visual reasoning capabilities. GPT-5.4 scored 81.2 percent on the MMMU-Pro benchmark, which measures multimodal reasoning across text and images. The model can also process high-resolution images up to 10.24 million pixels in full-fidelity mode.

GPT-5.4 will be available in several configurations. A reasoning-focused version called GPT-5.4 Thinking will be accessible to ChatGPT Plus, Team, and Pro users, while developers will be able to access the model through the OpenAI API and Codex tools. The API version supports context windows of up to one million tokens, enabling longer documents and complex multi-step tasks.

OpenAI said GPT-5.4 will carry higher per-token pricing than GPT-5.2 through its API, though improved efficiency means many tasks may require fewer tokens overall.

The release reflects a broader shift in how advanced AI models are being positioned. Instead of focusing only on conversational abilities, companies are now building systems designed to perform structured work across digital tools and software environments.

GPT-5.4, in that sense, represents OpenAI’s attempt to move AI beyond chat interfaces and toward systems capable of planning, executing, and verifying complex tasks across the workplace.

Disclaimer: This news article is based on a release published by OpenAI.

Avatar photo
NN Desk

Leave a Reply

Your email address will not be published. Required fields are marked *

Stay updated with NervNow Weekly

Subscribe now