OpenAI · GPT-5

GPT-5.4

OpenAI's flagship — broadest modality and ecosystem coverage.

GPT-5.4 is a multimodal model in OpenAI's GPT-5 series released in 2023. It supports text, image, and file inputs, balancing cost and functionality to cater to developers, businesses, and technical teams. The model features a 1,050,000-token context window, which boosts its ability to process and maintain coherent understanding of extensive content, making it ideal for complex workflows.

Technically, GPT-5.4 is built on OpenAI's enhanced transformer architecture. It employs optimizations for long-context reasoning and multimodal comprehension, enabling smooth handling of mixed-media tasks, long-form texts, and high-accuracy outputs. This positions it as a reliable choice for a variety of advanced applications across industries, bridging high performance and computational efficiency.

GPT-5.4 is a mid-tier 'workhorse' model in the GPT-5 family, positioned between cost-conscious variants and high-end flagship options. It offers a significant improvement in multimodal processing and context window size compared to its predecessors, allowing for superior performance and broader task handling at a balanced price point.

Background

GPT-5 is a multimodal large language model developed by OpenAI and the fifth in its series of generative pre-trained transformer (GPT) foundation models. Preceded in the series by GPT-4, it was launched on August 7, 2025. It is publicly accessible to users of the chatbot products ChatGPT and Microsoft Copilot as well as to developers through the OpenAI API.

Wikipedia

Specs

Context window: 1.1M tokens
Max output: 128K tokens
Input ($/1M tokens): $2.50
Output ($/1M tokens): $15.00
Modalities: Text · Image · File
Weights: Closed

Pricing last synced Apr 27, 2026 via OpenRouter. Confirm against official docs before committing.

Capabilities

Tool use
Vision
Extended thinking
Prompt caching
Open weights

What it excels at

Large context window
Supports inputs up to 1,050,000 tokens for extended coherence and long-form processing.
Multimodal capabilities
Handles text, image, and file inputs seamlessly, enabling cross-media tasks.
Advanced coherence management
Maintains high narrative and logical coherence across extended content or dialogue.
Versatile performance
Proficient across diverse workflows, from summarization to creative generation.

When to use this model

→Long-form content generation — Processes extensive narratives or technical documents with sustained coherence.
→Multimodal workflows — Integrates and analyzes text, image, and file inputs for seamless cross-media functionality.
→Comprehensive document summarization — Efficiently condenses and synthesizes large-scale, complex documents.
→Dialogue systems in complex domains — Enables sophisticated virtual assistants with nuanced understanding and context management.

Analysis synthesized from gpt-4o, llama-4-maverick, phi-4, etc.

Recent coverage

What people are saying

Epoch confirms GPT5.4 Pro solved a frontier math open problem

HN↑ 480706 comments

GPT‑5.4 Mini and Nano

HN↑ 248145 comments

GPT-5.4

HN↑ 1571 comments

I gave several AIs money to invest in the stock market

r/ClaudeAI↑ 1284165 comments

API model id

gpt-5

Vendor docs: platform.openai.com/docs

Compare GPT-5.4 with