首页 500强 活动 榜单 商业 科技 商潮 专题 品牌中心
杂志订阅

OpenAI发布新“推理”模型和编程智能体

Jeremy Kahn
2025-04-18

该公司新发布两款AI“推理”模型o3和o4-mini,试图在AI领域维持其领先地位。

文本设置
小号
默认
大号
Plus(0条)

OpenAI联合创始人兼首席执行官萨姆·奥尔特曼。图片来源:Taylor Hill—FilmMagic

OpenAI发布了两款号称“迄今最强大”的AI推理模型,以及一款辅助编程的开源AI智能体,试图在竞争激烈的AI领域维持其领先地位。

这款名为Codex CLI的开源编程智能体,是OpenAI自2019年以来首次推出的重要开源工具。

另外两款新模型分别是其o3模型的完整版本(被OpenAI称为最先进的AI系统),以及一个体积更小但更高效的o4-mini模型。

OpenAI总裁格雷格·布罗克曼在周三的新品发布会上表示:“这是首批被顶尖科学家认可能够真正产生有价值、有创新性的想法的模型。”

这些模型将即刻向付费用户开放,包括ChatGPT Plus和Pro服务的订阅用户,以及使用企业版Teams和API产品的机构。

此次新模型的发布正值OpenAI维持AI领域领先地位面临压力的时刻。今年早些时候,中国公司深度求索(DeepSeek)打破了人们对OpenAI等美国AI实验室长期技术领先的这一固有认知。深度求索的R1模型不仅具备OpenAI o系列模型的“思维链”推理能力,更凭借其开源特性(可免费下载和轻松定制)赢得众多企业青睐。相较之下,OpenAI的多数模型只能通过专有应用程序编程接口(API)付费访问。

与此同时,OpenAI还面临其他闭源模型提供商更加激烈的竞争。2月,AI公司Anthropic率先推出一款模型,既能快速提供类似直觉反应的回答,也能根据提示语要求进行“思维链”逐步推理。这种动态决定何时需要推理和何时需要更快提供回答的能力,正是OpenAI尚未攻克的难题。上个月,谷歌(Google)发布了Gemini 2.5 Pro推理模型,在多项基准测试中击败了OpenAI的o3-mini模型。

周三,OpenAI试图重新夺回在推理模型领域的领先地位。OpenAI宣称其o3和o4-mini模型现已在多项基准测试中领先,尽管这些结果尚未获得第三方验证。该公司还宣称,其模型能够自主调用其他软件工具,如网页浏览、编程环境等,无需用户特别给出指令。

OpenAI在周三的直播中演示了o3模型的能力。研究人员展示了o3模型在分析2015年一份物理研究海报的照片后,自主进行网页搜索,查找更多最新相关研究并对比研究结果。他们还展示了模型自主决定运行Python代码解决数学和编程难题的场景。

OpenAI表示,o3和o4-mini模型可直接对草图、图表甚至模糊的低质量照片进行视觉推理,并能在推理过程中操作图像处理。

而Codex CLI编程智能体设计为本地运行,通过云端接入o3和o4-mini模型进行推理,同时支持调用其他本地部署的软件工具。Codex CLI不仅能建议代码片段,还能自主选择使用不同工具来完成任务。

公司还透露,Codex CLI即将可以使用本周早些时候发布的GPT-4.1模型的功能。

为鼓励开发者使用Codex CLI,OpenAI设立了100万美元基金,将为有潜力的项目提供价值2.5万美元的API积分支持。

OpenAI表示,训练o3模型所使用的算力是前代最强推理模型o1的10倍。(财富中文网)

译者:刘进龙

审校:汪皓

OpenAI发布了两款号称“迄今最强大”的AI推理模型,以及一款辅助编程的开源AI智能体,试图在竞争激烈的AI领域维持其领先地位。

这款名为Codex CLI的开源编程智能体,是OpenAI自2019年以来首次推出的重要开源工具。

另外两款新模型分别是其o3模型的完整版本(被OpenAI称为最先进的AI系统),以及一个体积更小但更高效的o4-mini模型。

OpenAI总裁格雷格·布罗克曼在周三的新品发布会上表示:“这是首批被顶尖科学家认可能够真正产生有价值、有创新性的想法的模型。”

这些模型将即刻向付费用户开放,包括ChatGPT Plus和Pro服务的订阅用户,以及使用企业版Teams和API产品的机构。

此次新模型的发布正值OpenAI维持AI领域领先地位面临压力的时刻。今年早些时候,中国公司深度求索(DeepSeek)打破了人们对OpenAI等美国AI实验室长期技术领先的这一固有认知。深度求索的R1模型不仅具备OpenAI o系列模型的“思维链”推理能力,更凭借其开源特性(可免费下载和轻松定制)赢得众多企业青睐。相较之下,OpenAI的多数模型只能通过专有应用程序编程接口(API)付费访问。

与此同时,OpenAI还面临其他闭源模型提供商更加激烈的竞争。2月,AI公司Anthropic率先推出一款模型,既能快速提供类似直觉反应的回答,也能根据提示语要求进行“思维链”逐步推理。这种动态决定何时需要推理和何时需要更快提供回答的能力,正是OpenAI尚未攻克的难题。上个月,谷歌(Google)发布了Gemini 2.5 Pro推理模型,在多项基准测试中击败了OpenAI的o3-mini模型。

周三,OpenAI试图重新夺回在推理模型领域的领先地位。OpenAI宣称其o3和o4-mini模型现已在多项基准测试中领先,尽管这些结果尚未获得第三方验证。该公司还宣称,其模型能够自主调用其他软件工具,如网页浏览、编程环境等,无需用户特别给出指令。

OpenAI在周三的直播中演示了o3模型的能力。研究人员展示了o3模型在分析2015年一份物理研究海报的照片后,自主进行网页搜索,查找更多最新相关研究并对比研究结果。他们还展示了模型自主决定运行Python代码解决数学和编程难题的场景。

OpenAI表示,o3和o4-mini模型可直接对草图、图表甚至模糊的低质量照片进行视觉推理,并能在推理过程中操作图像处理。

而Codex CLI编程智能体设计为本地运行,通过云端接入o3和o4-mini模型进行推理,同时支持调用其他本地部署的软件工具。Codex CLI不仅能建议代码片段,还能自主选择使用不同工具来完成任务。

公司还透露,Codex CLI即将可以使用本周早些时候发布的GPT-4.1模型的功能。

为鼓励开发者使用Codex CLI,OpenAI设立了100万美元基金,将为有潜力的项目提供价值2.5万美元的API积分支持。

OpenAI表示,训练o3模型所使用的算力是前代最强推理模型o1的10倍。(财富中文网)

译者:刘进龙

审校:汪皓

OpenAI has released two AI “reasoning” models that it says are its most capable yet as well as an open-source AI agent that helps computer programmers code, as the company seeks to gain a lead over its rivals.

The open-source coding agent, called Codex CLI, marks the first time since 2019 that OpenAI has introduced a significant open-source tool.

The other new models are the full-scale version of its o3 model, which OpenAI says is its most advanced AI system, as well as a smaller, but more efficient model called o4-mini.

“These are the first models where top scientists tell us they produce legitimately good and useful novel ideas,” OpenAI president Greg Brockman said in announcing the new products on Wednesday.

The models will be immediately available to users of its paid ChatGPT Plus and Pro services, as well as organizations that use its enterprise-focused Teams and API products.

The release of the new models comes at a time when OpenAI faces pressure to show it remains at the forefront of AI development. Earlier this year, China’s DeepSeek upended conventional wisdom about the technological edge U.S. AI labs such as OpenAI enjoyed for years. DeepSeek’s R1 mimicked the “chain of thought” reasoning that OpenAI’s o-series models offer. The fact that DeepSeek’s R1 was also an open model—meaning people could download it for free and customize it easily—has tilted many enterprises in favor of deploying such open-source models. Most of OpenAI’s models, in contrast, can only be accessed on a paid basis through a proprietary application programming interface (API).

At the same time, OpenAI has also faced increased competition from other proprietary model providers. In February, AI company Anthropic became the first to offer a model that combines quick, intuition-like answers with the ability to also perform “chain of thought” step-by-step reasoning if a prompt requires it. The ability to decide when reasoning is required and when a faster answer will do is a trick OpenAI has yet to match. Then, last month, Google unveiled its Gemini 2.5 Pro model, a reasoning model that beat OpenAI’s o3-mini model on numerous benchmarks.

On Wednesday, OpenAI moved to try to retake the lead in reasoning models. The company says its o3 and o4-mini models now top various benchmarks—although none of those results has yet been independently verified. It also says the models have the ability to autonomously use other software tools, such as web browsing and coding environments, without having to be specifically prompted to do so by a user.

In a demo of o3’s capabilities that OpenAI livestreamed Wednesday, AI researchers showed o3 analyzing a photo of a physics research poster from 2015 and then searching the web autonomously to find more recent relevant research and comparing the results. They also showed it autonomously deciding to run Python code to solve various math and coding challenges.

OpenAI said o3 and o4-mini have the ability to reason directly about visual information, such as sketches, diagrams, or photos—even ones that might be blurry or of poor quality. The company said the models also knew how to manipulate photos as part of their reasoning process.

Meanwhile, the new Codex CLI coding agent is designed to run on a user’s device, tapping a cloud-based connection to OpenAI’s o3 and o4-mini models to help it reason, but then also allowing it to use other software tools deployed locally. Codex CLI doesn’t just suggest lines of code, it can autonomously decide to use a variety of different tools to help it complete a task.

The company said Codex CLI would also soon be able to tap the capabilities of the GPT-4.1 model that it released earlier this week.

To encourage developers to experiment with Codex CLI, OpenAI said it had set up a $1 million fund that will disburse $25,000 grants in API credits to promising projects.

OpenAI said o3 used about 10 times as much computing power to train as it took to create its o1 model, its previous best reasoning model.

财富中文网所刊载内容之知识产权为财富媒体知识产权有限公司及/或相关权利人专属所有或持有。未经许可,禁止进行转载、摘编、复制及建立镜像等任何使用。
0条Plus
精彩评论
评论

撰写或查看更多评论

请打开财富Plus APP

前往打开
Baidu
map