OpenAI Deep Research: How it Compares to Perplexity and Gemini

Lina Lam's headshotLina Lamยท February 15, 2025

๐Ÿ’ก Latest OpenAI update

As of April 2, Deep Research is now available to all ChatGPT Plus, Teams, Edu, and Enterprise users. There are rumored plans to roll out to Free users in the near future.

OpenAI's release of Deep Research came just as the AI community was processing the impact of DeepSeek R1 and its advancements. This timing led many to view it as OpenAI's response to the growing threat of open-source tools like DeepSeek.

Interestingly, OpenAI wasn't the first company to enter the research automation spaceโ€”Google had already introduced Gemini Deep Research shortly before.

OpenAI Deep Research

Moreover, an open-source alternative to OpenAI's Deep Research emerged just 12 hours after its release, gaining positive impressions from the developer community. Less than a month later, Perplexity Deep Research was also launched.

In an ocean filled with "deep researchers," OpenAI has introduced a heavyweight contender-one that researches "deeper" than most.

This article takes a close look at OpenAI's Deep Research, examining how it works, its strengths, limitations, benchmark results, and comparisons with the similarly named Gemini Deep Research and Perplexity Deep Research. Let's see if it truly delivers on its promises.

What is OpenAI Deep Research?

Deep Research is an AI-powered automated research agent designed for users who need in-depth analysis of complex topics. Unlike standard LLM outputs that rely on pre-trained knowledge, Deep Research:

  • โœ… Accesses and synthesizes real-time web data through online source browsing
  • โœ… Conducts multi-step reasoning to answer queries requiring deeper context
  • โœ… Generates long-form reports with citations and detailed explanations

Who should use Deep Research?

Deep Research is designed for professionals in fields that require extensive information retrieval, including:

  • Finance โ€” Competitive market analysis, investment research
  • Science & Engineering โ€” Research synthesis, literature reviews
  • Policy & Law โ€” Legal case studies, policy analysis
  • Business & E-Commerce โ€” Product comparisons, consumer insights

๐Ÿ’ก Track your LLM while you wait for API access

While Deep Research isn't API-accessible yet, you can monitor LLMs from all major providers. Get visibility into your app's costs, performance, and usage patterns built with standard LLM APIs.

How OpenAI Deep Research Works

Deep Research is built on an OpenAI o3 model that has been optimized for web browsing, data analysis, and multi-step reasoning.

It uses end-to-end reinforcement learning to handle complex search and synthesis tasks, effectively combining LLM reasoning capabilities with real-time web browsing.

Here is how it works:

  1. Query Interpretation & Clarification
    • Deep Research analyzes the user's query and requests clarifications when needed (e.g., specific location for price comparisons)
  2. Web Scraping & Data Extraction
    • Retrieves and processes top-ranked search results
    • Extracts relevant information from multiple sources
  3. Analysis & Synthesis
    • Summarizes findings and identifies patterns
    • Performs multi-document summarization with citation tracking
    • Analyzes and visualizes tabular data and figures using Python
  4. Report Generation
    • Creates structured reports with proper citations
    • Incorporates generated images, tables, and charts

OpenAI Deep Research Benchmark Results

According to OpenAI's official results, the Deep Research model significantly outperforms previous models on key benchmarks.

Humanity's Last Exam

Humanity's Last Exam (HLE) is a rigorous AI benchmark that evaluates LLMs across a broad range of expert-level academic subjects.

The Humanity's Last Exam encompasses disciplines including classics, ecology, law, and mathematics, testing how well AI systems handle questions that challenge even seasoned domain experts.

This benchmark measures accuracy on "Expert-Level" questions across more than 100 subjects.

Humanity's Last Exam

ModelAccuracy (%)
OpenAI Deep Research26.6
Perplexity Deep Research21.1
OpenAI o3-mini (high)13.0
DeepSeek-R19.4
OpenAI o19.1
Gemini Thinking6.2
Claude 3.5 Sonnet4.3
Grok-23.8
GPT-4o3.3

GAIA Benchmark

GAIA (General AI Assistant Benchmark) is a comprehensive evaluation framework designed to assess AI assistants on real-world problem-solving tasks. GAIA evaluates an AI system's capabilities in complex reasoning, multimodal input processing, web browsing, and tool utilization.

Model ConfigurationLevel 1Level 2Level 3Avg. Accuracy
Previous top results67.9267.4442.3163.64
Deep Research78.6673.2158.0372.57

Unlike traditional AI benchmarks that focus on professional skill-based evaluations, GAIA challenges AI systems with tasks that humans find straightforward but remain challenging for current models.

For example, while GPT-4 with plugins achieves only 15% accuracy, human respondents score 92%, highlighting a significant gap in AI performance on practical, reasoning-based tasks.

OpenAI Deep Research: Strengths & Limitations

StrengthsLimitations
โœ”๏ธ Detailed Summarization: Effectively extracts and synthesizes complex concepts๐Ÿ†‡ Hallucinations: May fabricate sources, misinterpret data, or cite incorrect factsโ€”which can be hidden in lengthy reports
โœ”๏ธ Data Accuracy: References are consistently accurate, especially for structured data๐Ÿ†‡ Information Consistency: Can produce contradictory information, show bias, or reference outdated data
โœ”๏ธ Query Processing: Effectively refines and processes multi-step queries๐Ÿ†‡ Original Analysis: Struggles with generating novel hypotheses or interpreting nuanced academic discussions
โœ”๏ธ Time Efficiency: Reduces hours of manual research to minutes with high-quality sources

OpenAI Deep Research vs. Gemini Deep Research vs. Perplexity Deep Research

Let's compare OpenAI Deep Research with Gemini Deep Research and Perplexity Deep Research (launched in February 2025).

TL;DR

  • OpenAI Deep Research offers the most powerful capabilities but comes at a premium price; ideal for technical and academic research
  • Google's Deep Research provides a more affordable option but is susceptible to SEO-driven biases and citation inaccuracies
  • Perplexity Deep Research delivers the fastest results and includes a free tier, making it perfect for quick, structured research with reliable citations

Detailed Comparison: OpenAI Deep Research Pricing & Features

OpenAIGooglePerplexity
Cost$200/month$20/monthFree (5 queries/day) or $20/month
Level of DetailComprehensive, detailed reportsConcise reportsStructured, concise summaries
Search SourcesWebsites and academic papersPrimarily website contentAcademic papers and real-time data
AccuracyHigh accuracy with occasional errorsMore susceptible to SEO biasHigh accuracy, slightly below OpenAI
Citation QualityGenerally reliable with some errorsOccasionally cites unrelated sourcesConsistently reliable citations
Best Use CasesTechnical and academic researchGeneral web researchResearch, journalism, real-time analysis
Input SupportText, images, PDFs, spreadsheetsPrimarily text-basedText queries, limited file handling
Output FormatDetailed reports with sources and visualsReports with key findingsConcise summaries with inline citations
Process TransparencyShows detailed reasoning stepsUses predetermined research pathsDisplays reasoning and search progression
Processing Time5-30 minutes per queryUnder 15 minutes2-4 minutes per query

OpenAI Deep Research is more capable and feature-packed, but so far, all the models struggle with reliability. In any case, you must understand its limitations and be prepared to work with them.

Is OpenAI's Deep Research Worth It?

The value of OpenAI's Deep Research depends on your specific needs.

Recommended forNot Recommended for
โœ”๏ธ Researchers working on complex, niche topics
โœ”๏ธ Projects requiring synthesis of scattered data
โœ”๏ธ Tasks needing comprehensive topic reports rather than quick answers
๐Ÿ†‡ Simple fact-checking queries (standard GPT-4o suffices)
๐Ÿ†‡ Financial, legal, or medical reports requiring absolute accuracy

While OpenAI's Deep Research commands a premium price, it's now available to Plus users, with plans to extend access to Free tier users in the near future.

When will Deep Research be available to Free and Plus users?

Accessing OpenAI Deep Research

As of February 25, 2025, OpenAI has announced that Deep Research is available to all ChatGPT Pro, Plus, Teams, Edu, and Enterprise users. Plus, Team, Enterprise, and Edu users receive 10 deep research queries per month, while Pro users get 120 queries per month.

OpenAI has indicated plans to extend Deep Research to Free users in the near future. Please check OpenAI's website for the most current updates.

Free Alternatives to OpenAI Deep Research

For those who find the $200/month price tag too steep, several free alternatives to OpenAI Deep Research are available:

  1. HuggingFace has developed an open-source Deep Research implementation
  2. The Open Deep Research project has gained over 10,000 GitHub stars
  3. Perplexity's Deep Research offers a free tier with limited queries, while Pro users get unlimited access

Accessing Perplexity Deep Research

Open Deep Research vs. OpenAI Deep Research

Open Deep Research (2nd option mentioned above) is an AI-powered research assistant that performs iterative, deep research by leveraging search engines, web scraping, and large language models (LLMs).

Unlike OpenAI's Deep Research, it is designed as a lightweight and highly customizable tool for developers who need full control over their research pipeline.

Key Features include:

  • Iterative Research: Generates search queries, processes results, and refines research direction over time.
  • Intelligent Query Generation: Uses LLMs to produce targeted search queries based on research goals.
  • Depth & Breadth Control: Users can configure how deep (iterations) and broad (query diversity) the research expands.
  • Smart Follow-ups: Dynamically generates follow-up questions to refine research insights.
  • Comprehensive Reports: Produces structured markdown reports containing key findings and sources.
  • Concurrent Processing: Handles multiple searches simultaneously for increased efficiency.

Learn how to set up and use Open Deep Research via the official docs.

Track Research Models with Helicone ๐Ÿ’ก

While OpenAI and Gemini Deep Research are unavailable via API, Helicone can help you monitor and optimize other research models like Open Deep Research.

Conclusion

OpenAI Deep Research represents an ambitious step toward automated AI-driven research. While its high cost and occasional factual inconsistencies suggest it won't replace human researchers anytime soon, many users report finding it to be a powerful research assistant. If this aligns with your needs, it may be worth exploring!

You might find these useful:

Frequently Asked Questions

How long does Deep Research take to generate a report?

Deep Research typically takes 5-30 minutes per query, with processing time varying based on topic complexity and data volume.

What kind of data can Deep Research access?

Deep Research can browse the open web and analyze uploaded files, but currently cannot access private, subscription-based, or internal resources. This capability is under development.

When should I use Deep Research vs. Search?

Use Search for quick facts, news, weather, or summaries (instant results). Use Deep Research for comprehensive analysis requiring multiple sources and structured reports (longer processing time).

How do I use Deep Research?

In ChatGPT, select 'Deep Research' and enter your query. You can attach files, images, or spreadsheets for additional context. Deep Research may ask clarifying questions before processing your request in the background to create a structured report.

Can I use Helicone to track Deep Research usage?

Currently, OpenAI Deep Research API is not available, so direct tracking is not possible. However, Helicone can track other AI-powered research models, including Open Deep Research, OpenAI's API-based models, and self-hosted LLMs.


Questions or feedback?

Are the information out of date? Please raise an issue or contact us, we'd love to hear from you!