Google Gemini Vs ChatGPT: Everything You Need To Know [Jan 2024]

Google released Gemini 1.0 on 6 December 2023 with three variants: Gemini Pro, Ultra, and Nano.

Gemini Ultra outperforms GPT-4 on 30 out of 32 standard benchmarks. It’s the best in the world at coding, and the first to perform better than a human expert on MMLU. It also supports Audio and Video input on top of Image and Text input.

Here, you will find accurate information about all Gemini models, its capabilities, and how it compares with other AIs like GPT-4, ChatGPT 3.5, Claude with official sources.

Capabilities

Note: While this is the official demo by Google, but they have edited to be faster and also added some pre-prompts which are not shown. This post has the full prompts.

Google Gemini is a powerful and versatile AI model with a range of capabilities.

Multimodal understanding: Gemini can understand, operate across, and combine different types of information, including text, code, audio, images, and video.
Advanced reasoning: Gemini can distinguish between relevant and irrelevant information in large datasets, such as scientific papers.
Improved code generation: Gemini is more efficient and effective in generating code, with Google’s AlphaCode 2 system performing better than 85% up from 50% for the original AlphaCode.
Powering various AI services: Gemini Pro powers Google’s chatbot Bard, while Gemini Nano is designed for specific tasks and mobile devices.
Enhancing user experience: Gemini will be integrated into Google Search, Chrome, Duet AI, and Ads. Early tests show Gemini reducing Search Generative Experience (SGE) latency by 40%.

Gemini vs GPT-4

Google-gemini-vs-chatgpt-4 benchmarks chart

Gemini Ultra has surpassed GPT-4 in reasoning, math, and code-related text-based benchmarks.
In the MMLU benchmark, Gemini Ultra achieved a score of 90.0%. This not only surpasses GPT-4’s score of 86.4% but also marks the first time a model has exceeded human expert performance in this benchmark.
MMLU tests a combination of subjects to assess world knowledge and problem-solving abilities.
In image, video, and audio tests, Gemini Ultra beats GPT-4 by achieving state-of-the-art results on various few-shot video captioning tasks as well as zero-shot video question answering tasks.
Gemini Ultra performed well without needing OCR systems to process images, indicating advanced inherent capabilities.

Task	Gemini Ultra	Gemini Pro	Few-shot State of the Art
VATEX (test) English video captioning	62.7	57.4	56.0 DeepMind Flamingo, 4-shots
VATEX ZH (test) Chinese video captioning	51.3	50.0	—
YouCook2 (val) English cooking video captioning	135.4	123.2	74.5 DeepMind Flamingo, 4-shots
NextQA (test) Video question answering	29.9	28.0	26.7 DeepMind Flamingo, 0-shot
ActivityNet-QA (test) Video question answering	52.2	49.8	45.3 Video-LLAVA, 0-shot
Perception Test MCQA (test) Video question answering	54.7	51.1	46.3 SeViLA (Yu et al., 2023), 0-shot

Gemini vs ChatGPT 3.5

Bard uses Gemini Pro which offers more advanced reasoning, planning, and writing surpassing GPT 3.5

In blind evaluations with our third-party raters, Bard is now the most preferred free chatbot compared to leading alternatives.
Source: Google

“Bard with Gemini Pro” is a tuned version of Gemini Pro with enhanced reasoning, planning, and writing capabilities.
It exceeds GPT 3.5 in six out of eight benchmarks, including MMLU and GSM8K.
Bard with Gemini Pro is claimed to have made the single biggest quality improvement since Bard’s launch.

Gemini vs Palm 2

The instruction-tuned Gemini Pro models have shown significant advancements across various capabilities when compared to the PaLM 2 model API.

In creative writing tasks, Gemini Pro outperformed PaLM 2 65.0% of the time.
For following instructions, Gemini Pro’s win-rate was 59.2%.
Notably, Gemini Pro achieved a 68.5% win-rate for providing safer responses.

Gemini Benchmarks vs Popular LLMs

Task	Gemini Ultra	Gemini Pro	GPT-4	GPT-3.5	PaLM 2-L	Claude 2	Inflection-2	Grok 1	LLAMA-2
MMLU (Multiple-choice questions in 57 subjects)	90.04%	79.13%	87.29%	70%	78.4%	78.5%	79.6%	73.0%	68.0%
GSM8K (Grade-school math)	94.4%	86.5%	92.0%	57.1%	80.0%	88.0%	81.4%	62.9%	56.8%
MATH (Math problems across 5 difficulty levels & 7 subdisciplines)	53.2%	32.6%	52.9%	50.3%	34.1%	34.4%	34.8%	23.9%	13.5%
BIG-Bench-Hard (Subset of hard BIG-bench tasks)	83.6%	75.0%	83.1%	66.6%	77.7%	—	—	—	51.2%
HumanEval (Python coding tasks)	74.4%	67.7%	67.0%	48.1%	—	70.0%	44.5%	63.2%	29.9%
Natural2Code (Python code generation)	74.9%	69.6%	73.9%	62.3%	—	—	—	—	—
DROP (Reading comprehension & arithmetic)	82.4	74.1	80.9	64.1	82.0	—	—	—	—
HellaSwag (validation set)	87.8%	84.7%	95.3%	85.5%	86.8%	—	89.0%	—	80.0%
WMT23 (Machine translation)	74.4	71.7	73.8	—	72.7	—	—	—	—

Source: Gemini technical report

Gemini Pro vs Gemini Ultra vs Gemini Nano

Feature	Gemini Pro	Gemini Ultra	Gemini Nano
Size	Best for scaling across a wide range of tasks requiring multimodality.	Largest and most capable model for highly complex tasks requiring advanced reasoning	Most efficient model for on-device tasks
Multimodal Capabilities	Yes	Yes	Yes
Availability	Available now	Coming early next year	Available for Android developers
Use Case	Wide range of multimodal tasks	Highly complex tasks	On-device tasks
Benchmark Performance	Surpasses GPT-4 in some areas	Exceeds the capability of all existing AI models	Not specified

How to Access Gemini?

Bard with Gemini Pro is becoming available in English in 170 countries and territories. The UK and Europe will get access soon.
Bard Advanced is a new service that will offer early access to the most sophisticated Gemini models, including Gemini Ultra.
Access for Developers and Businesses: Starting December 13, developers and business customers can use Gemini Pro. They can access it through the Gemini API in two Google platforms: AI Studio and Cloud Vertex AI.
Release of Gemini Ultra: Gemini Ultra, the most advanced version, will be available for developers and business customers in early next year.
Gemini Nano for Android Developers: Android developers can use Gemini Nano, which is tailored for specific tasks and mobile devices.

Closing Thoughts

With Gemini, Google has finally introduced a worthy rival to OpenAI’s GPT-4.

Gemini Ultra consistently outperforms other models in various benchmarks, particularly in tasks involving language understanding, video captioning, and question answering. This showcases its advanced capabilities in handling multimodal data and complex reasoning tasks.

Gemini Pro, while slightly trailing behind Gemini Ultra, also demonstrates robust performance, often surpassing Few-shot State-of-the-Art (SoTA) models. Its strengths are particularly evident in video-related tasks, underscoring Google’s strides in multimodal AI research.

Author
Recent Posts

Nerdynav

I'm Nav, a computer engineer, writer, and solopreneur. Here to help you and your business thrive through tech and self-growth. Cited in Forbes and TechCrunch.