Is Google Bard Catching Up to Bing and GPT-4? Insights Revealed!
Written on
Chapter 1: The Rise of Google Bard
Recently, I published an article highlighting how GPT-4 had a significant advantage over Bard in mathematical reasoning. Back then, even GPT-3.5 outperformed Bard. However, given Google's extensive investment and reputation for cutting-edge technology, it’s clear they’re not backing down. Google aims to maintain its position as the leading source for diverse information, from DIY home projects to complex math and quantum physics.
As expected, Google has been heavily investing in Bard’s development.
“Bard is built on the Pathways Language Model 2 (PaLM 2), a large language model (LLM) launched in late 2022. This model is a more powerful successor to LaMDA, which Bard was initially based on. PaLM 2 boasts 1.56 trillion parameters, significantly outpacing LaMDA's 137 billion parameters. Google is already developing a follow-up, PaLM 3, anticipated to have 10 trillion parameters, effectively doubling PaLM 2’s capabilities.” — bard.google.com
I was astonished to learn that by July, just months after its debut, Bard could process image inputs! Even more impressive was its ability to understand and solve a geometry problem from a photo I submitted.
Google Bard showcases impressive problem-solving skills by interpreting geometry images!
For comparison, ChatGPT requires a Plus subscription for image uploads, priced at $22 per month, while Bing, which utilizes GPT-4, remains free and allows image submissions. I found that Bing successfully tackled the same 50-cent coin problem.
Chapter 2: Image Recognition Capabilities
To further assess the performance of Bard and Bing, I conducted a few tests focusing on image recognition.
Here’s an image prompt I presented to both Bing and Bard, along with Bing’s analysis.
Bing accurately identified the image as "children’s toys scattered on a carpeted floor," even pinpointing a Spiderman toy, though it did misclassify a train as an airplane. Bard, however, provided an even more comprehensive description.
It not only recognized Spiderman but also identified the specific toy as the “Spidey and his Amazing Friends Web Strike 2-in-1 Vehicle.” Bard benefits from years of Google’s research into image recognition through Google Lens, which connects to structured online product data. Nonetheless, Bard did make more inaccuracies than Bing, claiming to see a “green and yellow helicopter, stuffed animals, and a ball” that weren’t present.
Section 2.1: Evaluating Math and Problem-Solving Skills
My primary interest lies in assessing these AI systems' abilities in mathematical problem-solving. I prefer to challenge them with math competition questions, as these tests require creative thinking beyond memorization.
Here’s one example:
Bard provided a solid answer, correctly noting that there are 8C4 = 70 ways to select 4 vertices from a cube. While it acknowledged that any of the 6 faces would yield 4 coplanar points, it overlooked the possibility of the 4 points being on opposite diagonals.
This opens up an additional 6 potential planes, leading to a total probability of 12/70 = 6/35. Bing, on the other hand, rightly identified the diagonal planes but misrepresented another scenario involving a tetrahedron inscribed within the cube.
Another example involves a counting problem with an interesting twist.
Bard correctly counted the 31 multiples of 7 up to 218 but made an error while trying to tally numbers containing the digit 7. Intriguingly, Bard also attempted to solve the problem by generating a Python script.
I ran the code, and it indeed produced the correct answer of 153 days! It appears Bard fabricated the answer to fit its flawed reasoning. Bing, however, performed poorly, suggesting a total exceeding the original count of 218.
Section 2.2: Bard vs Bing — Who’s Ahead?
From the limited examples I’ve shared, Bard has matched or even exceeded Bing and GPT-4 in performance. Yet, these are just a few instances, and a single prompt can yield varied responses. Bard does offer a unique feature of allowing users to view three different draft responses.
In my experience, while Bing still holds an advantage in generating English text, solving math problems, and coding, it’s evident that Bard is rapidly catching up. The pace of advancement in both systems is truly remarkable!
Thanks for your time, and I invite you to share your thoughts in the comments.
This video compares ChatGPT, Google Bard, and Bing AI, assessing their capabilities in a head-to-head format.
In this episode, the discussion revolves around Google Bard, GitHub Copilot X, GPT-4, and the implications for job security as AI evolves.