• NanoBits
  • Posts
  • Open AI ⚫ vs Google AI 🔴 : The Summer of AI is Here! And, it's Getting Hotter 🔥

Open AI ⚫ vs Google AI 🔴 : The Summer of AI is Here! And, it's Getting Hotter 🔥

Nanobits Product Spotlight

EDITOR’S NOTE

Hey Nanobits Readers!

Did you catch the AI fireworks this week? 🔥

OpenAI and Google just dropped some major bombshells 💣 that are going to change how we chat, create, and even think about AI.

Buckle up, because today, we're breaking down the biggest announcements of ChatGPT-4 "omni" (it's cooler than it sounds) and the jaw-dropping highlights from Google I/O and what they mean for you, the reader!

OPENAI’s GPT-4o

Image Credits: OpenAI

OpenAI's latest model, GPT-4o, is a significant leap towards seamless human-machine interaction. We covered the major feature updates in our last edition on Tuesday.

GPT-4o can process various inputs, including text, audio, images, and video, and respond with text, audio, or image outputs. Compared to its predecessors, this model:

  • Has a faster response time

  • Is comparable to human conversation

  • Outperforms in non-English languages

  • Superior vision and audio comprehension

Why is this relevant?

  • Eliminates Latency: Offers near-instantaneous responses in voice conversations, enhancing interactivity.

  • Understands Emotions: Detects nuanced emotions like gasps and breaths, enabling more realistic interactions.

  • Expresses Emotions: Replicates emotions through voice, even mimicking robotic and singing voices.

  • Interacts Visually: Processes real-time video feeds, understanding and responding to visual cues.

AI as a companion 👥: With the launch of this new model, we move towards a world where AI will be our companion and not just a mere tool (be it for education, chatting, coding, and many other areas)

Free for All: The GPT-4o model has been made free for all users who can experience an advanced, high-quality product that is way [can’t stress enough on the way!] better than the previous model, ChatGPT 3.5.

GOOGLE IO UPDATE

Image Credits: Google

Early morning on May 14th, Google kicked off its annual developer conference, Google I/O 2024, or as CEO Sundar Pichai calls it, "Google's version of the Eras Tour, but with fewer costume changes 🕺"

Google went all out and unveiled a wave of AI-powered advancements [apparently, by their own count, AI was mentioned a 120 times during the entire keynote session. Now that’s a lot of AI for 110 minutes 😅]. By now, most of you would have read the new feature updates. Here is a summary of everything they announced:

  • New AI Models: Introduction of Gemini and Gemma models to enhance various applications.

  • AI-Infused Android: Integration of AI features into Android for smarter user experiences.

  • AI Voice Assistant: A new voice assistant powered by AI for intuitive interactions.

  • Text-to-Video: Innovative tool for generating videos from text input.

If you are more curious, here is a list of 100 things that Google announced during its keynote session!

Why is this relevant?

Google I/O 2024 showcased the company's commitment to AI integration, promising transformative experiences for all its users.

  • Catching Up: Demonstrates Google's accelerated efforts to match OpenAI's advancements in AI.

  • Leveraging Strengths: Utilizes Google's vast experience and resources in search and real-time data to integrate AI into various products.

  • Maintaining Dominance: Aims to solidify Google's position as a leader in the AI-driven tech landscape and prevent market share loss.

  • User-Centric Innovation: Introduces AI-powered tools and features to enhance user experiences across Google's ecosystem.

This signals a significant shift in the tech landscape, with AI becoming increasingly central to our digital interactions and tools.

NANOBITS OPINION

This week has been a significant one in the history of AI. The quanta of innovations that both OpenAI and Google have showcased is going to massively change the way people interact with AI. I have already started thinking about how I can use these features in my day-to-day work and personal life to make things easier.

A few things about both the launches that stood out for me are:

  • Project Astra by Google & Be My Eyes by OpenAI - both showcasing the future of visual AI assistants, opening up possibilities for people with visual impairment

  • Notebook LM by Google & AI Tutor by OpenAI - personalized tutoring assistance for users on various subjects

  • Gemini 1.5 Pro in Workspace Labs & Meeting AI by OpenAI - summarize meeting minutes for all the professionals out there

  • Responsible AI progress (finally !!)

    1. Google prioritizes proactive security measures like "AI-Assisted Red Teaming" and expanding SynthID for media authentication

    2. while OpenAI focuses on safety by design through data filtering and post-training refinement, along with extensive external red teaming to identify and mitigate risks.

  • Both launches used cute dogs 🐶 to make the demos even cuter (to give a smile to all you dog lovers!)

    Image Credits: Google’s Project Astra Demo

Noteworthy Google AI capabilities:

  • Gemini 1.5 Flash is a lighter-weight model that’s designed to be fast and efficient to serve at scale

  • Both 1.5 Pro and 1.5 Flash are available in public preview with a 1 million token context window (highest among all global models) on Google AI Studio & Vertex AI

  • Trillium, the sixth generation of custom AI accelerator

  • Imagen 3, Google’s highest-quality image generation model yet

    Image Credits: Google’s Imagen

  • Veo, Google’s most capable video generation model yet

  • Starting with Pixel later this year, Gemini Nano — Android’s built-in, on-device foundation model — will have multimodal capabilities.

  • A new, opt-in scam protection feature that will use Gemini Nano’s on-device AI to help detect scam phone calls in a privacy-preserving way.

  • A new experimental feature in Google Photos called Ask Photos makes it even easier to look for specific memories or recall information included in your gallery.

Image Credits: Google

  • PaliGemma, Google’s first vision-language open model optimized for visual Q&A and image captioning.

Other notable OpenAI capabilities:

  • OpenAI’s models can now understand most Indian languages like Gujarati, Tamil, Telegu, Marathi, and Hindi

  • Coding Assistant - It’s like having a co-worker right on your screen helping you code and debug

  • 3D reconstruction from different AI-generated images

  • Improved text in image generation

Image Credits: OpenAI

What could have been better? 😏

  • OpenAI’s offering for its Plus users was disappointing. Apart from extended capacity and a desktop Mac app, there was no significant offering.

  • Most of Google’s products, unlike, OpenAI are yet to be released for public usage. As an early adopter, I would have loved the chance to try out its new features immediately

  • OpenAI’s live demonstration with the audience was a masterstroke in GTM strategy vs. gated AI sandbox trials by Google

  • OpenAI’s real-time product demonstration vs. Google’s pre-recorded videos

In general, there was an apparent dip in the enthusiasm that I felt all over X (formerly Twitter!) and other social media after the Google IO Keynote session 😐. I am not sure if it was because viewers were no longer surprised about what AI can do [because OpenAI had dazzled everyone 2 days back] or if viewers could not get their hands dirty trying out the new products.

Nevertheless, both the events were groundbreaking!
And, as a regular viewer and an ardent supporter of AI, I will be looking forward to trying out the new features soon.

Did you watch these launches? Which were your favorite demos? Are you ready for this AI revolution?

I'd love to hear about your experiences and how you plan to integrate AI into your life – drop your thoughts in the comments or reply to this email!

AI JOB OPPORTUNITIES

Click here to browse more AI jobs in India and outside

LATEST NEWS IN AI

OpenAI co-founder Ilya Sutskever exits, replaced by Jakub Pachocki

The Senate unveils a $32 billion AI regulation roadmap, focusing on workforce training, content safety, privacy, energy costs, and funding for research

DST and BharatGPT collaborate to create language models in Indian languages, aiming for widespread adoption by Indian startups across various sectors

Instagram co-founder Mike Krieger joins Anthropic as Chief Product Officer, focusing on developing intuitive AI products like the Claude app, targeting workplace interactions

AWS launches Amazon Bedrock GenAI service in AWS Asia Pacific (Mumbai) Region, enabling Indian organizations to build and scale generative AI applications securely and efficiently

Share the love ❤️ Tell your friends!

If you liked our newsletter, share this link with your friends and request them to subscribe too.

Check out our website to get the latest updates in AI

Reply

or to participate.