Google Gemini vs ChatGPT (For Day-To-Day Tasks)

Welcome to this week’s edition of Architecture Insights.

Last week, Google released a major update to their large language model called Gemini. Gemini, which was formerly known as Bard, has always been considered the top competitor of ChatGPT.

We will assess and compare the features of both LLMs and their relevance to daily practices in architecture and design.

As always, here is this week’s latest news in AI for architects and designers.

News & Updates

1. OpenAI shares a preview of their new video-to-text AI.

Going by the name Sora, it is currently being tested to make sure it doesn't produce harmful or inappropriate content. It is not yet available to all users and OpenAI only granted access to a select few visual artists, designers, and filmmakers to gain feedback before releasing it to the public. Nevertheless, Sora is a massive improvement in the text-to-video space.

Watch the AI-generated video from the image below.

Source: Google

2. ChatGPT rolls out new memory features.

Their objective is to enable ChatGPT to retain information from previous conversations, thereby reducing the need to repeat information frequently. The initial version of ChatGPT's memory can automatically remember important details as you converse with it, but you can also instruct it to memorize specific information.

The memory feature is entirely optional and can be activated or deactivated as per your preference. Similarly to Sora, this feature is only being tested with a limited number of users.

Gemini vs ChatGPT

Most of us are familiar with ChatGPT (especially if you have been reading this newsletter), and maybe a little less familiar with Gemini (previously Bard), which is owned by Google.

Although there are many similarities between the two, there are also some key differences that lead to preferences in the practice of architecture.

Let’s go through the basics first. Currently, there are two versions of ChatGPT and Gemini available: free and paid. The cost of both paid plans is $20USD per month.

The free version of ChatGPT is called GPT 3.5, whereas the paid version is referred to as GPT 4.0. Similarly, the free version of Gemini is named Pro 1.0, and the paid version is known as Ultra 1.0.

Here are some comparisons of the free versions.

Image Generation

ChatGPT is particularly good at generating images based on text-based descriptions. DALL-E will translate most prompts into a photorealistic image that looks like what you had imagined. While it won’t be preferred over Midjourney, it os still great for quick and simple images.

On the other hand, Gemini uses VQAV2 technology which helps it in the domain of sketches or artistic interpretations of concepts. While these sketches may not be as photorealistic as DALL-E, they capture a greater portion of detail from prompts and can foster rapid iteration and exploration. Gemini's "Ultra" version is particularly good at this.

Document Analysis

ChatGPT is great for summarizing key points, extracting relevant information, and answering specific questions about your documents. However, its summaries can sometimes lack nuance and require careful verification.

With a Plus subscription, you can access plug-ins which offer more specific features such as summarizing and analyzing PDFs.

Gemini is not as advanced as GPT plug-ins and it acknowledged the following when we asked it if it can summarize PDFs:

Direct Summarization (With Guidance): If you can provide the PDF, I can try to generate a summary of the key points. Be aware that my ability to do this effectively depends on the PDF's length, complexity, and how well-structured it is. Here's how to get the best results: Give me instructions: Specify the desired summary length (e.g., a few sentences, a paragraph) and any particular aspects you want to focus on.”

You can still achieve similar results with Gemini but it will require more work on your part to guide it.

Proposal Writing

Both models have their advantages. ChatGPT is skilled at creating persuasive arguments that can effectively showcase the value proposition of your design. However, it lacks technical expertise which may result in inaccuracies in proposal details.

On the other hand, Gemini is proficient at generating precise and technically sound content. Its ability to personalize proposals for individual clients and incorporate design elements from previous projects adds to its appeal.

Project Documentation

Neither model directly generates Excel spreadsheets or PowerPoint presentations in their free versions.

ChatGPT wins when it comes to data analysis, report writing, and summarizing varying types of information into digestible formats. This is valuable when looking at cost estimations and feasibility studies.

Gemini has similar capabilities in this area and takes a slight advantage when it comes to generating accurate text descriptions for diagrams and visuals.

Data Access

ChatGPT offers real-time data updates with 4.0 but only has information up to 2023 for 3.5. Its data accuracy can sometimes be questionable, requiring cross-checking in some cases.

Gemini focuses on verified and reliable data sources, ensuring the integrity of your decisions. It even features a button at the bottom of its answer to cross-reference its response from the source it received its information from.

Conclusion

As AI language models continue to evolve, it might be beneficial to leverage the power of both models. The features of each large language model are always changing and the best way to find the right fit for you is through your experimentation.

AI Image of the week

Thank you for reading this week’s issue, check past issues here. Share this newsletter with colleagues, friends, or anyone interested in the combined world of architecture and artificial intelligence.

Until next Friday,

A.I.

Did you enjoy this week's post?

Login or Subscribe to participate in polls.