What is this

Have you ever wanted to just place your phone down and talk? Nowadays, AIs have started gaining live features, where you can talk in a conversation-like manner. Our goal is to test the capabilities of them all, and give them each an overall grade of 1-10. The different things we will test are music, where I will hum “Rewrite the Stars”, then we will see if it has any special features, and then talk to it as if I were packing for a trip. We will be testing ChatGPT, Gemini, Perplexity, Meta, and Grok. All tests will be done on the phone app to keep things consistent.

ChatGPT

Starting with ChatGPT, we can see that it can talk live on the web and on the phone app. Looking into the different features that this AI live mode has, you can see that it can mute and change voices, but nothing else. When using the live feature, I told it what I was going to do and then sang the lyrics. When you are done with the chat, it transcribes what it heard. The problem is that the transcription is not that good, which you can see from below. Another thing that is weird is that it sounds as if it is rushing through what it is trying to say. When I asked it about a 4-day trip to NYC, it gave me a good list of items to bring, but yet again, it seemed as if it was trying to race through the list. Overall, I give the AI live mode a 7/10, as while it works, it lacks features that the other AIs have, and doesn’t sound great.

ChatGPT Song Transcription ChatGPT Packing List

Gemini

Moving on to the Gemini, you are only able to use it on the phone app as of now. With that, it has more features than ChatGPT does. For example, it allows you to pause the conversation, unlike ChatGPT, which only allows muting. It also gives camera and screenshare options. One feature that it misses out on is transcribing what is said. When taking what you have said, it will act like voice to text, missing words and changing them entirely, but Gemini manages to understand what is being said. When I go into live and tell it what it is going to do, it says it's up for the challenge, unlike ChatGPT, which came before. It also managed to give a quick and correct answer. When I ask if it's sure, it gives me the lyrics to prove itself. When you listen to Gemini’s voice, it sounds much more human-like, just in the way that it flows through the sentence. When I go on to ask about the NYC trip, it gives me a weather summary, as well as a couple of must-have clothing items. While the list was not as detailed, it gave me what I needed to pack as I asked. I give this a 9.5/10 because while it works, it is missing the ability to access google acount information, like email or calendar while in live mode, but it can access generic weather information. This is planned to change in the coming months, and I should create an update on this.

Gemini Song Transcription Gemini Packing List

Perplexity

Next up is Perplexity. It is not as well-known as the previous two AIs, but it is good nonetheless. One thing that it offers is video and screen share, just like Gemini. However, it transcribes what it is saying under the live icon. The only bad thing is that, unlike ChatGPT, which transcribes the audio, it will take a little of what is said and will make it into the prompt, as you can see above. When I started the live feature and told it what I was going to do, it said that it couldn’t find the name from singing, but when I sang it anyway, it managed to find the correct song and movie, as well as give citations as to where it found them. When I asked it what I should pack on a 4-day trip to NYC, it gave me fewer items than ChatGPT and Gemini. However, it gave me a great idea on what to start packing, and if you're like me, you will ask how you are doing. Overall, I give this AI an 8.5/10, because while it does have a good voice and speed, it struggles to understand everything you say, and doesn’t like to be interrupted. Also, some of the camera and screen features will not work once it starts talking, unlike Gemini.

Perplexity Song Transcription Perplexity Packing List

Meta

Up next is Meta, which has many cool features. Like ChatGPT, Meta can be talked to live on both the phone app and website. It is missing out on camera and screenshare features; it is embedded in Meta AI glasses. Another feature that Meta has is that it is testing a Full-duplex demo live mode, which I did not use. Starting in live mode, I told it what the plan was, and it said that it was ready. When I talked, it gave a small caption with what I was saying, and gave a caption on what it was saying when it was talking. Overall, the transcription is one of the best of all the ones we had tested by this point. It ended up correctly classifying the song and the show it was from. When I asked it about what to pack on the NYC trip, it understood the month was June and gave me a good list of items to pack for the trip. Overall, I give Meta AI a 7.5/10. I give it this rating because it is not quite human-like interactions, and it likes to interrupt you and forget the conversation when you continue talking.

Meta Song Transcription Meta Packing List

Grok

Finally, Grok. Grok took the longest time to connect to live mode, but it has a lot of different “modes” you can use. Grok's transcription was better than others, but still lacked in time, as it managed to duplicate me singing and still get the song wrong. Grok also only has two voice options, which really limits people with high and low voice sensitivity. While Grok AI is planned to be implemented in many different areas in the tech industry, it doesn’t quite seem ready to be there. When I ask Grok about my NYC trip, it gives a very brief and wide description of what I should pack. Overall, I give Grok a 3.5/10, as it struggles to understand and or check its answers, as well as being the longest to get to.

Grok Song Transcription Grok Packing List

Wrapping Up

Overall, first is Gemini, followed by Perplexity, then Meta, then ChatGPT, and lastly Grok. Thanks for reading!!