Gemini Podcast Creator!
Is this a feature anyone even asked for?
I recently received a notification on my phone about Gemini, Google’s AI model. Like ChatGPT, Gemini has many different applications, from integrating with a phone as a personal assistant to simply answering questions. The notification highlighted several new features, but the one that immediately caught my attention—and the focus of this article—was podcast creation.
At first, I thought the feature would simply create an informative voiceover from the text I provided, reorganising the information into a monologue. To test it out, I uploaded an article I had previously written titled "Is Truth Subjective?"
After Gemini finished processing for a few minutes, I listened to the 5-minute audio file and realised I was wrong. Instead of the informative monologue I expected, Gemini created a discussion between two different voices about the topic, even giving the file an interesting podcast episode title.
For reference, here’s the article I gave Gemini:
And here is a link to the Gemini chat where you can actually listen to what it created: https://g.co/gemini/share/621f4f354ac5
While this feature is impressive, it's not perfect. Although AI is linguistically sophisticated, many idiosyncrasies of AI-created speech can still be easily heard. Here’s a list of good and bad points about this feature.
The Good: Intelligent Interpretation
Overall, Gemini understood the core ideas of my single-viewpoint article and managed to create a surprisingly interesting two-person conversation out of the material. Even the conclusion at the end of the podcast was accurate and presented in an appropriate, conversational way.
Even more surprising, Gemini added external information to support points in the article, showing that it can look beyond the source material. For example, in my article I mention that dark matter and dark energy make up most of the universe. Gemini expanded on this by adding a specific detail: "we calculate that 85% of the universe’s mass is missing, based on gravity”.
(A side note on accuracy: I checked this number. It seems that 95% is closer to the correct figure. The 85% could either be an AI "hallucination" trying to justify my original statement, or it could come from a less common source. This does add an interesting concern about using AI models in this way.)
The Bad: Conversational Flaws and Source Reliance
Despite its interpretive intelligence, which is not specific to its podcast creation feature, the podcast had quite a few noticeable issues when it came to human speech patterns.
1. Bad Conversational Flow
• Reliance on the "Source": The podcast referred to the "sources" (my article) too often. It sounded like it was expecting the listener to have already read the source material, which is not how a podcast should be structured. It felt less like a conversation created for a third-party listener and more like a simple conversion of an article into dialogue.
• Misinterpretation of Context: Gemini seemed to misunderstand the context of an example I provided about perspective (is it a 6 or a 9 painted on the floor?). I then furthered the context of the example with a farm and some sheep, which seemed to confuse Gemini even more. While the “host” was discussing this example, the dialogue felt like it was missing chunks of sentences. For instance, the line "… on the variables like: are sheep missing? did a wolf get them? are they just over the hill? Without knowing these things" objectively made no sense, even if the listener had access to the article beforehand.
2. Unnatural Speech and Cadence
The most obvious issues, since this is a podcast meant to be listened to, were the tone of voice and cadence of speech. At various points, the tone at the end of a sentence would suggest there was more was to be said, but the speech would suddenly cut off. Certain emphases and pauses were made at the wrong time, even in the middle of a sentence, for no apparent reason, making the artificial nature of the voice quite clear.
3. Taking the Source as Objective Truth
While Gemini showed it could look outside the source for factual data (like the dark matter percentage I mentioned above), it ironically treated the subjective parts of my article as objective truth. For example, it accepted my statement that the phrase "this is my truth" is popular. In reality, it could be that this phrase is only popular locally to me (I hadn’t done any research on this particular phrase, so it’s possible that it wasn’t popular at the time).
After its all said and done, this seems like a feature with great potential, but it needs some tweaking.
My experiment suggests that giving it a full article as source material is probably not the best way to go about this. It might be better to provide brief bullet points listing core arguments. I had considered trying to give it a script, but seeing how it altered and reorganised the article, giving it a script would be pointless since Gemini can and will alter the material to fit a podcast format.
Speech generation, I believe, is the main aspect that needs looking into for this feature, because humans can easily spot AI generated voice… at least for the moment.
Do you think this feature is good? Do you ask yourself, like I do, who on earth asked for this feature? And do you think AI is encroaching into expressive media a little too much with this feature? I would love to hear your thoughts on this.
---
Author's note: I fully wrote the article, but then put it in Gemini and asked it to reorganise and fix the flow of the article. It did that quite well, but I noticed it added and changed some things, and in some places it seemed like it was praising the podcast feature a little too much (I had edited the article after putting it through Gemini of course). Was Gemini intentionally putting itself in a more positive light, or was it just trying to connect paragraphs better?
About the Creator
Mo Darasi
I write fiction, poetry and occasional articles about interesting topics.
Finding interesting ways to write a poem or hide messages within them seems to be my main interesting in writing now, and it's been fun



Comments (1)
Honestly Ai scares me. Love the simplicity of your article as I’m so behind times of this digital age