谷歌大杀器终于来了!最大规模Gemini发布:真超GPT4,手机直接可用
谷歌大杀器终于来了!最大规模Gemini发布:真超GPT4,手机直接可用

Google has ushered in a new era of large models with the official launch of Gemini 1.0 on December 6, as announced by CEO Sundar Pichai. Gemini, a native multimodal model, marks Google's first step into a new era of large models, offering three variants: the powerful Gemini Ultra, multitasking-oriented Gemini Pro, and task-specific and on-device Gemini Nano.
Google's ChatGPT-like application, Bard, has been upgraded to Gemini Pro, showcasing advanced capabilities in inference, planning, and understanding while remaining free. Google plans to introduce "Bard Advanced" in early next year, leveraging the capabilities of Gemini Ultra.
Gemini, a model Google claims to be the largest and most powerful, had been in the works since March, with rumors circulating about its trillion parameters and five times the training power of GPT-4. Google shifted focus from PaLM 2 to Gemini, merging Google Brain and DeepMind in April to tackle the project.
Gemini's capabilities have been a subject of curiosity, and though it outperforms in benchmarks and even surpasses human performance, Google's Eli Collins admitted they are still exploring the full extent of Gemini Ultra's abilities.
Sundar Pichai expressed excitement about the transformative potential of artificial intelligence, believing it to be a more profound shift than previous transitions like mobile or the internet. Gemini's release comes after rigorous testing, with Gemini Ultra excelling in various tasks, surpassing GPT-4 in most benchmarks.
Gemini's features include native multimodal support, seamless understanding and manipulation of diverse information types such as text, code, audio, images, and video. Gemini offers three versions: Ultra for complex tasks, Pro for multitasking, and Nano for on-device efficiency.
Gemini Ultra demonstrated outstanding performance, exceeding SOTA results in 30 out of 32 academic benchmark tests. In the MMLU dataset, Gemini achieved a score of 90.0%, surpassing human experts. The model's new approach in MMLU testing allowed for more thoughtful reasoning, resulting in significant improvements.
Gemini's native multimodal design enables it to comprehend and reason across different modalities efficiently. Its complex multimodal reasoning capabilities make it proficient in extracting insights from vast amounts of data, aiding breakthroughs in fields like science and finance.
Gemini 1.0 excels in simultaneous understanding of text, images, audio, and more, making it adept at reasoning through complex topics. The model's advanced coding abilities, supporting languages like Python, Java, C++, and Go, position it as a leading coding foundation.
Google used internally designed Tensor Processing Units (TPU) v4 and v5e for large-scale training of Gemini 1.0, ensuring reliability and efficiency. The newly launched Cloud TPU v5p is expected to accelerate Gemini's development, enabling faster training of large generative AI models.
Google plans to integrate Gemini across its products, enhancing capabilities in search, ads, Chrome, and Duet AI. Bard, powered by Gemini Pro, receives its most significant upgrade since launch, offering advanced inference, planning, and understanding.
Developers and businesses can access Gemini Pro through Google AI Studio or Google Cloud Vertex AI starting December 13. Android developers, starting with Pixel 8 Pro devices, can utilize Gemini Nano through AICore, a new system service in Android 14.
Google hints at the upcoming release of Gemini Ultra, currently undergoing trust and safety checks, with early access provided to selected clients, developers, and partners. Bard Advanced, utilizing Gemini Ultra, is set to launch in early next year.
Google envisions expanding Gemini's capabilities, including advancements in planning and memory, and increasing the context window to handle more information for improved responses.


Comments
There are no comments for this story
Be the first to respond and start the conversation.