AI inference is the process of using a trained machine learning model to make predictions on new, unseen data by applying learned patterns. This course is designed for developers, data scientists, and ML engineers interested in quickly deploying AI inference services on Cloud Run. It is useful for those familiar with cloud-based serverless application deployment solutions, but who may not have experience with running AI inference using Google Cloud serverless products. The course includes examples that deploys a model for AI inference with GPUs and integrates gen AI apps with data storage services.
In this course, you learn how Gemini, a generative AI-powered collaborator from Google Cloud, helps you use Google products and services to develop, test, deploy, and manage applications. With help from Gemini, you learn how to develop and build a web application, fix errors in the application, develop tests, and query data. Using a hands-on lab, you experience how Gemini improves the software development lifecycle (SDLC). Duet AI was renamed to Gemini, our next-generation model.
This course is dedicated to equipping you with the knowledge and tools needed to uncover the unique challenges faced by MLOps teams when deploying and managing Generative AI models, and exploring how Vertex AI empowers AI teams to streamline MLOps processes and achieve success in Generative AI projects.