
It is, Vertex AI, and is the first cloud service provider to offer L4 instances, with private preview of its G2 virtual machines launching today. Google Cloud is a key cloud partner and an early customer of NVIDIA’s inference platforms. The platforms’ software layer features the, which includes ™, a software development kit for high-performance deep learning inference, and ™, an open-source inference-serving software that helps standardize model deployment. With the 900 GB/s NVLink®-C2C connection between CPU and GPU, Grace Hopper can deliver 7x faster data transfers and queries compared to PCIe Gen 5. NVIDIA Grace Hopper for Recommendation Models is ideal for graph recommendation models, vector databases and graph neural networks.The new H100 NVL with 94GB of memory with Transformer Engine acceleration delivers up to 12x faster inference performance at GPT-3 compared to the prior generation A100 at data center scale. NVIDIA H100 NVL for Large Language Model Deployment is ideal for deploying massive LLMs like ChatGPT at scale.The L40 platform serves as the engine of ™, a platform for building and operating metaverse applications in the data center, delivering 7x the inference performance for Stable Diffusion and 12x Omniverse performance over the previous generation. NVIDIA L40 for Image Generation is optimized for graphics and AI-enabled 2D, video and 3D image generation.Serving as a universal GPU for virtually any workload, it offers enhanced video decoding and transcoding capabilities, video streaming, augmented reality, generative AI video and more. NVIDIA L4 for AI Video can deliver 120x more AI-powered video performance than CPUs, combined with 99% better energy efficiency.Arming developers with the most powerful and flexible inference computing platform will accelerate the creation of new services that will improve our lives in ways not yet imaginable.”Īccelerating Generative AI’s Diverse Set of Inference WorkloadsĮach of the platforms contains an NVIDIA GPU optimized for specific generative AI inference workloads as well as specialized software: “The number of applications for generative AI is infinite, limited only by human imagination. “The rise of generative AI is requiring more powerful inference computing platforms,” said Jensen Huang, founder and CEO of NVIDIA.

Each platform is optimized for in-demand workloads, including AI video, image generation, large language model deployment and recommender inference. The platforms combine NVIDIA’s full stack of inference software with the latest NVIDIA Ada, Hopper and Grace Hopper processors - including the and the, both launched today. SANTA CLARA, Calif., Ma(GLOBE NEWSWIRE) - GTC - NVIDIA today launched four inference platforms optimized for a diverse set of rapidly emerging generative AI applications - helping developers quickly build specialized, AI-powered applications that can deliver new services and insights.
