J A B B Y A I

Loading

AI-powered image generation has progressed at a remarkable pace — from early examples of models creating images of humans with too many fingers to now producing strikingly photorealistic visuals. Even with such leaps, one challenge remains: achieving creative control.

Creating scenes using text has gotten easier, no longer requiring complex descriptions — and models have improved alignment to prompts. But describing finer details like composition, camera angles and object placement with text alone is hard, and making adjustments is even more complex. Advanced workflows using ControlNets — tools that enhance image generation by providing greater control over the output — offer solutions, but their setup complexity limits broader accessibility.

To help overcome these challenges and fast-track access to advanced AI capabilities, NVIDIA at the CES trade show earlier this year announced the NVIDIA AI Blueprint for 3D-guided generative AI for RTX PCs. This sample workflow includes everything needed to start generating images with full composition control. Users can download the new Blueprint today.

Harness 3D to Control AI-Generated Images

The NVIDIA AI Blueprint for 3D-guided generative AI controls image generation by using a draft 3D scene in Blender to provide a depth map to the image generator — FLUX.1-dev, from Black Forest Labs — which together with a user’s prompt generates the desired images.

The depth map helps the image model understand where things should be placed. The advantage of this technique is that it doesn’t require highly detailed objects or high-quality textures, since they’ll be converted to grayscale. And because the scenes are in 3D, users can easily move objects around and change camera angles.

Under the hood of the blueprint is ComfyUI, a powerful tool that allows creators to chain generative AI models in interesting ways. For example, the ComfyUI Blender plug-in lets users connect Blender to ComfyUI. Plus, an NVIDIA NIM microservice lets users deploy the FLUX.1-dev model and run it at the best performance on GeForce RTX GPUs, tapping into the NVIDIA TensorRT software development kit and optimized formats like FP4 and FP8. The AI Blueprint for 3D-guided generative AI requires an NVIDIA GeForce RTX 4080 GPU or higher.

A Prebuilt Foundation for Generative AI Workflows

The blueprint for 3D-guided generative AI includes everything necessary for getting started with an advanced image generation workflow: Blender, ComfyUI, the Blender plug-ins to connect the two, the FLUX.1-dev NIM microservice and the ComfyUI nodes required to run it. For AI artists, it also comes with an installer and detailed deployment instructions.

The blueprint offers a structured way to dive into image generation, providing a working pipeline that can be tailored to specific needs. Step-by-step documentation, sample assets and a preconfigured environment provide a solid foundation that makes the creative process more manageable and the results more powerful.

For AI developers, the blueprint can act as a foundation for building similar pipelines or expanding existing ones. It comes with source code, sample data, documentation and a working sample for getting started.

Real-Time Generation Powered by RTX AI 

AI Blueprints run on NVIDIA RTX AI PCs and workstations, harnessing recent performance breakthroughs from the NVIDIA Blackwell architecture.

The FLUX.1-dev NIM microservice included in the blueprint for 3D-guided generative AI is optimized with TensorRT and quantized to FP4 precision for Blackwell GPUs, enabling more than doubled inference speeds over native PyTorch FP16.

For users on NVIDIA Ada Lovelace generation GPUs, the FLUX.1-dev NIM microservice comes with FP8 variants, also accelerated by TensorRT. These improvements make high-performance workflows more accessible for rapid iteration and experimentation. Quantization also helps run models with less VRAM. With FP4, for instance, model sizes are reduced by more than 2x compared with FP16.

Customize and Create With RTX AI

There are 10 NIM microservices currently available for RTX, supporting use cases spanning image and language generation to speech AI and computer vision — with more blueprints and services on the way.

Available now at https://build.nvidia.com/nvidia/genai-3d-guided, AI Blueprints and NIM microservices provide powerful foundations for those ready to create, customize and push the boundaries of generative AI on RTX PCs and workstations.

Each week, the RTX AI Garage blog series features community-driven AI innovations and content for those looking to learn more about NIM microservices and AI Blueprints, as well as building AI agents, creative workflows, digital humans, productivity apps and more on AI PCs and workstations.

Plug in to NVIDIA AI PC on Facebook, Instagram, TikTok and X — and stay informed by subscribing to the RTX AI PC newsletter.

Follow NVIDIA Workstation on LinkedIn and X.

See notice regarding software product information.

Leave a Comment