
NeMo Framework ContainerNVIDIA
NVIDIA NeMo™ framework supports enterprise development of LLMs and generative AI models with automated data processing, model training techniques, and flexible deployment options.
Vendor
NVIDIA
Company Website
Product details
NVIDIA NeMo™ is an end-to-end platform for the development of custom generative AI models anywhere. Designed for enterprise development, it utilizes NVIDIA's state-of-the-art technology to facilitate a complete workflow from automated distributed data processing to training of large-scale bespoke models using sophisticated 3D parallelism techniques, and finally, deployment using retrieval-augmented generation for large-scale inference on an infrastructure of your choice, be it on-premises or in the cloud.
Features
- Parallelism Techniques: Data Parallelism, Fully Sharded Data Parallelism (FSDP), Tensor Parallelism, Pipeline Parallelism, Sequence Parallelism, Expert Parallelism, Context Parallelism.
- Memory-Saving Techniques: Selective Activation Recompute (SAR), CPU offloading (Activation, Weights), Flash Attention (FA), Grouped Query Attention (GQA), Multi-Query Attention (MQA), Sliding Window Attention (SWA).
- Multimodality Training: Supports language and multimodal models including Llama 2, Falcon, CLIP, Stable Diffusion, LLAVA, and various text-based generative AI architectures including GPT, T5, BERT, MoE, RETRO.
- Pretrained Models: Includes pretrained models for Computer Vision (CV), Automatic Speech Recognition (ASR), Natural Language Processing (NLP), and Text to Speech (TTS).
- Customization Options: Offers techniques to refine pretrained LLMs for specialized use cases including p-tuning, LoRA, Supervised fine-tuning (SFT), Reinforcement learning from human feedback (RLHF), SteerLM.
Benefits
- Efficiency Gains: Utilizes GPU resources and memory across nodes, leading to groundbreaking efficiency gains.
- Reduced Training Time: Significantly reduces training time with seamless multi-node and multi-GPU training.
- Flexibility: Provides wide-ranging flexibility to meet varying business requirements.
- Enhanced Productivity: Enhances overall productivity with advanced parallelism and memory-saving techniques.
- Enterprise Support: Supported by NVIDIA AI Enterprise for a production-grade, secure, end-to-end software platform.