Logo
Sign in

Apache Ozone is a scalable, distributed object store designed for big data, analytics, and cloud-native applications. It supports billions of objects, S3-compatible APIs, and Hadoop integration, offering strong consistency, flexible durability, and secure, efficient storage for hybrid and large-scale environments.

Vendor

Vendor

The Apache Software Foundation

Company Website

Company Website

ozone-recon.png
ReconHighLevelDesign.png
EC-Write-Block-Allocation-in-Containers.png
Product details

Apache Ozone

Apache Ozone is a scalable, distributed object store designed for big data, analytics, and cloud-native applications. Originating from the Hadoop ecosystem, it supports billions of objects and exabytes of capacity. Ozone provides strong consistency, flexible durability, and multi-protocol access, making it suitable for lakehouse workloads, AI/ML pipelines, and hybrid cloud deployments 

Features

  • S3-compatible REST API and Hadoop-compatible file system (OFS)
  • Strong consistency via RAFT-based replication
  • Configurable durability with replication and erasure coding
  • Transparent Data Encryption (TDE) and TLS/SSL support
  • Kerberos authentication and ACL-based authorization
  • Integration with Apache Ranger for policy management
  • Efficient small file handling and metadata management
  • Snapshot support for buckets
  • CLI and Java client API for administration and access
  • Recon web UI for monitoring and management

Capabilities

  • Handles both small and large files across billions of objects
  • Separates metadata and data layers for independent scaling
  • Supports hybrid cloud scenarios with on-prem S3 compatibility
  • Enables streaming and low-latency access patterns
  • Manages hierarchical namespaces with volumes, buckets, and keys
  • Offers quota management and data rebalancing
  • Operates on commodity hardware with open-source licensing
  • Supports rolling upgrades and node decommissioning

Benefits

  • Scales seamlessly for enterprise-grade storage needs
  • Reduces storage costs with erasure coding
  • Enhances security and compliance with integrated controls
  • Simplifies integration with existing Hadoop and cloud-native tools
  • Improves performance for diverse workloads
  • Enables flexible deployment in Kubernetes and YARN environments
  • Facilitates unified storage across analytics and operational systems