
Apache ORCThe Apache Software Foundation
Apache ORC is a columnar storage format optimized for big data processing. It provides efficient compression, indexing, and fast data access, making it ideal for analytics workloads in Hadoop-based systems. ORC supports complex types and is designed for high performance and scalability.
Vendor
The Apache Software Foundation
Company Website



Product details
Apache ORC
Apache ORC (Optimized Row Columnar) is a high-performance columnar storage format designed for big data processing in the Hadoop ecosystem. It provides efficient compression, fast data access, and rich indexing capabilities, making it ideal for large-scale analytics workloads. ORC is self-describing and type-aware, supporting complex data types and optimized for streaming reads.
Features
- Columnar storage format optimized for Hadoop
- Built-in indexes including min/max values and bloom filters
- Support for complex types: structs, lists, maps, and unions
- Lightweight metadata for fast schema discovery
- Predicate pushdown for efficient query filtering
- ACID transaction support and snapshot isolation
- Stripe-based file structure for parallel processing
- Advanced compression techniques for reduced storage footprint
Capabilities
- Seamless integration with Apache Hive, Spark, and other Hadoop tools
- Efficient read and write operations for large datasets
- Type-aware encoding for optimal performance
- Supports schema evolution and backward compatibility
- Enables distributed processing with independent file stripes
- Compatible with Java APIs for custom data handling
- Designed for high-throughput and low-latency analytics
- Supports vectorized query execution for faster performance
Benefits
- Reduces storage costs through advanced compression
- Accelerates query performance with built-in indexing
- Improves scalability for big data applications
- Enhances data integrity with ACID compliance
- Simplifies data management with self-describing files
- Open-source and actively maintained under the Apache License
- Trusted by major organizations like Facebook and Yahoo for petabyte-scale data
Find more products by industry
Other ServicesEducationFinance & InsuranceHealth & Social WorkPublic AdministrationInformation & CommunicationView all