Logo
Sign in

Storage optimized for analytics workloads

Vendor

Vendor

Amazon Web Services (AWS)

Product details

Optimize query performance and cost as your data lake scales

Store tabular data at scale in S3

Amazon S3 Tables deliver the first cloud object store with built-in Apache Iceberg support and streamline storing tabular data at scale. Continual table optimization automatically scans and rewrites table data in the background, achieving up to 3x faster query performance compared to unmanaged Iceberg tables. These performance optimizations will continue to improve over time. Additionally, S3 Tables include optimizations specific to Iceberg workloads that deliver up to 10x higher transactions per second compared to Iceberg tables stored in general purpose S3 buckets. With S3 Tables support for the Apache Iceberg standard, your tabular data can be easily queried with popular AWS and third-party query engines. Use S3 Tables to store tabular data such as daily purchase transactions, streaming sensor data, or ad impressions as an Iceberg table in S3, and optimize performance and cost as your data evolves using automatic table maintenance.

Benefits

Scalability

Simplify data lakes at any scale, whether you’re just getting started or managing thousands of tables in your Iceberg environment.

Enhanced performance

Get up to 3x faster query performance through continual table optimization compared to unmanaged Iceberg tables, and up to 10x higher transactions per second compared to Iceberg tables stored in general purpose S3 buckets.

Fully managed

Perform continual table maintenance tasks such as compaction, snapshot management, and unreferenced file removal to automatically optimize query efficiency and costs over time.

Seamless integration

Access advanced Iceberg analytics capabilities and query data using familiar AWS services like Amazon Athena, Redshift, and EMR through the S3 Tables integration with Amazon SageMaker Lakehouse. Additionally, you can use Iceberg REST compatible third-party applications like Apache Spark, Apache Flink, Trino, DuckDB, and PyIceberg, to read and write data into S3 Tables.

Simplified security

Create tables as first-class AWS resources and apply permissions to easily govern access to them.

How it works

S3 Tables provide purpose-built S3 storage for storing structured data in the Apache Parquet format. Within a table bucket, you can create tables as first-class resources directly in S3. These tables can be secured with table-level permissions defined in either identity- or resource-based policies and are accessible by applications or tooling that supports the Apache Iceberg standard. When you create a table in your table bucket, the underlying data in S3 is stored as Parquet data. Then, S3 maintains the metadata necessary to make that Parquet data queryable by your applications. Table buckets include a client library that is used by query engines to navigate and update the Iceberg metadata of tables in your table bucket. This library, in conjunction with updated S3 APIs for table operations, allows for multiple clients to safely read and write data to your tables. Over time, S3 automatically optimizes the underlying Parquet data by rewriting, or "compacting” your objects. Compaction optimizes your data on S3 to improve query performance and minimize costs.

Find more products by category
Analytics SoftwareView all