
MaxCompute (previously known as ODPS) is a general purpose, fully managed, multi-tenancy data processing platform for large-scale data warehousing. MaxCompute supports various data importing solutions and distributed computing models, enabling users to effectively query massive datasets, reduce production costs, and ensure data security.
Vendor
Alibaba Cloud
Company Website


Overview
MaxCompute (previously known as ODPS) is a general purpose, fully managed, multi-tenancy data processing platform for large-scale data warehousing. MaxCompute supports various data importing solutions and distributed computing models, enabling users to effectively query massive datasets, reduce production costs, and ensure data security.
Benefits
- Large-scale computing and storage Supports EB-level data storage and computing.
- Multiple computational models Supports SQL, MapReduce, and Graph computational models and Message Passing Interface (MPI) iterative algorithms.
- Reliable data security measures Provides stable offline analysis services for more than seven years, and enables multi-level sandbox protection and monitoring.
- Cost-effective Provides more efficient computing and storage services than an enterprise private cloud, and reduces the purchase cost by 20% to 30%.
Features
Data channel
Supports multiple data tunnels, history data tunnels, and incremental data tunnels.
- Multiple data tunnels and history data tunnels MaxCompute uses tunnels to transmit data. Tunnels are scalable, and import and export PB-level data on a daily basis. You can import all data or history data through multiple tunnels. The tunnel service supports Java SDKs. You can use commands on the MaxCompute client to exchange files and data with the cloud.
- Real-time incremental data tunnels MaxCompute provides the DataHub service to upload real-time data. This service features low latency and is easy to use. It is very suitable for importing incremental data. DataHub supports multiple data transmission plug-ins, such as Logstash, Flume, Fluentd, and Sqoop. You can also use Log Service to easily ship logs to MaxCompute, and use big data development kits to perform log analysis and mining.
Data storage in a two-dimensional table
MaxCompute stores all data in tables to hide the file system. The compressed column storage greatly reduces your costs using a high compression ratio. MaxCompute provides a compression ratio of 5.
Computational models
Supports multiple computational models, such as SQL, MapReduce, and Graph.
- SQL MaxCompute SQL follows standard SQL syntax and Hive syntax. This combined syntax is similar to Hibernate Query Language (HQL), so SQL or HQL programmers can use MaxCompute SQL easily. MaxCompute provides a more efficient computing framework than a common MapReduce model, to run the SQL computational model. However, MaxCompute SQL does not support transactions, indexes, update, and delete.
- MapReduce MaxCompute provides the Java MapReduce programming model. MaxCompute does not have any file API. You have to read data from and write data to tables in the system. Therefore, the MapReduce model in MaxCompute is different from the MapReduce model in an open-source software community. A modified model may be less flexible. For example, you cannot customize sorting and hashing algorithms. However, the development process is simplified. More importantly, MaxCompute provides the Extended MapReduce (MR²) model. In this model, multiple Reduce operations can follow a Map operation.
- Graph In some complex iterative computation scenarios, such as K-Means and PageRank, MapReduce takes a long time to complete the tasks. Therefore, MaxCompute uses the Graph model to efficiently run these tasks.
Secure
MaxCompute is a multi-tenant computing platform. By default, tenants are isolated and do not share data. However, they can use MaxCompute to assign the permissions on certain data to other members in the same project group.