Logo
Sign in
Product Logo
IBM Analytics EngineIBM

Analytics Engine is a combined Apache Spark and Apache Hadoop service for creating analytics applications.

Vendor

Vendor

IBM

Company Website

Company Website

Product details

What is Analytics Engine?

IBM Analytics Engine provides Apache Spark environments a service that decouples the compute and storage tiers to control costs, and achieve analytics at scale. Instead of a permanent cluster formed of dual-purpose nodes, IBM Analytics Engine enables users to store data in an object storage layer such as IBM Cloud Object Storage and spins up clusters of compute notes when needed. For added flexibility and cost predictability, usage-based consumption is available for Apache Spark environments.

Features

  • **Leverage open-source power: **Build on an ODPi-compliant stack with pioneering data science tools with the broader Apache Spark ecosystem.
  • **Spin up and scale on demand: **Define clusters based on your application's requirement. Choose the appropriate software pack, version, and size of the cluster. Use as long as required and delete as soon as application finishes jobs.
  • **Customize and configure analytics: **Configure clusters with third-party analytics libraries and packages as well as IBM’s own enhancements. Deploy workloads from IBM Cloud services, such as machine learning.

Benefits

  • **Compute and storage are no longer bound: **Spin up compute-only clusters on demand. Because no data is stored in the cluster, clusters never need to be upgraded.
  • **I/O-heavy clusters are more cost-effective: **Provision more IBM Cloud Object Storage (or other data stores) on demand with no extra costs for compute cycles not used.
  • **Clusters are more elastic: **Adding and removing data nodes based on live demand is possible via REST APIs. Also, overhead costs remain low because there is no data stored in the compute cluster.
  • **Security is more cost-effective: **Using a multilayered approach significantly simplifies the individual cluster security implementation, while enabling access management at a more granular level.
  • **Vendor lock-in is avoided: **Clusters are spun up to meet the needs of the job versus forcing jobs to conform to a single software package/version. Multiple different versions of software can be run in different clusters.
  • **Control costs with Serverless Spark: **If you’re working with Apache Spark, but unsure how much resource is needed, provision a Serverless Spark instance that only consumes compute resource when running an app. Pay only for what you use.