LOGO

Google Cloud Launches Managed Spark Service

October 12, 2021
Google Cloud Launches Managed Spark Service

Google Cloud Launches Fully Managed Spark Service

During its Cloud Next event, Google has announced the availability of Spark on Google Cloud as a completely managed service.

This development positions the widely-used open-source data processing engine as a premium offering within the Google Cloud ecosystem.

Simplified Spark Experience

Gerrit Kazmaier, Google’s VP & GM for Database, Data Analytics & Looker, stated that this innovation brings Spark into the realm of cloud-native solutions.

It empowers data engineers and scientists to utilize Spark without the complexities of cluster configuration and management.

The service is seamlessly integrated with Google Cloud’s existing data services, enabling direct launch from platforms like BigQuery, Vertex AI, and Dataplex.

This integration streamlines Spark usage, allowing customers to leverage familiar frameworks and toolkits in a cloud-native environment.

A New Approach to Spark

Google claims this is the “world’s first autoscaling and serverless Spark service” for its data platform.

However, it’s important to acknowledge that numerous other companies already provide Spark management services, given its widespread adoption.

Spark is also a core component of the Databricks platform, a company founded by the original creators of Spark.

Spark on Google Cloud vs. Dataproc

A common question arises: doesn’t Google Cloud already offer a managed Spark service through Dataproc?

Kazmaier clarified that these are distinct services designed for different user groups.

Dataproc caters to organizations already invested in Spark, Hadoop, MapReduce, Presto, and similar systems, offering them as managed services.

Focus on Simplicity

The new Spark service prioritizes simplicity, particularly for organizations beginning their data initiatives.

Kazmaier emphasized that companies shouldn’t need to build foundational systems like storage and metadata management when starting their data journey.

“Instead of focusing on infrastructure, users can simply initiate Spark with a single command: ‘go.’”

This serverless approach eliminates the need for complex setup and allows teams to concentrate on data analysis and insights.

#Google Cloud#Spark#managed Spark#big data#analytics#data processing