Property portal halves execution time with new data pipeline

Image

A European real estate marketplace was looking to upgrade its existing, Azure-based data operations. They were facing performance challenges in their existing systems, and launched a pilot to migrate their Google Analytics (GA4) data processing to Databricks. Our solution significantly reduced runtimes, enabling a more cost-effective operation.

48%

reduction in end-to-end execution time

Challenge

The client’s previous data pipelines ran in Azure Synapse Analytics and Azure Data Factory (ADF) environments, which processed Google Analytics 4 (GA4) data stored in BigQuery. The system had reached its performance limits. Coupled with a scaling environment, this resulted in slow queries and increasing costs.

Solution

We proposed a PoC solution to migrate part of the client’s data ops into Databricks.

As part of this, we replaced the previous Synapse + ADF-based orchestration and transformations with Databricks Jobs. During the migration, we redesigned the processes to take full advantage of the Databricks platform. achieving an average of 48% reduction in runtime.

The new architecture also enables more cost-effective operation thanks to optimized resource utilization, and has room for expanding the project’s scope to other parts of the data pipeline.

Service

Data

Industries

Real Estate

Technologies

Databricks

Google Analytics