End-to-End Analytics Data Platform

Localhost Data Engineering & BI Pipeline (Olist E-Commerce Case Study)

Project Overview

This project demonstrates the design and implementation of a locally hosted, end-to-end analytics data platform, replicating how data moves through a modern organisation — from raw ingestion to executive reporting.

Rather than focusing on exploratory analysis alone, the emphasis is on data architecture, orchestration, transformation, and consumption, using a real-world e-commerce dataset as the input source.

The result is a fully functional analytics pipeline that mirrors an Azure-style cloud workflow, built and operated locally.

Project Objective

The goal of this project was to move beyond isolated dashboards and instead demonstrate:

How raw data is ingested and stored reliably
How workflows are orchestrated and automated
How raw data is transformed into analytics-ready models
How business intelligence tools consume structured data
How technical components work together as a system

This reflects how data teams operate in production environments, rather than in one-off analysis tasks.

Key Outcomes

Successfully implemented a local analytics platform replicating real-world architecture
Built automated, repeatable data workflows rather than manual processes
Modelled complex relational data into clean analytical structures
Produced business-facing dashboards backed by governed data models
Gained hands-on experience debugging orchestration, storage, and schema issues

Olist Performance Analysis.pptx

Architecture Overview

The platform was built locally using containerised services to replicate a cloud-style analytics stack.

High-level architecture:

Object storage layer for raw data ingestion
Workflow orchestration to manage pipelines
Analytics database for structured data
Transformation layer to model business entities
BI layer for reporting and insight delivery

Each component was configured, connected, and tested as part of a single integrated system.

Tools & Technologies Used

Docker Desktop (containerised local environment)
Object storage (S3-style via MinIO)
Workflow orchestration (Apache Airflow)
Analytics database (PostgreSQL)
Data transformation (dbt)
Business intelligence & visualisation (Power BI)

Data Source

The platform uses a multi-table e-commerce dataset as its raw input acquired from Kaggle.com. The dataset provides realistic complexity — multiple entities, relationships, and time-based behaviour — suitable for modelling a production-style analytics workflow.

The dataset itself is not the focus of the project; it serves as a representative source to support system design, transformation logic, and reporting outputs.

Louis - Olist cloud project.pdf

Data Pipeline & Workflow

Ingestion & Storage

Raw CSV files are ingested into object storage, simulating a data lake or blob storage layer commonly used in cloud environments.

Orchestration

Automated workflows manage ingestion and processing steps, ensuring tasks execute in the correct order and can be monitored and debugged when failures occur.

Transformation

Raw data is transformed into analytics-ready fact and dimension tables, applying consistent naming, relationships, and business logic.

This layer is designed to support:

Reusability
Clear lineage from raw to curated data
BI-friendly schemas

Analytics & Reporting

The transformed data is consumed by a BI tool to produce executive dashboards covering:

Sales and revenue performance
Product and category insights
Customer and geographic trends
Delivery and operational health

GitHub - siaokp/olist-data-platform-analytics: End-to-end local data platform built on the Olist Brazilian e-commerce dataset, featuring Airflow-orchestrated ingestion, PostgreSQL warehousing, dbt dimensional modelling, and Power BI analytics.End-to-end local data platform built on the Olist Brazilian e-commerce dataset, featuring Airflow-orchestrated ingestion, PostgreSQL warehousing, dbt dimensional modelling, and Power BI analytics. ...

Home

Page updated

Google Sites

Report abuse