Analyzing Global E-commerce with BigQuery and Power BI

GlobalMart: A Global E-commerce Powerhouse ๐ŸŒ

GlobalMart is a multinational e-commerce company operating across four major global regions: the Americas (AMER), Europe, Middle East, and Africa (EMEA), Asia-Pacific (APAC), and a distinct segment for its Home Market (HQ). Our mission is to provide a seamless shopping experience for millions of customers by offering a diverse range of products, including electronics, apparel, home goods, and media.

As a high-growth, globally distributed business, GlobalMart generates millions of transactional records daily. This immense volume of data presents both an opportunity and a challenge: to harness this information to understand customer behavior, optimize inventory, and ultimately drive profitability across varied international markets.

Analytical Objectives for the Case Study ๐ŸŽฏ

This case study is designed to showcase the complete data pipeline, from raw data generation to final business insights, with a focus on defining and calculating key metrics necessary for effective e-commerce management.

** The dataset we will use was generated synthetically previously with python

 << here >>

The primary analytical goals are to determine:

Revenue and Sales Performance: Calculate Total Revenue, Total Units Sold, and CLV (Customer Lifetime Value), broken down by product Category and Country. This provides a clear picture of what is selling and where.

Customer Behavior and Segmentation: Analyze Customer Acquisition and Retention rates. Specifically, we aim to identify the proportion of New Customers versus Repeat Customers to understand the health of our customer base.

Geographic and Time-Series Trends: Evaluate sales performance over time (monthly) and compare the performance of the four major regions to identify high-growth areas and underperforming markets.

Data Transformation Proficiency: Demonstrate the use of BigQuery SQL to clean, join, and aggregate the raw transactional data, preparing it into summary tables ready for visualization in Power BI.

BigQuery 

1. Upload the CVS files into BigQuery as a dataset in saparated tables.

Customers, Products, orders and order_items.



2. Fact Sales table:
First we should create a single table with all the needed information, in this case the name of our project in big query is mindful-baton-430418-a9, the dataset name is GlobalMart and the table will be named fact_sales.

See query

CREATE TABLE `mindful-baton-430418-a9.GlobalMart.fact_sales` AS
SELECT
    o.order_id,
    o.order_date,
    o.shipping_status,
    c.customer_id,
    c.country,
    c.city,
    p.product_id,
    p.product_name,
    p.category,
    oi.quantity,
    oi.price AS total_item_price
FROM
    `mindful-baton-430418-a9.GlobalMart.orders` AS o
JOIN
    `mindful-baton-430418-a9.GlobalMart.order_items` AS oi
    ON o.order_id = oi.order_id
JOIN
    `mindful-baton-430418-a9.GlobalMart.products` AS p
    ON oi.product_id = p.product_id
JOIN
    `mindful-baton-430418-a9.GlobalMart.customers` AS c
    ON o.customer_id = c.customer_id
WHERE
    o.shipping_status = 'Delivered'
    

3. Total Month Income:
Make some calculations and totalization is always useful to campare and split in parts to get context.

See query

SELECT
    FORMAT_DATE('%Y-%m', order_date) AS sales_month,
    SUM(total_item_price) AS total_revenue
FROM
    `mindful-baton-430418-a9.GlobalMart.fact_sales`
GROUP BY
    sales_month
ORDER BY
    sales_month;
    

4. Top 10 products by quantity sold:
One main part of a good sales analysis is identify how exactly the sales have gone, the star products give us much more context.

See query

SELECT
    product_name,
    category,
    SUM(quantity) AS total_quantity_sold
FROM
    `mindful-baton-430418-a9.GlobalMart.fact_sales`
GROUP BY
    product_name,
    category
ORDER BY
    total_quantity_sold DESC
LIMIT 10;
    

5. Average CLV by Country:
The ratios are all the time a good help to analyse data, in this case the Customer Lifetime Value let us know how valuable they are for our business in the differents places they are located.

See query

CREATE TABLE `mindful-baton-430418-a9.GlobalMart.clv_average_by_country`  AS
SELECT
    country,
    AVG(customer_lifetime_value) AS average_clv_by_country
FROM (
    SELECT
        customer_id,
        country,
        SUM(total_item_price) AS customer_lifetime_value
    FROM
        `mindful-baton-430418-a9.GlobalMart.fact_sales`
    GROUP BY
        customer_id,
        country
)
GROUP BY
    country
ORDER BY
    average_clv_by_country DESC;

6. Average CLV by Category:
This one help us determine how profitable each category are.

See query

CREATE TABLE `mindful-baton-430418-a9.GlobalMart.clv_average_by_category`  AS
SELECT
    category,
    AVG(customer_lifetime_value) AS average_clv_by_category
FROM (
    SELECT
        customer_id,
        category,
        SUM(total_item_price) AS customer_lifetime_value
    FROM
        `mindful-baton-430418-a9.GlobalMart.fact_sales`
    GROUP BY
        customer_id,
        category
)
GROUP BY
    category
ORDER BY
    average_clv_by_category DESC;
    

7. Repurchase Rate:
In witch proportion customers repurchase between 90 days

See query

CREATE TABLE `mindful-baton-430418-a9.GlobalMart.repurchase_90_days`  AS
WITH CustomerFirstPurchase AS (
  SELECT
    customer_id,
    MIN(order_date) AS first_purchase_date
  FROM
    `mindful-baton-430418-a9.GlobalMart.fact_sales`
  GROUP BY
    customer_id
),
CustomerSecondPurchase AS (
  SELECT
    fs.customer_id,
    fs.order_date
  FROM
    `mindful-baton-430418-a9.GlobalMart.fact_sales` AS fs
  JOIN
    CustomerFirstPurchase AS cfp
    ON fs.customer_id = cfp.customer_id
  WHERE
    fs.order_date > cfp.first_purchase_date
),
RepeatCustomers AS (
  SELECT
    DISTINCT customer_id
  FROM
    CustomerSecondPurchase
  WHERE
    EXISTS (
      SELECT 1
      FROM CustomerFirstPurchase AS cfp
      WHERE
        CustomerSecondPurchase.customer_id = cfp.customer_id
        AND DATE_DIFF(CustomerSecondPurchase.order_date, cfp.first_purchase_date, DAY) <= 90
    )
)
SELECT
  (COUNT(DISTINCT rp.customer_id) * 1.0 / COUNT(DISTINCT cfp.customer_id)) AS repurchase_rate_90_days
FROM
  CustomerFirstPurchase AS cfp
LEFT JOIN
  RepeatCustomers AS rp
  ON cfp.customer_id = rp.customer_id;
    

8. New customers per Month:
So we can estimates how the sales would be.

See query

CREATE TABLE `mindful-baton-430418-a9.GlobalMart.new_customers_month`  AS
SELECT
    FORMAT_DATE('%Y-%m', first_purchase_date) AS cohort_month,
    COUNT(customer_id) AS new_customers
FROM (
    SELECT
        customer_id,
        MIN(order_date) AS first_purchase_date
    FROM
        `mindful-baton-430418-a9.GlobalMart.fact_sales`
    GROUP BY
        customer_id
)
GROUP BY
    cohort_month
ORDER BY
    cohort_month;


Power BI 

Charge Data




----- Dashboard ----



<<  Download .pbix >>