Technical Documentation for Data Science & AI Projects
View the Project on GitHub AlassaneDialDiop/browns_data_science
Marketing team only see the last touchpoint before purchase, missing the complete customer journey. This leads to:
3-step modular SQL design that tracks all customer touchpoints:
graph TD
%% Source Tables
subgraph Sources["<b>Source Tables (DAG Dependencies)</b>"]
S1[📊 UNIFIED_LINEITEMS_WITH_SFCC<br/><i>Order details</i>]
S2[🔍 SESSION_ORDERS<br/><i>Truth sessions</i>]
S3[👥 SESSIONS<br/><i>All web sessions</i>]
S4[📧 DS_CUSTOMER_INFO<br/><i>Email mapping</i>]
end
%% Step A: Session Collection
subgraph StepA["<b>Step A: Session Collection</b><br/><i>Granularity: order_id × session_id × window</i>"]
A1["Output: MULTITOUCH_ORDERED_SESSIONS<br/><b>Keys:</b> order_id, session_id, attribution_window_days<br/><b>Rows:</b> ~2.5M YTD(all sessions for all orders)"]
end
%% Step B: Attribution Logic
subgraph StepB["<b>Step B: Attribution & Aggregation</b><br/><i>Granularity: date × method × journey</i>"]
B1["Apply 5 Attribution Methods"]
B2["<b>AGGREGATE:</b> Sessions → Daily Journeys<br/><b>Keys:</b> order_date, journey_method,<br/>province, Steps 1-4, attribution_window<br/><b>Rows:</b> ~1M YTD (unique journey patterns)"]
B1 --> B2
end
%% Step C: Final Output
subgraph StepC["<b>Step C: Production Format</b><br/><i>Granularity: date × channel × region × method × window</i>"]
C1["Output: Final Attribution Table<br/><b>Keys:</b> calendar_date, channel, region,<br/>journey_method, attribution_window_days<br/><b>Rows:</b> ~500K YTD(all combinations)"]
end
%% Connections with table names
S1 --> A1
S2 --> A1
S3 --> A1
S4 --> A1
A1 --> |"3M rows → 50K patterns"| B2
B2 --> |"Uses prep table"| C1
S1 --> |"Sales redistribution"| C1
%% Styling
classDef sourceStyle fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
classDef stepStyle fill:#f9f9f9,stroke:#333,stroke-width:2px
classDef aggregateStyle fill:#fff3e0,stroke:#f57c00,stroke-width:3px
class S1,S2,S3,S4 sourceStyle
class StepA,StepB,StepC stepStyle
class B2 aggregateStyle
File: multitouch_stepA_ordered_sessions.sql
Captures the complete customer journey by:
Key Idea: Each order appears 3 times (once per attribution window), allowing flexible analysis without reprocessing.
File: multitouch_stepB_prep.sql
Applies 5 different attribution models:
Output: Pre-aggregated daily journeys by strategy, ready for analysis.
File: multitouch_stepC_daily_agg.sql
Transforms attribution results to match existing table structure:
journey_method
column for attribution strategyattribution_window_days
column-- Step A Dependencies
PROD_ANALYZE.WEB_TRAFFIC.UNIFIED_LINEITEMS_WITH_SFCC -- Orders & sales
PROD_ANALYZE.WEB_TRAFFIC.SESSION_ORDERS -- Order-session mapping
PROD_ANALYZE.WEB_TRAFFIC.SESSIONS -- All web sessions
PROD_SANDBOX.ADIOP.DS_CUSTOMER_INFO -- Customer emails
PROD_SANDBOX.ADIOP.CUSTOMER_JOURNEY_TIMELINE -- New/existing flags
-- Step C Additional Dependencies
PROD_ANALYZE.GENERAL.FISCAL_CALENDAR_EXTENDED -- Date dimensions
PROD_ANALYZE.WEB_TRAFFIC.MISSING_SESSIONS_MARGIN_MONTHLY -- Corrections
-- Step A Output (Session-level)
PROD_SANDBOX.ADIOP.MULTITOUCH_ORDERED_SESSIONS -- 3M rows
-- Step B Output (Daily aggregated)
PROD_SANDBOX.ADIOP.MULTITOUCH_DAILY_JOURNEYS_PREP -- 50K rows
-- Step C Output (Production format)
[Custom table name based on use case] -- 500K rows
Last updated: 2025-07-31 v1.0 Built with Claude