Attribution Queries
Attribution analysis in BigQuery gives you full control over the model, lookback window, and key event definition. GA4’s UI now supports only Data-driven and Last-click attribution models. For other multi-touch attribution approaches (first-touch, linear, time-decay, position-based), BigQuery SQL implementations are your primary option. BigQuery attribution can span the full user lifetime and credit any event or channel you define.
Foundation: building the touchpoint sequence
Section titled “Foundation: building the touchpoint sequence”All attribution models start from the same foundation — a table of sessions with their traffic source, ordered by time, for each user who eventually converted.
-- Step 1: Get all sessions with source informationWITH sessions AS ( SELECT user_pseudo_id, (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') AS session_id, MIN(TIMESTAMP_MICROS(event_timestamp)) AS session_start_time, MAX(CASE WHEN event_name = 'session_start' THEN traffic_source.source END) AS source, MAX(CASE WHEN event_name = 'session_start' THEN traffic_source.medium END) AS medium, MAX(CASE WHEN event_name = 'session_start' THEN traffic_source.name END) AS campaign, COUNTIF(event_name = 'purchase') AS purchases, SUM(CASE WHEN event_name = 'purchase' THEN ecommerce.purchase_revenue ELSE 0 END) AS revenue FROM `project.analytics_PROPERTY_ID.events_*` WHERE _TABLE_SUFFIX BETWEEN '20231001' AND '20240131' -- 90-day lookback window AND (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') IS NOT NULL GROUP BY user_pseudo_id, session_id),
-- Step 2: Identify converting users and their conversion timeconversions AS ( SELECT user_pseudo_id, MIN(session_start_time) AS first_conversion_time, SUM(revenue) AS total_revenue FROM sessions WHERE purchases > 0 GROUP BY user_pseudo_id),
-- Step 3: Build conversion paths — all sessions before conversion, within lookbackconversion_paths AS ( SELECT s.user_pseudo_id, s.session_id, s.session_start_time, s.source, s.medium, s.campaign, c.first_conversion_time, c.total_revenue, ROW_NUMBER() OVER ( PARTITION BY s.user_pseudo_id ORDER BY s.session_start_time ) AS touchpoint_number, COUNT(*) OVER (PARTITION BY s.user_pseudo_id) AS total_touchpoints FROM sessions s INNER JOIN conversions c USING (user_pseudo_id) WHERE s.session_start_time <= c.first_conversion_time -- before conversion AND s.session_start_time >= TIMESTAMP_SUB(c.first_conversion_time, INTERVAL 90 DAY) -- within lookback AND s.source != '(direct)' -- exclude direct (consistent with GA4's last non-direct click))
SELECT *FROM conversion_pathsORDER BY user_pseudo_id, touchpoint_numberLast-touch attribution
Section titled “Last-touch attribution”WITH conversion_paths AS ( -- ... (CTE from above)),
last_touch AS ( SELECT * FROM conversion_paths WHERE touchpoint_number = total_touchpoints -- only the last touchpoint)
SELECT source, medium, campaign, COUNT(DISTINCT user_pseudo_id) AS conversions, SUM(total_revenue) AS attributed_revenueFROM last_touchGROUP BY source, medium, campaignORDER BY attributed_revenue DESCFirst-touch attribution
Section titled “First-touch attribution”WITH first_touch AS ( SELECT * FROM conversion_paths WHERE touchpoint_number = 1 -- only the first touchpoint)
SELECT source, medium, campaign, COUNT(DISTINCT user_pseudo_id) AS conversions, SUM(total_revenue) AS attributed_revenueFROM first_touchGROUP BY source, medium, campaignORDER BY attributed_revenue DESCLinear attribution
Section titled “Linear attribution”Each touchpoint gets an equal share of conversion credit:
SELECT source, medium, campaign, COUNT(DISTINCT user_pseudo_id) AS conversion_paths, SUM(1.0 / total_touchpoints) AS attributed_conversions, SUM(total_revenue / total_touchpoints) AS attributed_revenueFROM conversion_pathsGROUP BY source, medium, campaignORDER BY attributed_revenue DESCTime-decay attribution
Section titled “Time-decay attribution”More recent touchpoints get more credit. The weight decreases exponentially with time before conversion:
WITH paths_with_weights AS ( SELECT *, -- Time decay: half-life of 7 days -- Weight = 2^(days_before_conversion / 7) -- Normalized so all weights sum to 1 per conversion path TIMESTAMP_DIFF(first_conversion_time, session_start_time, DAY) AS days_before_conversion FROM conversion_paths),
paths_with_decay AS ( SELECT *, POW(2, -1.0 * days_before_conversion / 7.0) AS raw_weight FROM paths_with_weights),
paths_with_normalized_weights AS ( SELECT *, raw_weight / SUM(raw_weight) OVER (PARTITION BY user_pseudo_id) AS weight FROM paths_with_decay)
SELECT source, medium, campaign, SUM(weight) AS attributed_conversions, SUM(total_revenue * weight) AS attributed_revenueFROM paths_with_normalized_weightsGROUP BY source, medium, campaignORDER BY attributed_revenue DESCPosition-based (U-shaped) attribution
Section titled “Position-based (U-shaped) attribution”40% to first touch, 40% to last touch, 20% equally distributed across middle touches:
WITH paths_with_position_weights AS ( SELECT *, CASE WHEN total_touchpoints = 1 THEN 1.0 -- single touchpoint gets 100% WHEN total_touchpoints = 2 AND touchpoint_number = 1 THEN 0.5 -- first of two WHEN total_touchpoints = 2 AND touchpoint_number = 2 THEN 0.5 -- last of two WHEN touchpoint_number = 1 THEN 0.4 -- first touch WHEN touchpoint_number = total_touchpoints THEN 0.4 -- last touch ELSE 0.2 / GREATEST(total_touchpoints - 2, 1) -- middle touches share 20% END AS weight FROM conversion_paths)
SELECT source, medium, campaign, SUM(weight) AS attributed_conversions, SUM(total_revenue * weight) AS attributed_revenueFROM paths_with_position_weightsGROUP BY source, medium, campaignORDER BY attributed_revenue DESCMulti-model comparison
Section titled “Multi-model comparison”Compare all models side by side:
WITH all_models AS ( SELECT source, medium, campaign, 'Last Touch' AS model, COUNT(DISTINCT user_pseudo_id) AS conversions, SUM(total_revenue) AS revenue FROM conversion_paths WHERE touchpoint_number = total_touchpoints GROUP BY source, medium, campaign
UNION ALL
SELECT source, medium, campaign, 'First Touch', COUNT(DISTINCT user_pseudo_id), SUM(total_revenue) FROM conversion_paths WHERE touchpoint_number = 1 GROUP BY source, medium, campaign
UNION ALL
SELECT source, medium, campaign, 'Linear', SUM(1.0 / total_touchpoints), SUM(total_revenue / total_touchpoints) FROM conversion_paths GROUP BY source, medium, campaign)
SELECT source, medium, model, ROUND(conversions, 1) AS attributed_conversions, ROUND(revenue, 2) AS attributed_revenueFROM all_modelsORDER BY source, medium, modelAttribution with channel grouping
Section titled “Attribution with channel grouping”Apply the channel UDF to the attribution output for cleaner reporting:
SELECT `project.analytics_derived.classify_channel`(source, medium, campaign) AS channel, SUM(total_revenue / total_touchpoints) AS linear_attributed_revenueFROM conversion_pathsGROUP BY channelORDER BY linear_attributed_revenue DESCCommon mistakes
Section titled “Common mistakes”Forgetting to exclude direct traffic
Section titled “Forgetting to exclude direct traffic”GA4’s standard attribution excludes direct from the conversion path (last non-direct click). If you include direct sessions in your path, they dominate as the last touchpoint because users often return directly to complete their purchase. Decide whether to include or exclude direct, be consistent, and document your choice.
Not applying a lookback window
Section titled “Not applying a lookback window”Without a lookback window limit, a user’s first organic search visit 18 months ago might receive first-touch attribution for a recent purchase. Set a realistic lookback window (30-90 days for most businesses; 90+ days for high-consideration B2B purchases).
Using event_date instead of event_timestamp for ordering
Section titled “Using event_date instead of event_timestamp for ordering”event_date is a string in YYYYMMDD format. For touchpoint ordering, use event_timestamp (microseconds since epoch) to get the correct chronological order within the same day.
Counting the same conversion multiple times
Section titled “Counting the same conversion multiple times”If a user has multiple purchase events, the foundation query above attributes all revenue to a single “first conversion.” For multi-purchase attribution (attributing each order independently), adjust the conversions CTE to be per-transaction, not per-user.