Running Totals with
SUM() OVER()
- Tracks cumulative balances per user over time.
- Great for generating account statements.
Previous Transaction Comparison with
LAG()
- Helps detect anomalies or sudden changes in transaction behavior.
Daily Ranking with
RANK()
- Useful for leaderboards, fraud detection, or incentive programs.
First Transaction Date with
FIRST_VALUE()
- Identifies onboarding or customer lifecycle start points.
💡 Ideas for Further Exploration
Would you like to explore any of these next?
- Monthly Averages or Trends using
AVG() OVER()
andDATE_TRUNC()
- Detecting Consecutive Days of Activity using
LAG()
and date arithmetic - Flagging First and Last Transactions using
ROW_NUMBER()
orRANK()
- Rolling 7-Day or 30-Day Sums using
ROWS BETWEEN
with date logic - Visualizing these results in a dashboard or chart (e.g., using Python or Power BI)
Let me know if you'd like to dive deeper into any of these or if you want to try this on your own dataset!
Real-World Problem: Track Account Activity and Rank Customers Imagine you manage a digital banking app and need to: Track each user’s balance changes across multiple days Identify customers with the highest daily transaction volume Compare a user’s transaction with their previous one Calculate a running total for account statements All of these can be done with window functions, cleanly and efficiently. Sample Input Table: Transactions +-------------+----------+------------+---------+------------+ | Transaction | User_ID | Txn_Date | Amount | Type | +-------------+----------+------------+---------+------------+ | T1001 | 101 | 2025-05-10 | 500.00 | Credit | | T1002 | 102 | 2025-05-10 | 1000.00 | Credit | | T1003 | 101 | 2025-05-11 | -200.00 | Debit | | T1004 | 103 | 2025-05-11 | 1500.00 | Credit | | T1005 | 101 | 2025-05-12 | -100.00 | Debit | | T1006 | 102 | 2025-05-12 | -500.00 | Debit | | T1007 | 103 | 2025-05-12 | -300.00 | Debit | | T1008 | 102 | 2025-05-13 | 2000.00 | Credit | +-------------+----------+------------+---------+------------+ 1. Running Total (Balance Over Time) Using SUM() OVER SELECT User_ID, Txn_Date, Amount, SUM(Amount) OVER(PARTITION BY User_ID ORDER BY Txn_Date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Running_Balance FROM Transactions; Output: +----------+------------+---------+-----------------+ | User_ID | Txn_Date | Amount | Running_Balance | +----------+------------+---------+-----------------+ | 101 | 2025-05-10 | 500.00 | 500.00 | | 101 | 2025-05-11 | -200.00 | 300.00 | | 101 | 2025-05-12 | -100.00 | 200.00 | | 102 | 2025-05-10 | 1000.00 | 1000.00 | | 102 | 2025-05-12 | -500.00 | 500.00 | | 102 | 2025-05-13 | 2000.00 | 2500.00 | | 103 | 2025-05-11 | 1500.00 | 1500.00 | | 103 | 2025-05-12 | -300.00 | 1200.00 | +----------+------------+---------+-----------------+ 2. Compare with the Previous Transaction Using LAG() SELECT User_ID, Txn_Date, Amount, LAG(Amount, 1, 0) OVER(PARTITION BY User_ID ORDER BY Txn_Date) AS Prev_Amount, Amount - LAG(Amount, 1, 0) OVER(PARTITION BY User_ID ORDER BY Txn_Date) AS Change_From_Last FROM Transactions; Output: +----------+------------+---------+-------------+-----------------+ | User_ID | Txn_Date | Amount | Prev_Amount | Change_From_Last| +----------+------------+---------+-------------+-----------------+ | 101 | 2025-05-10 | 500.00 | 0.00 | 500.00 | | 101 | 2025-05-11 | -200.00 | 500.00 | -700.00 | | 101 | 2025-05-12 | -100.00 | -200.00 | 100.00 | | 102 | 2025-05-10 | 1000.00 | 0.00 | 1000.00 | | 102 | 2025-05-12 | -500.00 | 1000.00 | -1500.00 | | 102 | 2025-05-13 | 2000.00 | -500.00 | 2500.00 | | 103 | 2025-05-11 | 1500.00 | 0.00 | 1500.00 | | 103 | 2025-05-12 | -300.00 | 1500.00 | -1800.00 | +----------+------------+---------+-------------+-----------------+ 3. Ranking Users by Daily Transaction Volume Using RANK() SELECT Txn_Date, User_ID, SUM(ABS(Amount)) AS Daily_Volume, RANK() OVER(PARTITION BY Txn_Date ORDER BY SUM(ABS(Amount)) DESC) AS Daily_Rank FROM Transactions GROUP BY Txn_Date, User_ID; Output: +------------+----------+--------------+-------------+ | Txn_Date | User_ID | Daily_Volume | Daily_Rank | +------------+----------+--------------+-------------+ | 2025-05-10 | 102 | 1000.00 | 1 | | 2025-05-10 | 101 | 500.00 | 2 | | 2025-05-11 | 103 | 1500.00 | 1 | | 2025-05-11 | 101 | 200.00 | 2 | | 2025-05-12 | 102 | 500.00 | 1 | | 2025-05-12 | 103 | 300.00 | 2 | | 2025-05-12 | 101 | 100.00 | 3 | | 2025-05-13 | 102 | 2000.00 | 1 | +------------+----------+--------------+-------------+ This is useful for leaderboard-like dashboards or fraud detection systems. 4. First Transaction Date Using FIRST_VALUE() SELECT User_ID, Txn_Date, Amount, FIRST_VALUE(Txn_Date) OVER(PARTITION BY User_ID ORDER BY Txn_Date) AS First_Txn_Date FROM Transactions; Output: +----------+------------+---------+----------------+ | User_ID | Txn_Date | Amount | First_Txn_Date | +----------+------------+---------+----------------+ | 101 | 2025-05-10 | 500.00 | 2025-05-10 | | 101 | 2025-05-11 | -200.00 | 2025-05-10 | | 101 | 2025-05-12 | -100.00 | 2025-05-10 | | 102 | 2025-05-10 | 1000.00 | 2025-05-10 | | 102 | 2025-05-12 | -500.00 | 2025-05-10 | | 102 | 2025-05-13 | 2000.00 | 2025-05-10 | | 103 | 2025-05-11 | 1500.00 | 2025-05-11 | | 103 | 2025-05-12 | -300.00 | 2025-05-11 | +----------+------------+---------+----------------+
No comments:
Post a Comment