Pages

Monday, September 26, 2011

Running totals in SQL Server

One of the most asked question of the age is the "running total problem" of SQL. When you have to generate a running total of a SQL Server table.


The Problem:
The problem which arrives is this that the user wants to calculate the running totals for a particular column in the database to be either used in a report or for reference to be stored in a column and used later on while querying the database.


Let us suppose the user has this kind of table....



ID
Date
Net Payment Due
Running Total
1
2011-08-19 12:23
10.17

2
2011-08-19 12:32
45.78

3
2011-09-14 15:03
6.80

4
2011-09-14 15:07
6.77

5
2011-09-17 11:14
0.00

6
2011-09-17 13:51
6.75

7
2011-09-19 10:27
4.25




The fourth column "RunningTotal" is what the user wants. He wants a running total of the corresponding payments defined over a period of time, so that he can easily see the Running Total of the Payments over a period of time.


The Solution:
There could be many solutions for this issue.
  • Modern day reporting tools give us the flexibility to calculate the running totals within the reports. This way the user won't have to calculate the running totals in the database.
  • Keep running totals in a Computed Column in a database and simply query that column whenever required.
  • Calculate the Running Totals in the same query whenever required.
First and the third scenarios are used when the user (in this case, the developer) only wants the running total to be calculated for a very few times and he would like to calculate the running totals only when is needed.

One should use the second case only when there is frequent need of the Running Total. The running total, thus would be stored in the database and the developer could access the running totals through a simple query of the column. As we all know, that computed columns have their drawbacks. I would strongly recommend to use the third solution.

The Options:
There are different options available for us to use when calculating running totals using SQL Server....

  1. Nested Sub Query
  2. Self Join
  3. Cursor
Let us see those options one by one...

1. Using Nested Sub Query:
We will use a sub query to calculate sub totals for a result set while fetching the result set in parallel. The query would be like this...

SELECT  ID ,
DataEntryDateTime ,
NetPaymentDue ,
NetPaymentDue + COALESCE((SELECT    SUM(NetPaymentDue)
 FROM      dbo.Payment b
 WHERE     b.id < a.ID
), 0) AS RunningTotal
FROM    dbo.Payment a

Here, the key to the solution is the little 'WHERE' clause, which is restricting only the previous rows to be fetched from the database. 


Note: If your ID is not the primary key, You need to add an 'ORDER BY' at the end of the outer query to sort out the results as well.


2. Using Self Join:
In this method we will use a 'JOIN' statement to join to the same table and will produce the same results. Here is how...




SELECT  a.ID ,
a.NetPaymentDue ,
SUM(b.NetPaymentDue)
FROM    dbo.Payment a
INNER JOIN dbo.Payment b ON (b.ID <= a.ID)
GROUP BY a.ID , a.NetPaymentDue
ORDER BY a.ID , a.NetPaymentDue


If you want the 'Date' column to appear in the select list as well, you will have to put the above query in the sub-query and join to the same table one more time to do it...


3. Using Cursor:
The third option is to use the cursor, which (to me) seems to be the best possible solution for this kind of problem. This also proves the point that cursors are not always slow... :-)



DECLARE @TempTbl TABLE
(
ID SMALLINT ,
NetPaymentDue MONEY ,
RunningTotal MONEY
)
 
DECLARE @ID SMALLINT ,
@NetPaymentDue MONEY ,
@RunningTotal MONEY
 
SET @RunningTotal = 0
 
DECLARE rt_cursor CURSOR
FOR
SELECT ID, NetPaymentDue
FROM dbo.Payment
ORDER BY ID
 
OPEN rt_cursor
 
FETCH NEXT FROM rt_cursor INTO @ID, @NetPaymentDue
 
WHILE @@FETCH_STATUS = 0 
BEGIN
SET @RunningTotal = @RunningTotal + @NetPaymentDue
INSERT  @TempTbl
VALUES  (@ID, @NetPaymentDue, @RunningTotal)
FETCH NEXT FROM rt_cursor INTO @ID, @NetPaymentDue
END
 
CLOSE rt_cursor
DEALLOCATE rt_cursor
 
SELECT  *
FROM    @TempTbl
Like the post.... Give it a Kick...... kick it on DotNetKicks.com

2 comments:

  1. Good! That's how it would be done if running total is not saved in db but on a very large database i think it makes more sense to keep it pre-calculated in another column in the same table.

    ReplyDelete
  2. This is directed at anonymous :) Nice post though!

    I would really like to see how you would keep the running total in a pre calculated column on the same table without causing a real performance headache with updates.

    ReplyDelete