Saturday, December 18, 2010

Deconstructing an SQL snippet

Recently, I came across the following snippet in a Teradata SQL script that came my way:

select calendar_date
from sys_calendar.calendar
where Calendar_Date Between
(((ADD_MONTHS(DATE,-1)) /100) *100 +1) (DATE)
AND (((((ADD_MONTHS(DATE,-0)) /100 ) *100 +1) (DATE)) -1) (DATE);

I knew that Teradata, just like Excel, internally saves dates as integers but did not remember the conversion scheme. A quick search later, I had the formula:

Integer Date = (YEAR – 1900) * 10000 + (MONTH * 100) + DAY

So, Teradata internally saves today’s date, i.e. 2010-12-18, as 1101218. Here, year = 110 (110 years since 1900), month = 12 and day = 18.

Now, it was just a matter of taking the expressions apart piece by piece, starting from the innermost expressions.

In the line “(((ADD_MONTHS(DATE,-1)) /100 ) *100 +1) (DATE)”, the function ADD_MONTHS(DATE, -1) is merely adding -1 month to today’s date. Since today is 2010-12-18, the function returns 2010-11-18. It’s internally saved as 1101118.

The result of ADD_MONTHS(), 1101118, is divided by 100. This is an ‘integer’ division and results in 11011. The division removes the day part. If one tries to cast this number to date, Teradata complains, predictably.

In the next step, 11011 is multiplied by 100 to restore the day part, resulting in 1101100. However, there is something seriously wrong with 1101100 as it has ‘00’ for day part, which is illegal. This is why 1 is added to the result of (ADD_MONTHS(DATE,-1)) /100 ) *100 to give the first day of the previous month. The final '(DATE)' merely casts the integer to date.

So, all that the expression “(((ADD_MONTHS(DATE,-1)) /100 ) *100 +1) (DATE)” is doing is calculating the first day of the previous month. Similarly, the expression “( ( (((ADD_MONTHS(DATE,-0)) /100 ) *100 +1) (DATE) ) -1) (DATE)” calculates the last day of the previous month. It first calculates the first day of the current month and subtracts 1 day from it, which results in the last day of the previous month. (The ADD_MONTHS() function is redundant in the second expression as it is basically doing nothing except confound users).

Coincidentally, I had previously written a more intuitive way of achieving the same result. It has the virtue of not using magic numbers:


SELECT CALENDAR_DATE
FROM SYS_CALENDAR.CALENDAR
WHERE CALENDAR_DATE BETWEEN
ADD_MONTHS(CURRENT_DATE, -1) - EXTRACT(DAY FROM CURRENT_DATE) + 1
AND CURRENT_DATE - EXTRACT(DAY FROM CURRENT_DATE);

No comments:

Post a Comment