Last Function
There are lots of interesting functions in MicroStrategy’s Analytical Engine, but every once in awhile I stumble across one that really saves my day (though often long after the day needed saving). I work with a Type 2 Model aka. Slowly Changing Dimensions aka. As-is vs As-was, and unfortunately it can cause a lot of hoops for me to jump through. In this model, we store historical values for lots of non-aggregatable metrics. The traditional way of handling them is to set their Dimensionality to Ending (Fact) and then not have to worry about summing where it shouldn’t. I’ve never been a fan of that method since it causes a lot of extra joins and subqueries, but the other day I found a nifty Analytical Engine function that seems to provide an alternative.
insert into ZZNB00
select a11.DATEKEY,
sum(a11.FACT) METRIC
from FACT_TABLE a11
group by a11.DATEKEY
insert into ZZMB01
select max(pc11.DATEKEY) DATEKEY
from ZZNB00 pc11
select pa11.METRIC
from ZZNB00 pa11
join ZZMB01 pa12
on (pa11.DATEKEY = pa12.DATEKEY)
select pa11.METRIC
from (select a11.DATEKEY,
sum(a11.FACT) METRIC
from FACT_TABLE a11
group by a11.DATEKEY
) pa11
join (select max(pc11.DATEKEY) DATEKEY
from (select a11.DATEKEY DATEKEY,
sum(a11.FACT) METRIC
from FACT_TABLE a11
group by a11.DATEKEY
) pc11
) pa12
on (pa11.DATEKEY = pa12.DATEKEY)
Compare that to defining the same metric by using Last(FACT) instead of Sum(Fact) (~, Date):
select a11.DATEKEY,
a11.FACT METRIC
from FACT_TABLE a11
[Analytical SQL calculated by the Analytical Engine:
select Last(METRIC)
from [previous pass]
]
Where I like to use this is when I have a metric that I know should only return a single value and only be used for a single value. For example, if I had a report that showed the Stock Price for a given day, I may limit the report to only be able to run for 1 day, or I would expect users to never pass a date range. Expecting that the amount of data coming back will always be 1 row anyway, but still wanting to play it safe, I think a Last(Fact) metric makes more sense and will perform better. Add to the fact that the SQL is far easier to debug when mixed with other objects, and this trick can both ease some complexity in your system as well as provide a little boost in performance in the right scenarios.
As mentioned it’s not a “one size fits all” approach, but it’s always good to know that alternate methods exist out there.