Last Function

There are lots of interesting functions in MicroStrategy’s Analytical Engine, but every once in awhile I stumble across one that really saves my day (though often long after the day needed saving).  I work with a Type 2 Model aka. Slowly Changing Dimensions aka. As-is vs As-was, and unfortunately it can cause a lot of hoops for me to jump through.  In this model, we store historical values for lots of non-aggregatable metrics.  The traditional way of handling them is to set their Dimensionality to Ending (Fact) and then not have to worry about summing where it shouldn’t.  I’ve never been a fan of that method since it causes a lot of extra joins and subqueries, but the other day I found a nifty Analytical Engine function that seems to provide an alternative.

The function is called the Last() function, and while usually used in subtotals or on the fly calculations, I’ve actually found that in some cases, it can really help out those non-aggregatable metrics directly.  
In order for the normal method of Ending (Fact) Dimensionality to work, it first has to find the ending value, and then join to it.  The resulting SQL looks something like this:


insert into ZZNB00
select a11.DATEKEY,
sum(a11.FACT)  METRIC
from FACT_TABLE a11
group by a11.DATEKEY

insert into ZZMB01
select max(pc11.DATEKEY)  DATEKEY
from ZZNB00 pc11

select pa11.METRIC
from ZZNB00 pa11
join ZZMB01 pa12
 on (pa11.DATEKEY = pa12.DATEKEY)

You can already see it’s a little messy and has some temp table juggling.  Of course, my database platform is optimized for subqueries over temp tables, so in my environment the SQL comes out looking like this:


select pa11.METRIC
from (select a11.DATEKEY,
sum(a11.FACT)  METRIC
from FACT_TABLE a11
group by a11.DATEKEY
) pa11
join (select max(pc11.DATEKEY)  DATEKEY
from (select a11.DATEKEY  DATEKEY,
sum(a11.FACT)  METRIC
from FACT_TABLE a11
group by a11.DATEKEY
) pc11
) pa12
 on (pa11.DATEKEY = pa12.DATEKEY)

This is even worse because it makes two trips to the Fact Table.  Both of the above SQL queries were generated by simply placing one metric on a template with no attributes and no filters.

Compare that to defining the same metric by using Last(FACT) instead of Sum(Fact) (~, Date):


select a11.DATEKEY,
a11.FACT  METRIC
from FACT_TABLE a11

[Analytical SQL calculated by the Analytical Engine:
select Last(METRIC)
from [previous pass]
]

Here you can see that I get a simple pull from the Fact Table one time, and the Analytical Engine will handle the filter for me.  This method *could* result in better or worse performance, and really depends on the nature of your model and your database.  I don’t use it for all metrics, because the obvious disadvantage is that the Last() Analytical Engine Function will pull back all of the data, which could potentially be a lot, whereas the second join in the Dimensionality Ending (Fact) method will be handled in the database.

Where I like to use this is when I have a metric that I know should only return a single value and only be used for a single value.  For example, if I had a report that showed the Stock Price for a given day, I may limit the report to only be able to run for 1 day, or I would expect users to never pass a date range.  Expecting that the amount of data coming back will always be 1 row anyway, but still wanting to play it safe, I think a Last(Fact) metric makes more sense and will perform better.  Add to the fact that the SQL is far easier to debug when mixed with other objects, and this trick can both ease some complexity in your system as well as provide a little boost in performance in the right scenarios.

As mentioned it’s not a “one size fits all” approach, but it’s always good to know that alternate methods exist out there.

You may also like...