Aggregate Tables

Arguably the most powerful feature of MicroStrategy is that it is Aggregate Aware.  Since it writes ROLAP SQL, you as the Architect have the ability to simply drop in an Aggregate Table and MicroStrategy will automatically use it when appropriate.  The amount of development time required to increase the performance of your reports across the board is minimal.

Today, I’l talk about what Aggregate Tables are and identify some effective use cases.  Ultimately, the use of Aggregate Tables is a balance between pushing the calculations from Run Time to ETL, and their structures and quantity will depend on your scenarios and platforms.  One thing I can promise you though is that if you aren’t using Aggregate Tables today, you’re missing out on huge performance gains.


MicroStrategy is aggregate aware which means that it will always choose the best table to run the report from.  The tricky part is that the definition of “the best table” isn’t really intuitive.  MicroStrategy uses a ranking algorithm to score every table in the system.  This algorithm actually has nothing to do with the number of rows in the table, and instead is based on the sum of the cardinality of each attribute in the table.  The cardinality is calculated by taking each attribute’s place in the hierarchy, where the top level attribute is 1 and the bottom level attribute is X (where X is the number of attributes between itself and the top).

Example:
Year -> Month -> Day
   1    ->    2      ->   3

A table that contains Year, Revenue would have a value of 1.
A table that contains Month, Revenue would have a value of 2.
A table that contains Day, Revenue would have a value of 3. 

MicroStrategy will choose the table with the lowest value that can resolve the query.

The total count of all Attributes (Facts don’t count) is called the Logical Size of the Table.  At run time, MicroStrategy will look for tables that can answer the query.  If more than one are discovered, the one with the lowest Logical Size is chosen.

Say you have the following schema:

Detail_Fact_Table
————————-
Zip Code
Employee
Revenue

Dim_Zip_Code
———————
Zip_Code
City_ID

Dim_City
——————–
City_ID
City_Desc

Your Detail_Fact_Table may be pretty large, so you may want to introduce a City/Revenue aggregate table.  All you need to do is add that table to your database:

City_Agg_Table
———————-
City_ID
Revenue

Add it to your project (via Warehouse Catalog or Architect tool) and map the City attribute to it.  Now if you run a report with City and Revenue, it’ll come out of the City_Agg_Table.  If you drill down to Zip_Code, it will now come out of Detail_Fact_Table.  MicroStrategy will always choose the best table to use.  The best part is that you can introduce Aggregate Tables when you need them.  If you didn’t need or consider them at the start of the project, you can simply add this new table to the schema and everything you’ve ever built can now use it!

There may be situations where you’ll want to manually set the order that tables should be used.  For this, you can override the Logical Size yourself by editing the table and changing the value at the bottom.  Be sure to check the box “Preserve Logical Size” because the Logical Size is recalculated every time you update the schema.  You may want to do this if your Aggregate Table isn’t a true aggregate.  For example, maybe it only contains a subset of data.  MicroStrategy always assumes Aggregate Tables contain complete roll ups of each other.  So if you have 10yrs of data in your Detial and 2yrs of data in your Aggregate, then MicroStrategy isn’t going to know that it needs to run a 5yr old query out of the Detail table instead of the Aggregate, so structure your tables carefully.  (By the way, to accomplish that last scenario, you would want to use Metadata Partitioning, which I’ll talk about in a future post).

You may also like...

12 Responses

  1. potu says:

    Very helpful….

  2. Analytics says:

    Hi Bryan,
    I have one question, not exactly related to this.
    I have two table
    Table 1(Hour,X,Y,Z)
    Table 2(Day,X,Y,Z)
    As you can see difference above is day and hour object.
    Attributes are Hour,Day,X,Y
    Metrics= Z
    System Hierarchy=X,Y,Z,Hour,Day( no relationship is defined in this hierarchy)
    Report has
    Hour,X,Z Reports shows data
    Now when I drill from hour to Day, earlier it was giving error that ” Fact(Z) doesn’t exist at Day level”.
    Now this error is not coming I don’t know why? I didn’t change anything

    My another question is since hour and day object are from different table, how can I add relationship between them ??. In hierarchy tab, I am not allowed to add relationship between hour and day as they are from different table.
    Secondly, When I right click on Hour in report, it says drill->other directions->Y(attribute-not dragged in report),Day(attribute-not dragged in report).
    Now when I select Day, report successfully shows data.

    Earlier when it was showing level error(as stated above) it says drill down/drill up, but now it saying drill->other directions and its working fine.

    I don’t understand how it is working now.
    Secondly do I need to create any relationship between time objects, if yes then how I can?
    How I can show Drill>up or Drill->down instead of Drill-> other directions.
    Do I need to set drill map(options comes when I edit hour attribute, and it ask for drill up and drill down)

    Any suggestion will be great help!!!!
    Thanks.

    • Bryan says:

      The key to that definition is that Fact Z is marked in both of those fact tables. You would end up with Attributes Hour and Day in their respective tables only, and X,Y,Z as marked in both tables. Then the aggregation will work properly, otherwise you get the “Fact doesn’t exist at a level..” error.

      There’s a difference between a Parent/Child relationship, which is used for joins, and a Drilling Hierarchy which is used for default drill paths. In this schema, you would create a new Hierarchy that contains Day and Hour and link one to the other. Check the box at the bottom to define it as a Drill Hierarchy, and now on reports Day -> Hour will be a direct drill option. The way you’re currently doing it is a cross drill. They achieve the same ends, it’s just a matter of convenience.

      The reason Day and Hour can’t be defined as a Parent/Child is because they don’t share a table in common. Those such relationships should be handled in lookup/relationship tables and not fact tables. However, there isn’t really a direct relationship between Date and Hour traditionally. It doesn’t matter at a high level whether “4pm” is from “1/1/12″ or “2/1/12″ until you start talking about facts. A traditional schema would have Date, Hour, X,Y,Z in the “Hour table”, and Date, X,Y,Z in the “Date table”. I still wouldn’t create a parent/child between date because there’s no need (the drilling it handled by the Drill Hierarchy). In this way though, a report run for “1/1/12 4pm” can be resolved out of the “Hour table”. As your current schema stands, that query isn’t possible.

    • Hi Bryan,

      I have a question. It’s similar to your post.
      I need to have a complex calculation in metric. So in short: it should have an aggregates on each level of Time dimension and higher level aggregate is not equal to sum of lower level aggregations. For example Year level value <> Sum of Quarter level values.

      I have 5 separate levels in Time hierarchy: Year, Semester, Quarter, Month, Week.
      In DWH I created 5 separate fact views with aggregated values for each level and one common fact on top of these facts. I plan to use this fact on metric level.
      Whether its possible to implement a graph report with one complex metric, which takes aggregated value from the current level of hierarchy? This graph should have a drilling option, so I can drill down for example from Year to Quarter and see the metric value from the current level accordingly.

      Thank you in advance.

    • Bryan says:

      Yes, that’s the intention of how Aggregate Tables work. The most common scenario is count(distinct) at each of those levels, which isn’t aggregatable. As long as you define a single fact that exists in each table, then MicroStrategy will choose the correct table based on the attribute on the report. For example, if you start with Year, Metric, it’ll run the query out of the fact_year table which you’ve pre-aggregated to the proper level. If you then drill from Year to Quarter, then it’ll switch to using the fact_quarter table. You’ll also need a dim_date table to give context for the date hierarchy if you want to be able to still show the Year and Quarter side by side. Your subtotals won’t add up (I don’t think they support non-aggregatable subtotals yet), but the rows will be correct.

    • Bryan,

      First of all I haven’t told you that I have a one more relationship in Fact table – it’s Status dimension. So the aggregates values should be grouped by this dimension values as well.
      The thing is that Microstrategy shows incorrect rows when I add all attributes from dim_time to the report. If look to SQL view I see that in this case it takes aggregates from the lowest level of hierarchy. For example I added attributes: Year, Qarter and Month to the report (to be able to drill within this level from the graph on the dashboard). So on the Year level I see aggregates from Month level.
      My metric is simply MAX(Common_Fact_Count), where Common_Fact_Count is a common fact for all levels of Time hierarchy. MAX used here because metric cannot be creates without any evaluation.
      If I use either one attribute from Time or hierarchy itself, it takes aggregated values from correct level. But this approach excludes possibility of use drilling within graph scope (apparently it generates new report, for e.g. I drill from Year to Quarter and it generates Year->Quarter report out of my dashboard scope).

      Summary: how I can force Microstrategy to use current hierarchy level here (fact_year instead of lowest level fact_month)?

      Thanks.

    • Bryan says:

      Yeah, that was the last part of my previous response. MicroStrategy doesn’t merge multiple passes in separate rows, only separate columns. What you could look into is creating a custom subtotal and you may be able to arrive at the desired result, but those are very trial & error in my experience. You can go to Report Data Options -> Drilling and uncheck “Keep the Parent” and then drilling from Year -> Quarter will result in only Quarter being on the report, if that helps.

    • Antony says:

      Hi Bryan

      I have a question from your above post, but, not sure either Eugene had asked similar question.
      Lets take an example where i have 2 aggregate fact tables for year and quarter, as below.

      Year Count(sales)
      2009 18
      2010 14

      Quarter Count(sales)
      Q1 8
      Q2 10
      Q3 6
      Q4 8

      In my report, if i drag and drop both Year and Quarter and Count(Sales) fact, what will be the output?

      Note:
      I am new to to aggregate table, and just i w’d like to understand from the above posts. Also, correct and suggest me if my understanding of aggregates is wrong.

      Thanks
      Antony

    • Bryan says:

      It’ll be the output of the Quarter table but also grouped by Year (assuming that there is a relationship between Quarter and Year or Year is also in the Quarter table). If neither of those are true, you’ll get an error that the fact doesn’t exist at a level that supports the report.

  3. Raju says:

    Hello Bryan,

    My question is something looks simple for me, but i couldnt able to resolve it since long time.I have 5 tables and all are related with different IDs(heterogeneous mappings).

    1)How to set look up table when we create forms?. How can we identify which one is going to take as lookup when the join is like this T1.cust_id=T2.user_id.

    2)How to giev detail report on pie chart? Is it possible to create a detaail report( Not drill down), when i click pie chart it has to navigate to me to detail report.

    3)And i am doing reporting out of OLTP system, as there are no dimension tables and fact tables.so Is it necessacery to define relation between columns available in the table

    for ex:
    I have a table called Main_profile
    columns

    Lmp_le_id
    Lmp_long_name
    Lmp_segmnt_value etc…

    I queried them at db level, they retrived data like each id has single long name and segment value.Should i necessarily join all the columns available in tables.

    4) lets say i have T1.col1,col2,col3 from Table1,
    T2.col4,col5,col6

    if i want to create a report like col1,col2,col5 from T1,T2 where T1.col3=T2.col6

    if i take col1,col2,col5 to create report from two tables, and if i didnt provide internal joins in between cols within table, will the above query gives the result or nott?

    I am sorry if i hesitates you, i am new to MSTR and i am directly pushed into project. Could you please help me on the topics please

    Thanks
    Venkat

    • Bryan says:

      1) Create the attribute like normal based on the table you want to be the lookup (ie, Customer). Then edit the ID form and there is an option to create a new Form Expression. Now browse to the User table and choose the user_id column. You’ll have 2 items in the Form Expression window, and MicroStrategy will now treat these as the same Attribute with different column names in different tables. The lookup will be the table that is bold (there can be only 1 per attribute).

      2) The feature is called Links, and I believe you can’t do this from graphs in 9.x, not sure about 9.3.

      3) Define the objects in MicroStrategy and it’ll handle the joins for you. If the columns are the same in both tables, MicroStrategy will join on them. Generally when reporting on OLTP, you have to use Views to manipulate the data into a format more MicroStrategy-friendly, which is what you may need to do here too.

      4) This question is too generic to directly answer, but it pretty much follows question 3 above.

  4. Mariana says:

    I’m very new to Microstrategy and have a problem where a sub-query does a “select in” on the incorrect column. Is there a way to change this?

    I have tables:
    Customer_fct
    Account_fct => holds account level information – open/close date; account status; product description; joins to Customer_fct on gCust_ID
    Product_Dim => holds the product roll up hierarchy; highest level being Division; joins to Account_fct on gProductKey
    Segments_Dim => holds the customer segment hierarchy; joins to Customer_Fct on Segment_Key

    I want to sum some metrics on the Customer_fct and group by Segment based on Customers that holds specific products. I want the query to only sum the customers based on an IN statement from the Account_fct.
    I have set up parent/child relationships and a simple query works fine, but when I try to do a more complex metric based on specific indicators in the customer, the SQL generated does the IN statement by joining all the related tables, and the IN on the wrong column:

    select a12.Segment_ Description Segment_ Description,
    (sum(a11.Cust_Active_Flag) – sum((a11.Cust_Active_Flag * a11.Customer_Local_Flag))) WJXBFS1,
    sum(a11.Cust_Active_Flag) WJXBFS2
    into #ZZMD01
    from Customer_fct a11
    join segments_dim a12
    on (a11.Segment_key = a12.Segment_key)
    where ((a12.Segment_ Description)
    in (select c24.Segment_ Description
    from product_dim c21
    join Account_fct c22
    on (c21.gProductKey = c22.gProductKey)
    join Customer_fct c23
    on (c22.gCust_ID = c23.gCust_ID)
    join segments_dim c24
    on (c23.Segment_key = c24.Segment_key)
    where c21.Division in (N’DivisionName’)))
    group by a12.Segment_Level_2

    The query must select on the customers who have products in a specific selection – in this case where the Division in the product rollup hierarchy is something specific, like this:

    select a12.Segment_ Description Segment_ Description ,
    (sum(a11.Cust_Active_Flag) – sum((a11.Cust_Active_Flag * a11.Customer_Local_Flag))) WJXBFS1,
    sum(a11.Cust_Active_Flag) WJXBFS2
    into #ZZMD01
    from Customer_fct a11
    join segments_dim a12
    on (a11.Segment_key = a12.Segment_key)
    where ((a11.gCust_ID)
    in (select c22.gCust_ID
    from product_dim c21
    join Account_fct c22
    on (c21.gProductKey = c22.gProductKey)
    join Customer_fct c23
    on (c22.gCust_ID = c23.gCust_ID)
    where c21.Division in (N’DivisionName’)
    group by c22.gCust_ID
    ))
    group by a12.Segment_Description

    Any direction will be highly appreciated!

Leave a Reply

Your email address will not be published. Required fields are marked *

CAPTCHA Image

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Current day month ye@r *