A Star Schema is a design that contains only one lookup table for each hierarchy in the data model instead of having separate lookup tables for each attribute. With only a single lookup table for each hierarchy, the IDs and descriptions of all attributes in the hierarchy are stored in the same table. This type of structure involves a great degree of redundancy. As such, star schema are always completely denormalized. Let’s review the star schema above based on the MicroStrategy Tutorial data model.
The schema contains only two lookup tables, one for each hierarchy. LU_LOCATION stores the data for all of the attributes in the Location hierarchy, while LU_CUSTOMER stores the data for all of the attributes in the Customer hierarchy. As a result, star schemas contain very few lookup tables-one for each hierarchy present in the data model. Each lookup table contains the IDs and descriptions (if they exist) for all of the attribute levels in the hierarchy.
Even though you have fewer tables in a star schema than a snowflake, the tables can be much larger because each one stores all of the information for an entire hierarchy. When you need to query information from the fact table and join it to information in the lookup tables, only a single join is necessary in the SQL to achieve the desired result.
Joins in a Star Schema
As an example, if you run the same report to display customer state sales, only one join between the lookup and fact table is required to obtain the result set as illustrated below.
To join the Customer State description (Cust_State_Desc) to the Sales metric (calculated from Sales_Amt) requires only one join between tables since the Customer State ID and description are both stored in the LU_CUSTOMER table. As a result, the query has to access only one lookup table to obtain all of the necessary information for the report.
Even though achieving this result set requires only a single join, star schemas do not necessarily equate to better performance. Depending on the volume of data in any one hierarchy, you may be joining a very large lookup table to a very large fact table. In such cases, more joins between smaller tables can yield better performance.
Characteristics of a Star Schema
The following is a list of characteristics of a star schema.
- Contains fewer tables (one per hierarchy)
- Contains very large tables (much larger than some forms of snowflake schemas due to storing all attribute ID and description columns)
- Store the IDs and descriptions of all the attributes in a hierarchy in a single table
- Requires only a single join when querying fact data regardless of the attribute level at which you are querying data
Source: MicroStrategy University, MicroStrategy Advanced Data Warehousing, Course Guide, Version: ADVDW-931-Sep13-CG.
In my last blog post, I blogged about the new MicroStrategy Community. Jaime Perez, VP of Worldwide Customer Services, and his crew have come up with a better way for us to engage with MicroStrategy as well as his team.
Speaking of Jaime, last June, he posted this great tip on the MicroStrategy Knowledgebase site as a TechNote. I am reblogging it since it is one of the most frequent questions I get asked and I find it an extremely useful Tip & Trick. Also, this will give you an idea of the great stuff being posted in the MicroStrategy Community.
MicroStrategy and Cross Joins
In some scenarios, one may encounter cross joins in the SQL View of a standard, SQL Report in MicroStrategy. Cross joins appear when two tables do not have any common key attributes between them in which they can inner join. As a result, the two tables essentially combine together to create one table that has all the data from both tables, but this results in poorer performance with a common effect of increased execution times. Sometimes these execution times, and performance hits, can be very severe. Therefore, it is important to understand some simple steps that can be performed to resolve a cross join, as well as some steps to understand why it may be appearing in the SQL View of the report.
One common occurrence of a cross join is when a report contains at least two unrelated attributes in the grid, and no metrics are present in order to relate the unrelated attributes via a fact table. Such a occurrence can be resolved in a few ways:
- Create a relationship filter, set the output level as the unrelated attributes (or the entire report level), and then relate these by a Logical Table object
- Create a relationship filter, set the output level as the unrelated attributes (or the entire report level), and then relate these by a Fact object
- Add a metric to the report that uses a fact from a table in which both attributes can inner join to
This provides a pathway from the fact table to the lookup tables in which the unrelated attributes are sourced from. The result is an inner join between the fact table and the lookup tables, which resolves the cross join between the two unrelated lookup tables.
Options 1 and 2 provide a means in which the report template can remain as only attributes, whereas Option 3 would have a metric on the report. Option 3 may not be desired if a metric does not want to be placed on the report. Keep in mind that other techniques can also be employed to have the metric on the report, but formatted to be hidden from display.
More common scenarios include cross joins between a fact table and a lookup table, and are typically surprising to a developer. These situations can be a bit more tricky to troubleshoot and resolve, but here are a few techniques that can be employed to try to resolve the issue:
- In SQL View look at where the cross join appears, and between which tables the cross join appears
- Open up those tables in the Table Editor by navigating to the Schema Objects\Tables folder, and double-clicking the tables
- Select the Logical View Tab of both tables to see all the logical objects mapped to the table
- Take note of which attributes have a key icon beside them
- These key attributes denote attributes at the lowest level of their hierarchy presently mapped to the table and/or attributes that are in their own hierarchy (meaning they have no parents or children)
- The SQL Engine will join 2 tables on common key attributes only, so if none of the key attributes on either table exist on both tables, then a cross join should appear
This means that just because a Region attribute exists on Table_A and a Region attribute exists on Table_B does not necessarily mean that the SQL Engine will join on Region. If Region has its child attribute on the table, then that attribute should be the key as it is the lowest level attribute of its particular hierarchy mapped to the table. If Region exists on both tables, and is also a key attribute on both tables, then an inner join should take place on Region.
This essentially means that one can find a cross join, investigate the tables in which it appears, and verify if at least 1 common key attribute exists between the tables. If not, then that should be the first path to investigate because a cross join is correct in that scenario.
You can find a detailed video on how this issue is reproduced and resolved here: Tech Note 71019 . Steps to reproduce and resolve
MicroStrategy Technical Support can assist with resolving cross joins in a specific report, however caution should be taken when resolving such issues. In some scenarios, the cross join is resolved through modifications to the schema objects, which can have a ripple effect to all other reports in an environment. For example, if a relationship is changed in the Region attribute to resolve a cross join in one report, this change will be reflected in all other reports that use Region, and potentially the hierarchy in which Region belongs. As a result, the SQL View of one report will have the cross join resolved, but the SQL may have changed in other reports using Region or its related attributes. This may or may not be desired. MicroStrategy Technical Support may not be able to fully understand the impact of such a schema change to the data model, so before a change is made to the data model the consequences of such a change should be fully understood by the developer, and any changes made to the schema should be recorded.
 Jaime Perez, TN47356: How to troubleshoot cross joins in SQL Reports for the SQL Generation Engine 9.x, MicroStrategy Community, 06/24/2014, http://community.microstrategy.com/t5/Architect/TN47356-How-to-troubleshoot-cross-joins-in-SQL-Reports-for-the/ta-p/196989.
 MicroStrrategy Knowledgebase, Tech Note 71019 . Steps to reproduce and resolve,
With the release of MicroStrategy Analytics Enterprise 9.4.1, the Analytical Engine logic has been enhanced with respect to joining data from multiple datasets in a Report Services Document. One of the features that is available with this release is the ability to use objects (e.g., attributes, metrics) from multiple datasets in a single grid in a document.
If an attribute on a grid has elements that can be obtained from multiple datasets used in the document, the elements displayed will be from the global lookup table. Additionally, if one or more of the datasets containing the attribute has missing attribute form data or has different attribute form from the other datasets, the Analytical Engine will follow the rules noted below to compose the final output:
If there is attribute form with null value, the Analytical Engine will use the non-null form value from other datasets instead of the null form.
If several datasets have different attribute form information for the attribute element, the Analytical Engine will use the attribute form from the biggest dataset.
If several datasets have different attribute form information for the attribute element, and those datasets have same number of rows, the Analytical Engine will use the first dataset in the document for the attribute form value (according to the dataset adding sequence).
NOTE: Users should note that the rules are applied for each individual attribute element in the result at the row level rather than at the dataset level.
Users may consider the following datasets – C01 is a dataset with Customer City, Customer and Order:
C02 is a dataset with Customer, Order and a profit metric. Users may note that the Customer attribute is missing the DESC form in the second dataset:
If a Report Services Document is built with both these datasets, and the attributes are placed on a grid, the following results may be seen. As noted in Rule 1, the Analytical Engine will display the non-Null values from C01 for the Customer attribute elements:
Now users may consider a different dataset as C02 – similar to the initial dataset, but here the Customer name (DESC) form contains values instead of NULLs. This time the values for the attributes are not consistent – see that Customer ID ‘1’ has different values for the DESC form for different Orders (1 & 6).
|Customer Name||Customer ID||Order||Profit|
If a report is built for this dataset users will observe that the first attribute element value in the dataset is used as as the DESC form for the Orders 1 & 6 even if the value is different in subsequent rows (this is the same as previous Analytical Engine behavior).
When these datasets are used in the grid in a Report Services Document, the Analytical Engine will choose the attribute element values from dataset C02 to display in the attribute element values from. This is because of Rule 2 explained above.
Consider the following dataset:
|Customer Name||Customer ID||Order||Profit|
A report built off this dataset appears as follows:
After replacing the dataset ‘C02‘ from the previous example with the new dataset, the following results are seen. As noted in Rule 3, because both C01 an C02 have the same number of rows, the elements displayed for the Customer attribute will be filled from from the first dataset to be added to the document – in this case C01. However for the first row in the results, where there is no corresponding customer in the dataset C01, Rule 1 will be applied and instead of a NULL value, the non-null Customer Name field ‘Customer G’ is picked from C02. (Rules are applied at the individual element level).
Next: Why are some metric values blank in documents using multiple datasets in MicroStrategy Analytics Enterprise 9.4.1
 MicroStrategy Knowledgebase, Engine behavior for grids on a Report Services Document or dashboard with multiple datasets where some attribute forms are missing or have different values the datasets in MicroStrategy Analytics Enterprise 9.4.1 and newer releases, TN Key: 45463, 03/13/2014, https://resource.microstrategy.com/support/mainsearch.aspx.
NOTE: You may need to register to view MicroStrategy’s Knowledgebase.