A Star Schema is a design that contains only one lookup table for each hierarchy in the data model instead of having separate lookup tables for each attribute. With only a single lookup table for each hierarchy, the IDs and descriptions of all attributes in the hierarchy are stored in the same table. This type of structure involves a great degree of redundancy. As such, star schema are always completely denormalized. Let’s review the star schema above based on the MicroStrategy Tutorial data model.
The schema contains only two lookup tables, one for each hierarchy. LU_LOCATION stores the data for all of the attributes in the Location hierarchy, while LU_CUSTOMER stores the data for all of the attributes in the Customer hierarchy. As a result, star schemas contain very few lookup tables-one for each hierarchy present in the data model. Each lookup table contains the IDs and descriptions (if they exist) for all of the attribute levels in the hierarchy.
Even though you have fewer tables in a star schema than a snowflake, the tables can be much larger because each one stores all of the information for an entire hierarchy. When you need to query information from the fact table and join it to information in the lookup tables, only a single join is necessary in the SQL to achieve the desired result.
Joins in a Star Schema
As an example, if you run the same report to display customer state sales, only one join between the lookup and fact table is required to obtain the result set as illustrated below.
To join the Customer State description (Cust_State_Desc) to the Sales metric (calculated from Sales_Amt) requires only one join between tables since the Customer State ID and description are both stored in the LU_CUSTOMER table. As a result, the query has to access only one lookup table to obtain all of the necessary information for the report.
Even though achieving this result set requires only a single join, star schemas do not necessarily equate to better performance. Depending on the volume of data in any one hierarchy, you may be joining a very large lookup table to a very large fact table. In such cases, more joins between smaller tables can yield better performance.
Characteristics of a Star Schema
The following is a list of characteristics of a star schema.
- Contains fewer tables (one per hierarchy)
- Contains very large tables (much larger than some forms of snowflake schemas due to storing all attribute ID and description columns)
- Store the IDs and descriptions of all the attributes in a hierarchy in a single table
- Requires only a single join when querying fact data regardless of the attribute level at which you are querying data
Source: MicroStrategy University, MicroStrategy Advanced Data Warehousing, Course Guide, Version: ADVDW-931-Sep13-CG.