menu

Friday, May 27, 2011

How To Define Measure Group Partitions in SQL Server Analysis

ProblemWe are anticipating building some very large cubes and would like to get a handle on how to partition our measure groups to provide flexibility in processing and administering our cubes.   Can you guide us through the steps to define multiple partitions for a measure group and provide some explanations?
SolutionThe default behavior in SSAS is to create a single partition for a measure group.  Creating multiple partitions for a measure group is definitely a best practice when you have very large fact tables.  The following are the advantages of multiple partitions:
  • Improved Performance - SSAS can query a smaller amount of data by isolating one or more partitions to query rather than the entire measure group.  SSAS can also query multiple partitions in parallel.
  • Flexible Configuration - each partition has its own storage mode; e.g. MOLAP, HOLAP or ROLAP and aggregation design.  You can pick the optimal storage mode and aggregation based on how often the data is queried, how much data there is, how often the data changes, etc.
  • Flexible Processing - a partition can be processed separately or in parallel; you can drop a partition without having to process the cube.  If you have a partition for the latest week, month, quarter, etc. you can process just the rows from the fact table that belong in the partition as opposed to processing the entire fact table.
  • Multiple Source Tables - each partition can optionally specify its own fact table, allowing the flexibility to physically split large fact tables into multiple tables.
  • Improved Scalability - you can take advantage of processing a partition on another SSAS server.
In this tip we will walk through the steps to define multiple partitions for a measure group.   As a starting point we will use the Adventure Works DW Standard Edition sample project that comes with SQL Server 2005.  This project was chosen because the Enterprise edition sample already has multiple partitions, while the Standard edition sample has only one partition per measure group.  We'll define a single partition for each year of data in the Internet Sales measure group.
When defining multiple partitions for a measure group there is one critical point that should be emphasized.  You must specify a query for each partition such that every row from the fact table gets included in one and only one partition.  The simplest way of doing this is to partition based on the Time dimension which is what we will do in the example. 
Note that multiple partitions are only supported in the Enterprise Edition of SQL Server 2005.  You can also use the Developer Edition in your development environment.
Creating Multiple Partitions
Business Intelligence Development Studio (BIDS) is the integrated development environment that comes with SQL Server 2005.  Launch it from the Microsoft SQL Server 2005 program group then open the Adventure Works DW Standard Edition project (default location is C:\Program Files\Microsoft SQL Server\90\Tools\Samples\AdventureWorks Analysis Services Project\Standard).
Step 1: Edit the project properties by clicking Project, Properties from the top-level menu; set the Deployment Server Edition to Enterprise.  Multiple partitions are only available in the Enterprise Edition of SQL Server 2005 so you need to specify the Deployment Server Edition as Enterprise.
Step 2: Double click the Adventure Works cube in the Solution Explorer then click the Partitions tab to display the existing partitions (only the first partition is shown below). 
Click on Internet_Sales and edit the Partition Name to Internet_Sales_2001.
Step 3: Click the Internet_Sales partition (you should see the entire row highlighted as shown above) to display the partition properties.  Click the button in the Source property to display the binding:
Step 4: Change the Binding type from Table binding to Query binding and add the WHERE clause as shown below:
The idea is to change this partition so that it only has orders that were shipped in the year 2001.  The ShipDateKey column references the TimeKey column in the DimTime table (Time dimension) in the AdventureWorksDW database.  Run the following query to determine the ShipDateKey for the date 2001-12-31 and you will see that 184 is the TimeKey value:
SELECT TimeKey FROM DimTime
WHERE FullDateAlternateKey = '2001-12-31'
Step 5: Click New Partition (see hyperlink in Step 2 above) to add another partition to the Internet Sales measure group.  This will launch the Partition Wizard.  Select the dbo.FactInternetSales table under Available tables:
Step 6: Click Specify a query to restrict rows and add the WHERE clause as shown below:
The idea is to set this partition so that it only has orders that were shipped in the year 2002.  If you query the DimTime table you will find that the TimeKey value for 2002-12-31 is 549.
Step 7: Accept the defaults on the Processing and Storage Locations dialog:
Note that selecting Remote Analysis Services data source for the Processing location is a scale-out feature where you could actually process this partition on another SSAS server.
Step 8: Complete the Partition wizard; fill in the dialog as shown below:
Repeat steps 5 through 8 to create and configure partitions for the years 2003 and 2004.  
Step 9: Process the cube; right click on the Adventure Works cube in the Solution Explorer and select Process from the context menu. Accept the defaults on the Process Cube dialog; click Run (Run button not shown below):
You can also define measure group partitions using SQL Server Management Studio.  Connect to an Analysis Server, select a database and a cube, then right click on a measure group.  The context menu will have a New Partition option that launches the same Partition Wizard we used above.  There are also options to Merge Partitions and work with Aggregations.
Next Steps
  • If you don't already have the AdventureWorks SSAS sample projects and databases available, you can download them here to get the starting point for this tip.  Click the AdventureWorksBICI.msi link.  Also click on the Release Notes link for the details on attaching the relational database.  The default install location for the project is C:\Program Files\Microsoft SQL Server\90\Tools\Samples\AdventureWorks Analysis Services Project; you will see Enterprise and Standard folders.  We used the project in the Standard folder.
  • Keep in mind that multiple partitions for a measure group provides several advantages as discussed; a good time to think about a partitioning strategy is while you're developing a cube.  You can change the partitions after deployment; you may want to use SQL Server Management Studio in this case.
  • You can download the modified version of the AdventureWorks Standard SSAS project used in this tip here.

No comments:

Post a Comment