:::: MENU ::::

Posts Categorized / Analysis Services SSAS

  • May 13 / 2009
  • 0
Analysis Services SSAS, dbDigger, Performance Tunning and Optimization

Performance Configuration Issues for SQL Server Analysis Services

From an operational prospective, you can improve Analysis Services performance by keeping partition sizes reasonable, setting partition data slices, defining appropriate aggregations on all partitions, and running the Optimize Schema Wizard in the cube editor.

Partition Size

You can divide a cube into multiple partitions if you are using SQL Server 2000 Enterprise Edition (or SQL Server 7.0 Enterprise Edition) to increase query and processing performance. You should consider increasing the number of partitions if a partition file exceeds 5 GB or 20 million records. To determine the size of each partition, review the partition files in the Data folder.

Data Slice

When you partition a cube, you should define the data slice for each partition using the Partition Wizard. The Partition Wizard does not require you to set this data slice when you create a partition. As a result, it is possible (even easy) to create a partition without setting the data slice. In fact if you simply click through the Partition Wizard pages taking the default values, you end up with a partition without a data slice being set. You should verify that each partition of each cube in each Analysis Services database has a data slice defined. The only exception to this is if you have a cube with only one partition-in that case the data slice should not be set, because you want all of the cube data to be placed in that one partition. To verify that a data slice is defined, edit the partition in Analysis Manager and then step through the Partition Wizard pages. Defining the data slice enables Analysis Services to quickly eliminate irrelevant partitions from query processing. The data slice identifies the actual subset of data contained in each partition. Unless Analysis Services knows the range of data contained in each partition, it must query each partition, which negates much of the query performance benefit of partitions. To draw an analogy with SQL Server, creating a partition without a data slice is like creating a partitioned view without the CHECK clause. While you can do it, you force the query optimizer to scan all of the partitions in the view because you haven’t given it enough meta data to figure out what partition to access when a query is issued. While the Analysis Service’s runtime engine does not use a relational query optimizer (it has its own component that accomplishes a similar operation), it uses the data slice in roughly the same way: as meta data to tell it which partitions to scan if an aggregate cannot be
used or is not available. If you partition a cube by month, and have 36 months worth of data (in 36 partitions), and if you don’t specify the data slice, then the runtime engine must scan all 36 partitions to answer a query. If you specify the data slice, it could potentially only have to scan 1/36th the amount data, with an obvious improvement in
performance.. Setting a data slice also causes Analysis Services to add a join and a WHERE clause to the SQL statement used for retrieving data from the source database during processing. The WHERE clause limits the data retrieved by the SQL statement to the data that belongs in the data slice. For example, if you say that a partition’s data slice
is June 2003, then Analysis Services adds a join to the time dimension and adds the WHERE clause: WHERE = ‘June’ AND = ‘2003’
or whatever the appropriate member/level names are. If you do not define a data slice and you have multiple partitions, Analysis Services does not restrict the data that is retrieved from the source database. Without the data slice, if you just happen to have July 2003 data in the June partition. By specifying the data slice, the system can add these JOIN and WHERE clauses that assist in maintaining the integrity of the data. If Analysis Services is loading data for each partition from separate tables in the source database, this may yield some performance benefits. However, you should not disable the automatic generation
of the WHERE clause unless you are totally sure that the relational database partitioning is 100% correct when loading data. You must be 100% certain of the correctness of the data in all partitions of all cubes on the server in order to use this setting safely. Without the WHERE clause for your protection, double-counting data (or many-times counting of data) may occur, which could lead to server crashes if inconsistent data is processed. If you disable the generation of the WHERE clause, you assume all responsibility for enforcing data integrity between the data source and the data slice.

Aggregations

The most effective technique you can use to improve overall query responsiveness (assuming that the Analysis Services computer has sufficient memory and hard disk resources) is to design effective aggregations of fact data in each cube partition. However, too many aggregations will unnecessarily increase processing time without significantly improving query performance. When you use multiple partitions to increase query and processing performance, it is possible to deploy a new partition with no aggregations. While deploying partitions with different aggregation designs is a common optimization technique, deploying a partition with no aggregations generally indicates an error in deployment that can cause performance problems. You should verify that a minimum number of aggregations exist in each partition. You can quickly determine whether aggregations are defined on a partition by looking at the combined sizes of the .agg.flex.data and .agg.rigid.data files for each partition. Designing too many aggregations will slow processing and designing too few aggregations will slow
querying. Ensure that all partitions have a minimum number of aggregations – perhaps 10%.

Schema Optimization

Running the Optimize Schema tool on a cube eliminates unnecessary joins between dimension tables and fact tables, if certain conditions are met. By default, when you first create a cube, Analysis Services constructs a SQL query against the fact table that is a “N+1”-way join (where N is the number of dimensions). From the join, the Analysis Services query extracts the lowest-level key. From that key, Analysis Services begins the aggregation process. Having a 6-way join is typically not a major performance problem in most modern relational database systems. However, if your cube has 15 or 20 dimensions, the resulting multi-table join may suffer significant performance problems. Regardless of the number of dimensions in your cube, the Analysis Services query to the relational database is resolved faster and the data flows into Analysis Services more quickly during processing if you eliminate some of these joins. Fortunately, there is a common design technique that can greatly help the situation. Many star or snowflake schema designs are constructed in such a way that the foreign key that points from the fact table to the lowest level dimension table is not some random number, but it is also the member key itself. If that is true, then Analysis Services can “optimize away the join” and pull the member key directly from the fact table instead of using a join to the lowest-level dimension table. These conditions are checked for join removal are

  • The dimension must be a shared dimension.
  • The dimension must have been processed before you optimize the cube schema.
  • The member key column for the lowest level of the dimension must contain the keys that relate the fact table and the dimension table, and this must be the only key necessary to relate the fact table to the dimension table.
  • The keys in the member key column for the lowest level of the dimension must be unique.
  • The lowest level of the dimension must be represented in the cube; that is, the level’s Disabled property must be set to No. The level can be hidden.

If these conditions are met with respect to a dimension used in a cube, and the cube’s schema is optimized using the Optimize Schema command, Analysis Services composes a query that does not contain a join to the dimension table in the database when processing the cube. If these conditions are met for all dimensions in the cube, the Analysis server needs to read only the fact table to process the cube. Processing time reductions often can be substantial when this optimization technique is used. Cube schema optimization applies to all partitions of the cube, whether the partitions are processed independently or as a group. So, as a general rule, after you have designed the schema for a cube, you should run the Optimize Schema command. It removes the joins that meet the foregoing conditions. Next, you should determine which dimensions were not eliminated from the join and then determine how to meet the required conditions to eliminate the dimension table from the join. If you have partitioned your cube and specified the data slice, the dimension table used for the data slice cannot be eliminated. This join is set to protect you so that no additional, non-data slice data is included in the partition. If you do optimize away a dimension, you should be aware that the inner join that you have just eliminated had a side effect that may expose problems with your source data. The inner join to the dimension table eliminates fact table records that do not have matching dimension table records (this is what an inner join will do). This means that when you remove the inner join and start using the fact table member keys, you may start seeing processing errors that you were not getting before. When Analysis Services processes a record in the fact table that does not have a corresponding entry in the appropriate dimension table, Analysis Services generates an error. If you recreate a cube, add a dimension to a cube, or remove and then re-add a dimension, you must rerun the Optimize Schema command to re-optimize the cube. New dimensions are always added unoptimized.

  • May 13 / 2009
  • 0
Analysis Services SSAS, dbDigger, Monitoring and Analysis

Logging and Error Reporting for SQL Server Analysis Server

Analysis Services records a query log to enable you to analyze query patterns and improve your aggregation design. You can configure the properties of this query log. You can also enable a processing log and enable Analysis Services error reporting.

Query Log

To enable the Usage Based Optimization Wizard to design aggregations based on past usage patterns and to enable the Usage Analysis Wizard to generate reports analyzing query usage, Analysis Services records the levels touched by every Nth query in a query log. By default, every tenth query is logged. The default location for the query log is C:Program FilesMicrosoft Analysis ServicesBinmsmdqlog.mdb. This file, like any log file, should be secured from unauthorized access.

  • May 13 / 2009
  • 0
Analysis Services SSAS, dbDigger

Source Data Access for SQL Server Analysis Services

The MSSQLServerOLAPService service account must also have logon account permissions to access the source data in the source database if trusted connections are used to access the source data. With trusted connections, the MSSQLServerOLAPService service account is used to connect to the data source. If trusted connections are not used, you can specify a user name and password. The permissions required when connecting to a data source also depends on the type of storage structure used for the Analysis Services partitions. When MOLAP storage is used, the service account must have at least SELECT permissions on the source database. If ROLAP or HOLAP storage is used, the service account must have at least SELECT and CREATE TABLE permissions on the source database.

  • May 13 / 2009
  • 0
Analysis Services SSAS, dbDigger, Security and Permissions, SQL Server Agent scheduled Jobs

Service Accounts for SQL Server Analysis Services

To understand the permissions required for the MSSQLServerOLAPService and SQL Server Agent service accounts, you need to understand the security context in which various operations are executed. Certain tasks are performed in the context of the logged-on user, and other tasks are performed in the security context of the MSSQLServerOLAPService service account. When you request that Analysis Manager create a new object or browse for an existing object, the task is executed in the security context of the interactive user account of the user performing the task. When Analysis Services processes dimensions, partitions, and mining models, this task is performed in the security context of the MSSQLServerOLAPService service account. You must grant sufficient rights to this service account to ensure that processing will succeed. It is not uncommon for users to assume that if they can create objects, then they can process them. You must ensure that the MSSQLServerOLAPService and the SQL Server Agent service accounts have sufficient permissions for the tasks they need to perform. At a minimum, the service account must be a member of the OLAP Administrators group. This permission is needed for any user (or service that is acting on behalf of a user) that manages an Analysis Services server.

  • May 13 / 2009
  • 0
Analysis Services SSAS, dbDigger, Security and Permissions

Service Accounts for SQL Server Analysis Services

To understand the permissions required for the MSSQLServerOLAPService and SQL Server Agent service accounts, you need to understand the security context in which various operations are executed. Certain tasks are performed in the context of the logged-on user, and other tasks are performed in the security context of the MSSQLServerOLAPService service account. When you request that Analysis Manager create a new object or browse for an existing object, the task is executed in the security context of the interactive user account of the user performing the task. When Analysis Services processes dimensions, partitions, and mining models, this task is performed in the security context of the MSSQLServerOLAPService service account. You must grant sufficient rights to this service account to ensure that processing will succeed. It is not uncommon for users to assume that if they can create objects, then they can process them. You must ensure that the MSSQLServerOLAPService and the SQL Server Agent service accounts have sufficient permissions for the tasks they need to perform. At a minimum, the service account must be a member of the OLAP Administrators group. This permission is needed for any user (or service that is acting on behalf of a user) that manages an Analysis Services server.

  • May 13 / 2009
  • 0
Analysis Services SSAS, dbDigger

Data Source Configuration for SQL Server Analysis Services

When you create a new database within an Analysis Services instance, one of your first tasks is to define the data source for the database. A data source contains the information necessary to access source data for the database objects. The term “data source” actually refers to the data source object that is created, not the source data itself. When you define the data source in Analysis Manager, the name given to the object is either – or -. However, to eliminate confusion when the database is moved to another server, you should change the default naming convention by creating a logical name for the data source unrelated to the name of the original computer on which the database was initially created.
To create a logical name for the data source object in Analysis Manager, create the data source object. Then copy the new data source object and paste it into the same Analysis Services database. You are then prompted to define a new name for the data source object. The name you choose should reflect the logical type of data such as Sales Data, or Personal Data. After you define the new logical name, delete the original data source
object. Thereafter, when you move an Analysis Services database between computers, you can simply change the underlying server and database in the connection string by modifying the properties of the data source object in Analysis Manager (or in your script).
In addition to renaming your data sources to logical rather than physical names, you should ensure that your deployment computers use the same name. If your development computer has its data source named Sales Data, then your QA computer should have its data source named Sales Data, and your production computer should have its data source named Sales Data. Using consistent names across the development, QA, and production computers makes migration of individual pieces easier by cutting and pasting between the Analysis Services databases. If you do not change the name of the data source object before you create objects in the database, you will not be able to change the name of the data source object without using third-party utilities.

Pages:123
Consult us to explore the Databases. Contact us