User Tools

Site Tools


statistics_indexes

SQL Server

Statistics

Windows URL: Statistics

Statistics determine how the query optimizer chooses to create a query plan. A wrong query plan leads to unacceptable query execution times.

By default, SQL Server will update statistics automatically. The statistics are updated based on how much data is inserted, updated or deleted from a table relative to how much data is already in the table. This means that even though data in a table changed over time the statistics might not be updated. This calls for manual updates.

In SQL Server there are two ways to update statistics manually. The built-in function sp_updatestats or by calling a specific update statement on an object (update statistics <object>).

sp_updatestats

Pro:
Can be used on a database in one statement
Con:
By default, it only uses a part of the data to create statistics. So it can still provide bad input to the query optimizer.

Example:

USE qpdb
GO

EXEC sp_updatestats

UPDATE STATISTICS

Pro:
Has many properties to set
Con:
Used per object, so you have some extra work to update an entire schema/database
By default, it only uses a part of the data to create statistics. So it can still provide bad input to the query optimizer. You have to use “WITH FULLSCAN” to use all data, which will in turn take longer to finish.

Example:

USE qpdb
GO

UPDATE STATISTICS dbo.qp_schema WITH FULLSCAN

Tip: Use the following:
USE qpdb
GO

EXEC sp_MSForEachTable 'UPDATE STATISTICS ? WITH FULLSCAN;'

Tip: Use WITH NORECOMPUTE to stop automatic updates of the statistics after data is loaded and not going to change (often / a lot).

Recommended It is recommended to use UPDATE STATISTICS WITH FULLSCAN. Often, it is better to update the statistics less often, but do it well than too often with only a subset of the data.

Indices

Indices help the query optimizer by providing help with finding data. Indices can be clustered and non-clustered.

Clustered index

A clustered index means that data in a table is stored sorted by the fields in the index, so the query optimizer can find data as fast as possible. There can only be one clustered index. If a primary key exists in a table, that will be the clustered index.

Non-clustered index

A non-clustered index means that a file is created with the fields in the index and pointers to the location of the data in the table. In other words, it requires extra hard drive space, some overhead to keep the index up-to-date, but in turn it provides an extra way to read from a table faster.

Keeping the indices up-to-date can be done by means of REORGANIZE or REBUILD. The REORGANIZE option tries to find how much the data changed in a table (relatively) and will update the index if needed. The REBUILD option is the same as dropping the index and creating it again.

Statistics vs indexes

Tip: use one of the many scripts available on the internet.
Ola Hallengren

statistics_indexes.txt · Last modified: 2017/10/10 14:17 by markmeteenk