Why getting the wrong estimated number of rows, using filtered index in SQL query?

543    Asked by ChloeBurgess in SQL Server , Asked on Apr 23, 2021

I have a table with a nonclustered index on a datetime2 field. On that same table i have a field (char[1]) which is used to logically delete records and can have 2 distinct values : A(active) or D(deleted). There are 451047 records with the datetime2 field set at NULL but only 7095 are marked as A. Every query in the application looks for active records only, so, every query that's looking for NULLs in the datetime field was getting really bad estimates and, with that, bad execution plans. Simple example: 

I decided then to create a filtered non clustered index but estimates were still incorrect: Plan using a filtered index, still getting wrong estimatesPlan without filtered index: look at estimated rows vs actual rows

It looks like i still get the old estimates, even if the query is using properly the filtered index. Does anyone know the reason of this behaviour?

These are the new statistics of the filtered index: Filtered index statistics

Table definition:

CREATE TABLE [TYDATPRD].[HAND00F]( [STDRECSTS] [char](1) NULL, [HDHAND] [numeric](14, 0) IDENTITY(1,1) NOT NULL, [HDCHKINDT] [datetime2](7) NULL, --lots of other columns which I don't think are needed CONSTRAINT [PK_TYDATPRD_HAND00F] PRIMARY KEY CLUSTERED ( [HDHAND] ASC ) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [DATA] ) ON [DATA]

Nonclustered filtered index definition:

CREATE NONCLUSTERED INDEX [IX_HAND00F_HDCHKINDT] ON [TYDATPRD].[HAND00F] ( [HDCHKINDT] ASC ) WHERE [STDRECSTS]='A' WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [INDEXES] GO

What is SQL Server filtered index?











Answered by Chloe Burgess

SQL Server filtered index: A filtered index is an optimized nonclustered index especially suited to cover queries that select from a well-defined subset of data. It uses a filter predicate to index a portion of rows in the table. This is not intuitive but try adding the char[1] column to the filtered index definition:

  CREATE NONCLUSTERED INDEX [IX_HAND00F_HDCHKINDT] ON [TYDATPRD].[HAND00F] ( [HDCHKINDT], STDRECSTS ) WHERE [STDRECSTS] = 'A' WITH (DROP_EXISTING = ON) ON [INDEXES];

In my testing, this led to the right estimates. As an aside, I noticed that the plans in your question were different - one was properly parameterized and the other had a constant. Be very careful about how you're doing local testing with local variables and typed constants, especially if in the real world this will be done by parameters to stored procedures.

  Also, be very careful about case - literals in the where clause like = 'a' and = 'A' will generate different plans if they don't get parameterized, because the query text is different. Never mind that in some collations they won't yield the same results. (And does that char(1) column need to allow NULL?)


Your Answer

Interviews

Parent Categories