T-Sql Programmability Features in Sql server 2008 - Part 5 - SQL Programmers

T-Sql Programmability Features in Sql server 2008 - Part 5

09/09/2010

This is Part 5 of Tsql Features in Sql 2008. In this article we discussed about Sparse columns, Filtered Indexes, Object dependencies, Collation alignment and Depreciation.

 

Before reading this we suggest to read

T-Sql Programmability Features in Sql server 2008 - Part 1
T-Sql Programmability Features in Sql server 2008 - Part 2
T-Sql Programmability Features in Sql server 2008 - Part 3
T-Sql Programmability Features in Sql server 2008 - Part 4

Sparse columns

Sparse columns are ordinary columns that have an optimized storage for null values. Sparse columns reduce the space requirements for null values, at the cost of more overhead to retrieve nonnull values. Consider using sparse columns when the space saved is at least 20 percent to 40 percent. Sparse columns and column sets are defined by using the CREATE TABLE or ALTER TABLE statements. Sparse columns can be used with column sets and filtered indexes.

Properties of Sparse Columns:

  • Storing a null in a sparse column takes up no space at all.
  • To any external application the column will behave the same.
  • Sparse columns work really well with filtered indexes as you will only want to create an index to deal with the non-empty attributes in the column.
  • You can create a column set over the sparse column that returns an xml clip of all of the non-null data from columns covered by the set.  The column set behaves like a column itself. Note: you can only have one column set per table.
  • Change Data Capture and Transactional replication both work, but not the column sets feature.
  • Catalog views for a table that has sparse columns are the same as for a typical table. The sys.columns catalog view contains a row for each column in the table and includes a column set if one is defined.
  • When a sparse column is explicitly updated, the corresponding bit for that sparse column is set to 1, and the bit for the column set is set to 1.
  • When a column set is explicitly updated, the bit for the column set is set to 1, and the bits for all the sparse columns in that table are set to 1.
  • For insert operations, all bits are set to 1.

Example

CREATE TABLE dbo.Student
(
  ID INT NOT NULL PRIMARY KEY,
  Name VARCHAR(100) NOT NULL,
  Mark1 INT  SPARSE NULL,
  Mark2 INT SPARSE NULL,
  Mark3 INT SPARSE NULL
);

INSERT INTO dbo.Student(ID, Name, Mark1,Mark2) VALUES(1, 'XXXX', 80,65);
INSERT INTO dbo.Student(ID, Name, Mark3) VALUES(2, 'YYYY', 90);

SELECT ID, Name, Mark1, Mark2, Mark3 FROM dbo.Student; 

Result

Sparse Column

Filtered indexes

SQL Server 2008 introduces a new feature called Filtered Index. A Filtered Index is an optimized form of non clustered index, especially suited to cover queries that select from a well-defined subset of data. It is basically useful to cover those queries which return a small percentage of data from a well defined subset of data within a table. A well-designed filtered index can improve query performance, reduce index maintenance costs, and reduce index storage costs compared with full-table indexes.

In the earlier versions of SQL Server, Developers and Database Administrators have always faced performance issues when dealing with large amount of data in the tables. In order to get the best performance for your queries in the previous SQL Server Versions you might have to partition the table as in SQL Server 2005 or archive the data periodically in the earlier SQL Server Releases.

Advantages of Filtered Indexes

Improved query performance and plan quality

A very well designed filtered index will help you improve query performance on very large tables and this will also generate a better execution plan because, it is smaller than a full-table nonclustered index and has filtered statistics. The filtered statistics are more accurate than full-table statistics because it will only cover those rows available in the filtered index WHERE clause.

Reduced index maintenance costs

An index is maintained only when data manipulation language (DML) statements affect the data in the index. A filtered index reduces index maintenance costs compared with a full-table nonclustered index because it is smaller and is only maintained when the data in the index is affected. It is better to have large number of filtered index, especially in scenarios when you know that the data on which the filtered index is created changes less frequently. Similarly, if a filtered index contains only the frequently affected data, the smaller size of the filtered index reduces the cost of updating the statistics.

Reduced index storage costs

Another major advantage of creating a Filtered Index is that it will reduce the disk storage space for non clustered indexes when a full table index is not required. We can replace a full-table nonclustered index with multiple filtered indexes without significantly increasing the disk storage space for the indexes.

Example

Create table OrderDetails(OID INT,ItemID VARCHAR(100),Amount MONEY)

INSERT INTO OrderDetails(OID, ItemID, Amount) VALUES(1,1,1500)
INSERT INTO OrderDetails(OID, ItemID, Amount) VALUES(1,1,500)
INSERT INTO OrderDetails(OID, ItemID, Amount) VALUES(1,1,800)
INSERT INTO OrderDetails(OID, ItemID, Amount) VALUES(1,1,3000)

The following code creates a nonclustered index on the Amount column, filtering rows where the Amount is greater than or equal to 1000.

CREATE NONCLUSTERED INDEX IDX_ORDER_1000_OR_MORE
ON OrderDetails(Amount) WHERE Amount >= 1000;

SELECT * FROM OrderDetails WHERE Amount >= 1000;

Filtered indexes can also be defined as UNIQUE and have an INCLUDE clause as with regular nonclustered indexes. The optimizer considers using an index when a subinterval of the index filtered interval is requested in the query filter.

Object dependencies

Some database objects have dependencies upon other database objects. For example, views and stored procedures depend upon the existence of tables that contain the data returned by the view or procedure. The Object Dependencies for the current object lists both the database objects that must be present for the object to function properly and the objects that depend upon the selected object. An object that references another object in its definition and that definition is stored in the system catalog. This is called a referencing entity. An object that is referred to by another object is called a referenced entity.

SQL Server 2008 introduces three new objects that provide object dependency information

  • sys.sql_expression_dependencies
  • sys.dm_sql_referenced_entities
  • ys.dm_sql_referencing_entities


sys.sql_expression_dependencies

This view holds one record for each dependency on a user-defined object in the current database.  These user-defined objects can be objects stored in the current database, objects stored in a different database and referenced using a three part naming convention (databasename.schemaname.objectname), objects on a different server and referenced via a linked server using a four part naming convention (servername.databasename.schemaname.objectname), and objects that do not exist at the time the object is created (known as deferred objects).

Example

GO
CREATE TABLE [dbo].[OrderDetails](
    [OID] [int] NULL,
    [ItemID] [varchar](100) NULL,
    [Sales] [money] NULL
) ON [PRIMARY]

GO

CREATE PROCEDURE usp_GetOrderDetails
AS
SELECT COUNT(ItemID) AS ItemID, SUM(SaleS) AS SalesAmount FROM dbo.[OrderDetails]

SELECT ReferencingObject = sysobj.name ,ReferencedObject = dep.referenced_entity_name FROM sys.sql_expression_dependencies dep join sys.objects sysobj on dep.referencing_id = sysobj.object_id WHERE sysobj.name = 'usp_GetOrderDetails'

Result

object dependencies

sys.dm_sql_referenced_entities

This function returns one row for each user defined object referenced by name in the definition of the specified referencing entity.

Example

SELECT ReferencedEntityName = sysobj.name, obj.referenced_entity_name, referenced_minor_name FROM sys.objects sysobj JOIN sys.schemas syssch on sysobj.schema_id = syssch.schema_id CROSS APPLY sys.dm_sql_referenced_entities(syssch.name + '.' + sysobj.name, 'OBJECT') obj WHERE referenced_entity_name = 'OrderDetails'

Result

object dependencies

sys.dm_sql_referencing_entities

This function returns one record for each user defined object in the current database that references another user defined object by name. 

Example

SELECT ReferencingObject = referencing_entity_name,ReferencedObject = sysobj.name FROM sys.objects sysobj JOIN sys.schemas syssch on sysobj.schema_id = syssch.schema_id CROSS APPLY sys.dm_sql_referencing_entities(syssch.name + '.' + sysobj.name, 'OBJECT') obj WHERE     referencing_entity_name = 'usp_GetOrderDetails'

Result

 object dependencies

Dependency Type

It displays the type of dependency between two objects, which can be one of the following:

Schema-bound dependency

It is a relationship between two objects that prevents the referenced object from being dropped or modified as long as the referencing object exists. A schema-bound dependency is created when a view or user-defined function is created by using the WITH SCHEMABINDING clause, or when a table references another object through a CHECK or DEFAULT constraint or in the definition of a computed column.

Non-schema-bound dependency

It is a relationship between two objects that does not prevent the referenced object from being dropped or modified.

Not available

It indicates the dependency type that cannot be determined. This occurs only when the selected object is on an instance of SQL Server that is earlier than SQL Server 2008.

Change data capture

Change data capture is a new mechanism in SQL Server 2008 that enables you to easily track data changes in a table. Change data capture is designed to capture insert, update, and delete activity applied to SQL Server tables, and to make the details of the changes available in an easily consumed relational format. The changes are read by a capture process from the transaction log and recorded in change tables. The change tables used by change data capture contain columns that mirror the column structure of a tracked source table, along with the metadata needed to understand the changes that have occurred.

Before CDC we might simply query a last updated DATETIME column in our source system tables to determine what rows have changed.  While this is simple, and pretty effective, it is of no use in determining any rows that were physically deleted.  In addition we can't determine what was changed when; we can only access the current state of a row that has changed.  CDC provides a configurable solution that addresses these requirements and more.

Change data capture is only available in the Enterprise, Developer, and Enterprise Evaluation editions.

Example

CREATE TABLE dbo.Customer
(
  CustomerID  INT         NOT NULL,
  CustomerName   VARCHAR(30) NOT NULL, 
);

INSERT INTO dbo.Customer(CustomerID, CustomerName) VALUES(1, 'Customer1')

The following code enables the table for change data capture.

EXECUTE sys.sp_cdc_enable_table
    @source_schema = N'dbo'
  , @source_name = N'Customer'
  , @role_name = N'cdc_Admin';

Collation alignment

SQL Server 2008 aligns support for collations with Microsoft Windows Server® 2008, Windows Vista®, Windows Server 2003, and Windows® XP Home Edition. The new SQL Server version adds new collations and revises existing collation versions. You can recognize the new and revised collations by finding the number 100 (the internal version number of SQL Server 2008) in their names.

SQL Server 2008 is fully aligned with the collations in Windows Server 2008.

To this end, it has 80 new collations which are denoted by *_100 version references. The total number of variations for these collations exceeds 1400.

The purpose of this enhancement is to provide users with the most up-to-date and linguistically accurate cultural sorting conventions.

In order to see the new collations along with their variations in SQL Server 2008 you can use the following T-SQL query

SELECT [name], [description] FROM sys.fn_helpcollations() WHERE [name] LIKE '%100%'

Deprecation

A deprecated feature is one that will not be supported in a future version of SQL Server. These old features can still be used, but you need to start modifying code now to ease the transition when you upgrade.

The list of deprecated features can be viewed in books online, where they are divided into those due to be removed in the next version, and those that will removed after that.

The following query will return all SQL Server 2008 deprecated features that have been used since the last service restart, together the number of occurrences.

select instance_name, cntr_value from sys.dm_os_performance_counters where [object_name] like '%Deprecated%' and cntr_value > 0