Gathering Extended Properties into your Management Data Warehouse database

Originally Posted 8/27/2014

In the first post on the Management Data Warehouse we looked at how to set it up and start collecting default data using Data Collection.  In this post, we will see how to create a custom collector to gather all Extended Properties at the database level on our Instances configured for Data Collection.   This shows how easy it is to create a custom collection set and collect information about our SQL Server instances.

Before we dive into the process, lets discuss a little more about what we are trying to accomplish.  If you have been following along with the Froogle DBA posts, then you know that I recommend you document your database inside the databases themselves using Extended Properties.  By doing so, you can use Central Management Servers to report on your environment.  But sometimes it is just easier and faster to query this information from a single source.  Since we are already leveraging Management Data Warehouse why don’t we use it to consolidate this information from across our environment.  We are going to gather all Extended Properties set on databases, once a day and store them in the Management Data Warehouse database.  We are going to decide we are not going to keep a lot of history on this information, so we will just keep one weeks worth of data.

With this information let’s start working on building our collector.  To start with we need to understand that the way data collection works is that a collector is going to run on each of our instances and report back to the Management Data Warehouse.  This is a distributed collection model and means we will eventually need to “install” this collector on each of our instances.  The implication of this is we will only need to worry about writing a collector to gather information for one instance and then deploy that collector to all of our instances.  We can also further simplify the process of creating the collector by using the data collection’s ability to run a script against all user database on an instance.  With this we only need to write a query to gather the database level Extended Properties for a given database.  The query we will use is shown below.


WITH NULLRECORD AS (
SELECT 0 AS class
)
SELECT
sys.extended_properties.name
, sys.extended_properties.value
FROM NULLRECORD
LEFT JOIN sys.extended_properties ON NULLRECORD.class = sys.extended_properties.class;
The reason we perform the left join is to ensure that we get at least one row per database.  The default collectors that come with SQL Server are gathering data at the instance level and not the database level.  We could use the query below, but we would not be able to answer the question in MDW of which databases are missing Extended Properties.

SELECT
sys.extended_properties.name
, sys.extended_properties.value
FROM sys.extended_properties
WHERE sys.extended_properties.class = 0;

So now that we have our query let’s dive into the process of creating a Data Collection Set.  A Data Collection Set is made up of a Collection Schedule, and a series of Collection Items.  We will have just the one collection item which will gather the Extended Properties.  If we had other things we would like to collect once a day it is best practice to add them as additional items to a single Collection Set vs. having individual Collection Sets for each.  This greatly improves the scalability of Management Data Warehouse.  So let’s address each of these one at a time.  We’ll tackle them in reverse order starting with the Collection Set.

How we build the Collection Set is determined by the Data Collector Type we wish to use.  There are four Data Collector Types provided by SQL Server; Generic T-SQL Query, Generic SQL Trace, Performance Counters, and Query
Activity.  Since the information we wish to gather is based on a T-SQL query we will be using the Generic T-SQL Query Collector Type.  Each collector type has a different set of information it needs which is provided in an XML segment to the Data Collection engine.

In this case we need to tell it the Query to run, the Table to load the data into, and what database(s) to run the query
against.  You can look at the Generic T-SQL Query Collector Type link for more specifics, but the XML segment we want to use is as follows.

<ns:TSQLQueryCollector xmlns:ns=”DataCollectorType”>
<Query>
<Value>
WITH NULLRECORD AS (
SELECT 0 AS class
)
SELECT
sys.extended_properties.name
, sys.extended_properties.value
FROM NULLRECORD
LEFT JOIN sys.extended_properties ON NULLRECORD.class = sys.extended_properties.class
</Value>
<OutputTable>ExtendedProperties</OutputTable>
</Query>
<Databases
UseUserDatabases=”true”/>
</ns:TSQLQueryCollector>

This segment says to run the query against all user databases (<Databases UseUserDatabases=”true” />) and store the results in a table in the custom_snapshots schema in the Management Data Warehouse called “ExtendedProperties” (<OutputTable>ExtendedProperties</OutputTable>).  We will need to wrap this XML in a stored procedure call later, but before we put this together we need to address the issue of scheduling.

SQL Server ships with 6 schedules for Data Collection Sets.  They are all designed to run multiple times a day.  There are collection schedules available for collecting at 5, 10, 15 and 30 minutes.  There are also collection schedules available to collect at 1 and 6 hours.  We could just use the 6 hour collection and collect multiple times a day.  But since we don’t have a business need to do so we will take advantage of the fact that we can create our own schedules. All of the schedules created for data collection all start with “CollectorSchedule_”.  To avoid confusion (and the chance that SQL Server adds a daily collection schedule in the future) we will avoid using that convention.  We’ll use “Collector_Schedule_” instead.

In order to create a new schedule we need to use the sp_add_schedule stored procedure in the msdb databases.  If you are interested in all the options available you can refer to the information in BOL.  But for our purposes to create a schedule to run daily at 01:00 we would use the following script.  We will intentionally not include the time of day in the name of the schedule, because as our environment grows (or if you have a large environment already) we may need to stagger the times that instances report into Management Data Warehouse to improve performance.

USE msdb;

IF NOT EXISTS(
SELECT schedule_uid
FROM [msdb].[dbo].[sysschedules]
WHERE name = N’Collector_Schedule_Daily’
)
EXEC sp_add_schedule @schedule_name =  ‘Collector_Schedule_Daily’,
@enabled = 1
, @freq_type = 4
, @freq_interval = 1
, @freq_subday_type = 1
, @freq_recurrence_factor = 1
, @active_start_time = 010000
, @owner_login_name = ‘sa’;

So now that we have the xml segment for the collector and the schedule, lets put it all together into a single script which will create the schedule if needed, drop the collection set if it exists, create the collection set, and then finally start up the collection set.  This is the script that we can use with Central Management Server to get all of our Instances collecting Extended Properties and reporting them to the Management Data Warehouse.

USE msdb;

DECLARE @collectorname VARCHAR(128);
SET @collectorname = ‘Database Extended Properties’;

IF NOT EXISTS(
SELECT schedule_uid
FROM [msdb].[dbo].[sysschedules]
WHERE name = N’Collector_Schedule_Daily’
)
EXEC sp_add_schedule @schedule_name =  N’Collector_Schedule_Daily’,
@enabled = 1
, @freq_type = 4
, @freq_interval = 1
, @freq_subday_type = 1
, @freq_recurrence_factor = 1
, @active_start_time = 010000
, @owner_login_name = ‘sa’;

IF EXISTS(
SELECT name
FROM [syscollector_collection_items_internal]
WHERE name = @collectorname

)
EXEC sp_syscollector_delete_collection_set @name = @collectorname;

DECLARE @collector_type UNIQUEIDENTIFIER;
SELECT @collector_type = collector_type_uid
FROM [msdb].[dbo].[syscollector_collector_types]
WHERE name = N’Generic T-SQL Query Collector Type’;

DECLARE @schedule UNIQUEIDENTIFIER;
SELECT @schedule = schedule_uid
FROM [msdb].[dbo].[sysschedules]
WHERE name = N’Collector_Schedule_Daily’;

DECLARE @collectionsetid INT;
DECLARE @collectionsetuid UNIQUEIDENTIFIER;
EXEC [msdb].[dbo].[sp_syscollector_create_collection_set]
@name=@collectorname,
@collection_mode=1, — Non-cached mode
@description=N’Records Database Extended Properties’,
@days_until_expiration=7,  — 1 week
@schedule_uid=@schedule,
@collection_set_id=@collectionsetid OUTPUT,
@collection_set_uid=@collectionsetuid OUTPUT;

DECLARE @parameters XML;
SET @parameters=‘<ns:TSQLQueryCollector xmlns:ns=”DataCollectorType”>
<Query>
<Value>
WITH NULLRECORD AS (
SELECT 0 AS class
)
SELECT
sys.extended_properties.name
, sys.extended_properties.value
FROM NULLRECORD
LEFT JOIN sys.extended_properties ON NULLRECORD.class = sys.extended_properties.class
</Value>
<OutputTable>ExtendedProperties</OutputTable>
</Query>
<Databases
UseUserDatabases=”true”/>
</ns:TSQLQueryCollector> ‘
;

DECLARE @collectionitem INT;
EXEC [msdb].[dbo].[sp_syscollector_create_collection_item]
@name=@collectorname,
@parameters=@parameters,
@collection_item_id=@collectionitem OUTPUT,
@collection_set_id=@collectionsetid,
@collector_type_uid=@collector_type;

EXEC sp_syscollector_start_collection_set @name = @collectorname;

So now we have a collector we can use to collect Database Extended Properties for all our user databases.  Once we deploy this to all of our SQL Server instances we will have a central repository of this information we can use.  There are many other collectors we could write to gather information on our environment.  We will save those for another time.

I hope you have been enjoying the Froogle DBA series.  I would love to hear any feedback you might have.

Leave a comment