Data Dictionary part I
Having a data dictionary is an invaluable resource for not only for the DBA but for anyone, report writers or developers who need to work on a database that you have to manage or have developed yourself. A good amount of time can be saved from answering the same questions over and over again if you can point to documentation to answer any initial questions. Microsoft provides a way to keep a this information in each database without having to change the structure of any of the objects that already exist. Extended Properties are a valuable resource in keeping metadata close to the heart of the system where it is needed. In this series I will show you my data dictionary frame work. I do create one table data dictionary that contains information that will work in conjunction with MS extended properties system functions, procedures, and views. I have a concept of "data domains" which is a way to relate different columns in the same table or different tables that contain the same logical information that may not support a foreign key relationship. An example of this is state in a data warehouse or master data management system, some tables may use the USPS two letter code, other tables from a different system may use an ISO 3 digit numeric code, but in each of these cases they represent the same information. Connecting these two different data elements in the database can be incredibly useful to a developer especially if they are new to your environment. To facilitate this I create an extended property every column in the database that is "domain" and populate it with the name of the data domain, in the example above the extended properties "domain" would get a value of "state" in each of the two different tables. I then populate my dataDictionary table with the domain "state" and a description for this domain. I have simple stored procedure that will read my dataDictionary table and place all of the domain descriptions in a extended propriety of "MS_Description" for all columns that have a matching value in the "domain" extended property. I then have two simple views that extended the sys.extended_properites system view (SQL 2005 object). Below I have listed the code that will generate the aforementioned objects and a quick sample to show what it can do. (This code requires a metadata schema to exist)
IF EXISTS (SELECT * FROM sys.views
WHERE object_id = OBJECT_ID(N'[metadata].[columnsExtendedProperties]'))
DROP VIEW [metadata].[columnsExtendedProperties]
GO
IF EXISTS (SELECT * FROM sys.views
WHERE object_id = OBJECT_ID(N'[metadata].[columnsDataDictionary]'))
DROP VIEW [metadata].[columnsDataDictionary]
GO
IF EXISTS (SELECT * FROM sys.objects
WHERE object_id = OBJECT_ID(N'[metadata].[dataDictionary]') AND type in (N'U'))
DROP TABLE [metadata].[dataDictionary]
GO
IF EXISTS (SELECT * FROM sys.objects
WHERE object_id = OBJECT_ID(N'[dbo].[testExtendedProperties]') AND type in (N'U'))
DROP TABLE [dbo].[testExtendedProperties]
GO
IF EXISTS (SELECT * FROM sys.objects
WHERE object_id = OBJECT_ID(N'[metadata].[columnsExtendedPropertiesRepopulate]') AND type in (N'P', N'PC'))
DROP PROCEDURE [metadata].[columnsExtendedPropertiesRepopulate]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [metadata].[dataDictionary](
[domainName] [varchar](50) NOT NULL,
[domainDesc] [varchar](1000) NOT NULL,
PRIMARY KEY CLUSTERED
(
[domainName] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF
, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
go
exec sp_addextendedproperty 'domain', 'domainName'
, 'schema', 'metadata', 'table', 'dataDictionary', 'column', 'domainName'
exec sp_addextendedproperty 'domain', 'domainDesc'
, 'schema', 'metadata', 'table', 'dataDictionary', 'column', 'domainDesc'
insert into metadata.dataDictionary
(domainName,domainDesc)
values('domainName','Short text name for the classification of a businessdata type or attribute')
insert into metadata.dataDictionary
(domainName,domainDesc)
values('domainDesc','Description of the a business data type or attribute')
GO
create view [metadata].[columnsExtendedProperties]
as
----------------------------------------------
--Steve Fibich
--1.11.2007
--This view is used to match columns to their extended
--properties. This is an extension of sys.extended_properties
--system view.
--
----------------------------------------------
select
top 100 percent
s.name as schemaName
,t.name as TableName
,c.name as columnName
,e.name as EName
,e.value as EDesc
,c.column_id as columnID
from
sys.extended_properties e
inner join
sys.tables t
on
t.object_id=e.major_id
inner join
sys.columns c
on
e.minor_id=c.column_id
and
e.major_id=c.object_id
inner join
sys.schemas s
on
t.schema_id=s.schema_id
where
class=1
order by
s.name
,t.name
,c.column_id
,e.name
GO
create view [metadata].[columnsDataDictionary]
as
----------------------------------------------
--Steve Fibich
--1.11.2007
--This view is used to match columns to their extended
--properties. It does this for particular extended properties
--"domain" and "MS_Description. It then pivots these results for a
--more useful layout.
--
----------------------------------------------
select
top 100 percent
schemaName
,tableName
,columnName
,[domain]
,[MS_Description]
from
(
select
schemaName
,tableName
,columnName
,columnID
,EName
,EDesc
from
[metadata].[columnsExtendedProperties]
)p
PIVOT
(
min(EDesc)
for EName IN
([domain],[MS_Description])
)as pvt
order by
tableName
,columnID
GO
go
create proc metadata.columnsExtendedPropertiesRepopulate
----------------------------------------------
--Steve Fibich
--1.11.2007
--This proc removes any 'MS_Description' extended properties
--from any columns of any tables that have matching entires in the
--metadata.datadictionary table and that have an extended propertie of 'domain'
--This is to try to make keeping 'MS_Description' descripions upto date eaiser
--
----------------------------------------------
@debug bit=0
as
--if @debug=0
-- BEGIN
-- select @debug=debug from metadata.logging where schemaName='metadata'
-- END
declare @schemaName sysname
declare @tableName sysname
declare @columnName sysname
declare @domainDesc varchar(1000)
declare extProp_cursor cursor for
select
e.schemaName
,e.tableName
,e.columnName
,dd.domainDesc
from
metadata.dataDictionary dd
inner join
[metadata].[columnsExtendedProperties] e
on
dd.domainName=edesc
where
ename='domain'
open extProp_cursor
Fetch next from extProp_cursor into
@schemaName,@tableName,@columnName,@domainDesc
while @@fetch_status=0
BEGIN
if @debug=1
BEGIN
select @schemaName,@tableName,@columnName,@domainDesc
END
if exists (select * from ::fn_listextendedproperty('MS_Description'
, 'schema', @schemaName, 'table', @tableName, 'column', @columnName))
BEGIN
execute sp_dropextendedproperty 'MS_Description','schema'
,@schemaName,'table',@tableName,'column',@columnName
END
execute sp_addextendedproperty 'MS_Description',@domainDesc,'schema'
,@schemaName,'table',@tableName,'column',@columnName
Fetch next from extProp_cursor into
@schemaName,@tableName,@columnName,@domainDesc
END
close extProp_cursor
deallocate extProp_cursor
-----------------------------------------
--Example Table and Extended Properties--
-----------------------------------------
go
create table dbo.testExtendedProperties
(
column1 int
,column2 int
,column3 int
,column4 int
,column5 int
,column6 int
)
exec sp_addextendedproperty 'domain', 'domainName'
, 'schema', 'dbo', 'table', 'testExtendedProperties', 'column', 'column1'
exec sp_addextendedproperty 'domain', 'domainDesc'
, 'schema', 'dbo', 'table', 'testExtendedProperties', 'column', 'column2'
exec sp_addextendedproperty 'domain', '3rdDomainDesc'
, 'schema', 'dbo', 'table', 'testExtendedProperties', 'column', 'column3'
exec sp_addextendedproperty 'domain', '4thDomainDesc'
, 'schema', 'dbo', 'table', 'testExtendedProperties', 'column', 'column4'
exec sp_addextendedproperty 'domain', '3rdDomainDesc'
, 'schema', 'dbo', 'table', 'testExtendedProperties', 'column', 'column5'
exec sp_addextendedproperty 'domain', '4thDomainDesc'
, 'schema', 'dbo', 'table', 'testExtendedProperties', 'column', 'column6'
insert into metadata.dataDictionary
(domainName,domainDesc)
values('3rdDomainDesc','This is the 3rd domain description that we have it should be on column3 and column5')
insert into metadata.dataDictionary
(domainName,domainDesc)
values('4thDomainDesc','This is the 4th domain description that we have it should be on column4 and column6')
execute metadata.columnsExtendedPropertiesRepopulate
select * from [metadata].[columnsDataDictionary]
I will be revising this data dictionary system in future posts, I already have an issue with the way the columnsDataDictionary view functions. As you may have noticed the columnsExtendedPropertiesRepopulate overwrites any already existing "MS_Description" values already in place. This will be enhanced in future versions.