Welcome to VSTSC's Community Server Sign in | Join | Help

space_used_sp vs. sp_spaceUsed

Ok I'm not one to re-write standard system stored procedures for SQL Server.  That being said I have been working a good deal with files and file groups within SQL Server over the past few days moving data and indexes to files and file groups other than the default PRIMARY file group.  Sp_spaceUsed is great procedure for giving you overall storage utilization information for a database or database objects (table, index view, or queue).  As I have been moving indexes and tables over to different file groups I wanted to see how this was splitting my data as far as size and row counts was going and how it was being disturbed across the file system.  Unfortunately sp_spaceUsed does not give you this level of detail and I could not find a system view or other system procedure to give me all of the information that I wanted. 

sp_spaceUsed

So I set out to create my own and now that I have it working and figured I'll share it with everyone.  I ended up creating two objects, one a view sp_space_used_view and the second a procedure space_used_sp. I’m a database guy so I love organization so I put both of these objects into a schema named metadata that is owned by DBO.  The view displays the space used by all objects in the database with rollup levels being displayed as ' -db-level', ' -schema-level', and ' -table-level'.  The view relies upon the sys.objects (due to the fact that system tables are not in sys.tables so I was losing the space and row count information for these objects.), sys.indexes, sys.partions, sys.database_files, sys.schemas, sys.dm_db_partion_stats, sys.internal_tables, and sys.index_columns catalog views; all of which have detail coverage in BOL.  The view utilizes these system catalog views to show rows counts, reserved space, data space, index space, and unused space in the database files broken down by table.  I figure I would want to see all of the detailed data for a given database sometimes, rollups others, and individual objects separately as well so encapsulation of the select logic into a view allows re-use of this code in other objects.  The procedure utilizes the view to show you information for a particular table or for the overall database.  I hope this code is useful to others as well.  One of the main differences between sp_spaceUsed and my space_used_sp is that sp_spaceUsed allows for the updating of system information which my procedure does not, it only returns data to the screen.  The second difference is that sp_spaceUsed requires the schema name to be passed as part of the table name if you want to view information on an object outside of your default schema, space_used_sp does not require this and displays all objects with the name you provide giving you the schema name as part of the output.

space_used_sp vs. sp_spaceUsed

Here it is:  (Note:everything is created in a schema call metadata)

IF  EXISTS (SELECT * FROM sys.views WHERE object_id = OBJECT_ID(N'[metadata].[space_used_vw]'))
DROP VIEW [metadata].[space_used_vw]
go
create view metadata.space_used_vw
-------------------------------------------
--SFIBICH
--7/7/2008
--The purpose of this view is to show space used broken
--down by SQL Server data files.  Currently is does not calculate
--space for fulltext indexes and or XML indexes.
--
-------------------------------------------
as
select
top 100 PERCENT
coalesce([schema_name],' -db level-') schemaName
,coalesce(table_name,' -schema level-') tableName
,coalesce(data_file_name,' -table level-') dataFileName
,[rowCount]
,reserved*8 reservedKB
,data*8 dataKB
,case when used_page_count>data then (used_page_count-data)*8 else 0 end indexKB
,case when reserved>used_page_count then (reserved-used_page_count)*8 else 0 end unusedKB
from
(
 SELECT  
 t.name table_name
 ,df.name data_file_name
 ,s.name [schema_name]
 ,SUM( 
    CASE 
  WHEN (p.index_id < 2) THEN row_count 
  ELSE 0 
    END 
 )  [rowCount]
 ,SUM(reserved_page_count)+coalesce(max(IT_reserved_page_count),0) reserved
 ,SUM( 
    CASE 
  WHEN (p.index_id < 2) THEN (in_row_data_page_count + lob_used_page_count + row_overflow_used_page_count) 
  ELSE lob_used_page_count + row_overflow_used_page_count 
    END 
 )data 
 ,SUM(used_page_count)+coalesce(max(IT_used_page_count),0) used_page_count
  FROM sys.dm_db_partition_stats p
 inner join
 sys.objects t
 on
 p.object_id=t.object_id
 inner join
 sys.schemas s
 on
 t.schema_id=s.schema_id
 inner join
 sys.indexes i
 on
 p.object_id=i.object_id
 and
 p.index_id=i.index_id
 inner join
 sys.database_files df
 on
 i.data_space_id=df.data_space_id
 left outer join
 (--Internal tables are placed on the same filegroup as the parent entity. (BOL)
  select
  pp.object_id
  ,min(pp.index_id) index_id
  ,sum(p.reserved_page_count) IT_reserved_page_count
  ,sum(p.used_page_count)  IT_used_page_count
  FROM
  sys.dm_db_partition_stats p
  inner join
  sys.internal_tables it
  on
  p.object_id = it.object_id 
  inner join
  sys.dm_db_partition_stats pp
  ON
  it.parent_object_id=pp.object_id
  WHERE
  it.internal_type IN (202,204) 
  group by
  pp.object_id
 )it
 on
 p.object_id=it.object_id
 and
 p.index_id=it.index_id
 group by
 s.name
 ,t.name
 ,df.name
 with rollup
) dataTable
order by
[schema_name]
,table_name
,[rowCount] desc
,data_file_name asc
go
IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[metadata].[space_used_sp]') AND type in (N'P', N'PC'))
DROP PROCEDURE [metadata].[space_used_sp]
go
create procedure metadata.space_used_sp
-------------------------------------------
--SFIBICH
--7/7/2008
--This is a replacement for sp_spaceused, it utlizes the view
--[metadata].[space_used_vw].  Currently it only reports on tables
--and the total for tables in a schema, or database level broken
--down by database files.
--
-------------------------------------------
@tableName sysname =null
as
if @tableName is null
begin
--------------------
--
--database, schema info
--
--------------------
 select
 ' -db level-' schemaName
 ,' -schema level-' tableName
 ,dataFileName
 ,sum([rowCount]) [rowCount]
 ,sum(reservedKB) reservedKB
 ,sum(dataKB) dataKB
 ,sum(indexKB) indexKB
 ,sum(unusedKB) unusedKB
 from
 [metadata].[space_used_vw]
 where
 dataFileName <>' -table level-'
 or
 (
 dataFileName =' -table level-'
 and
 schemaName=' -db level-'
 and
 tableName =' -schema level-'
 )
 group by
 dataFileName
 order by [rowCount] desc,dataFileName
end
else
begin
--------------------
--
--particular table
--
--------------------
 select
 schemaName
 ,tableName
 ,dataFileName
 ,[rowCount]
 ,reservedKB
 ,dataKB
 ,indexKB
 ,unusedKB
 from
 [metadata].[space_used_vw]
 where
 tableName =@tableName
end

go

Ordering SSIS packages in Visual Studio 2005 (part II)

Ok the first post on Ordering SSIS packages in Visual Studio received a large number of hits...no comments but a good number of hits so I figured a follow up was due.  After my first post I found myself ordering all of my SSIS packages with the order-ssisProj.ps1 code but I found it to be a bother since I had to put in the location of each .dtproj file.  So with this second post I will introduce a script I wrote a few days ago that orders all of my SSIS packages.  It finds any .dtproj file in the default USERPROFILE + \my documents\ and orders them.  Before I get into the powershell script a quick intro about running a powershell script.  I'm not going to go into great detail other than to say you have to enable scripting in powershell.  Second you have to take the code below and put it into a text file and save it with a .ps1 extension.  (As always you use this script at your own risk...and you should always understand what code you grab off a website is doing)

order-SSISproj2.ps1

 #-----------------------------------------------------------------------------
#
#SFIBICH
#5/18/08
#Version 1.0
#order-SSISproj.ps1
#This script saves a backup file as a .bak
#This script will search the USERPROFILE enviorment variable + My documents
#as its starting place for .dtproj files
#This script allows args that will replace the default search path
#Use at your own risk, no guarantees
#-----------------------------------------------------------------------------
#
#$ars[0] - relative path to the file name
#$a = path to this file when executed
#$c = counter variable
#$i = counter for progress bar
#$x = xml object varaible
#$y = list of all dtproj files
#$z = list of dtspackages node items inside the xml document
#
#-----------------------------------------------------------------------------
$a=$^
$i=0
if ($args[0].length -gt 0) {
 $fileLocation=$args[0]
 if ($fileLocation -eq '?') {
  get-content $a | ? {$_.length -gt 0} | ? {$_.substring(0,1) -eq '#' }
  exit
 }
 else {
  date;$y=gci -path ($fileLocation) -include *.dtproj -recurse;date
 }
}else{
 write-warning('starting search for *.dtproj files at '+$x.value+'\My Documents\   this is a recursive search and may take a while')
 date;$y=gci -path ($x.value+'\My Documents\') -include *.dtproj -recurse;date
}
$y | % {
 $i+=1
 write-progress "re-ordering dtproj files- progress" "% complete" -perc ($i/$y.length*100)
 $_.Name
 [xml]$x = get-content $_.FullName
  $x.save($_.FullName+'.bak')
  $z=$x.project.dtspackages.dtspackage | sort-object -property name
  $c=0
  $z| % {if ($c -eq 0) {$c+=1} else {$x.project.dtspackages.RemoveChild($_)}} | out-null
  $z| % {if ($c -eq 1) {$c+=1} else {$x.project.dtspackages.AppendChild($_)}} | out-null
  $x.save($_.FullName)
}

 

order-SSISproj2.ps1 (the walk thru)

Ok the first few lines 1 thru 22 are standard documentation.  This will be use later and is useful to anyone who wants to update or modify the code in the future.  Lines 23 grabs the first token of the previous command line (The file name in this case) and assigns it to the variable $a. Line 24 sets $i to a default value of 0.  Lines 24 thru 34 parse the arguments passed to the script to figure out first if there are any arguments and if they are what are they.  If there where arguments passed to the script and they happen to be '?' (the help command) then the first argument of the previous commands contents are read. (basically go read the file that was entered into the command) Sending on the lines with comments to the screen for ouptut.  If the arguments passed to the script are not '?' then it is assumed that it is a starting location for a search for *.dtproj files.  That directory and every directory inside it are searched for proj files and assigned to the array $y.  Else the default search path is searched for *.dtproj files and those locations are assinged to the array $y. Lines 38 thru 49 is where the bulk of the work happen.  Line 38 executes a for each on the $y array looping through the code block line 39 to 49.  Line 39 increments the $i variable so we can show progress.  Line 40 brings up the progress bar in powershell and shows the percent complete incrementing with each loop through this code block. Line 41 writes the name of the file the script is working on to standard output.  Line 42 gets the content of the dtproj file for this loop and places it into a XML casted variable $x.  Now that line of code is not mine, I can't remember where I saw it but it is very slick!  Now I have an XML object to work with full of data!  First thing I do is to save the xml file with the same file name and a.bak extension just in case I mess something up.  Line 43 grabs all of the nodes in the project.dtspackages node and places them sorted into the $x variable.  In line 45 I set the $c variable to 0 so I can check this later to know how many times I have gone thru inner loops.  Line 46 removes all except for the first dtspackage node objects, based on the ordered set in $x.  I had to do this because when I removed all of the nodes the parent dtspackages node no longer existed and I couldn't add anything back to it. Line 47 adds all of the dtspackage nodes back in the sorted order. Line 48 saves the file and then line 49 you loop back to start the next file.  That’s it short and sweet!

Ok and here is the before and after:

Before (notice the XX_XXX_DX packages at the bottom)

After (notice that the XX_XXX_DX packages are in the correct location)

Once again I am not any way a powershell expert but I wanted to share a useful script that I developed for my own use with anyone who is working in an environment with a large number of SSIS packages in a single project or solution.  Let me know your thoughts?  If you have an improvement to the script send it to me or post it!

 

 

Ordering SSIS packages in Visual Studio 2005

Problem: SSIS Packages in Microsoft Visual Studio 2005 are ordered in the order of package creation by default, or more precisely the order they have been added to that solutions/projects folder.  What this means is that if you have an SSIS project that has more than a few SSIS packages let’s say 70+ it becomes an issue when you go to look for a particular package.  You can't just have a good naming scheme for your packages and hope to find them quickly (though that does help). Because the packages are not ordered in alphabetical order and MSVS 2005 does not give you a way to change the default order easily you are stuck searching through what is pretty much an un-ordered list.  Ok so if you have encountered this problem you know it’s real pain to deal with.  

The Answer:  Visual Studio uses metadata in an xml file to know which packages are in your project and what you want these packages to be called.  Note:  The file names and the package names in the project do not have to be the same but they are by default.  This file is the [ProjectName].dtproj file located in the root of the project folder for any given SSIS project. 

My Solution: Powershell to the rescue

#------------------------------------------
#
#SFIBICH
#5/16/08
#Version 1.0
#order-SSISproj.ps1
#This script saves a backup file as a .bak
#------------------------------------------
#$ars[0] - relative path to the file name
#$c = counter variable
#$x = xml object varaible
#------------------------------------------
if ($args[0].length -gt 0) {
 $fileLocation=$args[0]
 [xml]$x = get-content $fileLocation
 $x.save($fileLocation+'.bak')
 $y=$x.project.dtspackages.dtspackage | sort-object -property name
 $c=0
 $y| % {if ($c -eq 0) {$c+=1} else {$x.project.dtspackages.RemoveChild($_)}}
 $y| % {if ($c -eq 1) {$c+=1} else {$x.project.dtspackages.AppendChild($_)}}
 $x.save($fileLocation)
}else{
 write-warning('Please enter the location of a SSIS project .dtproj file (NO WORK EXECUTED)') 
}

 

Thats it I hope some of you find this useful!

trap [exception] { #insert code here} .... -ea stop

First off I'm not The PowerShellGuy (if you want tons of powershell information go there first)...I'm just getting started with PowerShell but I can already see how it can be extremely useful to a database developer and database administrator.  I plan on posting on PowerShell from time to time in this blog when I find something particularly useful to either of the previously mentioned rolls but for now I'm going to go over the TRAP object in PowerShell more to the point of what I have learned on how to use it.  The trap object is pretty straight forward and is similar to a onError event.  The main thing I want to explain in this post is that any cmdlet that you want to trap an error needs its ErrorAction or -ea property set to stop.  That’s it, that short, that sweat, but I had a hard time finding anything on the web or in three different PS books (Not mentioned to protect the guilty) on how to properly execute a trap.  Below I give a simple example:


$NoSuchDirectory="c:\noSuchDirectory"

#nothing happens as expected....script continues
function noTrap
{
 get-childItem $NoSuchDirectory=
}
noTrap


#nothing happens as expected(?)....script continues
function withTrap
{
 trap {
  #insert trap info here
  "Trapped Error! kind of"
 }
 get-childItem $NoSuchDirectory=
}
withTrap

#error is trapped....script stops then continues
function withTrap_2
{
 trap {
  #insert trap info here
  "Trapped Error! kind of"
  continue;
 }
 get-childItem $NoSuchDirectory= -ea stop

 #notice I've chanaged the ErrorAction on GCI to STOP from
 #the default continue, this is what allows the script to
 #trap the error.  If you change it to continue the trap
 #never fires
}
withTrap_2
"After the Trap"

 

 Now I just need to figure out how to get the SQL 2008 powershell snapins to port over to a machine that doesn't have SQL 2008 installed on it.

Posted by steve | 0 Comments
Filed under:

theSystem part X

Well one of my topics, "theSystem" listed in a previous post TwoProjects has moved over to SQL Server Central.  I may have some more background information on this project from time to time but the majority of the writing will take place at SQL Server Central.  If you missed the second article here is a link..."theSystem".

I'm still working on SOAK but its taken a backseat to the ramping up of a large project at my job, developing some SQL Server for iSereis programers training, and a speaking engagement. (Roanoke Code Camp)

Posted by steve | 0 Comments
Filed under:

Connecting to an iSeries for SSIS Users (this pertains to any .Net connection)

There are a number of useful pieces of information to be gleaned from reams and reams of documentation on both sides of the house when it comes to connect SSIS to an IBM iSeries.  I will try to present as much of it as I can in an easy to read fashion here.  I am going to start off with the basics- Terminology.

General iSeries Terminology:

¨  iSeries – name for the actual hardware; re-branded to System I

¨  i5/OS – Native operating system for AS/400 – previous called OS 400 operating system running on System I hardware

¨  i5-Processor for the System I; System I –i5

¨  DB2/400 – IBM’s DB database running on the OS 400 operating system.  This is different from DB2 UDB (IBM is not exactly clear on this sometimes)

¨  System I can run i5/OS(OS 400), AIX, UNIX(other tan AIX), LINUX, Windows (through the use of add on cards)

i5/OS (AS400) Terminology:

¨  DBFS – Database File System; Native AS/400 file system everything is an object in a database.

¨  IFS – Integrated File System; Windows compatible file system, folders, files.

¨  LPAR- Logical Partition- this is similar to a virtual server running

¨  iASP – integrated Asynchronous Storage Pools; This allows the AS/400 to run multiple DB instances similar to a named SQL Server instance.  Not very common on this platform.  Also known as a Catalog

¨  Library – This matches to a schema on MS SQL Server.  Libraries are heavily used on the AS/400 much more common that schema use on MS SQL Server

¨  Files – Tables

¨  Logical Files – Logics – MS SQL Server equivalent is an indexed view.

¨  Indexes – MS SQL Server equivalent to indexes

¨  Journals and Journal Receivers – This is how the 400 handles commitment control.  (Basically a log file can be set at the object level)

¨  QGPL –General Purpose Library-Lots of things get dumped here by default including SQL Packages

¨  SQL Packages-Not 100% sure but it is required to run a disturbed SQL Program and handles the file system access plan.

 

Well thats it for now, I'll give more specific examples of the IBM OLE and ODBC drivers as well as MS DB2 OLE drivers in future posts.

Posted by steve | 0 Comments

Data Dictionary Part II

Ok, after putting up my first post on the Data Dictionary I had posters remorse.  There is a simple change that makes the script shown in the previous post a little more user friendly.  First the view metadata.columnsDataDictionary need to be changed from a SQL Select using a pivot to just a self join from the metadata.columnsExtendedProperties to allow for columns that shared the same domain to easily display different MS_Description values.  The second change was to allow the procedure metadata.columnsExtendedPropertiesRepopulate to allow for an overall overwrite or just to add new values and retain old MS_Description extended propertie values.  I don't think I mentioned this earlier the extended property of 'MS_Description' will be picked up by a number of development applications so it is useful to stick your data dictionary value in there but not to overwrite one if it came supplied with say a third part application installation or if you are allowing a third party applicaiton manage your descriptioins in the database.

Well thats it here is the code with a simple example:

USE [steves]

GO

IF EXISTS (SELECT * FROM sys.views WHERE object_id = OBJECT_ID(N'[metadata].[columnsExtendedProperties]'))

DROP VIEW [metadata].[columnsExtendedProperties]

GO

IF EXISTS (SELECT * FROM sys.views

WHERE object_id = OBJECT_ID(N'[metadata].[columnsDataDictionary]'))

DROP VIEW [metadata].[columnsDataDictionary]

GO

IF EXISTS (SELECT * FROM sys.objects

WHERE object_id = OBJECT_ID(N'[metadata].[dataDictionary]') AND type in (N'U'))

DROP TABLE [metadata].[dataDictionary]

GO

IF EXISTS (SELECT * FROM sys.objects

WHERE object_id = OBJECT_ID(N'[dbo].[testExtendedProperties]') AND type in (N'U'))

DROP TABLE [dbo].[testExtendedProperties]

GO

IF EXISTS (SELECT * FROM sys.objects

WHERE object_id = OBJECT_ID(N'[metadata].[columnsExtendedPropertiesRepopulate]') AND type in (N'P', N'PC'))

DROP PROCEDURE [metadata].[columnsExtendedPropertiesRepopulate]

GO

SET ANSI_NULLS ON

GO

SET QUOTED_IDENTIFIER ON

GO

SET ANSI_PADDING ON

GO

CREATE TABLE [metadata].[dataDictionary](

[domainName] [varchar](50) NOT NULL,

[domainDesc] [varchar](1000) NOT NULL,

PRIMARY KEY CLUSTERED

(

[domainName] ASC

)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF

, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

) ON [PRIMARY]

GO

SET ANSI_PADDING OFF

go

exec sp_addextendedproperty 'domain', 'domainName'

, 'schema', 'metadata', 'table', 'dataDictionary', 'column', 'domainName'

exec sp_addextendedproperty 'domain', 'domainDesc'

, 'schema', 'metadata', 'table', 'dataDictionary', 'column', 'domainDesc'

insert into metadata.dataDictionary

(domainName,domainDesc)

values('domainName','Short text name for the classification of a businessdata type or attribute')

insert into metadata.dataDictionary

(domainName,domainDesc)

values('domainDesc','Description of the a business data type or attribute')

GO

create view [metadata].[columnsExtendedProperties]

as

----------------------------------------------

--Steve Fibich

--1.11.2007

--This view is used to match columns to their extended

--properties. This is an extension of sys.extended_properties

--system view.

--

----------------------------------------------

select

top 100 percent

s.name as schemaName

,t.name as TableName

,c.name as columnName

,e.name as EName

,e.value as EDesc

,c.column_id as columnID

from

sys.extended_properties e

inner join

sys.tables t

on

t.object_id=e.major_id

inner join

sys.columns c

on

e.minor_id=c.column_id

and

e.major_id=c.object_id

inner join

sys.schemas s

on

t.schema_id=s.schema_id

where

class=1

order by

s.name

,t.name

,c.column_id

,e.name

GO

create view [metadata].[columnsDataDictionary]

as

----------------------------------------------

--Steve Fibich

--1.11.2007

--This view is used to match columns to their extended

--properties. It does this for particular extended properties

--"domain" and "MS_Description. It then pivots these results for a

--more useful layout.

--

----------------------------------------------

select

top 100 percent

cep1.schemaName

,cep1.tableName

,cep1.columnName

,cep1.eDesc as domain

,cep2.eDesc as MS_Description

from

(

select

schemaName

,tableName

,columnName

,eDesc

,eName

,columnId

from [metadata].[columnsExtendedProperties]

where

EName='domain'

) cep1

full outer join

(

select

schemaName

,tableName

,columnName

,eDesc

,eName

,columnId

from

[metadata].[columnsExtendedProperties]

where

EName='MS_Description'

) cep2

on

cep1.schemaName=cep2.schemaName

and

cep1.tableName=cep2.tableName

and

cep1.columnID=cep2.columnID

order by

cep1.schemaName

,cep1.tableName

,cep1.columnID

GO

go

create proc metadata.columnsExtendedPropertiesRepopulate

----------------------------------------------

--Steve Fibich

--1.11.2007

--This proc removes any 'MS_Description' extended properties

--from any columns of any tables that have matching entires in the

--metadata.datadictionary table and that have an extended propertie of 'domain'

--This is to try to make keeping 'MS_Description' descripions upto date eaiser

--

--@overwrite=0 turns off overwriting so existing MS_Descriptions will be left alone

--@overwrite=1 existing MS_Description values will be deleted and re-created

--@debug =0 sets additional logging off

--@debug=1 sets additional looging on

--

----------------------------------------------

@overwrite bit=0

,@debug bit=0

as

--if @debug=0

-- BEGIN

-- select @debug=debug from metadata.logging where schemaName='metadata'

-- END

declare @schemaName sysname

declare @tableName sysname

declare @columnName sysname

declare @domainDesc varchar(1000)

declare extProp_cursor cursor for

select

e.schemaName

,e.tableName

,e.columnName

,dd.domainDesc

from

metadata.dataDictionary dd

inner join

[metadata].[columnsExtendedProperties] e

on

dd.domainName=edesc

where

ename='domain'

open extProp_cursor

Fetch next from extProp_cursor into

@schemaName,@tableName,@columnName,@domainDesc

while @@fetch_status=0

BEGIN

if @debug=1

BEGIN

select @schemaName,@tableName,@columnName,@domainDesc

END

if (exists (select * from ::fn_listextendedproperty('MS_Description', 'schema'

, @schemaName, 'table', @tableName, 'column', @columnName)) and (@overwrite=0))

BEGIN

execute sp_dropextendedproperty 'MS_Description','schema'

,@schemaName,'table',@tableName,'column',@columnName

END

if not exists (select * from ::fn_listextendedproperty('MS_Description', 'schema'

, @schemaName, 'table', @tableName, 'column', @columnName))

BEGIN

execute sp_addextendedproperty 'MS_Description',@domainDesc,'schema'

,@schemaName,'table',@tableName,'column',@columnName

END

Fetch next from extProp_cursor into

@schemaName,@tableName,@columnName,@domainDesc

END

close extProp_cursor

deallocate extProp_cursor

-----------------------------------------

--Example Table and Extended Properties--

-----------------------------------------

go

create table dbo.testExtendedProperties

(

column1 int

,column2 int

,column3 int

,column4 int

,column5 int

,column6 int

)

exec sp_addextendedproperty 'domain', 'domainName'

, 'schema', 'dbo', 'table', 'testExtendedProperties', 'column', 'column1'

exec sp_addextendedproperty 'domain', 'domainDesc'

, 'schema', 'dbo', 'table', 'testExtendedProperties', 'column', 'column2'

exec sp_addextendedproperty 'domain', '3rdDomainDesc'

, 'schema', 'dbo', 'table', 'testExtendedProperties', 'column', 'column3'

exec sp_addextendedproperty 'domain', '4thDomainDesc'

, 'schema', 'dbo', 'table', 'testExtendedProperties', 'column', 'column4'

exec sp_addextendedproperty 'domain', '3rdDomainDesc'

, 'schema', 'dbo', 'table', 'testExtendedProperties', 'column', 'column5'

exec sp_addextendedproperty 'domain', '4thDomainDesc'

, 'schema', 'dbo', 'table', 'testExtendedProperties', 'column', 'column6'

insert into metadata.dataDictionary

(domainName,domainDesc)

values('3rdDomainDesc','This is the 3rd domain description that we have it should be on column3 and column5')

insert into metadata.dataDictionary

(domainName,domainDesc)

values('4thDomainDesc','This is the 4th domain description that we have it should be on column4 and column6')

execute metadata.columnsExtendedPropertiesRepopulate

select * from [metadata].[columnsDataDictionary]

select * from [metadata].[columnsExtendedProperties]

 

Data Dictionary part I

Having a data dictionary is an invaluable resource for not only for the DBA but for anyone, report writers or developers who need to work on a database that you have to manage or have developed yourself.  A good amount of time can be saved from answering the same questions over and over again if you can point to documentation to answer any initial questions.   Microsoft provides a way to keep a this information in each database without having to change the structure of any of the objects that already exist.  Extended Properties are a valuable resource in keeping metadata close to the heart of the system where it is needed.  In this series I will show you my data dictionary frame work.  I do create one table data dictionary that contains information that will work in conjunction with MS extended properties system functions, procedures, and views.  I have a concept of "data domains" which is a way to relate different columns in the same table or different tables that contain the same logical information that may not support a foreign key relationship.  An example of this is state in a data warehouse or master data management system, some tables may use the USPS two letter code, other tables from a different system may use an ISO 3 digit numeric code, but in each of these cases they represent the same information.  Connecting these two different data elements in the database can be incredibly useful to a developer especially if they are new to your environment.  To facilitate this I create an extended property every column in the database that is "domain" and populate it with the name of the data domain, in the example above the extended properties "domain" would get a value of "state" in each of the two different tables.  I then populate my dataDictionary table with the domain "state" and a description for this domain.  I have simple stored procedure that will read my dataDictionary table and place all of the domain descriptions in a extended propriety of "MS_Description" for all columns that have a matching value in the "domain" extended property.  I then have two simple views that extended the sys.extended_properites system view (SQL 2005 object).  Below I have listed the code that will generate the aforementioned objects and a quick sample to show what it can do.   (This code requires a metadata schema to exist)

IF EXISTS (SELECT * FROM sys.views

WHERE object_id = OBJECT_ID(N'[metadata].[columnsExtendedProperties]'))

DROP VIEW [metadata].[columnsExtendedProperties]

GO

IF EXISTS (SELECT * FROM sys.views

WHERE object_id = OBJECT_ID(N'[metadata].[columnsDataDictionary]'))

DROP VIEW [metadata].[columnsDataDictionary]

GO

IF EXISTS (SELECT * FROM sys.objects

WHERE object_id = OBJECT_ID(N'[metadata].[dataDictionary]') AND type in (N'U'))

DROP TABLE [metadata].[dataDictionary]

GO

IF EXISTS (SELECT * FROM sys.objects

WHERE object_id = OBJECT_ID(N'[dbo].[testExtendedProperties]') AND type in (N'U'))

DROP TABLE [dbo].[testExtendedProperties]

GO

IF EXISTS (SELECT * FROM sys.objects

WHERE object_id = OBJECT_ID(N'[metadata].[columnsExtendedPropertiesRepopulate]') AND type in (N'P', N'PC'))

DROP PROCEDURE [metadata].[columnsExtendedPropertiesRepopulate]

GO

SET ANSI_NULLS ON

GO

SET QUOTED_IDENTIFIER ON

GO

SET ANSI_PADDING ON

GO

CREATE TABLE [metadata].[dataDictionary](

[domainName] [varchar](50) NOT NULL,

[domainDesc] [varchar](1000) NOT NULL,

PRIMARY KEY CLUSTERED

(

[domainName] ASC

)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF

, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

) ON [PRIMARY]

GO

SET ANSI_PADDING OFF

go

exec sp_addextendedproperty 'domain', 'domainName'

, 'schema', 'metadata', 'table', 'dataDictionary', 'column', 'domainName'

exec sp_addextendedproperty 'domain', 'domainDesc'

, 'schema', 'metadata', 'table', 'dataDictionary', 'column', 'domainDesc'

insert into metadata.dataDictionary

(domainName,domainDesc)

values('domainName','Short text name for the classification of a businessdata type or attribute')

insert into metadata.dataDictionary

(domainName,domainDesc)

values('domainDesc','Description of the a business data type or attribute')

GO

create view [metadata].[columnsExtendedProperties]

as

----------------------------------------------

--Steve Fibich

--1.11.2007

--This view is used to match columns to their extended

--properties. This is an extension of sys.extended_properties

--system view.

--

----------------------------------------------

select

top 100 percent

s.name as schemaName

,t.name as TableName

,c.name as columnName

,e.name as EName

,e.value as EDesc

,c.column_id as columnID

from

sys.extended_properties e

inner join

sys.tables t

on

t.object_id=e.major_id

inner join

sys.columns c

on

e.minor_id=c.column_id

and

e.major_id=c.object_id

inner join

sys.schemas s

on

t.schema_id=s.schema_id

where

class=1

order by

s.name

,t.name

,c.column_id

,e.name

GO

create view [metadata].[columnsDataDictionary]

as

----------------------------------------------

--Steve Fibich

--1.11.2007

--This view is used to match columns to their extended

--properties. It does this for particular extended properties

--"domain" and "MS_Description. It then pivots these results for a

--more useful layout.

--

----------------------------------------------

select

top 100 percent

schemaName

,tableName

,columnName

,[domain]

,[MS_Description]

from

(

select

schemaName

,tableName

,columnName

,columnID

,EName

,EDesc

from

[metadata].[columnsExtendedProperties]

)p

PIVOT

(

min(EDesc)

for EName IN

([domain],[MS_Description])

)as pvt

order by

tableName

,columnID

GO

go

create proc metadata.columnsExtendedPropertiesRepopulate

----------------------------------------------

--Steve Fibich

--1.11.2007

--This proc removes any 'MS_Description' extended properties

--from any columns of any tables that have matching entires in the

--metadata.datadictionary table and that have an extended propertie of 'domain'

--This is to try to make keeping 'MS_Description' descripions upto date eaiser

--

----------------------------------------------

@debug bit=0

as

--if @debug=0

-- BEGIN

-- select @debug=debug from metadata.logging where schemaName='metadata'

-- END

declare @schemaName sysname

declare @tableName sysname

declare @columnName sysname

declare @domainDesc varchar(1000)

declare extProp_cursor cursor for

select

e.schemaName

,e.tableName

,e.columnName

,dd.domainDesc

from

metadata.dataDictionary dd

inner join

[metadata].[columnsExtendedProperties] e

on

dd.domainName=edesc

where

ename='domain'

open extProp_cursor

Fetch next from extProp_cursor into

@schemaName,@tableName,@columnName,@domainDesc

while @@fetch_status=0

BEGIN

if @debug=1

BEGIN

select @schemaName,@tableName,@columnName,@domainDesc

END

if exists (select * from ::fn_listextendedproperty('MS_Description'

, 'schema', @schemaName, 'table', @tableName, 'column', @columnName))

BEGIN

execute sp_dropextendedproperty 'MS_Description','schema'

,@schemaName,'table',@tableName,'column',@columnName

END

execute sp_addextendedproperty 'MS_Description',@domainDesc,'schema'

,@schemaName,'table',@tableName,'column',@columnName

Fetch next from extProp_cursor into

@schemaName,@tableName,@columnName,@domainDesc

END

close extProp_cursor

deallocate extProp_cursor

-----------------------------------------

--Example Table and Extended Properties--

-----------------------------------------

go

create table dbo.testExtendedProperties

(

column1 int

,column2 int

,column3 int

,column4 int

,column5 int

,column6 int

)

exec sp_addextendedproperty 'domain', 'domainName'

, 'schema', 'dbo', 'table', 'testExtendedProperties', 'column', 'column1'

exec sp_addextendedproperty 'domain', 'domainDesc'

, 'schema', 'dbo', 'table', 'testExtendedProperties', 'column', 'column2'

exec sp_addextendedproperty 'domain', '3rdDomainDesc'

, 'schema', 'dbo', 'table', 'testExtendedProperties', 'column', 'column3'

exec sp_addextendedproperty 'domain', '4thDomainDesc'

, 'schema', 'dbo', 'table', 'testExtendedProperties', 'column', 'column4'

exec sp_addextendedproperty 'domain', '3rdDomainDesc'

, 'schema', 'dbo', 'table',