SSIS Design Pattern - Collect Enterprise SQL Server Database Metadata With SSIS
Introduction
A few months back I received an email from a nice lady who was beginning to work with SSIS. She was trying to use SSIS to automate metadata collection in her enterprise and she had made a good start in the package she attached to the email. She had a few questions so I opened the package and began poking around some to answer them. I realized it would take me a lot longer to type out answers to her questions than to simply build the SSIS package for her (or at least make a good start). So I did. This post is about that package.
I've built similar stuff in the past to monitor the schemas of data warehouse sources. "Why would you do such a thing, Andy?" I know this will come as a shock to you: Sometimes developers and DBAs make changes to the schemas of data warehouse sources and don't tell the data warehouse people. Ok, I made that part up - that never happens. But think about how you could address the issue if it ever happened. Hypothetically.
First Things First
There are a couple approaches to collecting SQL Server database metadata. My preference is to automate the process to the point that I can drop it almost anywhere and get results with a minimal amount of tinkering and tweaking. For me, this means my development process is iterative because I never get things right the first time. Ever.
This Version
The current version of this package is striped 0.4. The version I write about here is version 0.1. It serves to introduce the principles and functionality in a less complex way - allowing you the Reader to take the project in any direction you deem cool. But that means there's stuff missing from this SSIS package that I would normally include. Stuff like error handling and logging, for example.
A Place For My Data
Before I started I decided to store the data in a SQL Server database. You don't need the database for the SSIS part of the project described in this post. You can simply skip the tasks that write the metadata to the database. But if you skip them and I decide to write about version 0.2, you will come back and want the database scripts. You can download them here (free registration required).
In this post, I am not going to focus on the database I designed. Like the SSIS package, this is version 0.1 of the database. There is no attempt at referential integrity in this version. It is a simple, single-pass effort at designing a landing zone for this data.
A. The Servers
Servers? Raise Your Hands Please
Create a new SSIS project and name it EnterpriseDBMetadata. Name the package EnterpriseDBMetadata.dtsx.
Create the following package-scoped variables (in alphabetical order):
- ColumnDataType (String)
- ColumnName (String)
- ColumnNames (Object)
- ColumnObjectID (Int32)
- DatabaseId (Int32)
- DatabaseName (String)
- DatabaseNames (Object)
- DatabaseObjectId (Int32)
- saSQLServers (Object)
- SchemaId (Int32)
- SchemaName (String)
- SchemaNames (Object)
- SchemaObjectId (Int32)
- SQLServerId (Int32)
- SQLServerInstanceName (String)
- TablesAndViewsId (Int32)
- TablesAndViewsName (String)
- TablesAndViewsNames (Object)
- TablesAndViewsObjectId (Int32)
- TablesAndViewsType (String)
Add a Script Task to the Control Flow and rename it Enumerate SQL Server Instances. Editing the Script Task, add saSQLServers to the ReadWriteVariables property:
Click the Design Script button to open the Visual Studio for Applications (VSA) script designer and, in Project Explorer, right-click References to add references to the Microsoft.SqlServer.Smo, Microsoft.SqlServer.SmoEnum, and Microsoft.SqlServer.SqlEnum assemblies:
Next add the following VB.Net code to the script designer:
Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Runtime
Imports Microsoft.SqlServer.Management.Smo
Imports Microsoft.SqlServer.Management.Common
Public Class ScriptMain
Public Sub Main()
Dim oSmo As New SmoApplication
Dim dtSQLServers As DataTable = oSmo.EnumAvailableSqlServers()
Dim iSQLServerCount As Integer = dtSQLServers.Rows.Count - 1
Dim i As Integer = 0
Dim saSQLServers() As String
Dim sSQLServerInstanceName As String
ReDim saSQLServers(iSQLServerCount)
For Each row As DataRow In dtSQLServers.Rows
sSQLServerInstanceName = row("Name").ToString
'MsgBox(sSQLServerInstanceName)
saSQLServers(i) = sSQLServerInstanceName
i += 1
Next
Dts.Variables("saSQLServers").Value = saSQLServers
Dts.TaskResult = Dts.Results.Success
End Sub
End Class
This is a handy piece of code. It basically creates a list of all the SQL Server instances it can locate on your network and shoves that list into an SSIS variable called saSQLServers.
Shredding The List
Steve Jones - my friend, fellow SQL Server MVP, and editor of SQL Server Central - was kind enough to publish a recent article I wrote about using the Foreach Loop Container to shred object variables. You can read more about that here. Shredding is just a fancy word (we call them "$3 words" here in Farmville) for reading individual items in a collection.
Next drop a Foreach Loop Container onto the Control Flow and connect the Script Task to it with a Success Precedence Constraint:
Double-click the Foreach Loop Container to open the editor. On the General page rename it ForEach SQL Server Instance. Select the Foreach From Variable Enumerator on the Collection page, and then select the saSQLServers variable:
On the Variable Mappings page, map the SQLServerInstanceName variable to Index 0. There's only one column of items in my saSQLServers variable - it's an SSIS object variable, but it contains a single-dimension String array:
Click the Ok button to close the Foreach Loop Container editor.
Let's take a moment to talk about what we did. We have a variable named saSQLServers that is populated by the Script Task with a list of all the SQL Server instances it can identify on the network. The Foreach Loop Container shreds this variable, walking through each value - each SQL Server instance name, in this case - one at a time. Each pass, it pushes the value of one of those SQL Server instance names into a variable called SQLServerInstanceName.
In and of itself, this is pretty cool.
The Destination Connection
Now we need to connect our SSIS package to SQL Server so we can store the name of the servers we can reach. Right-click inside the Connection Managers window at the bottom of the Control Flow and select New OLE DB Connection:
When the Configure OLE DB Connection Manager form displays, click the New button to display the Connection Manager editor. In the Server Name dropdown, enter the name of the SQL Server instance to which you deployed the EnterpriseDBMetadata scripts. In the "Select or enter a database name" dropdown, select EnterpriseDBMetadata:
I always recommend testing the connection. Click the Test Connection button to verify connectivity, then click the Ok button to close the Connection Manager editor. Then click the Ok button again to close the Configure OLE DB Connection Manager.
Rename this connection "EnterpriseDBMetadata.OLEDB".
The Source Connection
I can hear you thinking "Gosh Andy, you did that backwards! Most people start at the source and then talk about the destination." Yeah, I know. But I'm an engineer at heart and engineers are not normal people.
How many source Connection Managers do we need?
"That's a trick question Andy, I know it! But I think we need one for each SQL Server instance out there."
You are partially correct. It is a trick question - you're right about that. We do need one source Connection Manager for each SQL Server instance, but we are only going to be connected to one SQL Server instance at a time as we iterate through our Foreach Loop Container. So we can get away with one source Connection Manager if we can only find a way to aim it at the individual SQL Server instance we wish, when we wish it.
So let's do that.
Create another OLE DB Connection Manager as you did before. This time configure it to connect to a local or developement instance to which you can connect using Windows Authentication. Configure the database name as before and click the Ok buttons to end the Connection Manager creation.
Rename this Connection Manager "SQLServerInstance.OLEDB".
Right-click the SQLServerInstance.OLEDB Connection Manager and select Properties. Click on the Expressions Property and then click the ellipsis in the Value textbox for the Expressions Property:
When the Property Expressions form displays, select the ServerName property from the Property dropdown:
Click on the ellipsis on the Expression textbox to display the Expression Builder form:
Expand the Variables list and drag the SQLServerInstanceName variable into the Expression textbox as shown above. I recommend always checking the value of an expression. To do so, click the Evaluate Expression button. Click the Ok button to close the Expression Builder, then click the Ok button again to close the Property Expressions editor.
You've just done something else that's pretty cool, so let's reflect: We have a variable named saSQLServers that is populated by the Script Task with a list of all the SQL Server instances it can identify on the network. The Foreach Loop Container shreds this variable, walking through each value - each SQL Server instance name, in this case - one at a time. Each pass, it pushes the value of one of those SQL Server instance names into a variable called SQLServerInstanceName. And each time it changes the value of the SQLServerInstanceName variable, the SQLServerInstance.OLEDB Connection Manager's connection properties are updated to now point to that instance of SQL Server.
How cool is that?
Show Me!
I love lamp. And I like Script Tasks. I use Script Tasks a lot (lamps too!). One cool use of them is to show me the value of a variable that's being changed by a Foreach Loop Container. Since we have a Foreach Loop Container that's changing a variable, allow me to demonstrate.
Drag a Script Task into the Foreach Loop Container and rename it "Show SQL Server Instance Name". Open the editor and add SQLServerInstanceName to the ReadOnlyVariables property:
Add these two lines of code to Sub Main():
Dim sSQLServerInstanceName As String = Dts.Variables("SQLServerInstanceName").Value.ToString
MsgBox("SQL Server Instance Name: " & sSQLServerInstanceName)
All this script does is read the value currently stored in SQLServerInstanceName and display it in a message box:
Close the VSA editor and click the Ok button to close the Script Task editor.
Test Run Time!
You can run the package now to see which servers it finds. I did. That's how I generated the last image.
Store It
Next, drag an Execute SQL Task into the ForEach SQL Server Instance Container and connect the Script Task to it using a Success precedence constraint. Name the Execute SQL Task "Add the SQL Server name" and double-click it to open the editor.
Select the EnterpriseDBMetadata.OLEDB connection in the Connection property dropdown. Click the SQLStatement property and then the ellipsis in the Value. When the Enter SQL Query form displays, paste this T-SQL into the textbox:
If Not Exists(Select SQLServerInstanceName
from dbo.SQLServerInstances
where SQLServerInstanceName = ?)
begin
Insert Into dbo.SQLServerInstances
(SQLServerInstanceName)
Output inserted.Id
Values(?)
end
Else
begin
Select Id
from dbo.SQLServerInstances
where SQLServerInstanceName = ?
end
See the question marks? This is a parameterized query. The question marks are placeholders for parameters. The collection of parameters comprises an array, and the array is zero-based. This means the first question mark - that one in the first If statement - maps to parameter 0. The second maps to parameter 1, the third to parameter 2. There's a pattern emerging here... You get the picture.
To map these parameters in the Execute SQL Task, click on the Parameter Mapping page:
Map parameters by clicking the Add button to add a parameter mapping. Since we have three question marks in our T-SQL, you need three parameters. The first parameter maps to the SQLServerInstanceName variable, so I select that variable from the Variable Name dropdown. I am supplying the variable to the query, so it's an Input (Direction). It's a String data type and Strings map to the VARCHAR Data Type in the OLEDB provider I'm using in the ConnectionType property. I supply the ordinal of the parameter to which I am mapping for Parameter Name. In other words, this is the first question mark, so the Parameter Name is 0. I leave the Parameter Size set to the default (-1).
Now it just so happens that all three of our question marks map to the same SSIS variable value: SQLServerInstanceName. That isn't always the case. But since it is here, I configure the remaining two parameters identically to the first, the only exception is the Parameter Name --> ordinal values.
The T-SQL in this Execute SQL Task returns a value. If the dbo.SQLServerInstances table does not contain a value for the current SQLServerInstanceName, the current SQLServerInstanceName is added. Did you catch the Output clause in the Insert statement? Output inserted.Id returns the value of the newly inserted Id, which is an Identity(1,1) column in the dbo.SQLServerInstances table. The Output clause rocks. It's new in T-SQL 2005 and allows us to write a single statement to perform the insert and return the inserted row's identity value. Note that if the SQLServerInstanceName exists in the dbo.SQLServerInstances table, the Id for that row is returned.
Either way, you get an Id value back from this statement. What do you do with it? Return to the General page of the Execute SQL Task editor and change the ResultSet Property value from "None" to "Single row". This tells the Execute SQL Task to expect a row back from the T-SQL statement.
We want to capture this Id value for later (patience! You'll see...). Now that we've told the Execute SQL Task to expect a single row from the statement's execution, click on the Result Set page to configure it.
Similar to the parameters collection, single-row resultsets are a zero-based array and use ordinals to map various columns in the row to SSIS variables. Click the Add button to add a new ResultSet. Change the Result Name to 0 and select the SQLServerId variable:
Let's review: We configured the Execute SQL Task to find or insert a SQLServerInstanceName into the dbo.SQLServerInstances table and return the Id of the row to us. We used a parameterized query for this. Why? Because it's inside a Foreach Loop Container and the SQLServerInstanceName value is going to change with each iteration (pass) through the loop. We want to add them all to the dbo.SQLServerInstances table, one at a time. Make sense? Good. We also want to store the Id value for some reason Andy won't tell us yet. Bad Andy.
The General page of the configured Execute SQL Task looks like this:
Click the Ok button to close the Execute SQL Task editor.
A Pattern Emerges...
What we've built so far is actually the template for everything else in this package. The package is a lot more complex than this, but it will help you immensely if you realize that this piece of functionality is merely repeated with minor tweaking to build the remainder of the package.
Now. You can copy and paste a lot here if you want to. And I don't mind if you do so long as you promise to build really good tests. You should build really good tests anyway but testing SSIS's a topic for another post. My point is simple: You can introduce a host of errors - some of them very interesting (which is another way of saying "hard to find") - cutting and pasting. So be careful.
The pattern looks like this:
That's it really. Looks simple when you look at it this way. That's because it is simple. It just takes work to implement this in SSIS. Or in any other language or platform. Surprise! We call this "software development".
B. The Databases
Next drag another Execute SQL Task into the ForEach SQL Server Instance Loop Container. Connect the "Add the SQL Server name" Execute SQL Task to the new one using a Success precedence constraint and rename the new Execute SQL Task "Get DatabaseNames":
Double-click the new Execute SQL Task to open the editor. Set the Connection property to the SQLServerInstance.OLEDB Connection Manager. Set the SQLStatement property to the following T-SQL statement:
select name, Database_Id
from master.sys.databases
where name not in ('master','msdb','model','tempdb')
This statement queries the master.sys.databases table for a list of databases and their Database_Id values, explicitly excluding system databases by name. Now there are other ways to explicitly exclude system databases and you are welcome to add them to the comments for this post. I love learning new stuff - teach me something! (In Farmville, we'd say "Learn me something". We talk funny. I know.)
We do not use any parameters in this statement. It is dynamically aimed at the current instance of SQL Server by the dynamic expression we defined for the ServerName property or the SQLServerInstance.OLEDB Connection Manager. So this query will return a list of database on each SQL Server Instance identified by the "Enumerate SQL Server Instances" Script Task, as saSQLServers is shredded by the "ForEach SQL Server Instance" Loop Container, pushing new values into the SQLServerInstanceName variable, re-aiming the SQLServerInstance.OLEDB Connection Manager. Got it? Good.
What to do with this list of databases... I know - let's push them into a variable.
Unlike last time, we're returning a couple columns and several rows. The Single Row ResultSet simply will not do - we need a "Full result set" so set the ResultSet property of the Execute SQL Task to this value. On the Reuslt Set page, click the Add button and name the new Result 0. Select DatabaseNames from the Variable Name column:
So now, the list of database names and IDs are going to be pushed into the DatabaseNames object variable.
Object variables are cool. You can use them for anything. They are like Variants in old VB. They hold scalars or collections, or collections of collections. In this case we're pushing a full ADO.Net Dataset into the DatabaseNames object variable.
How cool is that?
Click the Ok button to close the Execute SQL Task editor.
A-Shreddin' We Will Go
Next drag a Foreach Loop Container into the "ForEach SQL Server Instance" Loop Container. Connect the Get DatabaseNames Execute SQL Task to the new Foreach Loop Container using a Success precedence constraint and rename the new Foreach Loop Container "Foreach Database".
Double-click the Foreach Database Loop Container and navigate to the Collection page. Select the Foreach ADO Enumerator and the DatabaseNames variable in the "ADO object source variable" dropdown. ADO.Net datasets can hold multpile tables but ours only has one table, so accept the default Enumeration mode of "Rows in the first table":
Click the Variable Mappings page and add a couple variable mappings.
Remember: the DatabaseNames object variable contains an ADO.Net dataset that has one DataTable. That table is populated with the results of the T-SQL query in the "Get DatabaseNames" Execute SQL Task. The T-SQL shaped the DataTable contained inside the DatabaseNames variable. That query had two columns in the Select clause. We map the data in the DatabaseNames variable to scalar SSIS variables using the ordinal of the columns in that T-SQL Select clause. Since it read select name, Database_Id, we map name to the DatabaseName SSIS variable using ordinal 0 and Database_Id to the DatabaseObjectId SSIS variable using ordinal 1. Make sense? Good.
Click the Ok button to close the Foreach Loop Container editor.
Let's look at what we have here. Once we've established a connection to a particular instance of SQL Server, we query the master database for a list of database names and Ids. We then shred that list in a nested Foreach Loop Container.
Before we move on, we need to add some more dynamic functionality to our SQLServerInstance.OLEDB Connection Manager. So right click the SQLServerInstance.OLEDB Connection Manager and select Properties. Click the Expressions property and then click the ellipsis in the Value textbox. When the Property Expressions Editor displays select InitialCatalog in the Property dropdown. Click the ellipsis in the Expression textbox:
In the Expression Builder, add the DatabaseName variable to the Expression:
This makes the SQLServerInstance.OLEDB Connection Manager even more dynamic. It now is aimed at individual databases as the Foreach Database Loope Container iterates.
Show Me The Databases!
Drag a Script Task into the Foreach Database Loop Container. Rename it "Show Database Name" and double-click to open the editor. On the Script page add the SQLServerInstanceName, DatabaseName, DatabaseObjectId variables to the ReadOnlyVariables property:
Click the Design Script button and add the following code to Sub Main():
Dim sSQLServerInstanceName As String = Dts.Variables("SQLServerInstanceName").Value.ToString
Dim sDatabaseName As String = Dts.Variables("DatabaseName").Value.ToString
Dim iDatabaseObjectId As Integer = Convert.ToInt32(Dts.Variables("DatabaseObjectId").Value)
Dim sMsg As String = "SQL Server Instance Name: " & sSQLServerInstanceName & vbCrLf & _
"Database Name: " & sDatabaseName & vbCrLf & _
"Database Id: " & iDatabaseObjectId.ToString
MsgBox(sMsg)
Close the VSA editor and click the Ok button to close the Script Task editor.
This script builds a string containing the values of the SQLServerInstanceName, DatabaseName, and DatabaseObjectId variables; then displays them in a message box:
Store The Database Metadata
Drag an Execute SQL Task into the Foreach Database Loop Container. Connect the Show Database Name Script Task to it using a Success precedence constraint and rename the Execute SQL Task "Add the Database Name". Double-click it to open the editor. Set the following properties:
- ResultSet: Single row
- Connection: EnterpriseDBMetadata.OLEDB
- SQLStatement:
If Not Exists(Select DatabaseName
from dbo.Databases
where DatabaseName = ?
and SQLServerInstancesTableId = ?)
begin
Insert Into dbo.Databases
(DatabaseObjectId
,SQLServerInstancesTableId
,DatabaseName)
output inserted.Id
Values(?, ?, ?)
end
Else
begin
Select Id
from dbo.Databases
where DatabaseName = ?
and SQLServerInstancesTableId = ?
end
There are a few more parameters this time. Map them as shown in the following list and image:
-
DatabaseName, Input, VARCHAR, 0
-
SQLServerId, Input, LONG, 1
-
DatabaseObjectId, Input, LONG, 2
-
SQLServerId, Input, LONG, 3
-
DatabaseName, Input, VARCHAR, 4
-
DatabaseName, Input, VARCHAR, 5
-
SQLServerId, Input, LONG, 6
Click the Result Set page and add a new resultset. Set the Result Name to 0 and assign the result to the variable DatabaseId:
Click the Ok button to close the Execute SQL Task editor.
Is it starting to come together now? Can you see where we're headed? Do you smell what Andy's cooking?
C. The Schemas
Add another Execute SQL Task to the Foreach Database Loop Container and rename it "Get SchemaNames". Add a Success precedence constraint from the "Add the Database Name" Execute SQL Task to the Get SchemaNames Execute SQL Task, and double-click Get SchemaNames to open the editor. Set the following properties:
- ResultSet: Full result set
- Connection: SQLServerInstance.OLEDB
- SQLStatement:
select name, Schema_Id
from sys.schemas
where schema_id < 16384
and name not in ('guest', 'INFORMATION_SCHEMA', 'sys')
This statement returns dbo and user schemas and their respective Ids.
Configure the Result Set page with the following resultset properties:
- Result Name: 0
- Variable Name: SchemaNames
Click the Ok button to close the Execute SQL Task editor.
Shred The Schemas
Drag a new Foreach Loop Container into the Foreach Database Loop Container and rename it "Foreach Schema". Connect the "Get SchemaNames" Execute SQL Task to it using a Success precedence constraint. Double-click Foreach Schema to open the editor and set the following properties:
- Collection page, Enumerator: Foreach ADO Enumerator
- Collection page, Ado object source variable: SchemaNames
- Collection page, Enumeration mode: Rows in first table
- Variable Mappings page
-
- Variable: SchemaName; Index: 0
- Variable: SchemaObjectId; Index: 1
Click the Ok button to close the Foreach Loop Container editor.
Show Me The Schemas!
Drag a Script Task into the Foreach Schema Loop Container and rename it "Show Schema Name". Double-click to open the editor and add the following SSIS variables to the ReadOnlyVariables property: SQLServerInstanceName, DatabaseName, DatabaseObjectId, SchemaName, SchemaObjectId. Click the Design Script button to open the VSA editor and add the following code to Sub Main():
Dim sSQLServerInstanceName As String=Dts.Variables("SQLServerInstanceName").Value.ToString
Dim sDatabaseName As String = Dts.Variables("DatabaseName").Value.ToString
Dim iDatabaseObjectId As Integer = Convert.ToInt32(Dts.Variables("DatabaseObjectId").Value)
Dim sSchemaName As String = Dts.Variables("SchemaName").Value.ToString
Dim iSchemaObjectId As Integer = Convert.ToInt32(Dts.Variables("SchemaObjectId").Value)
Dim sMsg As String = "SQL Server Instance Name: " & sSQLServerInstanceName & vbCrLf & _
"Database Name: " & sDatabaseName & vbCrLf & _
"Database Id: " & iDatabaseObjectId.ToString & vbCrLf & _
"Schema Name: " & sSchemaName & vbCrLf & _
"Schema Id: " & iSchemaObjectId.ToString
MsgBox(sMsg)
Close the VSA editor and click the Ok button to close the Script Task editor.
Store Schema Metadata
Drag an Execute SQL Task into the Foreach Schema Loop Container and connect the "Show Schema Name" Script Task to it using a Success precedence constraint. Rename the Execute SQL Task "Add the Schema Name" and double-click it to open the editor. Set the following properties:
- ResultSet: Single row
- Connection: EnterpriseDBMetadata.OLEDB
- SQLStatement:
If Not Exists(Select SchemaName
from dbo.Schemas
where SchemaName = ?
and DatabasesTableId = ?)
begin
Insert Into dbo.Schemas
(SchemaObjectId
,DatabasesTableId
,SchemaName)
output inserted.Id
Values(?, ?, ?)
end
Else
begin
Select Id
from dbo.Schemas
where SchemaName = ?
and DatabasesTableId = ?
end
Map the parameters as shown in the following list and image:
-
SchemaName, Input, VARCHAR, 0
-
DatabaseId, Input, LONG, 1
-
SchemaObjectId, Input, LONG, 2
-
DatabaseId, Input, LONG, 3
-
SchemaName, Input, VARCHAR, 4
-
SchemaName, Input, VARCHAR, 5
-
DatabaseId, Input, LONG, 6
Click the Result Set page and add a new resultset. Set the Result Name to 0 and assign the result to the variable SchemaId:
Click the Ok button to close the Execute SQL Task editor.
D. The Tables and Views
Add another Execute SQL Task to the Foreach Schema Loop Container and rename it "Get Table and View Names". Add a
Success precedence constraint from the "Add the Schema Name" Execute SQL Task to the "Get Table and View Names" Execute SQL Task, and double-click Get SchemaNames to open the editor. Set the following properties:
- ResultSet: Full result set
- Connection: SQLServerInstance.OLEDB
- SQLStatement:
select name, Object_Id, type_desc
from sys.tables
where schema_id = ?
union
select name, Object_Id, type_desc
from sys.views
where schema_id = ?
This statement returns tables and views, their respective Ids, and type descriptions (table or view).
Map the following parameters:
- SchemaObjectId, Input, LONG, 0, -1
- SchemaObjectId, Input, LONG, 1, -1
Configure the Result Set page with the following resultset properties:
- Result Name: 0
- Variable Name: TablesAndViewsNames
Click the Ok button to close the Execute SQL Task editor.
Shred The Tables and Views
Drag a new Foreach Loop Container into the Foreach Schema Loop Container and rename it "Foreach Table and View". Connect the "Get Table and View Names" Execute SQL Task to it using a Success precedence constraint. Double-click Foreach Table and View to open the editor and set the following properties:
- Collection page, Enumerator: Foreach ADO Enumerator
- Collection page, Ado object source variable: TableAndViewsNames
- Collection page, Enumeration mode: Rows in first table
- Variable Mappings page
-
- Variable: TableAndViewsName; Index: 0
- Variable: TableAndViewsObjectId; Index: 1
- Variable: TableAndViewsType; Index: 2
Click the Ok button to close the Foreach Loop Container editor.
Show Me The Table And Views!
Drag a Script Task into the Foreach Table and View Loop Container and rename it "Show Table and View Name". Double-click to open the editor and add the following SSIS variables to the ReadOnlyVariables property: SQLServerInstanceName, DatabaseName, DatabaseObjectId, SchemaName, SchemaObjectId, TablesAndViewsName, TablesAndViewsObjectId, TablesAndViewsType. Click the Design Script button to open the VSA editor and add the following code to Sub Main():
Dim sSQLServerInstanceName As String = Dts.Variables("SQLServerInstanceName").Value.ToString
Dim sDatabaseName As String = Dts.Variables("DatabaseName").Value.ToString
Dim iDatabaseObjectId As Integer = Convert.ToInt32(Dts.Variables("DatabaseObjectId").Value)
Dim sSchemaName As String = Dts.Variables("SchemaName").Value.ToString
Dim iSchemaObjectId As Integer = Convert.ToInt32(Dts.Variables("SchemaObjectId").Value)
Dim sTableAndViewName As String = Dts.Variables("TablesAndViewsName").Value.ToString
Dim iTableAndViewObjectId As Integer = Convert.ToInt32(Dts.Variables("TablesAndViewsObjectId").Value)
Dim sTableAndViewType As String = Dts.Variables("TablesAndViewsType").Value.ToString
Dim sMsg As String = "SQL Server Instance Name: " & sSQLServerInstanceName & vbCrLf & _
"Database Name: " & sDatabaseName & vbCrLf & _
"Database Id: " & iDatabaseObjectId.ToString & vbCrLf & _
"Schema Name: " & sSchemaName & vbCrLf & _
"Schema Id: " & iSchemaObjectId.ToString & vbCrLf & _
"Table / View Name: " & sTableAndViewName & vbCrLf & _
"Table / View Id: " & iTableAndViewObjectId.ToString & vbCrLf & _
"Table / View Type: " & sTableAndViewType
MsgBox(sMsg)
Close the VSA editor and click the Ok button to close the Script Task editor.
Store Tables And Views Metadata
Drag an Execute SQL Task into the Foreach Tables and Views Loop Container and connect the "Show Table and View Name" Script Task to it using a Success precedence constraint. Rename the Execute SQL Task "Add the Table and View Name" and double-click it to open the editor. Set the following properties:
- ResultSet: Single row
- Connection: EnterpriseDBMetadata.OLEDB
- SQLStatement:
If Not Exists(Select ObjectName
from dbo.TablesAndViews
where ObjectName = ?
and ObjectType = ?
and SchemasTableId = ?)
begin
Insert Into dbo.TablesAndViews
(TablesAndViewsObjectId
,SchemasTableId
,ObjectName
,ObjectType)
output inserted.Id
Values(?, ?, ?, ?)
end
Else
begin
Select Id
from dbo.TablesAndViews
where ObjectName = ?
and ObjectType = ?
and SchemasTableId = ?
end
Map the parameters as shown in the following list and image:
-
TablesAndViewsName, Input, VARCHAR, 0
-
TablesAndViewsType, Input, VARCHAR, 1
-
SchemaId, Input, LONG, 2
-
TablesAndViewsObjectId, Input, LONG, 3
-
SchemaId, Input, LONG, 4
-
TablesAndViewsName, Input, VARCHAR, 5
-
TablesAndViewsType, Input, VARCHAR, 6
-
TablesAndViewsName, Input, VARCHAR, 7
-
TablesAndViewsType, Input, VARCHAR, 8
-
DatabaseId, Input, LONG, 9
Click the Result Set page and add a new resultset. Set the Result Name to 0 and assign the result to the variable TablesAndViewsId:
Click the Ok button to close the Execute SQL Task editor.
E. The Tables and Views
Add another Execute SQL Task to the Foreach Table and View Loop Container and rename it "Get Column Names and Metadata". Add a Success precedence constraint from the "Add the Table and View Name" Execute SQL Task to the "Get Column Names and Metadata" Execute SQL Task, and double-click "Get Column Names and Metadata" to open the editor. Set the following properties:
This statement returns columns, their respective Ids, and type names.
Map the following parameters:
- TablesAndViewsObjectId, Input, LONG, 0, -1
Configure the Result Set page with the following resultset properties:
- Result Name: 0
- Variable Name: ColumnNames
Click the Ok button to close the Execute SQL Task editor.
Shred The Columns
Drag a new Foreach Loop Container into the Foreach Table and View Loop Container and rename it "Foreach Column". Connect the "Get Column Names and Metadata" Execute SQL Task to it using a Success precedence constraint. Double-click Foreach Column to open the editor and set the following properties:
- Collection page, Enumerator: Foreach ADO Enumerator
- Collection page, Ado object source variable: ColumnNames
- Collection page, Enumeration mode: Rows in first table
- Variable Mappings page
-
- Variable: ColumnName; Index: 0
- Variable: ColumnObjectId; Index: 1
- Variable: ColumnDataType; Index: 2
Click the Ok button to close the Foreach Loop Container editor.
Show Me The Columns!
Drag a Script Task into the Foreach Column Loop Container and rename it "Show Column Name". Double-click to open the editor and add the following SSIS variables to the ReadOnlyVariables property: SQLServerInstanceName, DatabaseName, DatabaseObjectId, SchemaName, SchemaObjectId, TablesAndViewsName, TablesAndViewsObjectId, TablesAndViewsType, ColumnName, ColumnObjectId, ColumnDataType. Click the Design Script button to open the VSA editor and add the following code to Sub Main():
Dim sSQLServerInstanceName As String = Dts.Variables("SQLServerInstanceName").Value.ToString
Dim sDatabaseName As String = Dts.Variables("DatabaseName").Value.ToString
Dim iDatabaseObjectId As Integer = Convert.ToInt32(Dts.Variables("DatabaseObjectId").Value)
Dim sSchemaName As String = Dts.Variables("SchemaName").Value.ToString
Dim iSchemaObjectId As Integer = Convert.ToInt32(Dts.Variables("SchemaObjectId").Value)
Dim sTableAndViewName As String = Dts.Variables("TablesAndViewsName").Value.ToString
Dim iTableAndViewObjectId As Integer = Convert.ToInt32(Dts.Variable
("TablesAndViewsObjectId").Value)
Dim sTableAndViewType As String = Dts.Variables("TablesAndViewsType").Value.ToString
Dim sColumnName As String = Dts.Variables("ColumnName").Value.ToString
Dim iColumnObjectId As Integer = Convert.ToInt32(Dts.Variables("ColumnObjectId").Value)
Dim sColumnDataType As String = Dts.Variables("ColumnDataType").Value.ToString
Dim sMsg As String = "SQL Server Instance Name: " & sSQLServerInstanceName & vbCrLf & _
"Database Name: " & sDatabaseName & vbCrLf & _
"Database Id: " & iDatabaseObjectId.ToString & vbCrLf & _
"Schema Name: " & sSchemaName & vbCrLf & _
"Schema Id: " & iSchemaObjectId.ToString & vbCrLf & _
"Table / View Name: " & sTableAndViewName & vbCrLf & _
"Table / View Id: " & iTableAndViewObjectId.ToString & vbCrLf & _
"Table / View Type: " & sTableAndViewType & vbCrLf & _
"Column Name: " & sColumnName & vbCrLf & _
"Column Id: " & iColumnObjectId.ToString & vbCrLf & _
"Column DataType: " & sColumnDataType
MsgBox(sMsg)
Close the VSA editor and click the Ok button to close the Script Task editor.
Store Columns And Metadata
Drag an Execute SQL Task into the Foreach Column Loop Container and connect the "Show Column Name" Script Task to it using a Success precedence constraint. Rename the Execute SQL Task "Add the Column Name" and double-click it to open the editor. Set the following properties:
- ResultSet: Single row
- Connection: EnterpriseDBMetadata.OLEDB
- SQLStatement:
If Not Exists(Select ColumnName
from dbo.Columns
where ColumnName = ?
and TablesAndViewsTableId = ?)
begin
Insert Into dbo.Columns
(ColumnsObjectId
,TablesAndViewsTableId
,ColumnName
,ColumnDataType)
Values(?, ?, ?, ?)
end
Map the parameters as shown in the following list and image:
-
ColumnName, Input, VARCHAR, 0
-
TablesAndViewsId, Input, LONG, 1
-
ColumnObjectId, Input, LONG, 2
-
TablesAndViewsId, Input, LONG, 3
-
ColumnName, Input, VARCHAR, 4
-
ColumnDataType, Input, VARCHAR, 5
Click the Ok button to close the Execute SQL Task editor.
Conclusion
You're done!
When you're done your SSIS package should look something like this:
You can get the SSIS package here (free registration required).
:{> Andy