Welcome to VSTSC's Community Server | | Help
Sunday, May 03, 2009 11:11 AM

Report Delivery Database (SSIS and Reporting Services Web Service part 1a)

Ok time to get side tracked sort of.  In my last post I talked about inspecting the Reporting Services web service to gather information about reports and possibly even building that out to execute reports.  In order to drive the automatic execution of reports we need a manner in which to hold report names and locations, along with parameters for those reports.  In this post I will go over a database I designed to handle feeding this data to an SSIS package to execute reports.  It’s worth noting that this structure can be used to feed the data driven subscription engine in SQL Server Reporting Services Enterprise Edition.  (You will still be lacking the event driven ability).  I'm sure I'll also manage to sneak in a little of my database development and design philosophy and possible a PowerShell script or two I use to automated the build process. Ok let’s start out at the end:

 Here is the DB that we will end up with.

To do this we need to create 4 tables.  One to hold the report name and its location, one to hold Parameter Names but not values, One to hold the linkage between what reports get what parameters, and finally one to hold the actual data for an given scheduled run of a report ting this data all the way back to an individual report and parameter information.

Here are the scripts, I'll talk a little bit about each one.

Script 0000

use master

go

if @@SERVERNAME = 'YOURDEVELOPMENTDBINSTANCE!'

if exists (select * from sys.databases where name ='steves')

begin

print 'killing connections!'

declare @sqlString varchar(100)

while (exists (select spid from sys.sysprocesses s inner join sys.databases db on db.database_id=s.dbid where db.name='steves'))

begin

select @sqlString='kill '+cast(max(spid) as varchar(10)) from sys.sysprocesses s inner join sys.databases db on db.database_id=s.dbid where db.name='steves'

print @sqlString

execute(@sqlString)

end

begin

print 'I hope you know what you are doing!'

drop database steves

print 'steves database dropped!'

end

end

Ok first off you may wonder why I'm dropping the DB in this first script it’s because this is my development methodology.  I guess it’s almost test driven database development but with a few differences. I like to start from scratch testing all of my build scripts that get me to the current production version, then I like to run any of the scripts that add metadata to the system, next I run any data loading scripts or SSIS packages necessary to get the system in a active state similar to production, finally I like to add any of the new development scripts to the system. To accomplish this I normally use a combination of SubVersion, PowerShell, and SQL Server Agent.  (Maybe in a future post I'll go into my PowerShell,SubVersion, SQL Server interactions, to be honest I would like to move this in Visual Studio and Source Safe but haven't had the time to do a lot of detail looking around)  Ok long story longer, If the I'm in the right environment I kill the connections and drop the DB, then create a new DB and schema.  Otherwise this script does nothing.

Script 0001

use master

if not exists (select * from sys.databases where name ='Steves')

begin

create database steves

print 'database "steves" created!'

end

else

begin

print 'database "steves" already exists'

end

go

use steves

go

if not exists(select * from sys.schemas where name = 'RD')

begin

execute ('create schema rd authorization dbo')

print 'schema "rd" created!'

end

else

begin

print 'schema "rd" already exits!'

end

Simple enough create a new database and new schema if they don't exist. 

Script 0002

use steves

go

if not exists (select * from sys.tables where name = 'reports' and SCHEMA_NAME(schema_id)='rd')

begin

create table rd.reports

(

report_id int identity(1,1) not null

,report_name varchar(150) not null

,report_location varchar(255) not null

,constraint pk_reports_01 primary key (report_id)

)

print 'table rd.reports created!'

end

else

begin

print 'table rd.reports already exists!'

end

Ok, this script just creates a table nothing special.  It does keep up the theme of providing information back based on success of the script.  (I get enough information back from a failure, and if we have a need to handle them in a special manner we can introduce try/catch blocks)   This is the report table, the report name is a user defined name for a report it can match the actual report name but does not need to.  The report location field with hold the location off of the ReportServer/ URL to the name including the name actual rdl file.

Script 0003

use steves

go

if not exists (select * from sys.tables where name = 'parameters' and SCHEMA_NAME(schema_id)='rd')

begin

create table rd.parameters

(

param_id int identity(1,1) not null

,param_name varchar(20) not null

,param_data_type tinyint not null

,constraint pk_parameters_01 primary key (param_id)

)

print 'table rd.parameters created!'

end

else

begin

print 'table rd.parameters already exists!'

end

Once again pretty straight forward table create with a few print statements. This table holds a name for each parameter; this name should match what the name of the parameter in the reports will be.  This requires a semi high level of standardization and correlation between report writing and parameter entry into this system.   We are not locked into this and if we need to specify a different name for the same parameter we can do so at the report/parameter cross reference level but this is the going to the default name used.  You also see param_data_type which holds the data type that this parameter represents,  because all parameter data is going to be held in a varchar field it is useful to know what the data type should be.  The last thing I want to point out it deals with database standards is the primary key constraint it is named with pk to signifying the constraint is a primary key.  Followed by the table name and then the ordinal position of the column(s) that are in the primary key.  The reason I am pointing this out is that I do not believe good database design happens by chance, it takes in my opinion forethought and planning to come with a good design.

Script 0004

use steves

go

if not exists (select * from sys.tables where name = 'reportsParametersXrefs' and SCHEMA_NAME(schema_id)='rd')

begin

create table rd.reportsParametersXrefs

(

xref_id int identity(1,1) not null

,report_id int not null

,param_id int not null

,alternate_param_name varchar(40) null

,constraint pk_reportsParametersXrefs_01 primary key (xref_id)

,constraint un_reportsParametersXrefs_02_03 unique (report_id,param_id)

,constraint fk_reportsParametersXrefs_02 foreign key (report_id) references rd.reports(report_id)

,constraint fk_reportsParametersXrefs_03 foreign key (param_id) references rd.parameters(param_id)

)

print 'table rd.reportsParametersXrefs created!'

end

else

begin

print 'table rd.reportsParametersXrefs already exists!'

end

Alright now we are getting somewhere.  First I want to point out the continuation of the constraint naming, once again I am not saying this is the only way to do it, I'm just saying think about everything in your database in the planning stages it will make its use and maintenance much easier.  Ok this table basically just links reports to parameters, it also gives you a chance to rename the parameter name to one that matches the value in the report, this will be key in the overall Report Delivery Engine we are building.

Script 0005

use steves

go

if not exists (select * from sys.tables where name = 'schedules' and SCHEMA_NAME(schema_id)='rd')

begin

create table rd.schedules

(

schedule_num int not null

,xref_id int not null

,param_value varchar(255) not null

,constraint pk_schedules_01 primary key (schedule_num,xref_id)

,constraint fk_schedules_02 foreign key (xref_id) references rd.reportsParametersXrefs(xref_id)

)

print 'table rd.schedules created!'

end

else

begin

print 'table rd.schedules already exists!'

end

Finally we are ready to pair the data with the report/parameter information.  Notice how the parameter value is held in a varchar field.  This is done to make this system as flexible as possible.  I toyed with the idea of creating a parameter value column for each data type and making those fields nullable and then enforcing the business logic that requires one of the parameter values to be filled with data via a trigger but this seemed overly complex and didn't truly buy me any additional data integrity.  I can still enforce domain integrity via a trigger that looks back to the parameter table but I'm not too sure its necessary but I will keep that idea ready just in case.

Script 0006

use steves

go

create view rd.reportSchedules

as

select

r.report_name

,r.report_location

,p.param_name

,t.name

,rpx.alternate_param_name

,s.param_value

,s.schedule_num

,r.report_id

,p.param_id

,rpx.xref_id

,t.system_type_id

from

rd.reports r

inner join

rd.reportsParametersXrefs rpx

on

r.report_id=rpx.report_id

inner join

rd.parameters p

on

p.param_id=rpx.param_id

inner join

rd.schedules s

on

s.xref_id=rpx.xref_id

inner join

sys.types t

on

t.system_type_id =p.param_data_type

go

if (@@ERROR=2714)

begin

print 'view rd.reportSchedules already exists!'

end

else if exists (select * from sys.views where name = 'reportSchedules' and SCHEMA_NAME(schema_id)='rd')

print 'view rd.reportSchedules created!'

else

print 'error look above'

Finally we have a view to make all of these identity filled normalized tables provide some useful data to the end user.  This view just shows all of the value fields along with only one copy of each of the idents across all four tables.  This script is also in a different format where I create the object and wait to see what the error value is if any.  The reason for this is that the create view statement must be the only statement in its batch.  I'm not exactly sure why this is but it is what it is.

What does all this get us? Well nothing, but wasn't that a fun 30 minutes, Seriously this gives a building block, and it gives us a system to record and retrieve parameters for an multiple reports and multiple schedules for any given report.  There are few more scripts that can make this an solution for providing data to the Data Driven subscription area on Enterprise Edition of reporting services but I'll save those for another post. (Hint: 1 proc for retrieving data, 1 proc for updating key parameters like begin date and end date, 1 proc for dynamically feeding data to this set of tables prior to a report running).

 

Ok as a little bonus here is the PowerShell script I use to automate my build process.  Basically you pass it a directory and a SQL Server instance and it spins through the directory executing any script that starts with four numeric values in order from lowest to highest.  I normally schedule this to run from SQL Agent at a set time every week.

#-----------------------------------------------------------------------------
#
#SFIBICH
#7/2/08
#Version 1.0
#versionControlSQLScript.ps1
#This script looks for files in the directory and executs them in order
#based on the script having an .SQL file extension and the first for characters of the
#name being numeric.  It will execute the scripts in ascending order.
#
#-----------------------------------------------------------------------------
#
#$ars[0] - relative path to the folder
#args[1] - the name of the server that the code will be executed on.
#$a = path to this file when executed
#serverName - the name of the server this code will be executed on defaults to DEV
#
#-----------------------------------------------------------------------------
$a=$MyInvocation.InvocationName
$serverName ="localhost"
if ($args[1].length -gt 0) {
 $serverName=$args[1]
}
"ServerName:"+$serverName
if ($args[0].length -gt 0) {
 $fileLocation=$args[0]
 if ($fileLocation -eq '?') {
  get-content $a | ? {$_.length -gt 0} | ? {$_.substring(0,1) -eq '#' }
  exit
 }
 else {
  [reflection.assembly]::LoadWithPartialName("'Microsoft.VisualBasic")
  gci $fileLocation | ? {
   $_.name -match "sql" -and [Microsoft.VisualBasic.Information]::isNumeric($_.name.trim().substring(0,4)) -eq "TRUE"} |
   sort-object @{Expression={[Int](([String]$_).trim().substring(0,4))};Ascending=$true} |
   % {$_.FullName; sqlcmd -S $serverName -E -e -i $_.FullName}
 }
}else{
 write-warning('Default Directory required - no work executed')
}

 

 

SSIS and Reporting Services Web Services

This post covers accessing web services in an SQL Server 2008 SSIS package, in particular the web service provided by Reporting Services 2005 in both MSRS 2005 and MSRS 2008.  I have a need to generate reports on a schedule and/or on an event driven basis from Reporting Services based on a number of outside conditions.  I could use Reporting Services 2008 Enterprise Edition which has data driven subscriptions but this falls short in the event driven category and will run the report regardless based on a schedule only.  This post is more or less a walkthrough of the ReportService web service based on applied to SQL Server 2008 SSIS Script task.  The information below works in both Data Flow and Control Flow Script tasks.  (Read: there is nothing earths shaking going on here but you need to start somewhere and being familiar with the basics is going to help us in the long run.)

Reporting Services exposes a number of web services classes in SQL Server 2005 and SQL Server 2008 Reporting Service.   For this post we will be looking into the web service namespace.  For future reference there is a ReportExecution, ReportService2006 and ReportServiceAuthentication web service name spaces that are available for use in both 2005 and 2008 of Reporting services.

Before we can get into building our report delivery engine we need to get a feel for how to interact with the Reporting Services programmatically.  We will do this by setting up a simple SSIS package that outputs basic properties from a given report on a Reporting Services environment.  This is not as simple as dragging the web service component from the tool bar on to the control flow tab and typing in the URL of the web server.  Unfortunately the SSIS web service task is incompatible with the level of control we need to have to successfully execute our goal.

Step 1 – Drag the Script Task onto the control flow.  Note most of what I am going to show you can be done exactly the same inside a data flow task as a data source, transformation, or destination.

Step 2 – Ok, not really a step but I want to make that point that the script language drop now shows and supports C#.

 

Step 3 – Once the script is open left click on the reference section and choose Add Web Reference.  Note the namespace of this particular script task.  You will need to reference this later in the Using Statement to properly reference the web service.

Step 4 – Put in the URL of your reporting services server followed with reportservice2005.asmx?wsdl  to get the WSDL for the ReportService2005 web service class.  Then rename the web reference name to something meaningful to you, ReportService2005 is a good choice.  (and follows along with the MSDN documentation example if you need a really good reason)

WSDL URL Example: http://FQDN:8080/ReportServer_SQL2008/reportservice2005.asmx?wsdl

Note: I did have issues referencing the https version of the WSDL from my local machine inside visual studio, so I went with the http reference. 

 

Step 5 – Reference the name space by adding your script tasks name space along with name of the web service.

Step 6 – Code Explanation:

Ok here is a little detail on what we are doing first we declare a Boolean fireAgianFlag for use later in a fire Information event.  Next we declare a new instance of the class.  Once we have created an instance we set its credentials and URL properties.  The URL is set to the instance of Reporting Services running on my local machine, pointing to the reportservice2005.asmx page.  Next we create two new instances of the property class set the name properties to Description and Name and add them to an array of property objects.  The get properties method will fill any property array sent to it filling the value property of the property array with information that corresponds to the value in the name of that particular property object.  (So basically we have an array that has two objects each object is a property object.  The property object has two properties of its own name and value.  I only mentioned this because the word property is use a whole lot in the above few sentences and reads a little funny to me so I expect it may be a little confusing for everyone else.)  After the property array is created and set with the appropriate values it is used in the call to the ReportService classes’ method.  This method fills the property array with information for a given report, in this example the ConnectionInfo in the StevesReportingExamples folder of the root of the ReportServer directory.  We then loop through the property array and show the name and value of each property we have retrieved data for.

 

Step 7 – Running the package you see that I have a number of property names and there values on the right hand side outputted by the fire information event.  I have a few more than those shown in the code screen shot above because I added a number of additional property objects to the property array.  The code for the additional properties is below, its worth to note there are a few more properties specific to reports, folders, and objects that are available but not shown.

Additional valid property objects names and values:

        //Item(Generic) Properties

        Property[] properties = new Property[12];

        Property name = new Property();

        Property description = new Property();

        Property CreatedBy = new Property();

        Property CreationDate = new Property();

        Property Hidden = new Property();

        Property ID = new Property();

        Property ModifiedBy = new Property();

        Property ModifiedDate = new Property();

        Property Path = new Property();

        Property Size = new Property();

        Property Type = new Property();

        Property VirtualPath = new Property();

        properties[0] = name;

        properties[1] = description;

        properties[2] = CreatedBy;

        properties[3] = CreationDate;

        properties[4] = Hidden;

        properties[5] = ID;

        propertiesDevil = ModifiedBy;

        properties[7] = ModifiedDate;

        propertiesMusic = Path;

        properties[9] = Size;

        properties[10] = Type;

        properties[11] = VirtualPath;

 

        //Report Specific Properties

        Property[] Reportproperties = new PropertyMusic;

        Property Language = new Property();

        Property ReportProcessingTimeout = new Property();

        Property ExecutionDate = new Property();

        Property CanRunUnattended = new Property();

        Property HasParameterDefaultValues = new Property();

        Property HasDataSourceCredentials = new Property();

        Property IsSnapshotExecution = new Property();

        Property HasScheduleReadyDataSources = new Property();

        Reportproperties[0] = Language;

        Reportproperties[1] = ReportProcessingTimeout;

        Reportproperties[2] = ExecutionDate;

        Reportproperties[3] = CanRunUnattended;

        Reportproperties[4] = HasParameterDefaultValues;

        Reportproperties[5] = HasDataSourceCredentials;

        ReportpropertiesDevil = IsSnapshotExecution;

        Reportproperties[7] = HasScheduleReadyDataSources;

 

        //Folder Specific Properties

        Property[] Folderproperties = new Property[1];

        Property Reserved = new Property();

        Folderproperties[0] = Reserved;

 

        //Resource Specific Properties

        Property[] Resourceproperties = new Property[1];

        Property MimeType = new Property();

        Resourceproperties[0] = MimeType;

 

        //Item(Generic) Properties

        name.Name = "name";

        description.Name = "description";

        CreatedBy.Name = "CreatedBy";

        CreationDate.Name = "CreationDate";

        Hidden.Name = "Hidden";

        ID.Name = "ID";

        ModifiedBy.Name = "ModifiedBy";

        ModifiedDate.Name = "ModifiedDate";

        Path.Name = "Path";

        Size.Name = "Size";

        Type.Name = "Type";

        VirtualPath.Name = "VirtualPath";

 

        //Report Specific Properties

        Language.Name = "Language";

        ReportProcessingTimeout.Name = "ReportProcessingTimeOut";

        ExecutionDate.Name = "ExecutionDate";

        CanRunUnattended.Name = "CanRunUnattended";

        HasParameterDefaultValues.Name = "HasParameterDefaultValues";

        HasDataSourceCredentials.Name = "HasDataSourceCredentials";

        IsSnapshotExecution.Name = "IsSnapshotExecution";

        HasScheduleReadyDataSources.Name = "HasScheduleReadyDataSources";

 

        //Folder Specific Properties

        Reserved.Name = "Reserved";

 

        //Resource Specific Properties

        MimeType.Name = "MimeType ";

 

I hope that’s helpful and I’m pretty sure I’ll post a few more entries around this topic.

PowerShell Certification, why I think we need one.

I normally do not write opinion pieces, I feel that if I can contribute anything worthwhile to the community my efforts are best served explaining or demonstrating a technical area where I have some level of expertise.  I have however one subject that I have been mulling over for the past few days that is an exception to this rule, that is certification.  Not any certification but a PowerShell certification, this however does not exist.  I think we (the community) need a PowerShell certification for a number of reasons (the following are my personal reasons, which I hope some of you share). 

·         First, trying to obtain a certification drives the individual to a certain competency in a given product or area.  

·         Second, it gives the individual an idea of what given competency level in a particular product or area is and what is required to have that competency. 

·         Third and simplest reason,  it’s a goal that one can drive towards. 

I have taken a number of different certifications in my career and I find the process of preparing for the certification the most valuable part of the certification process.  This is not to say that certification doesn’t help getting you a job, certification at the least has gotten my foot in the door at one employer.  Employer’s look to certification as a sign post on an employee’s career as saying, yes I have interest in this subject, I have dedicate some of my time outside of my job to show you what I can do.  This does not mean if you take a certification you are necessary an expert, or people who  have certifications are better than those IT professional that do not, but it is a sign of sorts of dedication or interest to an area going beyond the standard 9-5 (or 7:45-5, plus off hours support and project go-live/implementation weekends J  ).  It is does offer a potential employer something to differentiate two otherwise equally qualified individuals for the same position.

 So back to the original point, we (or at least think the community and if nothing else I would like) need a PowerShell certification.    Let me give you a little history as to why I am asking for this certification to exist.   I am not a complete newbie to PowerShell, I’ve been using it for about a year or so in production environments, and I’ve read a few books one in particular Professional Windows PowerShell (a Wrox book by Andrew Watt)  which was the first book I read and  was great and helped me come up to speed.  Recently I started to read what I am beginning to consider the bible for PowerShell, Windows PowerShell in Action by  Bruce Payette.  This is a fantastic book, but it has also made me feel like I know little to nothing about PowerShell.  That is why I am asking for a PowerShell certification; until I started reading Bruce Payette’s book I thought that I had a pretty good handle on PowerShell.  Yes, there was a few things I didn’t’ think I knew, but over all I thought I had this PowerShell thing down.  Now I feel like a newbie in this area and I have to go back and re-examine everything I’ve written, thinking how poor the implementation is.  This is a good thing, though it is a little humbling of an experience.  (I have some examples of scripts that should be re-evaluated on this site, but I’m going to leave them up as hopefully I’ll be able to look back and see my skills improve over time, and if nothing else they work)  For a while now I have known of the 3 different and its upcoming release in Windows7/Windows 2k8 R2.  Last night however I found a post (from May 1st 2008) from the PowerShell Team detailing some of the new upcoming features in PowerShell V2.  That got me thinking, I just barely have a handle on V1, V2 already has three CTP’s out and I have no idea what is going on in the newest release, I need something to help guide me or at least something I can use as a sign post to make sure I’m on the right track with PowerShell V1 so I have a chance of hanging on to V2.

If you think about it a PowerShell certification is not straying too far outside certifications that are already offered by Microsoft.  Currently Microsoft offers certifications in Operating Systems, Server Application, and different parts of the .  So in closing if anyone who reads this blog has any interest in a PowerShell certification show your support by putting in a comment or sending me an email.  I would like to know if I am a lone wolf on this one.    If anyone one who reads this blog has any influence over certifications tracks at Microsoft please do what you can to make this certification a reality.

Posted by steve | 1 Comments

PrimeFinder

Alright everyone has written the obligatory "Hello World!" program when they start to learn a new language.  I have my own variety of that practice and its PrimeFinder.  Whenever I start to learn a new language I write a program that finds primes.  I know this may sound a little strange but this normally allows me to practice a number of new skills and techniques as well as see if there is a way I can improve both my prime number finding logic and find out if the new language has anything to offer that is unique for this task.   In this post I'll examine PrimeFinder written in PowerShell, C#, and T-SQL.  I will discuss each and the surprises I found out about each.   Feel free to take this as a challenge to find more efficient ways in these languages or any others; I would like to see any different solutions that are more efficient.

There are some rules though, here they are:

1.     The solution must be robust enough to find all primes ranging from 1 to 2,147,483,647 (signed 4 byte int limit)

2.     The solution must output the prime numbers in some manner which is recordable. (printing to standard out is fine since that can be pushed into a text file)

3.     The solution must restartable, or easily changed to a state that allows it to be restartable.  In that it can produce a certain number of primes, stopped and pick up where it left off.  (this is the reason for the requirement #2, very flexible on this requirment)

4.     The solution must be timed

5.     The solution should find the first 10000 primes by default when executed

6.     The solution is allowed three primary constants of 1, 2, and 3.  These are allowed to remove any single use code that would be in place to find these 3 primes.

The first solution was written in C# and is the reason that I was looking into the problem to begin with.  This code was a pretty much brute force attack against prime numbers but it seems to be very efficient.  I am running this code in a screen shot below wrapped in a a few lines of PowerShell code to produce an easy to read timer at the bottom of the screen shot.

7.48 seconds not bad!

Save the following code in a .cs file, compile it should be exectable.

using System;
namespace fibich.math
{
 class primeFinder
 {
  const int prime1=1;
  const int prime2=2;
  const int prime3=3;

  static void Main()
  {
   //Variable Decleration
   bool isPrime =true;
   int possiblePrime =prime3;
   int primeCount=0;
   
   //Const Output
   Console.WriteLine("Starting to find Primes");
   Console.WriteLine("prime:" + prime1.ToString());  
   Console.WriteLine("prime:" + prime2.ToString());  
   Console.WriteLine("prime:" + prime3.ToString());  
    

   //Finding Primes

   while (primeCount<10000)
   {
    possiblePrime+=prime2;
    for (int primeFinder =prime3; primeFinder     {
     if ((possiblePrime % primeFinder)==0)
     {
      isPrime=false;
      break;
     }
     else
     {
      isPrime=true;
     }
    }
    if (isPrime==true)
    {
     primeCount++;
     Console.WriteLine("prime:"+possiblePrime.ToString()+" primeCount:"+primeCount.ToString());
    } 
    
   }
  return;
  }
 }
}

The second solution I offer up is written in PowerShell.  The reason I wrote this was because I wrote the C# solution I noticed how similar the syntax was and I figured why not I it only required a few changes to the code to make it executable as a powershell script.  Basically I added $ all my variables, changed == to -eq and < to -gt removed the Console.WriteLine from the output statements and that was about it.  I wrapped this code in the same PowerShell statements to produce the exact same timer information.

277 seconds!  Wow that was over 37 times slower than the C# solution. 

I expected PowerShell to be slower but man I didn't expect that, I was thinking maybe 10 times as worst.  Hey PowerShell's power is not in number crunching, its a scripting environment and gets its power from delivering more productivity per key stroke, empowering the developer not squeezing every ounce of performance from a process so I'm not disappointed.  Just a little surprised. 

Save the following code to a ps1 file enable scripts in PowerShell and you should be good to go.

#Variable Decleration
$primeCount=1
$prime1=1;
$prime2=2;
$prime3=3;
$isPrime =$true;
$possiblePrime =$prime3;
$primeCount=0;
   
#Const Output
"Starting to find Primes";
"prime:" + $prime1.ToString();  
"prime:" + $prime2.ToString();  
"prime:" + $prime3.ToString();  
    
#Finding Primes
while ($primeCount -lt 10000)
{
 $possiblePrime+=$prime2;
 for ($primeFinder =$prime3; $primeFinder -le ([Math]::Sqrt($possiblePrime))+1;$primeFinder+=$prime2)
 {
  if (($possiblePrime % $primeFinder) -eq 0)
  {
   $isPrime=$false;
   break;
  }
   else
  {
   $isPrime=$true;
  }
 }
  if ($isPrime -eq $true)
  {
   $primeCount++;
   "prime:"+$possiblePrime.ToString()+" primeCount:"+$primeCount.ToString();
  } 
}

The final solution is something a little different.  It’s a T-SQL solution, where the prime numbers that are found are used to find the next prime numbers.  I was hoping that this approach would be more efficient than the brute force used by the other two solutions and it still maybe at the higher range end of prime numbers.   This solution also utilizes set based logic to calculate the modulus of each number, instead of trying one number at a time against the trial prime number.

No Screen shot unfortunately but this solution took 70 seconds so it came in third in the trails to find the first 10,000 primer numbers.

This solution requires a database name dataHub, a schema named numbers and a table named primes in that schema. 

use datahub

BEGIN

if not exists(select * from sys.schemas where name='numbers')

BEGIN

execute('Create Schema numbers authorization dbo')

print 'schema created:Numbers'

END

else

BEGIN

print 'schema already exists:Numbers'

END

END

use datahub

IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[Numbers].[primes]') AND type in (N'U'))

DROP TABLE [Numbers].[primes]

GO

create table numbers.primes (

prime bigint not null primary key

,version int not null

)

use dataHub

GO

IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[Numbers].[primeFinder]') AND type in (N'P', N'PC'))

DROP PROCEDURE [Numbers].[primeFinder]

GO

create procedure numbers.primeFinder

@primeCount int = 10000

,@purge bit =0

as

declare @version int

declare @possiblePrime bigInt

declare @primeCounter int

set @primeCounter=0

if @purge=1

BEGIN

delete numbers.primes

set @version=1

--insert into numbers.primes

--(prime,version)

--values(1,@version)

--insert into numbers.primes

--(prime,version)

--values(2,@version)

insert into numbers.primes

(prime,version)

values(3,@version)

END

else

BEGIN

select @version = coalesce(max(version),0) from numbers.primes

END

select @possiblePrime = max(prime) from numbers.primes

while @primeCounter <@primeCount

BEGIN

set @possiblePrime=@possiblePrime+2

if not exists(select 1 from numbers.primes where prime<=sqrt(@possiblePrime) and @possiblePrime%prime=0)

BEGIN

insert into numbers.primes

(prime,version)

values(@possiblePrime,@version)

set @primeCounter=@primeCounter+1

END

END

go

Well that’s it; I look forward to hearing back from everyone and seeing any different or new takes on these solutions.  I already I have an idea for a combination solution utilizing the CLR integration in SQL Server, but that will have to wait to another time.

Posted by steve | 0 Comments
Filed under: , , ,

SSIS Execute SQL Task and RaisError (Using a OLEDB Connection)

This is a relatively short post about raising an error in T-SQL and having SSIS not recognize that error.  If you have an SSIS package that calls a SQL Task and that T-SQL for whatever reason ends up raising an error the SSIS package that called the T-SQL may complete with the step without error.    There may be two separate reasons why this is not working as expected. 

The first reason why it may not be working has to do with an apparent bug is SQL Server 2005 SP2. (It looks like it was a bug in 2008 as well) There are a number of good posts on this, but I believe you can fix this by installing SP 2 Updates.  The first step is to make sure you know what version of SQL Server you are on. ()  The second step is to upgrade to a SQL Server build that has this error corrected. ()   I have chosen to installed the most recent Cumlative Update that was released on August 18th 2008, it will bring you to version 09.00.3282.00.

About the SP2 Error:

http://www.windows-tech.info/15/2c8831412be41b1f.php

 

The second reason has to do wiht the RaisError() call itself.  The Error Level set in the RaisError statement needs to be set to a level of 11 or higher.  That's it, it's that simple. Below is a simple example of testing this process.

Procedure:

Create proc [dbo].[failure_proc]

as

BEGIN

RaisError('Steves Error',18,1)

END

Unexpected Success (Error Level 11):

Expected Failure (Error Level 11):

 

Now I just need to integrate these changes into some of my previous packages and upgrade a few servers.

Posted by steve | 0 Comments

dtexec.exe what are you doing? (part II)

This is the second part in a very short series covering dtexec and how to see exactly what package it is executing.  Last time we covered how to see what dtexec was doing using PowerShell and WMI.  This was great for checking all the different packages that might be running on a server at any given time.  What we will cover this time is how to create a package that will not allow any work to be done if it is already running.  If all of your jobs are called through the SQL Server Agent this is not really concern you need to have, as that job scheduler will not allow the same job to be called twice.  If you have the same package imbedded in different SQL Server Agent Jobs, or you are using a third party job scheduler (that will call the same job even if it is already running), or if you are using a windows scheduler calling dtexec.exe or a mixed of all 3 this should be a concern. 

Package Overview:

Alright you will notice the first object in the SSIS package is the WMI Data Reader Task.  This is a underutilized task in my opinion and really opens up the possibilities of data collection for SSIS. 

 

I use the WMI Connection Manager to connect to the WMI object I would like to query. (Details below)  Its query is 'select * from WIN32_process' where it selects all of the process on the local machine to where the package is being executed.  It dumps the result set into a package variable that is of type Object. 

The WMI Connection Manager object is the standard connection manager with the default NameSpace (\root\cimv2).  I set the Server name value to \\localhost and selected the use windows authentication check box.  All pretty basic options.

 

The forEach loop then iterates of the record set object an runs the script task once for each row.  Pulling out the Caption and CommandLine columns from the recordset and placing them into two variables WMICaption and WMICommandLine.  (If you would like to see the entire record set to see what is available configure the WMI task to output its data to a text file and run it once.)

 

The Script task inside the forEach Loop is where all of the real work is done.  Most of the work being done consists of string parsing.  First I grab the processes that have a Caption of DTEXEC. Caption is the process name.  Then when I have a record with a DTEXEC caption value I parse out the package name between the '/F(ile)' and the '.dtsx' values that will be there in a dtexec.exe call.  I also remove any directory information that was used in the call.  I output this to the screen for this example in a message box for each time there is a dtexec process.  If the script detects the same package name in two different records where they match the System Package Variable 'PackageName' with the process name of DTEXEC it then sets a package level variable dtexecFlag to True.  This is the indicator to run or not run the rest of the package.

 Public Sub Main()
  '
  ' Add your code here
        '
        '--------------------------------------------------------------
        '
        'SFIBICH
        '9/2/2008
        'This strips out the package name from a CommandLine column of a
        'WMI Record set returned from Win32_process object where the
        'Caption value is DTEXEC
        '
        '--------------------------------------------------------------
        Dim WMICaption As String
        Dim WMICommandLine As String
        Dim WMIPackageName As String
        Dim PackageName As String
        Dim IndexOfSlashF As Integer
        Dim IndexOfSpace As Integer

        WMICaption = Dts.Variables("WMICaption").Value.ToString
        WMICommandLine = Dts.Variables("WMICommandLine").Value.ToString
        PackageName = Dts.Variables("PackageName").Value.ToString

        If WMICaption.Trim.Length > 5 Then
            If (WMICaption.Trim.ToUpper.Substring(0, 6) = "DTEXEC") Then
                IndexOfslashF = WMICommandLine.ToUpper.IndexOf("/F")
                IndexOfSpace = WMICommandLine.Substring(IndexOfslashF).ToUpper.IndexOf(" ") + IndexOfslashF

                WMIPackageName = WMICommandLine.Substring(IndexOfSpace + 1, WMICommandLine.ToUpper.IndexOf(".DTSX") - IndexOfSpace - 1)
                WMIPackageName = WMIPackageName.Substring(WMIPackageName.LastIndexOf("\") + 1)
                WMIPackageName = WMIPackageName.Trim

                MsgBox("WMICaption:" + WMICaption + vbCrLf + "WMICommandLine:" + WMICommandLine + vbCrLf + _
                "Package Name:" + PackageName + vbCrLf + "WMIPackage Name:" + WMIPackageName + vbCrLf + _
                "IndexOfslashF:" + IndexOfSlashF.ToString + vbCrLf + "IndexOfSpace:" + IndexOfSpace.ToString)

                Dts.Variables("PackageCount").Value = CInt(Dts.Variables("PackageCount").Value) + 1

                If (WMIPackageName = PackageName And CInt(Dts.Variables("PackageCount").Value) > 1) Then
                    Dts.Variables("dtexecFlag").Value = True
                End If
            End If
        End If
        Dts.TaskResult = Dts.Results.Success
    End Sub
 

Precedence Constraints.

These are pretty straight forward one is set for Success and dtexecFlag=FALSE and the other is set to Success and dtexecFlag=TRUE.

Never Ending For Loop and Script task: This combination is to give me a way to have my example package run forever so I can test the WMI portion of this package.  The loop is set to 1==1 and the script task waits 5 seconds.

There are some things to remember.

1.) DTEXEC is not executed when running a package using Visual Studio BIDS 2005.  So this package will not find copies of itself running if you open two BIDS sessions and execute the package twice.

2.) Run the packages through command line to avoid the issue above and to allow the message boxes to PopUp.  If you run it through SQL Agent you will get an error when it comes to the Message boxes.

3.) Read the message boxes as they pop up, you will not get the "Shut Down" message box until you run the second package, and you will have to click through two boxes prior to that.

Here are the screen shots of the package in action.

Running in BIDS with a copy of the package running in the command line in the background.

 

Running through the command line:

Well that's it, and you can download the package here!  (I think registration is required, it redirects to a user group site sorry but I am having issues uploading to this site.)

Posted by steve | 1 Comments
Filed under: , ,

dtexec.exe what are you doing?

dtexec.exe is the program that is executed to run dtsx packages in SSIS 2005.  This executable is called if you call a Integration Services package through SQL Server Agent or through the command line directly.  The problem I am trying to solve today is to know what package dtexec.exe is running.  When you run an SSIS package you can look in task manager and see a dtexec.exe process executing.  If you only have one SSIS package executing at a time then you know which package dtexec is running.  If you have an enviorment where you have two or more packages on mulitple schedules and some packages that run on demand then you are left wondering which package is a particular dtexec.exe process running?  I have written a short (and do I mean short) powershell script to get this information.  I hope some of you find this usefull.  (This script can also be modified to handle looking into a "job number/name" if you are running your SSIS packages in some sort of parent/child framework. (more on that in a future post).

 So for this example I have 1 SSIS packages, NeverEndingPackage.dtsx and another instance of NeverEndingPackage2.dtsx. (NeverEndingPackage2.dtsx is a copy of the NeverEndingPackage.dtsx) These packages are simply a for loop where 1==1 and the have a script task that waits for 5 seconds inbetween each loop.

I have called these packages through the command line, once each and you will see two dtexec.exe process in my task manager, but the question is which one is which if I needed to stop one of these processes?

 

Now enters PowerShell (using WMI) to solve the issue.  I'll try and walk through the script below.

################################
#
# SFIBICH
# 8/21/2008
# This script gets the process of a given machine
# that are running dtexec.exe (SSIS executable)
# It then intergates the CommandLine String of the
# Win32_process object to display the SSIS package
# that is running.
#
################################
#one liner
#gwmi Win32_process |?{$_.ProcessName -eq 'dtexec.exe'} | select-object -property processId, @{Name='PackageName'; Expression={$_.CommandLine.substring($_.CommandLine.IndexOf('/F')+2,($_.CommandLine.indexOf('.dtsx')-$_.CommandLine.IndexOf('/F')-2)).trim()+'.dtsx'}}, threadCount, WS,VM  | format-table

#multi liner
gwmi Win32_process |?{$_.ProcessName -eq 'dtexec.exe'} | select-object -property processId,`
@{Name='PackageName'; Expression={$_.CommandLine.substring($_.CommandLine.toUpper().IndexOf('/F')`
+2,($_.CommandLine.toUpper().indexOf('.DTSX')-$_.CommandLine.toUpper().IndexOf('/F')-2)).trim()+'.dtsx'}}, `
threadCount, WS,VM  | format-table -auto

Ok, the # are just comments so nothing there.

There really is only one line of code here I just broke it into different lines using the ` at the end of each line. 

Line 1 - This uses the get-wmi command (shorten to gwmi)  to get the Win32_process object for this computer.  (Note: you can pass -ComputerName ComputerName to get the same information on a remote machine.)   IT then passes the output of this to a where-object (?) PowerShell command searching on ProcessName equal to dtexec.exe.  After that it passes the filtered output to a select-object command where it selects the processID and some other things

Line2 - We create a custom property of PackageName that is a substring of the Win32_process property CommandLine where I have substringed the CommandLine to get the information between the /F and the .dtsx portion of the line.  Which for a dtexec.exe process will be the name of the package.  (More trimming maybe nescessary if you are runing the dtexec command from a directory that the SSIS packages are not in as it will length the file name that is passed to the command.)

Line3 - Finishing up the custom property and adding .dtsx to the name of the package since I cut it off

Line 4 - adding a few other properties like WorkingSet  and Virtual Memory both in KB. Finally passing it to a format-table command to get the output in a manner that I like.

 Thats it....here is the script in action:

You can see that the first package NeverEndingPackage is processID 312 and the second call to NeverEndingPackage2 is 4484.  You can compare the process ID in the powershell script to that in the Task Manager.  You can not match a process in task manager to a SSIS package.  You can use the WMI Win32 Process object to pull more details about each SSIS package execution.  Next time I'll show you how to use this type of information inside an SSIS package so that an automated job scheduler (other than SQL Server Agent) can't call your package while there is an instance of it already running.

Thats it.

Posted by steve | 1 Comments
Filed under: , ,

space_used_sp vs. sp_spaceUsed

Ok I'm not one to re-write standard system stored procedures for SQL Server.  That being said I have been working a good deal with files and file groups within SQL Server over the past few days moving data and indexes to files and file groups other than the default PRIMARY file group.  Sp_spaceUsed is great procedure for giving you overall storage utilization information for a database or database objects (table, index view, or queue).  As I have been moving indexes and tables over to different file groups I wanted to see how this was splitting my data as far as size and row counts was going and how it was being disturbed across the file system.  Unfortunately sp_spaceUsed does not give you this level of detail and I could not find a system view or other system procedure to give me all of the information that I wanted. 

sp_spaceUsed

So I set out to create my own and now that I have it working and figured I'll share it with everyone.  I ended up creating two objects, one a view sp_space_used_view and the second a procedure space_used_sp. I’m a database guy so I love organization so I put both of these objects into a schema named metadata that is owned by DBO.  The view displays the space used by all objects in the database with rollup levels being displayed as ' -db-level', ' -schema-level', and ' -table-level'.  The view relies upon the sys.objects (due to the fact that system tables are not in sys.tables so I was losing the space and row count information for these objects.), sys.indexes, sys.partions, sys.database_files, sys.schemas, sys.dm_db_partion_stats, sys.internal_tables, and sys.index_columns catalog views; all of which have detail coverage in BOL.  The view utilizes these system catalog views to show rows counts, reserved space, data space, index space, and unused space in the database files broken down by table.  I figure I would want to see all of the detailed data for a given database sometimes, rollups others, and individual objects separately as well so encapsulation of the select logic into a view allows re-use of this code in other objects.  The procedure utilizes the view to show you information for a particular table or for the overall database.  I hope this code is useful to others as well.  One of the main differences between sp_spaceUsed and my space_used_sp is that sp_spaceUsed allows for the updating of system information which my procedure does not, it only returns data to the screen.  The second difference is that sp_spaceUsed requires the schema name to be passed as part of the table name if you want to view information on an object outside of your default schema, space_used_sp does not require this and displays all objects with the name you provide giving you the schema name as part of the output.

space_used_sp vs. sp_spaceUsed

Here it is:  (Note:everything is created in a schema call metadata)

IF  EXISTS (SELECT * FROM sys.views WHERE object_id = OBJECT_ID(N'[metadata].[space_used_vw]'))
DROP VIEW [metadata].[space_used_vw]
go
create view metadata.space_used_vw
-------------------------------------------
--SFIBICH
--7/7/2008
--The purpose of this view is to show space used broken
--down by SQL Server data files.  Currently is does not calculate
--space for fulltext indexes and or XML indexes.
--
-------------------------------------------
as
select
top 100 PERCENT
coalesce([schema_name],' -db level-') schemaName
,coalesce(table_name,' -schema level-') tableName
,coalesce(data_file_name,' -table level-') dataFileName
,[rowCount]
,reserved*8 reservedKB
,data*8 dataKB
,case when used_page_count>data then (used_page_count-data)*8 else 0 end indexKB
,case when reserved>used_page_count then (reserved-used_page_count)*8 else 0 end unusedKB
from
(
 SELECT  
 t.name table_name
 ,df.name data_file_name
 ,s.name [schema_name]
 ,SUM( 
    CASE 
  WHEN (p.index_id < 2) THEN row_count 
  ELSE 0 
    END 
 )  [rowCount]
 ,SUM(reserved_page_count)+coalesce(max(IT_reserved_page_count),0) reserved
 ,SUM( 
    CASE 
  WHEN (p.index_id < 2) THEN (in_row_data_page_count + lob_used_page_count + row_overflow_used_page_count) 
  ELSE lob_used_page_count + row_overflow_used_page_count 
    END 
 )data 
 ,SUM(used_page_count)+coalesce(max(IT_used_page_count),0) used_page_count
  FROM sys.dm_db_partition_stats p
 inner join
 sys.objects t
 on
 p.object_id=t.object_id
 inner join
 sys.schemas s
 on
 t.schema_id=s.schema_id
 inner join
 sys.indexes i
 on
 p.object_id=i.object_id
 and
 p.index_id=i.index_id
 inner join
 sys.database_files df
 on
 i.data_space_id=df.data_space_id
 left outer join
 (--Internal tables are placed on the same filegroup as the parent entity. (BOL)
  select
  pp.object_id
  ,min(pp.index_id) index_id
  ,sum(p.reserved_page_count) IT_reserved_page_count
  ,sum(p.used_page_count)  IT_used_page_count
  FROM
  sys.dm_db_partition_stats p
  inner join
  sys.internal_tables it
  on
  p.object_id = it.object_id 
  inner join
  sys.dm_db_partition_stats pp
  ON
  it.parent_object_id=pp.object_id
  WHERE
  it.internal_type IN (202,204) 
  group by
  pp.object_id
 )it
 on
 p.object_id=it.object_id
 and
 p.index_id=it.index_id
 group by
 s.name
 ,t.name
 ,df.name
 with rollup
) dataTable
order by
[schema_name]
,table_name
,[rowCount] desc
,data_file_name asc
go
IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[metadata].[space_used_sp]') AND type in (N'P', N'PC'))
DROP PROCEDURE [metadata].[space_used_sp]
go
create procedure metadata.space_used_sp
-------------------------------------------
--SFIBICH
--7/7/2008
--This is a replacement for sp_spaceused, it utlizes the view
--[metadata].[space_used_vw].  Currently it only reports on tables
--and the total for tables in a schema, or database level broken
--down by database files.
--
-------------------------------------------
@tableName sysname =null
as
if @tableName is null
begin
--------------------
--
--database, schema info
--
--------------------
 select
 ' -db level-' schemaName
 ,' -schema level-' tableName
 ,dataFileName
 ,sum([rowCount]) [rowCount]
 ,sum(reservedKB) reservedKB
 ,sum(dataKB) dataKB
 ,sum(indexKB) indexKB
 ,sum(unusedKB) unusedKB
 from
 [metadata].[space_used_vw]
 where
 dataFileName <>' -table level-'
 or
 (
 dataFileName =' -table level-'
 and
 schemaName=' -db level-'
 and
 tableName =' -schema level-'
 )
 group by
 dataFileName
 order by [rowCount] desc,dataFileName
end
else
begin
--------------------
--
--particular table
--
--------------------
 select
 schemaName
 ,tableName
 ,dataFileName
 ,[rowCount]
 ,reservedKB
 ,dataKB
 ,indexKB
 ,unusedKB
 from
 [metadata].[space_used_vw]
 where
 tableName
end

go

Ordering SSIS packages in Visual Studio 2005 (part II)

Ok the first post on Ordering SSIS packages in Visual Studio received a large number of hits...no comments but a good number of hits so I figured a follow up was due.  After my first post I found myself ordering all of my SSIS packages with the order-ssisProj.ps1 code but I found it to be a bother since I had to put in the location of each .dtproj file.  So with this second post I will introduce a script I wrote a few days ago that orders all of my SSIS packages.  It finds any .dtproj file in the default USERPROFILE + \my documents\ and orders them.  Before I get into the powershell script a quick intro about running a powershell script.  I'm not going to go into great detail other than to say you have to enable scripting in powershell.  Second you have to take the code below and put it into a text file and save it with a .ps1 extension.  (As always you use this script at your own risk...and you should always understand what code you grab off a website is doing)

order-SSISproj2.ps1

 #-----------------------------------------------------------------------------
#
#SFIBICH
#5/18/08
#Version 1.0
#order-SSISproj.ps1
#This script saves a backup file as a .bak
#This script will search the USERPROFILE enviorment variable + My documents
#as its starting place for .dtproj files
#This script allows args that will replace the default search path
#Use at your own risk, no guarantees
#-----------------------------------------------------------------------------
#
#$ars[0] - relative path to the file name
#$a = path to this file when executed
#$c = counter variable
#$i = counter for progress bar
#$x = xml object varaible
#$y = list of all dtproj files
#$z = list of dtspackages node items inside the xml document
#
#-----------------------------------------------------------------------------
$a=$^
$i=0
if ($args[0].length -gt 0) {
 $fileLocation=$args[0]
 if ($fileLocation -eq '?') {
  get-content $a | ? {$_.length -gt 0} | ? {$_.substring(0,1) -eq '#' }
  exit
 }
 else {
  date;$y=gci -path ($fileLocation) -include *.dtproj -recurse;date
 }
}else{
 write-warning('starting search for *.dtproj files at '+$x.value+'\My Documents\   this is a recursive search and may take a while')
 date;$y=gci -path ($x.value+'\My Documents\') -include *.dtproj -recurse;date
}
$y | % {
 $i+=1
 write-progress "re-ordering dtproj files- progress" "% complete" -perc ($i/$y.length*100)
 $_.Name
 [xml]$x = get-content $_.FullName
  $x.save($_.FullName+'.bak')
  $z=$x.project.dtspackages.dtspackage | sort-object -property name
  $c=0
  $z| % {if ($c -eq 0) {$c+=1} else {$x.project.dtspackages.RemoveChild($_)}} | out-null
  $z| % {if ($c -eq 1) {$c+=1} else {$x.project.dtspackages.AppendChild($_)}} | out-null
  $x.save($_.FullName)
}

 

order-SSISproj2.ps1 (the walk thru)

Ok the first few lines 1 thru 22 are standard documentation.  This will be use later and is useful to anyone who wants to update or modify the code in the future.  Lines 23 grabs the first token of the previous command line (The file name in this case) and assigns it to the variable $a. Line 24 sets $i to a default value of 0.  Lines 24 thru 34 parse the arguments passed to the script to figure out first if there are any arguments and if they are what are they.  If there where arguments passed to the script and they happen to be '?' (the help command) then the first argument of the previous commands contents are read. (basically go read the file that was entered into the command) Sending on the lines with comments to the screen for ouptut.  If the arguments passed to the script are not '?' then it is assumed that it is a starting location for a search for *.dtproj files.  That directory and every directory inside it are searched for proj files and assigned to the array $y.  Else the default search path is searched for *.dtproj files and those locations are assinged to the array $y. Lines 38 thru 49 is where the bulk of the work happen.  Line 38 executes a for each on the $y array looping through the code block line 39 to 49.  Line 39 increments the $i variable so we can show progress.  Line 40 brings up the progress bar in powershell and shows the percent complete incrementing with each loop through this code block. Line 41 writes the name of the file the script is working on to standard output.  Line 42 gets the content of the dtproj file for this loop and places it into a XML casted variable $x.  Now that line of code is not mine, I can't remember where I saw it but it is very slick!  Now I have an XML object to work with full of data!  First thing I do is to save the xml file with the same file name and a.bak extension just in case I mess something up.  Line 43 grabs all of the nodes in the project.dtspackages node and places them sorted into the $x variable.  In line 45 I set the $c variable to 0 so I can check this later to know how many times I have gone thru inner loops.  Line 46 removes all except for the first dtspackage node objects, based on the ordered set in $x.  I had to do this because when I removed all of the nodes the parent dtspackages node no longer existed and I couldn't add anything back to it. Line 47 adds all of the dtspackage nodes back in the sorted order. Line 48 saves the file and then line 49 you loop back to start the next file.  That’s it short and sweet!

Ok and here is the before and after:

Before (notice the XX_XXX_DX packages at the bottom)

After (notice that the XX_XXX_DX packages are in the correct location)

Once again I am not any way a powershell expert but I wanted to share a useful script that I developed for my own use with anyone who is working in an environment with a large number of SSIS packages in a single project or solution.  Let me know your thoughts?  If you have an improvement to the script send it to me or post it!

 

 

Ordering SSIS packages in Visual Studio 2005

Problem: SSIS Packages in Microsoft Visual Studio 2005 are ordered in the order of package creation by default, or more precisely the order they have been added to that solutions/projects folder.  What this means is that if you have an SSIS project that has more than a few SSIS packages let’s say 70+ it becomes an issue when you go to look for a particular package.  You can't just have a good naming scheme for your packages and hope to find them quickly (though that does help). Because the packages are not ordered in alphabetical order and MSVS 2005 does not give you a way to change the default order easily you are stuck searching through what is pretty much an un-ordered list.  Ok so if you have encountered this problem you know it’s real pain to deal with.  

The Answer:  Visual Studio uses metadata in an xml file to know which packages are in your project and what you want these packages to be called.  Note:  The file names and the package names in the project do not have to be the same but they are by default.  This file is the [ProjectName].dtproj file located in the root of the project folder for any given SSIS project. 

My Solution: Powershell to the rescue

#------------------------------------------
#
#SFIBICH
#5/16/08
#Version 1.0
#order-SSISproj.ps1
#This script saves a backup file as a .bak
#------------------------------------------
#$ars[0] - relative path to the file name
#$c = counter variable
#$x = xml object varaible
#------------------------------------------
if ($args[0].length -gt 0) {
 $fileLocation=$args[0]
 [xml]$x = get-content $fileLocation
 $x.save($fileLocation+'.bak')
 $y=$x.project.dtspackages.dtspackage | sort-object -property name
 $c=0
 $y| % {if ($c -eq 0) {$c+=1} else {$x.project.dtspackages.RemoveChild($_)}}
 $y| % {if ($c -eq 1) {$c+=1} else {$x.project.dtspackages.AppendChild($_)}}
 $x.save($fileLocation)
}else{
 write-warning('Please enter the location of a SSIS project .dtproj file (NO WORK EXECUTED)') 
}

 

Thats it I hope some of you find this useful!

trap [exception] { #insert code here} .... -ea stop

First off I'm not The PowerShellGuy (if you want tons of powershell information go there first)...I'm just getting started with PowerShell but I can already see how it can be extremely useful to a database developer and database administrator.  I plan on posting on PowerShell from time to time in this blog when I find something particularly useful to either of the previously mentioned rolls but for now I'm going to go over the TRAP object in PowerShell more to the point of what I have learned on how to use it.  The trap object is pretty straight forward and is similar to a onError event.  The main thing I want to explain in this post is that any cmdlet that you want to trap an error needs its ErrorAction or -ea property set to stop.  That’s it, that short, that sweat, but I had a hard time finding anything on the web or in three different PS books (Not mentioned to protect the guilty) on how to properly execute a trap.  Below I give a simple example:


$NoSuchDirectory="c:\noSuchDirectory"

#nothing happens as expected....script continues
function noTrap
{
 get-childItem $NoSuchDirectory=
}
noTrap


#nothing happens as expected(?)....script continues
function withTrap
{
 trap {
  #insert trap info here
  "Trapped Error! kind of"
 }
 get-childItem $NoSuchDirectory=
}
withTrap

#error is trapped....script stops then continues
function withTrap_2
{
 trap {
  #insert trap info here
  "Trapped Error! kind of"
  continue;
 }
 get-childItem $NoSuchDirectory= -ea stop

 #notice I've chanaged the ErrorAction on GCI to STOP from
 #the default continue, this is what allows the script to
 #trap the error.  If you change it to continue the trap
 #never fires
}
withTrap_2
"After the Trap"

 

 Now I just need to figure out how to get the SQL 2008 powershell snapins to port over to a machine that doesn't have SQL 2008 installed on it.

Posted by steve | 0 Comments
Filed under:

theSystem part X

Well one of my topics, "theSystem" listed in a previous post TwoProjects has moved over to SQL Server Central.  I may have some more background information on this project from time to time but the majority of the writing will take place at SQL Server Central.  If you missed the second article here is a link..."theSystem".

I'm still working on SOAK but its taken a backseat to the ramping up of a large project at my job, developing some SQL Server for iSereis programers training, and a speaking engagement. (Roanoke Code Camp)

Posted by steve | 0 Comments
Filed under:

Connecting to an iSeries for SSIS Users (this pertains to any .Net connection)

There are a number of useful pieces of information to be gleaned from reams and reams of documentation on both sides of the house when it comes to connect SSIS to an IBM iSeries.  I will try to present as much of it as I can in an easy to read fashion here.  I am going to start off with the basics- Terminology.

General iSeries Terminology:

¨  iSeries – name for the actual hardware; re-branded to System I

¨  i5/OS – Native operating system for AS/400 – previous called OS 400 operating system running on System I hardware

¨  i5-Processor for the System I; System I –i5

¨  DB2/400 – IBM’s DB database running on the OS 400 operating system.  This is different from DB2 UDB (IBM is not exactly clear on this sometimes)

¨  System I can run i5/OS(OS 400), AIX, UNIX(other tan AIX), LINUX, Windows (through the use of add on cards)

i5/OS (AS400) Terminology:

¨  DBFS – Database File System; Native AS/400 file system everything is an object in a database.

¨  IFS – Integrated File System; Windows compatible file system, folders, files.

¨  LPAR- Logical Partition- this is similar to a virtual server running

¨  iASP – integrated Asynchronous Storage Pools; This allows the AS/400 to run multiple DB instances similar to a named SQL Server instance.  Not very common on this platform.  Also known as a Catalog

¨  Library – This matches to a schema on MS SQL Server.  Libraries are heavily used on the AS/400 much more common that schema use on MS SQL Server

¨  Files – Tables

¨  Logical Files – Logics – MS SQL Server equivalent is an indexed view.

¨  Indexes – MS SQL Server equivalent to indexes

¨  Journals and Journal Receivers – This is how the 400 handles commitment control.  (Basically a log file can be set at the object level)

¨  QGPL –General Purpose Library-Lots of things get dumped here by default including SQL Packages

¨  SQL Packages-Not 100% sure but it is required to run a disturbed SQL Program and handles the file system access plan.

 

Well thats it for now, I'll give more specific examples of the IBM OLE and ODBC drivers as well as MS DB2 OLE drivers in future posts.

Posted by steve | 0 Comments
Filed under: ,

Data Dictionary Part II

Ok, after putting up my first post on the Data Dictionary I had posters remorse.  There is a simple change that makes the script shown in the previous post a little more user friendly.  First the view metadata.columnsDataDictionary need to be changed from a SQL Select using a pivot to just a self join from the metadata.columnsExtendedProperties to allow for columns that shared the same domain to easily display different MS_Description values.  The second change was to allow the procedure metadata.columnsExtendedPropertiesRepopulate to allow for an overall overwrite or just to add new values and retain old MS_Description extended propertie values.  I don't think I mentioned this earlier the extended property of 'MS_Description' will be picked up by a number of development applications so it is useful to stick your data dictionary value in there but not to overwrite one if it came supplied with say a third part application installation or if you are allowing a third party applicaiton manage your descriptioins in the database.

Well thats it here is the code with a simple example:

USE [steves]

GO

IF EXISTS (SELECT * FROM sys.views WHERE object_id = OBJECT_ID(N'[metadata].[columnsExtendedProperties]'))

DROP VIEW [metadata].[columnsExtendedProperties]

GO

IF EXISTS (SELECT * FROM sys.views

WHERE object_id = OBJECT_ID(N'[metadata].[columnsDataDictionary]'))

DROP VIEW [metadata].[columnsDataDictionary]

GO

IF EXISTS (SELECT * FROM sys.objects

WHERE object_id = OBJECT_ID(N'[metadata].[dataDictionary]') AND type in (N'U'))

DROP TABLE [metadata].[dataDictionary]

GO

IF EXISTS (SELECT * FROM sys.objects

WHERE object_id = OBJECT_ID(N'[dbo].[testExtendedProperties]') AND type in (N'U'))

DROP TABLE [dbo].[testExtendedProperties]

GO

IF EXISTS (SELECT * FROM sys.objects

WHERE object_id = OBJECT_ID(N'[metadata].[columnsExtendedPropertiesRepopulate]') AND type in (N'P', N'PC'))

DROP PROCEDURE [metadata].[columnsExtendedPropertiesRepopulate]

GO

SET ANSI_NULLS ON

GO

SET QUOTED_IDENTIFIER ON

GO

SET ANSI_PADDING ON

GO

CREATE TABLE [metadata].[dataDictionary](

[domainName] [varchar](50) NOT NULL,

[domainDesc] [varchar](1000) NOT NULL,

PRIMARY KEY CLUSTERED

(

[domainName] ASC

)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF

, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

) ON [PRIMARY]

GO

SET ANSI_PADDING OFF

go

exec sp_addextendedproperty 'domain', 'domainName'

, 'schema', 'metadata', 'table', 'dataDictionary', 'column', 'domainName'

exec sp_addextendedproperty 'domain', 'domainDesc'

, 'schema', 'metadata', 'table', 'dataDictionary', 'column', 'domainDesc'

insert into metadata.dataDictionary

(domainName,domainDesc)

values('domainName','Short text name for the classification of a businessdata type or attribute')

insert into metadata.dataDictionary

(domainName,domainDesc)

values('domainDesc','Description of the a business data type or attribute')

GO

create view [metadata].[columnsExtendedProperties]

as

----------------------------------------------

--Steve Fibich

--1.11.2007

--This view is used to match columns to their extended

--properties. This is an extension of sys.extended_properties

--system view.

--

----------------------------------------------

select

top 100 percent

s.name as schemaName

,t.name as TableName

,c.name as columnName

,e.name as EName

,e.value as EDesc

,c.column_id as columnID

from

sys.extended_properties e

inner join

sys.tables t

on

t.object_id=e.major_id

inner join

sys.columns c

on

e.minor_id=c.column_id

and

e.major_id=c.object_id

inner join

sys.schemas s

on

t.schema_id=s.schema_id

where

class=1

order by

s.name

,t.name

,c.column_id

,e.name

GO

create view [metadata].[columnsDataDictionary]

as

----------------------------------------------

--Steve Fibich

--1.11.2007

--This view is used to match columns to their extended

--properties. It does this for particular extended properties

--"domain" and "MS_Description. It then pivots these results for a

--more useful layout.

--

----------------------------------------------

select

top 100 percent

cep1.schemaName

,cep1.tableName

,cep1.columnName

,cep1.eDesc as domain

,cep2.eDesc as MS_Description

from

(

select

schemaName

,tableName

,columnName

,eDesc

,eName

,columnId

from [metadata].[columnsExtendedProperties]

where

EName='domain'

) cep1

full outer join

(

select

schemaName

,tableName

,columnName

,eDesc

,eName

,columnId

from

[metadata].[columnsExtendedProperties]

where

EName='MS_Description'

) cep2

on

cep1.schemaName=cep2.schemaName

and

cep1.tableName=cep2.tableName

and

cep1.columnID=cep2.columnID

order by

cep1.schemaName

,cep1.tableName

,cep1.columnID

GO

go

create proc metadata.columnsExtendedPropertiesRepopulate

----------------------------------------------

--Steve Fibich

--1.11.2007

--This proc removes any 'MS_Description' extended properties

--from any columns of any tables that have matching entires in the

--metadata.datadictionary table and that have an extended propertie of 'domain'

--This is to try to make keeping 'MS_Description' descripions upto date eaiser

--

--@overwrite=0 turns off overwriting so existing MS_Descriptions will be left alone

--@overwrite=1 existing MS_Description values will be deleted and re-created

--@debug =0 sets additional logging off

--@debug=1 sets additional looging on

--

----------------------------------------------

@overwrite bit=0

,@debug bit=0

as

--if @debug=0

-- BEGIN

-- select @debug=debug from metadata.logging where schemaName='metadata'

-- END

declare @schemaName sysname

declare @tableName sysname

declare @columnName sysname

declare @domainDesc varchar(1000)

declare extProp_cursor cursor for

select

e.schemaName

,e.tableName

,e.columnName

,dd.domainDesc

from

metadata.dataDictionary dd

inner join

[metadata].[columnsExtendedProperties] e

on

dd.domainName=edesc

where

ename='domain'

open extProp_cursor

Fetch next from extProp_cursor into

@schemaName,@tableName,@columnName,@domainDesc

while @@fetch_status=0

BEGIN

if @debug=1

BEGIN

select @schemaName,@tableName,@columnName,@domainDesc

END

if (exists (select * from ::fn_listextendedproperty('MS_Description', 'schema'

, @schemaName, 'table', @tableName, 'column', @columnName)) and (@overwrite=0))

BEGIN

execute sp_dropextendedproperty 'MS_Description','schema'

,@schemaName,'table',@tableName,'column',@columnName

END

if not exists (select * from ::fn_listextendedproperty('MS_Description', 'schema'

, @schemaName, 'table', @tableName, 'column', @columnName))

BEGIN

execute sp_addextendedproperty 'MS_Description',@domainDesc,'schema'

,@schemaName,'table',@tableName,'column',@columnName

END

Fetch next from extProp_cursor into

@schemaName,@tableName,@columnName,@domainDesc

END

close extProp_cursor

deallocate extProp_cursor

-----------------------------------------

--Example Table and Extended Properties--

-----------------------------------------

go

create table dbo.testExtendedProperties

(

column1 int

,column2 int

,column3 int

,column4 int

,column5 int

,column6 int

)

exec sp_addextendedproperty 'domain', 'domainName'

, 'schema', 'dbo', 'table', 'testExtendedProperties', 'column', 'column1'

exec sp_addextendedproperty 'domain', 'domainDesc'

, 'schema', 'dbo', 'table', 'testExtendedProperties', 'column', 'column2'

exec sp_addextendedproperty 'domain', '3rdDomainDesc'

, 'schema', 'dbo', 'table', 'testExtendedProperties', 'column', 'column3'

exec sp_addextendedproperty 'domain', '4thDomainDesc'

, 'schema', 'dbo', 'table', 'testExtendedProperties', 'column', 'column4'

exec sp_addextendedproperty 'domain', '3rdDomainDesc'

, 'schema', 'dbo', 'table', 'testExtendedProperties', 'column', 'column5'

exec sp_addextendedproperty 'domain', '4thDomainDesc'

, 'schema', 'dbo', 'table', 'testExtendedProperties', 'column', 'column6'

insert into metadata.dataDictionary

(domainName,domainDesc)

values('3rdDomainDesc','This is the 3rd domain description that we have it should be on column3 and column5')

insert into metadata.dataDictionary

(domainName,domainDesc)

values('4thDomainDesc','This is the 4th domain description that we have it should be on column4 and column6')

execute metadata.columnsExtendedPropertiesRepopulate

select * from [metadata].[columnsDataDictionary]

select * from [metadata].[columnsExtendedProperties]

 

Posted by steve | 0 Comments
Attachment(s): DataDictionary.sql.txt

Data Dictionary part I

Having a data dictionary is an invaluable resource for not only for the DBA but for anyone, report writers or developers who need to work on a database that you have to manage or have developed yourself.  A good amount of time can be saved from answering the same questions over and over again if you can point to documentation to answer any initial questions.   Microsoft provides a way to keep a this information in each database without having to change the structure of any of the objects that already exist.  Extended Properties are a valuable resource in keeping metadata close to the heart of the system where it is needed.  In this series I will show you my data dictionary frame work.  I do create one table data dictionary that contains information that will work in conjunction with MS extended properties system functions, procedures, and views.  I have a concept of "data domains" which is a way to relate different columns in the same table or different tables that contain the same logical information that may not support a foreign key relationship.  An example of this is state in a data warehouse or master data management system, some tables may use the USPS two letter code, other tables from a different system may use an ISO 3 digit numeric code, but in each of these cases they represent the same information.  Connecting these two different data elements in the database can be incredibly useful to a developer especially if they are new to your environment.  To facilitate this I create an extended property every column in the database that is "domain" and populate it with the name of the data domain, in the example above the extended properties "domain" would get a value of "state" in each of the two different tables.  I then populate my dataDictionary table with the domain "state" and a description for this domain.  I have simple stored procedure that will read my dataDictionary table and place all of the domain descriptions in a extended propriety of "MS_Description" for all columns that have a matching value in the "domain" extended property.  I then have two simple views that extended the sys.extended_properites system view (SQL 2005 object).  Below I have listed the code that will generate the aforementioned objects and a quick sample to show what it can do.   (This code requires a metadata schema to exist)

IF EXISTS (SELECT * FROM sys.views

WHERE object_id = OBJECT_ID(N'[metadata].[columnsExtendedProperties]'))

DROP VIEW [metadata].[columnsExtendedProperties]

GO

IF EXISTS (SELECT * FROM sys.views

WHERE object_id = OBJECT_ID(N'[metadata].[columnsDataDictionary]'))

DROP VIEW [metadata].[columnsDataDictionary]

GO

IF EXISTS (SELECT * FROM sys.objects

WHERE object_id = OBJECT_ID(N'[metadata].[dataDictionary]') AND type in (N'U'))

DROP TABLE [metadata].[dataDictionary]

GO

IF EXISTS (SELECT * FROM sys.objects

WHERE object_id = OBJECT_ID(N'[dbo].[testExtendedProperties]') AND type in (N'U'))

DROP TABLE [dbo].[testExtendedProperties]

GO

IF EXISTS (SELECT * FROM sys.objects

WHERE object_id = OBJECT_ID(N'[metadata].[columnsExtendedPropertiesRepopulate]') AND type in (N'P', N'PC'))

DROP PROCEDURE [metadata].[columnsExtendedPropertiesRepopulate]

GO

SET ANSI_NULLS ON

GO

SET QUOTED_IDENTIFIER ON

GO

SET ANSI_PADDING ON

GO

CREATE TABLE [metadata].[dataDictionary](

[domainName] [varchar](50) NOT NULL,

[domainDesc] [varchar](1000) NOT NULL,

PRIMARY KEY CLUSTERED

(

[domainName] ASC

)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF

, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

) ON [PRIMARY]

GO

SET ANSI_PADDING OFF

go

exec sp_addextendedproperty 'domain', 'domainName'

, 'schema', 'metadata', 'table', 'dataDictionary', 'column', 'domainName'

exec sp_addextendedproperty 'domain', 'domainDesc'

, 'schema', 'metadata', 'table', 'dataDictionary', 'column', 'domainDesc'

insert into metadata.dataDictionary

(domainName,domainDesc)

values('domainName','Short text name for the classification of a businessdata type or attribute')

insert into metadata.dataDictionary

(domainName,domainDesc)

values('domainDesc','Description of the a business data type or attribute')

GO

create view [metadata].[columnsExtendedProperties]

as

----------------------------------------------

--Steve Fibich

--1.11.2007

--This view is used to match columns to their extended

--properties. This is an extension of sys.extended_properties

--system view.

--

----------------------------------------------

select

top 100 percent

s.name as schemaName

,t.name as TableName

,c.name as columnName

,e.name as EName

,e.value as EDesc

,c.column_id as columnID

from

sys.extended_properties e

inner join

sys.tables t

on

t.object_id=e.major_id

inner join

sys.columns c

on

e.minor_id=c.column_id

and

e.major_id=c.object_id

inner join

sys.schemas s

on

t.schema_id=s.schema_id

where

class=1

order by

s.name

,t.name

,c.column_id

,e.name

GO

create view [metadata].[columnsDataDictionary]

as

----------------------------------------------

--Steve Fibich

--1.11.2007

--This view is used to match columns to their extended

--properties. It does this for particular extended properties

--"domain" and "MS_Description. It then pivots these results for a

--more useful layout.

--

----------------------------------------------

select

top 100 percent

schemaName

,tableName

,columnName

,[domain]

,[MS_Description]

from

(

select

schemaName

,tableName

,columnName

,columnID

,EName

,EDesc

from

[metadata].[columnsExtendedProperties]

)p

PIVOT

(

min(EDesc)

for EName IN

([domain],[MS_Description])

)as pvt

order by

tableName

,columnID

GO

go

create proc metadata.columnsExtendedPropertiesRepopulate

----------------------------------------------

--Steve Fibich

--1.11.2007

--This proc removes any 'MS_Description' extended properties

--from any columns of any tables that have matching entires in the

--metadata.datadictionary table and that have an extended propertie of 'domain'

--This is to try to make keeping 'MS_Description' descripions upto date eaiser

--

----------------------------------------------

@debug bit=0

as

--if @debug=0

-- BEGIN

-- select @debug=debug from metadata.logging where schemaName='metadata'

-- END

declare @schemaName sysname

declare @tableName sysname

declare @columnName sysname

declare @domainDesc varchar(1000)

declare extProp_cursor cursor for

select

e.schemaName

,e.tableName

,e.columnName

,dd.domainDesc

from

metadata.dataDictionary dd

inner join

[metadata].[columnsExtendedProperties] e

on

dd.domainName=edesc

where

ename='domain'

open extProp_cursor

Fetch next from extProp_cursor into

@schemaName,@tableName,@columnName,@domainDesc

while @@fetch_status=0

BEGIN

if @debug=1

BEGIN

select @schemaName,@tableName,@columnName,@domainDesc

END

if exists (select * from ::fn_listextendedproperty('MS_Description'

, 'schema', @schemaName, 'table', @tableName, 'column', @columnName))

BEGIN

execute sp_dropextendedproperty 'MS_Description','schema'

,@schemaName,'table',@tableName,'column',@columnName

END

execute sp_addextendedproperty 'MS_Description',@domainDesc,'schema'

,@schemaName,'table',@tableName,'column',@columnName

Fetch next from extProp_cursor into

@schemaName,@tableName,@columnName,@domainDesc

END

close extProp_cursor

deallocate extProp_cursor

-----------------------------------------

--Example Table and Extended Properties--

-----------------------------------------

go

create table dbo.testExtendedProperties

(

column1 int

,column2 int

,column3 int

,column4 int

,column5 int

,column6 int

)

exec sp_addextendedproperty 'domain', 'domainName'

, 'schema', 'dbo', 'table', 'testExtendedProperties', 'column', 'column1'

exec sp_addextendedproperty 'domain', 'domainDesc'

, 'schema', 'dbo', 'table', 'testExtendedProperties', 'column', 'column2'

exec sp_addextendedproperty 'domain', '3rdDomainDesc'

, 'schema', 'dbo', 'table', 'testExtendedProperties', 'column', 'column3'

exec sp_addextendedproperty 'domain', '4thDomainDesc'

, 'schema', 'dbo', 'table', 'testExtendedProperties', 'column', 'column4'

exec sp_addextendedproperty 'domain', '3rdDomainDesc'

, 'schema', 'dbo', 'table', 'testExtendedProperties', 'column', 'column5'

exec sp_addextendedproperty 'domain', '4thDomainDesc'

, 'schema', 'dbo', 'table', 'testExtendedProperties', 'column', 'column6'

insert into metadata.dataDictionary

(domainName,domainDesc)

values('3rdDomainDesc','This is the 3rd domain description that we have it should be on column3 and column5')

insert into metadata.dataDictionary

(domainName,domainDesc)

values('4thDomainDesc','This is the 4th domain description that we have it should be on column4 and column6')

execute metadata.columnsExtendedPropertiesRepopulate

select * from [metadata].[columnsDataDictionary]

 

I will be revising this data dictionary system in future posts, I already have an issue with the way the  columnsDataDictionary view functions.  As you may have noticed the columnsExtendedPropertiesRepopulate overwrites any already existing "MS_Description" values already in place.  This will be enhanced in future versions.

More Posts Next page »