Whilst discussing the evolution of SQL Server Data Tools in my previous article, I touched on a new feature, Package Parts, which is available in SSDT 2015. This article takes an in-depth analysis of the Package Parts feature, particularly, as it relates to Data Warehouse development.
Background
There are various approaches to designing a reusable SSIS package, for instance, you could create your own package template, implementing parent-child package execution using Execute Package Task, or simply copy and paste an existing package. However, just because your SSIS solution is reusable doesn’t necessarily mean that it’s easily maintainable too. Sometimes as an ETL developer, you need to make changes to your SSIS solution because of structural changes from source, change of connection string etc. and depending on how you have designed your solution, this process of refactoring your ETL could range from bearable to agonising. In the following sections of this article, we discuss and illustrate how Package Parts can be used to create reusable and maintainable SSIS solutions that could later make your ETL refactoring experience, very pleasant.
The Traditional Package Template
In order to understand and appreciate the Package Part feature in SSIS 2016, I will demonstrate some of the limitations we encountered in the traditional package template. A package template allowed you to create a package with shared controls that can later be used as a template for creating other packages. Say for instance you have a requirement to setup a data mart that stores fruit-sales data. Such a data mart might contain the following fact and dimensions:
- Fruits dimension
- Customer dimension
- Method Of Payment dimension
- Fruit Category
- Fruit Sales Fact Table
Populating the abovementioned dimensions may involve similar processes (i.e. use Execute SQL Task to clear staging table, generate surrogate keys, send out error emails) in which case creating a package template would be useful. A typical package template that can be used to load the fruit-sales data mart is shown in Figure 1.
The next part involves saving and copying the package template developed into a drive location where Visual Studio is installed. The folder that you need to copy the template to is called DataTransformationItems and depending on your version of Visual Studio installed (in my case, I am using Visual Studio 2013), the path of the drive location should be as follows:
{Your Drive Letter}:\Program Files (x86)\Microsoft Visual Studio 14.0\Common7\IDE\PrivateAssemblies\ProjectItems \DataTransformationProject\DataTransformationItems
Once you have located the correct drive location, then copy and paste the package template into the folder as shown in Figure 2.
After you have closed and re-launched SSDT, the next time you add a new item into an SSIS project, the package template you added in Figure 2 will come up as one of the template options for your new item as shown in Figure 3.
Now that I have a template, we can use it to create the rest of the packages that will load our fact and dimensions for the fruit-sales data mart. The complete ETL for loading our data mart is shown in Figure 4.
The greatest limitation with the traditional package template comes in when we need to refactor our SSIS solution. Say for instance we later realise that for some reasons our staging database has been changed from AdvetureWorks2014 to ReportServer. Well, to ensure that any packages created using the template has the correct database reference, we will update our package template such that the OLE DB connection shows the correct database as shown in Figure 5.
To check that the ensure is effective, I created a new dimension called SomeNewDimension.dtsx and as it can be seen in Figure 6, SomeNewDimension.dtsx has ReportServer connection name instead of AdvetureWorks2014.
But, what happens to packages created with an older version of the template? Will they still have references to the old connection? The answer is, unfortunately, yes – as shown in Figure 7.
This means that in order to ensure that the old packages have the correct connection, we will have to open each individual package and update the connection (that’s where the pain comes in, as in a case of a solution with 100 packages would mean opening and editing each package). Another alternative would be to delete all old packages and recreate them using the new template – either way, you will have to manually work on those 100 packages!
Package Parts
Package Parts enable us to produce package templates that are easier to manage and reuse. As from SSDT 2015, every SSIS project has a new Package Parts node, which in turn, has a Control Flow sub-node, as shown in Figure 8.
It is in this Control Flow sub-node that we get to create Package Parts by either adding an existing file-based Package Part or simply create one from scratch, as per options shown in Figure 9.
Working on Package Parts is similar to the way you work on SSIS Packages except that Package Parts have their own file extension, dtsxp. In Figure 10, we have created a Package Part that mirrors the traditional package template that was created earlier on.
Now that we have created out Package Part, we have to add it in an SSIS Package. Unfortunately, you cannot just click and drag the Package Part from Solution Explorer into the SSIS Package’s Control Flow area, instead as soon as you have created and saved the Package Part, it will come up under Package Parts node in SSIS Toolbox, as shown in Figure 11 – wherein you can click and drag it onto the SSIS Package’s Control Flow area.
Figure 12 shows an SSIS Package that uses a Package Part. As it can be seen, Package Part’s component have a letter P on the right hand corner of that component’s name.
Now to demonstrate as to how changes to the Package Parts gets applied across packages referencing it, we will attempt to make changes to the Package Parts’ connection name. The existing version of the Package Part has the AdvetureWorks2014 connection name, as can be seen in Figure 13 from one of the SSIS Packages that are referencing our Package Part, PackageTemplate.dtsxp.
Once the connection name of PackageTemplate.dtsxp is changed to ReportServer, we can see that such a change has been applied on SSIS Packages referencing the Package Part as shown in Figure 14.
Limitations of Package Parts
One of the things you would have noticed when we replicated the traditional package template into Package Parts was that in the Package Part we had grouped all the executables into a Sequence Container. This is because, although you can get away with adding multiple executables into a package Part without using a Sequence Container, when you attempt to add them from SSIS Toolbox you run into an error message displayed in Figure 15 as apparently Package Parts with multiple executables cannot be added into an SSIS Package.
I can almost hear what you are thinking, “So Microsoft, why allow me to create such a Package Part with multiple executables outside the Sequence Container, if you are going to prevent me from using it?” Personally, I think this is a bug!
Remember how a change to the Package Part gets applied across all packages who references that Package Part, well, if one of the SSIS Packages that references the Package Part is opened at the time of editing the Package Part, then when you switch context back to that SSIS Package, you will run into a message displayed in Figure 16. It basically means that you will have to manually refresh all SSIS Packages that were opened during updating of the Package Part. So make sure that you close all packages prior to making changes to Package Parts.
The rest of the limitations of Package Parts include the following:
- Project connections are not visible to Package Parts, thus can’t be used
- You also cannot parameterise you connections within Package Parts
- Unless you use variables, some Package Parts’ executables are difficult to customize directly from SSIS Packages i.e. data flow task, script task etc.
- You cannot debug a Package Part from the designer
Conclusion
Package Parts introduces another alternative to creating reusable and maintainable SSIS solutions. Unfortunately, it has its own sets of shortcomings which I hope will be resolved in subsequent releases of SQL Server.