This article explores the SSIS Conditional Split Transform task to split data into multiple destinations based on the specified conditions.
Introduction
We apply different logics on data in SQL Server tables to fulfill the requirements of the end-users. We might apply different analytical, logical, arithmetic conditions and logic on data. Let’s say you have a requirement to split the data into multiple tables depending upon some conditions. You can use a T-SQL function to write, but it might take time to do the development work. You need to rework on the code in case there are logic changes.
SSIS always comes handy in such situations and provides a solution for us with minimal complexity. Further, we can use it to take inputs from multiple data sources such as Excel, CSV, OLE DB, ODBC, TXT, etc. We can also use it to prepare output in different formats without writing complex programming codes.
Environment Details
- We will use the AdventureWorks2017 sample database in this article
- You should have SQL Server Data Tools installed to prepare the SSIS package. You can go over here, Download and install SQL Server Data Tools (SSDT) for Visual Studio to install it
Requirement
We have Employee data in the AdventureWorks2017 database. It contains a view vEmployee to fetch employee records.
Execute the following query to view the sample data.
1 2 |
SELECT * FROM [AdventureWorks2017].[HumanResources].[vEmployee]; |
We want to split data based on the CountryRegionName column in different tables.
- For employees belonging to the United Kingdom, insert data into the table [United Kingdom]
- For employees belonging to the United States, insert data into the table [United States]
- For employees belonging to Germany, insert data into the table [Germany]
- The rest of the data (not satisfying above conditions) should go to the [default] table
Create an SSIS package to split data based on conditions
Add an SSIS Conditional Split Transformation to the Data Flow
Let’s use the SSIS package to satisfy the above conditions and split data in multiple tables.
-
Open SQL Server Data Tool and go to File->New -> Project
It opens the following New Project Window. Select the Integration Service project and assign an appropriate name to the project. You can also specify a location where the project will save
-
Click Ok, and it prepares the project and opens the following project window
-
Click on SSIS Toolbox (left-hand side) and drag the Data Flow Tasks to the Control Flow area
-
Double click on the Data Flow Tasks, and it moves to Data Flow Tab as per the following screenshot
-
In the Data Flow area, drag the OLE DB Source from the SSIS Toolbox. We use OLE DB Source because our sample data is in the SQL Server Database tables
-
Right-click on the OLE DB Source and rename the task with an appropriate name. You can skip this step if you do not want to rename the source
-
Right-click on the Source data and click on Edit. It opens the OLE DB Source Editor. In this editor, we need to define the SQL Server instance connection, database and object (table\view) details
We do not have any existing connections in this SSIS package. Click on the New, provide SQL Instance details, and select the database in which an object exists. You can also click on Test Connection to verify that connection is successful.
If the Test Connection is successful, you get the following message.
Select the table or view from the drop-down list.
Click Ok, and you can see the Source data task does not contain any cross icon. It shows that configuration is successful for this task.
Add a SSIS Conditional Split Transformation to the Data Flow
It is the central part of this article. We need to add a Conditional Split Transformation task in the SSIS package to split the data. The Conditional Split Transformation task checks for the specified condition. It moves the data to an appropriate destination depending upon the condition.
Drag the SSIS Conditional Split task from the SSIS Toolbox.
Now, Connect the Source Data (OLE DB Source) to the Conditional Split transformation task. To connect, drag the green arrow from the Source Data to the SSIS Conditional Split as shown in the following image.
Now, we need to define the conditions in the SSIS Conditional Split transformation task. Double click on the Conditional Split and it opens the following Conditional Split transformation Editor.
The Conditional Split transformation Editor has three sections.
- We can use system variables and query output columns in the expressions to split the data flow
- We can define the various functions such as Mathematical, Date & Time function, Logical function, Null function
- In this section, we define the conditions to define the split of data. We should be careful in defining the conditions and these conditions should not overlap with each other. If the data satisfy one, it should not satisfy other conditions
Specify the conditions in the SSIS Conditional Split transformation Editor. We need to use equal operator (==) in the split conditions and values in the double quotes as shown below.
The specified conditions are as follows:
-
For United Kingdom employees, insert data into the table [United Kingdom]
- CountryRegionName == “United Kingdom”
-
For United States employees, insert data into the table [United States]
- CountryRegionName == ” United States ”
-
For Germany employees, insert data into the table [Germany]
- CountryRegionName == ” Germany”
- The rest of the data (not satisfying above conditions) should go to the [default] table
Adding Data Flow Destinations to the Data Flow
After the SSIS Conditional Split Transformation task, we need to add a destination path for each of the split conditions data.
Drag the OLE DB Destination task in the data flow from the SSIS toolbox.
Drag the green arrow from the Conditional Split to OLE DB Destination. It opens the following window.
In the output, select the output name defined in the split conditional transformation task earlier.
For a better understanding, I renamed the OLE DB Destination as the United Kingdom.
You can still see a Red Cross icon in the destination United Kingdom.
Double click on the United Kingdom task and it opens the following OLE DB Destination editor.
If we have an existing table in which we want to insert data, we can select the table from the drop-down list. Otherwise, click on the New, and it shows the script as per the input data.
Click OK, and it shows the name of the destination table as shown in the following image.
Click on the Mappings, and you can view the mapping between the input and output columns. You can make a change in the column mappings if required.
Click Ok, and you can see that conditional task specifying condition 1.
You can follow similar steps and configure the OLE DB destination to satisfy other conditions as well.
The complete SSIS package looks as per the following screenshot.
Let’s execute the SSIS package and see how it works. For executing the package, click on the Start.
You can see a green tick on each task, and it shows the package is successful.
In the screenshot above, we also note the following.
- Total Input Rows: 290
- Conditional Split for the United Kingdom: 1
- Conditional Split for the United States: 284
- Conditional Split for Germany: 1
- Data that does not meet any specified condition: 4
Suppose we want to further split the data for the United States based on the column StateProvinceName column. The conditions on which we want to split United Data are as follows.
Split data for Washington, California, and others in a separate table.
In the following screenshot, we have another SSIS Conditional Split transformation task to further split the United States data as per the condition.
Execute the SSIS package now, and we can see that United States data (Row 284) further divides into the following values.
- California: 2
- Washington: 275
- Others: 7
Conclusion
In this article, we explored SSIS Conditional Split Transformation to split the data as per specified conditions. We can use it for the source and destination as SQL Server tables. We can use multiple inputs and output formats such as flat files, spreadsheets or any destination supported by SSIS. You should explore and be familiar with this task to do the data transformation without any complicated programming.
- Understanding PostgreSQL SUBSTRING function - September 21, 2024
- How to install PostgreSQL on Ubuntu - July 13, 2023
- How to use the CROSSTAB function in PostgreSQL - February 17, 2023