SSIS and PowerShell – A Powerful Combination

PowerShell is a powerful task automation tool from Microsoft. Although SSIS does not provide something like Execute PowerShell Script Task out of the box, it does have an Execute Process Task which can be used to run PowerShell scripts just as easily.

On a recent project, I was tasked with downloading and updating Federal Election Commission, FEC.GOV, Committee and Candidate Election data into Dynamics GP vendor maintenance and eOne Solutions Extender tables. To accomplish this, I created a PowerShell script to download the zip files for the FEC.GOV web site, added this script inside a SSIS package Execute Script task and then added additional data flow tasks to un-zip the files and import into custom SQL tables for use with a SmartConnect integration into Dynamics GP. The final SSIS package looked like the screen shot below.

SSIS Package Control Flow

Here are the detailed steps to create this SSIS package:

1.  Create the two custom tables

SET ANSI_NULLS ON

GO

SET QUOTED_IDENTIFIER ON

GO

SET ANSI_PADDING ON

GO

CREATE TABLE [dbo].[FEC_Candidate](

[Candidate_ID] [varchar](9) NOT NULL,

[Candidate_Name] [varchar](200) NULL,

[Party_Affiliation] [varchar](3) NULL,

[Year_of_Election] [numeric](4, 0) NULL,

[Candidate_State] [varchar](2) NULL,

[Candidate_Office] [varchar](1) NULL,

[Candidate_District] [varchar](2) NULL,

[Incumbent_Challenger_Status] [varchar](1) NULL,

[Candidate_Status] [varchar](1) NULL,

[Principal_Campaign_Committee] [varchar](9) NULL,

[Street1] [varchar](35) NULL,

[Street2] [varchar](35) NULL,

[City] [varchar](35) NULL,

[State] [varchar](2) NULL,

[Zip] [varchar](9) NULL

) ON [PRIMARY]

GO

SET ANSI_PADDING OFF

GO

CREATE TABLE [dbo].[FEC_Committee](

[Committee_ID] [varchar](50) NOT NULL,

[Committee_Name] [varchar](200) NULL,

[Treasurer_Name] [varchar](100) NULL,

[Street1] [varchar](50) NULL,

[Street2] [varchar](50) NULL,

[City] [varchar](50) NULL,

[State] [varchar](50) NULL,

[Zip] [varchar](50) NULL,

[Committee_Designation] [varchar](50) NULL,

[Committee_Type] [varchar](50) NULL,

[Committee_Party] [varchar](50) NULL,

[Filing_Frequency] [varchar](50) NULL,

[Interest_Grp_Category] [varchar](50) NULL,

[Connected_Org_Name] [varchar](200) NULL,

[Candidate_ID] [varchar](50) NULL

) ON [PRIMARY]

GO

SET ANSI_PADDING OFF

GO

2.  Delete Extracted FEC Files – Delete the previous downloaded zip files

a.  Add Script Task to the Control Flow

b.  Add C# script to delete previous extracted files – Add the Script Task to the Control Flow and give it a name “Delete Extracted FEC Files”. Then edit it by choosing the Script Language (C# or VB.NET). I used C# for my solution.

Script Task Editor

C# delete files script

public void Main()

{

string directoryPath = @”D:\KTL\FEC_Downloads\Extract”;

string[] oldFiles = System.IO.Directory.GetFiles(directoryPath, “*.txt”);

foreach (string currFile in oldFiles)

{

FileInfo currFileInfo = new FileInfo(currFile);

currFileInfo.Delete();

}

// TODO: Add your code here

Dts.TaskResult = (int)ScriptResults.Success;

}

3.  Truncate FEC tables – Truncate the custom tables from the previous download

4.  Download FEC files from Website – Attached the below PowerShell script to this data flow task to get the updated FEC data

a.  Create PowerShell script and save as a .ps1 file

Invoke-WebRequest ftp://ftp.fec.gov/FEC/2018/cm18.zip -OutFile D:\KTL\FEC_Downloads\Archive\Committee.zip

Invoke-WebRequest ftp://ftp.fec.gov/FEC/2018/cn18.zip -OutFile D:\KTL\FEC_Downloads\Archive\Candidate.zip

Invoke-WebRequest ftp://ftp.fec.gov/FEC/2018/ccl18.zip -OutFile D:\KTL\FEC_Downloads\Archive\Link.zip

b.  Add PowerShell.ps1 to SSIS Execute Process task – Add the SSIS Execute Process task to the Control Flow and add PowerShell.exe to the Executable field and –F D:\KTL\FEC_Downloads\FEC_Download.ps1 to the Arguments field. The Arguments will need to change based on your system file location and name.

Execute Process for PowerShell script

5.  Foreach loop to Extract the zip files

a.  Add a Variable to the SSIS package – Add a variable to the package named “ZipFullPath”

SSIS Package variable

b.  Add a Foreach Loop to the Control Flow and then add then the following to the Collection and Variable Mappings within the Foreach Loop Editor

Foreach Loop folder location

Foreach Loop variable

6.   Add a Script Task inside the Foreach Loop

a.  Inside the Foreach Loop add the Script Task and open the Editor. Use C# with the ReadOnlyVariables shown below.

C# Script Editor

b.  Click on the Edit Script and add the following script

public void Main()

{

string zipfullpath=Dts.Variables[“User::ZipFullPath”].Value.ToString();

string inputfolder=Dts.Variables[“$Package::InputFolder”].Value.ToString();

using (ZipArchive arch=ZipFile.OpenRead(zipfullpath))

{

foreach(ZipArchiveEntry entry in arch.Entries)

{

entry.ExtractToFile(Path.Combine(inputfolder,entry.FullName));

}

}

File.Delete(zipfullpath);

// TODO: Add your code here

Dts.TaskResult = (int)ScriptResults.Success;

}

The final script should look like this:

C# zip extract script

7. Add a Data Flow task to the Control Flow – Add a Data flow task and two data flow from the Committee text file and one for the Candidate text file into each of the custom tables created in the first step.

Data Flow screen shot

Conclusion

Hope this helps show you some of the possibilities that you can create by combining PowerShell with SSIS.

Advertisements

Sharing the Wealth – Sorry not $$$ just Knowledge

Recently, I created several Power BI visuals on Dynamics GP data and had a request to share the SQL scripts I used to import into my Power BI model’s datasets. The previous Power BI datasets and visuals included:

  • General Ledger financial statement and analysis
  • Inventory reorder point analysis
  • Procurement analysis
  • Sales summary
  • Customer analysis
  • Sales Return (RMA) analysis

Most of the scripts where SQL views and can be used to import into Power BI or in SSIS to import into a Data Mart. Some SQL scripts will need to be changed to meet your needs or fit your environment. The zip file can be found here. Hope this helps with your quest for knowledge.

Real Time Data Warehouse ETL

Part 2 – Implementing Service Broker as a Real Time ETL Tool – The Code

To implement my Service Broker ETL API, I started with the basic code/framework provided by Eitan Blumin in this blog post: http://www.madeiradata.com/service-broker-asynchronous-triggers/. He provides a good basic framework for the setup of Service Broker and an explanation of the installation scripts. To learn more on the basic framework and scripts to implement Service Broker please refer to his post.

In this post, I’m going to provide his scripts for the basic installation with some modifications to work with Dynamics GP database and Solver’s BI360 data warehouse plus my modified stored procedure, usp_AT_iGLETL, that provides the logic of the ETL process from Dynamics GP’s general ledger table, GL20000, to Solver’s BI360 staging table.

General Workflow

Eitan’s general Service Broker workflow looks like this:

1. The user performs an UPDATE operation on a table. In my case with the modifications in place, the posting of a General Ledger batch within Dynamics GP will be the UPDATE operation on the GL20000 table.

2. An AFTER INSERT trigger on the table is fired. This trigger compiles the contents of the INSERTED and DELETED tables into XML parameters, creates a Service Broker message and sends it to a queue. The original transaction immediately returns. The modifications to the AFTER INSERT trigger pass the usp_AT_iGLETL stored procedure into the SB_AT_Fire_Trigger stored procedure.

3. The Service Broker service fires up and processes the messages in the queue independently of the original transaction. It opens up a transaction that will pull the message out of the queue, execute a relevant stored procedure that will use the XML data previously taken from the INSERTED and DELETED tables, and implement the relevant logic within the usp_AT_iGLETL stored procedure.

Service Broker Framework Installation and Usage

The installation script provided below implement a generic Service Broker framework which will allow you to use it for any table without any changes to the installed objects. The framework will work like this:

1. First, run the “Step1_SB_AT_Installation” script on the Dynamics GP company database where you want the asynchronous triggers.

2. Run the “Step2_GP_GLTable_Trigger” script on the Dynamics GP company database to create the AFTER INSERT trigger on the GL20000 table.

3. Run the “Step3_GP_ETL_Logging” script to create the basic framework for logging the Service Broker ETL process and error logs.

4. Run the “Step4_BI360_f_Trans_GL_Staging_Table” script to create the staging table for in BI360DW database.

5. Run the “Step5_ETL_StoredProcedure” script on the Dynamics GP company database to create the stored procedure, usp_AT_iGLETL, with the ETL logic.

Download Code

You should now have in place a “Near Real Time ETL” process in place to handle the loading of data from Dynamics GP to staging table within Solver’s BI360DW data warehouse database. You could then use the same logic and scripts above to handle the loading of the data from the staging table to the f_Trans_GL table within the BI360DW database. In my tests of the implementation outlined above, I was able to post a 350 line batch within Dynamics GP and load the staging table within five seconds.

Conclusion

Hopes this helps you with the basic understanding and installation of Service Broker as an ETL tool in loading your data warehouse..

Using BI360 and SSIS to Automate a Daily Flash Report

Have you ever had to link multiple spreadsheets together to generate a “readable” or “useable” report? Back in a previous life/career, I was a Director of Finance at a casino, where I provided Daily Flash Reports to management. The reports were quite detailed with slot, table games and restaurant sales data. It would generate KPIs and a summary income statement for the day, month to date and year to date figures for current year, previous year and budget (screen shot below).

clip_image002[1]

This was a huge undertaking that linked three detailed Excel spreadsheets for current year actuals, previous year actuals and current year budget to one summary Excel spreadsheet. Each day’s revenue and expense numbers needed to be input into the current year actuals Excel spreadsheet. The entering of the detailed transactions would take hours to complete from a manual entry standpoint. Then more time was needed to review the final output, make sure links were not broken, formulas were totaling correctly and there were no data entry errors in the linked spreadsheets. In total, I would say we spent 3 to 5 man hours every day to produce the report, before it was sent to management. At the end of the year, we would spend another 30 to 40 additional hours setting up the spreadsheets for the next year. It involved updating the previous year actuals with the year’s data that was about ready to become the previous year, adding in the next year’s budgeted numbers and clearing out the actual year spreadsheet. That’s a whopping 1,865 hours per year spent on producing one report!

Now I jump into the DeLorean with Marty McFly to get back to the present and I find BI360 with its Data Warehouse capabilities. If I still worked at the casino, I would set up dimensions in the BI360 Data Warehouse for each of the slot machine data, table game data and restaurant data. Then I would populate the daily detail from the slot accounting system, restaurant POS system and other systems into a fact table with SSIS (SQL Server Integration Services), providing the same end result without the huge manual process of the past. This same slot machine data could also be used to meet Gaming Regulations requirements to analyze slot machine payment out volatility or PAR percentage. (For those of you not from the casino industry, volatility or PAR percentage analysis is a way to statistically analyze the performance of the slot machine’s programmed customer win percentage to make sure that it’s performing within authorized manufactured specs based on slot machine’s current play life cycle.)  After the initial process of creating the SSIS package and designing the Daily Operating Report (DOR), I estimate that I would probably spend one hour per day on the DOR, for a savings of 1,500 hours per year. What a time saver that would have been, which would have more than made up for the cost of the software!