Design Tip #107 Using the SQL MERGE Statement for Slowly Changing Dimension Processing
Most ETL tools provide some functionality for handling slowly changing dimensions. Every so often, when the tool isn’t performing as needed, the ETL developer will use the database to identify new and changed rows, and apply the appropriate inserts and updates. I’ve shown examples of this code in the Data Warehouse Lifecycle in Depth class using standard INSERT and UPDATE statements. A few months ago, my friend Stuart Ozer suggested the new MERGE command in SQL Server 2008 might be more efficient, both from a code and an execution perspective. His reference to a blog by Chad Boyd on MSSQLTips.com gave me some pointers on how it works. MERGE is a combination INSERT, UPDATE and DELETE that provides significant control over what happens in each clause. This example handles a simple customer dimension with two attributes: first name and last name. We are going to treat first name as a Type 1 and last name as a Type 2. Remember, Type 1 is where we handle a change in a dimension attribute...