Migrating Data to SQL Azure Using SSIS

Jump to: navigation, search

Contents

SQL Server Integration Services

SQL Server Integration Services (SSIS) is a data-integration and workflow-solutions platform, providing ETL (Extract, Transformation, Load) solutions for data warehousing as well as extractions and transformations. With its graphical tools and wizards, developers often find that SSIS is a quick solution for moving data between a source and destination. As such, it's a great choice for migrating data between a local database and a SQL Azure database. Notice, however, that the previous sentence says data. When you're using SSIS, the database and tables must already exist in SQL Azure.

If you're familiar at any level with SSIS, you're probably wondering why it has the limitation of only moving data. Several SSIS tasks can provide the functionality of moving objects as well data, such as the Transfer SQL Server Objects task. When asked about this task, Microsoft replied that SSIS relies on SMO (SQL Server Management Objects) for this task, and SMO doesn't currently support SQL Azure. In addition, some of the SSIS connection managers use SMO and therefore are limited when dealing with objects. Thus, the current solution is to create databases and tables using straight SQL and then use SSIS to do the actual data transfer. This article illustrates how to use SSIS to migrate your data from on-premise SQL to SQL Azure.

Creating an Integration Services Project

To create your project, follow these steps:

1. Fire up Business Intelligence Development Studio (BIDS) by choosing Programs ~TRA Microsoft SQL Server 2008 ~TRA Business Intelligence Development Studio.

2. When BIDS opens and the New Project dialog displays, select Business Intelligence Projects from the list of project types, and then select Integration Services Project, as shown in Figure 1. Click OK.

Figure 1 - Creating a new SSIS project

Figure1. Creating a new SSIS project

You now see an SSIS package designer surface. This surface has several tabs along the top: Control Flow, Data Flow, Event Handlers, and Package Explorer, shown in Figure 2. This example uses the Control Flow and Data Flow tabs.

Figure 2 - Control Flow Tab

Figure 2 - Control Flow Tab

In Visual Studio, select View Toolbox. The Toolbox contains a plethora of what are called tasks, which are control and data-flow elements that define units of work that are contained and preformed within a package. You use a few of these tasks to migrate the data from your local database to your SQL Azure database.

Building a Migration Package

Let's start building an SSIS package to migrate your data. Follow these steps:

1. In the SSIS package designer, select the Control Flow tab, and drag an Execute SQL task and three data flow tasks from the Toolbox onto the Control Flow designer.

2. Right-click Execute SQL task, and select Edit from the context menu. The Execute SQL Task Editor opens.

3. Change the task name to Clear Data, leave the Connection Type as OLE DB, and leave the SQLSourceType as Direct Input.

4. In the Connection property, select <New connection>, as shown in Figure 3. The Connection Manager dialog opens (see Figure 4).

Figure 3 - Adding a new connection

Figure 3 - Adding a new connection

Figure 4 - Entering the connection information

Figure 4 - Entering the connection information

5. In the Connection Manager dialog, enter the server name of your SQL Azure server, select Use SQL Authentication, and enter your SQL Azure username and password. The username must be in the format username@server where username is your Administrator username or a valid SQL Azure username and server is the first part of the server name (the piece prior to .database.windows.net).

6. In the "Select or enter a database name" field, the OLE DB provider doesn't return a list of databases. No big deal: type in the name TechBio.

7. Click the Test Connection button. If you've entered everything correctly and your firewall is set up, your test connection succeeds.

8. Click OK in the Connection Manager dialog.

9. Back in the Execute SQL Task Editor, click the ellipsis button in the SQLStatement property to display a small Enter SQL Query dialog in which to enter one or more T-SQL statements. Enter the following DELETE statements, which clear out the data from the previous example. (This isn't critical, but it gives you a clean slate to start with.)

DELETE FROM UserDocs
DELETE FROM Users
DELETE FROM Docs

10. Click OK in the Enter SQL Query dialog. The Execute SQL Task Editor dialog should look like Figure 5. Click OK.

Figure 5 - Execute SQL Task Editor

Figure 5 - Execute SQL Task Editor

11. Back on the Control Flow tab in the SSIS package designer, make sure the Clear Data task is highlighted. Notice the green arrow coming from the bottom of the Clear Data task: click anywhere on this green arrow, and drag it to the first data flow task. Doing so creates a connection from the Clear Data task to the first data flow task, signifying the flow of execution. You can see an example of this in Figure 8.When the Clear Data task has completed executing, the first data flow task will then execute.

12. Let's add logic to the first data flow task. Double-click the linked data flow task (or right-click and select Edit). Doing so takes you to the Data Flow tab.

13. Drag an OLE DB Source task and an OLE DB Destination task to the Data Flow designer surface. This is where the actions of pulling data from the source database (the local DB) and copying it to the destination database (SQL Azure) take place.

14. Right-click the OLE DB Source task, and click Edit. Doing so opens the OLE DB Source Editor, where you define a connection to your local database, such as the connection shown in Figure 5-10. You already have a connection to the SQL Azure database, but you need to create a connection to your local database that your tasks use to copy the data.

15. On the OLE DB Source Editor task, you see the connection to the SQL Azure database. Click the New button to open the Configure OLE DB Connection Manager dialog. Click New again to open the Connection Manager dialog you saw in Figure 4.

16. In this dialog, enter the information to connect to your local copy of the TechBio database.

17. Test the connection, and then click OK in the Connection Manager dialog.

18. Click OK in the Configure OLE DB Connection Manager dialog.

19. Back in the OLE DB Source Editor, click the Table/View drop-down, select the Docs table, and then click OK.

20. As you did for the control flow task, drag the green arrow from the OLE DB Source task to the OLE DB Destination task.

21. Double-click the OLE DB Source task to edit the task properties, which is where the data is going: the SQL Azure database. Because you've already created a connection to the SQL Azure database, you can use that connection. In the OLE DB Destination Editor, select the SQL Azure connection, and then select the Docs table from the drop-down list of tables. Oops-you get the error shown in Figure 6.

Figure 6 - OLE DB connection error

Figure 6 - OLE DB connection error

This is interesting, because you didn't get this error when configuring the Execute SQL task. The difference is that the two Connection Manager dialogs don't operate quite the same way. The Connection Manager dialog for the Execute SQL task let you type in the table name, whereas the Connection Manager dialog for the OLE DB Destination task required you to select from the list. But when you expanded the list, you received the error shown in Figure 6. The fix is to use an ADO.NET destination instead of the OLE DB destination. To do this, continue as follows:

22. Delete the OLE DB Destination task, and drag an ADO.NET Destination task onto the surface.

23. Connect to the two tasks, and then double-click the ADO.NET Destination task to configure it.

24. In the ADO.NET Destination Editor dialog, click the New button to configure a new ADO.NET connection.

25. Walk through the same steps as in the previous two connection configurations. This time, you're able to type the database name in the Connection Manager dialog.

26. Click OK in the all the dialogs until you're back to the ADO.NET Destination Editor.

27. Before you click OK in this dialog, click Mappings at left, as shown in Figure 7. Doing so ensures that the source table columns are appropriately mapped to the destination table columns. Click OK in the ADO.NET Destination Editor.

Figure 7 - ADO.NET Destination Editor with mappings

Figure 7 - ADO.NET Destination Editor with mappings

If you're new to SSIS, congratulations: you've just configured your first SSIS data flow. Your data flow should look like Figure 8-not very exciting, but useful nonetheless. If you aren't new to SSIS, you still deserve congratulations, because you've successfully configured a data flow to migrate data to the cloud.

Figure 8 - Data flow

Figure 8 - Data flow

Put down the root beer, though, because you aren't quite done. Continue with these steps:

28. Go back to the Control Flow tab, and connect the first data flow task to the second data flow task, and connect the second and third data flow tasks.

29. Double-click the second data flow task, and do the same thing you did for the first data flow beginning in step 12. You don't need to re-create the connections, but in the Source and Destination Editors for the second data flow, select the Users table.

30. Repeat the process for the third data flow task, this time selecting the UserDocs table.

When you're finished, your control flow task should look like Figure 9. The tasks aren't green (yet), but the flow should be similar.

Figure 9 - Successful execution of an SSIS package

Figure 9 - Successful execution of an SSIS package

Executing Your Migration Package

You're now ready to test your SSIS package. In Visual Studio, click the green arrow on the toolbar to begin package execution. Execution starts at the Clear Data task-which, as you recall, deletes all the data from the UserDocs, Users, and Docs tables. Execution next goes to the first data flow, which queries data from the local Docs table (source) and copies it to the TechBio database in the Docs table in SQL Azure (destination). Execution then goes the second and third data flow tasks.

When execution is complete, all the tasks are green, as shown in Figure 9, letting you know that they executed successfully. Any tasks that are yellow are currently executing. Red tasks are bad: that means an error occurred during the execution of the task, regardless of whether the task was in the control flow or data flow, and execution stops.

If your tasks are all green, you can go back to your root beer. Otherwise, the best place to start debugging is the Output window. All output, including errors, is written to this window. You can find errors easily by looking for any line that starts with Error: toward the end of the list. Errors you receive may be SSIS specific or SQL Azure specific. For example, did you define your connections correctly? Microsoft makes testing connections very simple, and this doesn't mean the Test Connection button. The Source Editors dialog-regardless if whether it's an OLE DB or ADO.NET Editor-includes a Preview button that provides a small preview of your data, up to 200 rows. This ensures that at least your source connection works correctly.

Author

Scott Klein

Scott Klein is co-founder of Blue Syntax, a consulting a training company specialized on the Azure platform. Scott has been working with SQL Server for nearly 20 years, working with companies large and small in fields such as medical, finance, and retail. Scott is a veteran author, having written a number of books including the soon to be released Pro SQL Azure. Scott is a SQL Server MVP and President of the South Florida SQL Server Users Group and the newly formed Azure PASS Virtual Chapter. Scott also organizes and runs the South Florida SQL Saturday and speaks frequently at SQL Saturday and .NET Code Camp events as well as user groups. He was lucky enough to speak at the 2008 European PASS conference and recently returned from Germany providing a 3-day deep dive training at the PASS Camp.

His online presences include: