Transaction Support in Integration Services
Excerpt by Don Kiely
A transaction is a core concept of relational database systems. It is one of the major mechanisms through which a database server protects the integrity of data, by making sure that the data remains internally consistent. Within a transaction, if any part fails you can have the entire set of operations within the transaction roll back, so that no changes are persisted to the database. SQL Server has always had rich support for transactions, and Integration Services hooks into that support. A key concept in relational database transactions is the ACID test. To ensure predictable behavior, all transactions must possess the basic ACID test, which means:
- Atomic: A transaction must work as a unit, which is either fully committed or fully abandoned when complete.
- Consistent: All data must be in a consistent state when the transaction is complete. All data integrity rules must be enforced and all internal storage mechanisms must be correct when the transaction is complete.
- Isolated: All transactions must be independent of the data operation of other concurrent transactions. Concurrent transactions can only see data before other operations are complete or after other transactions are complete.
- Durable: After the transaction is complete, the effects are permanent even in the event of system failure
Integration Services ensures reliable creation, updating, and insertion of rows through the use of ACID transactions. For example, if an error occurs in a package that uses transactions, the transaction rolls back the data that was previously inserted or updated, thereby keeping database integrity. This eliminates orphaned rows and restores updated data to its previous value to ensure that the data remains consistent. No partial success or failure exists when tasks in a package have transactions enabled. They fail or succeed together. Tasks can use the parent container's transaction isolation or create their own. The properties that are required to enable transactions are as follows:
- TransactionOption: Set this property of a task or container to enable transactions. The options are:
- Required: The task or container enrolls in the transaction of the parent container if one exists; otherwise it creates a new transaction for its own use.
- Supported: The task uses a parent's transaction, if one is available. This is the default setting.
- Not Supported: The task does not support and will not use a transaction even if the parent is using one.
- IsolationLevel: This property determines the safety level, using the same scheme you can use in a SQL Server stored procedure. The options are:
- Serializable: The most restrictive isolation level of all. It ensures that if a query is reissued inside the same transaction, existing rows won't look any different and new rows won't suddenly appear. It employs a range of locks that prevents edits or insertions until the transaction is completed.
- Read Committed: Ensures that shared locks are issued when data is being read and prevents "dirty reads." A dirty read consists of data that is in the process of being edited, but has not been committed or rolled back. However, you can change data before the end of the transaction, resulting in nonrepeatable reads (also known as phantom data).
- Read Uncommitted: The least restrictive isolation level, which is the opposite of READ COMMITTED, allows "dirty reads" of the data. Ignores locks that other operations may have issued and does not create any locks of its own. This is called "dirty read" because underlying data may change within the transaction and this query would not be aware of it.
- Snapshot: Reads data as it was when the transaction started, ignoring any changes since then. As a result, it doesn't represent the current state of the data, but it represents a consistent state of the database as of the beginning of the transaction.
- Repeatable Read: Prevents others from updating data until a transaction is completed, but does not prevent others from inserting new rows. The inserted rows are known as phantom rows, because they are not visible to a transaction that was started prior to their insertion. This is the minimum level of isolation required to prevent lost updates, which occur when two separate transactions select a row and then update it based on the selected data. The second update would be lost since the criteria for update would no longer match.
Integration Services supports two types of transactions. The first is Distributed Transaction Coordinator (DTC) transactions, which lets you include multiple resources in the transaction. For example, you might have a single transaction that involves data in a SQL Server database, an Oracle database, and an Access database. This type of transaction can span connections, tasks, and packages. The down side is that it requires the DTC service to be running and tends to be very slow. The other type of transaction is a Native transaction, which uses SQL Server's built-in support for transactions within its own databases.
This uses a single connection to a database and T-SQL commands to manage the transaction. Integration Services supports a great deal of flexibility with transactions. Integration Services supports a great deal of flexibility with transactions. It supports a variety of scenarios, such as a single transaction within a package, multiple independent transactions in a single package, transactions that span packages, and others. You'll be hard pressed to find a scenario that you can't implement with a bit of careful thought using Integration Services transactions.
This post is an excerpt from the online courseware for our Microsoft SQL Server 2008 Integration Services course written by expert Don Kiely.