Tuesday, August 31, 2010

Oracle to Teradata Migration Project tips

Introduction:
The decision to move to Teradata from an OLTP systems like oracle etc, with multiple data marts into an Enterprise Data warehouse is not complicate to start with but this needs right skills and planning to do and this gives the edge for the business to have the 3NF physical model, linear scalability, mixed workload capacity in an active and challenging business day-to-day life.

Birds Eye view agenda:
• Get the business requirements from the client anything that is necessary/related to the migration
• Have an approach how to design the physical database like tables, Indexes, Views and other database objects.
• All the business related Log and Historic Information related to data migration.
• Incremental Load conversion


Detail Technical and Business artifact for Migration Implementation

Client related Infrastructure:
• Business related rules and triggers embedded in data
• Data model of the existing business and the model for the tales that needs to be moved.
• All the software and hardware compatible with the Release of the Teradata that is in place.
• Make sure that the requirements are at least frozen during the port operations at the sending side (Oracle/DB2).
• If multiple data marts are ported do one at a time and then determine name of the production database and view database.
• Design a strategy in creating profiles and users.
• As part of the data movement and migration make sure we have the information of what gets moved and how, meaning are we landing onto a disk and doing or direct fork lift?
• Determine the right PI in table designing talking to the ETL and Front end Reporting folks.
• Design initial load jobs with the assumption that the target tables are created.
• Determine the type of slowly changing dimensions and design accordingly.

Approach for Physical DB design and conversion:
• Teradata maximum length of a database object is 30, so keep in mind always about the naming convention, like DB2 OS390 Maximum name length = 18 DB2 Unix Maximum name length = 128, no problem with Oracle as its limit is also 30 characters.
• Ask if the client has any naming standard, if so we can follow it; so that a mapping sheet is created for 120 to 30 characters and have the business sit with us to review the sheet and agree upon for a valid names
• Like have all the DDL and indexes to be converted, for DB2
o Use the DB2 system Catalog as input into a conversion script
o Use data modeling tool like ERWIN and do the reverse engineering
o Have a DB2LOOK command and an editor.
o Same thing goes with the Indexes, go with Teradata index creation criteria; use the columns and index table to u unload from the DB2
o Enough testing needs for converting UDFs into Teradata, either the source is DB2 or any other database because the language and features are different. The easy way to handle this is to put the UDF is to create macros or stored procedure and UDTs to data containers in stored procedure.


Log and Historic migration:

• Determine the migration approach, is it going to be an ETL tool doing the pull and push or the source data is put into flat files and then use this to load into Teradata using load utilities; since the tables are empty a FASTLOAD can be used to do the loading.
• The SLA violation is something that determines how to load the data. From legacy systems to a system using legacy ETL and then the Teradata ETL methodology to production or use a scripting tool to generate the load and unload scripts to get the data in place, the second one is an automated process which loads and unloads at the same time (using NDM technology)
• Have all the rules next to you like for instance the flat file format rules for transfer to Teradata, what if it is delimited to fixed or both.


Incremental Load conversion:

• Requirement can be completely ported to Teradata and run the loads or during the port operations the data could be loaded to the DB2/Oracle database; there are ways to handle this:
o Either stamp with a date and the migration moves the data from a point in time also called Incremental load synchronization
o The other way is a refresh, the data is moved at the very end of the migration process after the source system finishes it last run before the first run of the new system, if this is not possible for business continuity reason then a custom solution needs to be implemented to capture the change.

Process flow for a smooth data migration

• The design should be in place for the migration before testing anything with all validation in place
• Then starts the ETL and the specs to build followed by migration of data
• The Integration testing comes into picture with all the sync effort with the parallel systems if it is implemented that way.
• For huge migration projects automation is recommended like create process, macros in place for real time reporting status for tracking purpose.

Conclusion:

For any projects planning phase is so important. Measure thousand times before you cut. I took some time to write few things to share my knowledge and experience as an Lead Teradata DBA, DB Architect and GoldenGate Specialist with strong ETL Architectural development, I put this document as a high level to share my knowledge. I’ll be more than happy if this document can help in any of the missing gaps of your planning and analysis phase.

1 comment:

  1. It is a very nice post. Could you please share with me some document or writeup for oracle to teradata migration. I have got an opportunity to work on Informatica ETL migration (oracle to teradata) project.Please send across to suja.somu@gmail.com.

    ReplyDelete