Wednesday, August 22, 2012

2012-08-22 Wednesday - Data Governance Tools

Today I'm researching tools that may support a portion of a Data Governance process.

The development/production environment includes both Oracle and Microsoft SQL Server databases.

Some of the challenges I'm seeking to address with a Data Governance process / tooling includes answering the following types of questions:
  • What changes occurred from the previous to the next version of the  schema?
  • How can we migrate data from one version of the schema to the next?
  • How can we migrate data from one environment to another?
  • If we need to migrate data from a Prod to a Dev environment, how can we ensure that data is reliably 'cleansed' or 'masked' to avoid Personally Identifiable Information (PII) from'leaking' out of a Prod environment.
  • How to avoid having to recreate large volumes of existing test data from scratch - when major schema and/or data changes result from application changes

 Some of the desired features in a schema comparison tool would include the following:
  • Command Line interface for automatic generation of reports
  • DDL import of the previous/next schema definition files
  • Generation of a report in PDF, HTML, RTF formats to document both schemas
  • Reports to identify the deltas between the two schemas
  • Generation of the DDL to alter the previous schema to look like the future schema
However, beyond the simple task of schema comparison - Data Governance also encompasses the  challenges of the following tasks:
  • Migrating data - and supplying the transformation rules as needed
    • Splitting data from 1:N fields
    • Combining data from N:1 fields
    • Populating new data fields (via default value, lookup value, etc.)
    • Computing new values f(x):y

My initial inclination is to suggest the Altova Database Schema Differencing Tool (DiffDog(R) 2012)

Other Resources to Consider