OWB - Data Profiling

Card Puncher Data Processing

Data Profiling Implementation in OWB

If you don't know anything about data profiling, you can have a look to the Data Profiling Page for a description.

Nothing is better than a video : OWB - Data Profiling Demo

Most organizations build a data warehouse to provide an integrated, reliable, and consistent “single version of the truth.” Data is usually sourced from a number of systems and has to be extracted, cleansed, and integrated before being made available for users to query.

The quality of the data loaded into the data warehouse is often variable, however, and for that reason, historically the process of profiling your source data has been a time-consuming, manual process that has required either lots of experience with SQL*Plus or the purchase of an expensive third-party tool.

With the release of Oracle Warehouse Builder 10g Release 2, however, the ability to profile your data is built into the tool and no knowledge of SQL*Plus is required. Furthermore, the data profiles that you build using Oracle Warehouse Builder can be used to generate automatic corrections to your data. In this article, you’ll learn all the nuances of this important new feature.

Data profiling is a prominent feature of Warehouse Builder 11g, allowing you to investigate data and detect anomalies. Then create business rules and generate a corrective ETL process based on these business rules.

Data Profiling can be defined as a process that automates the identification of problematic data and metadata while enabling the correction of inconsistencies, redundancies and inaccuracies in the data.

You can profile your source or target data to discover structure, relationships and data rules. Data Profiling provides statistical information about compliant data and outliers, including Six Sigma. You can drill down into the anomalies and derive data rules manually or automatically. You can use these data rules to monitor the quality of data in your data warehouse.

You can perform attribute analysis, referential analysis, functional dependency analysis or profile data against custom designed rules.

The steps are displayed in the following diagram.

Data Profiling Step

In the initial step the source data is profiled and all the information the data holds is detected. In step 2 you derive some of the data rules and then use these derived rules to derive corrections (step 3). These corrections are regular Warehouse Builder mappings and you need to deploy those in step 4. The last step runs the correction mappings to cleanse the data in the target.

Table attribute sets

Use attribute sets in Warehouse Builder to determine which columns on a table are candidates for profiling, then in the profiling editor choose which ones you really are going to profile.

For example, do you really want to profile a system generated numerical key field out of the ERP system or use those resources to profile that varchar2(4000) column labeled “additional information”?

Owb Table Attribute Sets Tab

Even with the sample size at a small sample, you need to remember that a lot of the profiling resources are spent on working on column relationships. So again, don’t just profile every column and try to see if there are relationships between all of them, because you are not efficiently using the resources.

Data Profiling Tab

Documentation / Reference





Discover More
Card Puncher Data Processing
OWB - Data Profiling Demo

This data profiling demo of the data profiling product of OWB was made with the help of this Oracle by Example series : Examining...
Global Explorer Public Data Rule
OWB - Data Rules

Data Rule is the key concept of the data quality. They are part of the Data Quality option to Warehouse Builder. If you don't know what is a data rule, please refer you to data rule. For a complete explanation...
Owb Functional Dependency
OWB - Functional Dependency Rule

A functional dependency occur when : Then we say that The attribute A is the dependent The attribute B is the determinant A is then a derived attribute of B. In the data profiling tools, the...
Owb Table Operator Data Rule Properties
OWB - Table

The Table operator enables you to source data from and load data into tables stored You can set Target Filter for Update, Target Filter for Delete, and Match By Constraint. When a data...
Card Puncher Data Processing
Oracle Warehouse Builder

Owb Documentation - Understanding Data Quality Management White...



Share this page:
Follow us:
Task Runner