Dealing With External Data

In most Data Warehouse Projects, you will have to deal with External Data, provided by Business nits. Use cases for this kind of data are any kind of parameters or additional data, influencing processing. 
Business Users "love" MS Excel, so External Data often is provided as MS Excel or CSV-Files.This often causes a lot of hassle:
  • Cumbersome Formats

    • Depending on the Language Settings of the Client Computer of the Editor of the file, Data and Number Formats differ (the 31st of December 2022 might be written as "31.12.2022", but also "12/31/2022". Numbers might include decimal point or comma, and also thousands separators)

    • Users might add columns or rows for comments

  • Invalid Data
    • It is complicated to provide technically valid data via MS Excel - Users can enter invalid numbers / dates by mistake
    • In most cases, the data is not checked for data integritry - it might be possible to enter non-existing key values by mistake (for example non-eisting keys for Business Units, Personal Numbers and so on). 
  • No appropriate Audit-Tracking
    • If files are sent to the Data Warehouse, you should keep track who changed the data.
    • Four-Eyes Principles might be necassary for critical data
  •  Instable Processing
    • Cumbersome Formats might lead to process abends or - even worse - processing of bad data
    • If the External Data is processed in batch mode, the person providing the data won't get immediate feedback about invalid data

The Solution

Use "Fast Edit" from Apparo Solutions (c) instead of Excel.
This is a low-cost, web-based application providing interfaces to simply
  • Insert Data via a Web Fron End

  • Paste Huge amounts of Data from Excel

  • Process Excel Data sent by EMail

  • Process Excel Data stored in a given directory

 

Data is stored directly into a relational database (for example Oracle, DB2 or MS SQL) including
  • Checking the correct data types
  • Checking Referential Integrity if needed
  • Checking for valid data ranges
  • 4-Eyes Principle can be implemented
  • Audit tracking (writing specific audit tables and/or using Slowly Changing Dimensions Type 2)

 

The use of Fast Edit has a lot of advantages
  • Prohibit the processing of bad / invalid Data in the Data Warehouse

  • Business Users get feedback about bad / invalid Data immediately - no time lost

  • Data is stored in a relational database and can be directly processed by the warehouse, no need to handle with file

  • Audit tracking (who did enter or change the data how and when)

  • 4 Eyes Principle

  • Extensive Functionality included  - I implemented a System for complex Data Adjustments using Fast Edit

 
Follow the link for detailled information about Fast Edit.