Dealing With External Data
In most Data Warehouse Projects, you will have to deal with External Data, provided by Business nits. Use cases for this kind of data are any kind of parameters or additional data, influencing processing.
Business Users "love" MS Excel, so External Data often is provided as MS Excel or CSV-Files.This often causes a lot of hassle:
Depending on the Language Settings of the Client Computer of the Editor of the file, Data and Number Formats differ (the 31st of December 2022 might be written as "31.12.2022", but also "12/31/2022". Numbers might include decimal point or comma, and also thousands separators)
Users might add columns or rows for comments
- Invalid Data
- It is complicated to provide technically valid data via MS Excel - Users can enter invalid numbers / dates by mistake
- In most cases, the data is not checked for data integritry - it might be possible to enter non-existing key values by mistake (for example non-eisting keys for Business Units, Personal Numbers and so on).
- No appropriate Audit-Tracking
- If files are sent to the Data Warehouse, you should keep track who changed the data.
- Four-Eyes Principles might be necassary for critical data
- Instable Processing
- Cumbersome Formats might lead to process abends or - even worse - processing of bad data
- If the External Data is processed in batch mode, the person providing the data won't get immediate feedback about invalid data
Use "Fast Edit" from Apparo Solutions (c) instead of Excel.
This is a low-cost, web-based application providing interfaces to simply
Insert Data via a Web Fron End
Paste Huge amounts of Data from Excel
Process Excel Data sent by EMail
Process Excel Data stored in a given directory
Data is stored directly into a relational database (for example Oracle, DB2 or MS SQL) including
- Checking the correct data types
- Checking Referential Integrity if needed
- Checking for valid data ranges
- 4-Eyes Principle can be implemented
- Audit tracking (writing specific audit tables and/or using Slowly Changing Dimensions Type 2)
The use of Fast Edit has a lot of advantages
Prohibit the processing of bad / invalid Data in the Data Warehouse
Business Users get feedback about bad / invalid Data immediately - no time lost
Data is stored in a relational database and can be directly processed by the warehouse, no need to handle with file
Audit tracking (who did enter or change the data how and when)
4 Eyes Principle
Extensive Functionality included - I implemented a System for complex Data Adjustments using Fast Edit
Follow the link for detailled information about Fast Edit.