Introduction
Organisations need to get data from external sources and load it to a database. With outsourcing of processes this need is growing fast. Customer databases (CRM) base the core for business processes and data management becomes the priority. It is common that data is exchanged within text files in pseudo formats (.csv, .psv, fixed position) and loaded with special utilities.
Though the process may look simple, it often fails to work. The reason behind this is the difference between business rules and database schemas between organisations.
The ETL (Extract, Transform, Load) process is not that simple because tables, fields and constraints are not the same for two different systems. Moreover, data quality of external sources is often inadequate and causes problems.
Prepare lookup files
- Product codes
- Titles
- Marketing campaigns
- And other reference data.
Create the specification
Data exchange is based on a specification that can be used by database programmers to understand the format of the data in the file. Therefore, it should be a well structured document with description of all tables and fields. Descriptions can be formal (based on programming language) or informal. Flat File Checker provides an easy way to build such schema in which each data element (table, field) can be defined by both data rule and free text comment. The benefit of using Flat File Checker is that this schema is both readable and executable.Build your knowledge
With FFC you can easily extend schema without writing any code. It is important to add new rules to the schema each time it fails to catch errors.
Let them do it or automate it
If you need to validate data yourself you can use command line version of the application to automate the process.