Friday, April 9, 2010

How to validate CSV file

Introduction

It is often required to validate a CSV file or other pseudo formatted data against a set of business rules. Though MS Excel offers data validation functionality for simple checks, it is often not enough for advanced data quality assurance.

Flat File Checker is the best option for data validation if you are facing any of those tasks:
  • Validation of large data file
  • Regular validation of files with same or similar structure
  • Validation against a set of complex data rules
  • Automated data validation / batch processing
It takes few easy steps to validate a file.

1. Install Flat File Checker

1.1 Download Windows installer from SourceForge.net.
1.2 Install application (click OK on all screens through installation process).
1.3 Open the application (FlaFi ->FlatFileChecker in the Start menu)

2. Create schema with a file

2.1 New Schema
Click on Create New Schema button in the main toolbar which will activate files and fields areas of the form:
2.2 Add File
Now Drag & Drop your data file to the files area of the form:
This will open a File Details form where you must provide your file details in the underlined fields:
After you click OK in File Definition form, File will appear in the Files area and fields from the header will be created and appear in the Fields area:

Now we can save the schema.

3. Add Data Rules

Here I will show how to add most common data rules.
3.1. Required
To disallow Empty/Null values in the field double click on it in the Fields list which will open Field Details form.
Then right click on the Rules node in the rule builder and select Add->Required in the pop-up menu:
 

3.2 Dates
Dates validation can be a bit tricky as time "goes" ahead and we need to appreciate its dynamic nature.
Let's say for example, that all contacts in our file should be at least 18 years old and we want data rule to work today, tomorrow or any time.
Therefore, we cannot just put static value, i.e. 08/04/1982 in the data rule. Thanks to Flat File Checker's support of VB Expression we can easily overcome this issue.

  • Double click one of the date fields in (if you have any in your file), which is Date_of_Birth in this example;

  • Right click on Rule node in rule build and select Add->Value Constraint;

  • In the appeared fill-in rule details:

Notice:
In this example value is enclosed in square brackets to indicate that it contains VB Expression:
Format(Now.AddYears(-18),'dd/MM/yyyy')

Finally we need to populate Date Format field and check Allow Null Values is it is OK for Date_of_Birth field to be blank:

 Now click on OK button to keep all changes in the schema.

3.3 Regular Expressions
To add regular expression validation to the schema:
  • Double click on the field to open Field Details form;
  • Right click on Rules node and select Add->Value Constraint;
  • Select REGEXP as operator and provide regular expression (UK Postcode in the screenshot):
  • Click on Y button of the control and OK button of the form to save changes.
3.4 External Lookup
It is also common requirement that all values in the data file belong to a defined set of values.
If you have a lookup file that contains this values (vocabulary or primary data) then you can easily validate data against this file with Flat File Checker.

Drag & drop lookup file to files are of the Main form and populate file details (tick on Lookup checkbox).


Click on Links button in the main menu after the file is added to the schema.

In the Links form select MyData.Title as Foreign Key and Titles.Title as Primary Key and tick Enforce:



 Click Add button to add link to the schema. Then press OK to close File Links form and save changes.

4. Execute schema

To validate file click on Run Checks button in the toolbar:

You will see progress bar while file is being validated which will be replaced by "CHECK COMPLETE" in the end of the processing:

5. View and correct errors

Now you can click on Show Errors button in the toolbar to open Data Errors form in which you can see and correct errors.

6. Other information sources on how to use Flat File Checker

6.1 How to thread on the Forum.
6.2 See online documentation here.
please feel free to contact me with any questions about Flat File Checker - info@flat-file.net