Thursday, December 22, 2011

Flat File Checker Validation using C#.NET

Business Case

The Data Feed project had a requirement to design and develop an interface to accept and parse the data file from ABZ Company.
The Data File was generated with data from multiple types of systems. Hence it was decided to use a Fixed-width file.
The data file format was defined to be fixed-width format. Here is an example of a fixed width file :

01732Juan Perez              004350011052002 
00554Pedro Gomez         123423006022004 
00112Ramiro Politti         000000001022000 
00924Pablo Ramirez        033213024112002


Each field in the line has a start position and length defined. The challenge with a fixed width file is that each field has a defined length and any unwanted character can get easily ignored and result in an incorrect data.
So validation of data was very crucial for the success of the project.

Goals

To design a data-validation tool to validate the incoming data on a continuous basis:
  • validation engine must be configurable --- validation rules must not be hard coded.
  • data errors in a file must be documented and be available to be emailed to the sender.

Solution

The business rules against which you must validate the files must be made available to the FlatFileLibrary in the form of an XML file, which is created through the graphical user interface of the Flat File Checker (FFC) application. Here are examples of such rules:
  • Value in the field is Required
  • Field is a date with a given format and within a given range
  • Field structure must match a Regular Expression
  • Two files are linked [relational links]
  • Field must be compared to another field
  • Unique constraint is imposed on one or several fields
  • Validation must be made against values stored in a database
The task of validation can be now addressed in a few easy steps:
  • Create a Flat File Schema (XML file that contains business/data rules for files that you want to validate)
  • Add the Flat File Checker library to your .NET application
  • Run validation with minimum code within your application

Background

The idea behind Flat File Checker is to create an application that can validate structured data in a text file based on schema stored in XML, in a way similar to the way XML files are validated. This approach separates data/business rules from the validation itself, and thus considerably less coding is required.

Creating the schema

Here is a quick way to create a schema file that can be used for validation with FFC:

Add the Flat File Library to your solution

Add a reference of Flat File Library (FlatFilelibrary.dll (version 0.7.3.2) and Eval3.dll (version 0.6.4.0)) to your solution in Visual Studio. (http://www.flat-file.net/download.html)

Run the validation and process

To use the Flat File Library, you will need to use these main classes:
  • FlatFileSchema- The class that contains a collection of files
  • File - The virtual class that gives access to several objects associated with a flat file: columns, errors, etc
  • DataError- The class that contains details of the data error
using FlatFileLibrary;
using System.Threading;
using FlatFileLibrary.DataSources;


//Define the variables
private FlatFileSchema _files;
private AutoResetEvent do_checks =new AutoResetEvent(false);

//Add the event handler in the class constructor or during form initialization

_files = new FlatFileSchema(schemaPath);
this._files.Validated += new EventHandler<schemavalidatedeventargs>(FileSetValidated);

        public void  ValidateFile()
        {
            try
            {
                RunValidation();
            }
            catch (Exception e)
            {
                log.Error("Data Validation failed because :" + e.Message);
            }            
        }
        private void RunValidation()
        {
            // Use Flat File Checker user interface to create Schema file.

            do_checks.Reset(); //reset state    
            _files.RunChecks(); //don't store event from the lib     
            do_checks.WaitOne(); //Wait for FileSetValidated to set this event 
        }

Issues and Resolutions

Below is a list of the issues and how they were resolved:
1)      Version differences: There were some changes in class names and namespaces between various versions of FFC. The code above works with the following versions of the 2 FlatFilelibrary dlls.  FlatFilelibrary.dll (version 0.7.3.2) and Eval3.dll (version 0.6.4.0)
2)      Writing errors to the database: FFC generates a text file with errors by default. In order to write errors to the database/customize the error file, you can use the following code:
public void FileSetValidated(Object sender, SchemaValidatedEventArgs e)
        {
            ValidationComplete = e.Result;
           try
            {
                foreach (FlatFile file in _files.DataSources)
                {
                    if (file.Errors.Count > 0)
                    {
                        IsDataValid = false;
              
                    foreach (DataError err in file.Errors)
                    {
                        foreach (IDataRule check in err.Checks)
                        {
                            //parse through errors 

                            StringBuilder errorLine= new StringBuilder();
                            errorLog = new SurveyDataFileErrorLog();
                            errorLog.Row = err.RowInSource;
                            errorLog.ColumnName = err.Column.Name;
                            errorLog.Value = err.Value;
     //Customize error messages.. 

                         if (check.DataRuleMessage.Contains("REGEX"))
                            {
                                errorLog.ExceptionMessage = "Invalid Format";
                            }
                            else {
                                errorLog.ExceptionMessage = check.DataRuleMessage;
                            }
                        }
                    }
                }
                    else{
                        IsDataValid = true; }               
            }
                }
            finally {
                do_checks.Set(); }
}
3)      Asynchronous Data Validation: The Flat File checker validation is done asynchronously. The project requirement was to be able to do it synchronously and depending on whether the file was valid or invalid next function will be executed.
private void RunValidation()
       {
            // Use Flat File Checker user interface to create Schema file.

            do_checks.Reset(); //reset state    
            _files.RunChecks(); //don't store event from the lib     
            do_checks.WaitOne(); //Wait for FileSetValidated to set this event 
        }

FFC – Wish list & Next Steps

One of the features that I would like to see in the upcoming versions of the FFC is the ability to check if a file was valid or invalid in the RunValidation() method. Currently I had to set it during the event handler and read it from a global variable. Next Step: Assign the File Name to be validated at runtime. The following code was sent by FFC author  FlatFilefile = (FlatFile)Schema.DataSource(); file.Path= ; Next Step: Validate file data with the existing data in SQL database .

Results

The Flat File Checker is a complete file validation tool with configurable validation rules.  I was able to validate the files on a continuous basis and generate emails/error files without much effort.
Author: Pooja Lnu