Friday, February 26, 2010

Development plans for 2010. Make a Wish!

I want to get feedback to plan development in 2010.

From the end of 2009 I have stopped developing new features without user request, so if you think something is missing make sure you tell about it. There are still a lot of things that can be done to improve Flat File Checker.
I have number areas that I think should be developed/improved in 2010, but I am not sure whether they are relevant to you:

See descriptions for each point bellow.

1. Match & Merge (De-duplication)
a. Duplicates Report - implemented - see User Guide->Add Actions->Report Duplicates.
b. Fuzzy Logic / Heuristic
- Soundex - implemented - see example of a derived field with Soundex function in the User Guide->Calculated Field->Add Calculated Field;
- Levenshtein (distance between two strings)
- Dictionaries (matches between official names and nicknames, i.e.: Robert - Rob - Robbie - Robin - Rupert - Bob - Bobby - Bert.) - implemented - how to match records using special values;
c. Some kind of conditional logic if it is not too complicated, i.e. If Forename is missing, then check whether Initials are the same - implemented - see special values for matching.

2. Web Services (WS)
a. Basic functionality to upload Flat File Schema to the server.
b. Basic functionality to upload data files against Schema that is already on the server.
c. Run validation of the file set against the schema.

Thanks to Chris (RacerX64), who already spent his weekend (man, I hope you got bonus for that) to write the most of needed functionality, so it should take couple of days to accomplish this task. The biggest challenge with WS is to set up public server which could be available to everybody. If somebody can help with IIS hosting, it will be highly appreciated. IF WS is UP and running I will think of basic web interface for file upload and error reporting.

3. Data Error Log & Execution
a. Make HTML or XML+XSLT version of report that will be user friendly enough to send reports to data providers. - implemented. Now the output format of the log depends on its extension and can be Text, Xml or Htlm (see User Guide->Create New Schema->Log Formats)
b. Replace current execution logging with standard Tracing mechanism which is more flexible and robust way of logging.

4. Data File Preview
a. Implemented - Add a form to GUI that will show data from the file in the table with errors being highlighted.
b. Add a form for managing fields setting of the fixed position file.

5. Documentation
a. I want to make it easy to use and read. Please help me with this!
b. I have started shooting short videos to build up a tutorial. I think there is space for about 6 more movies. Please leave your feedback or produce your own video.
c. I need to make a sample schema and data set that everybody could see and play with.
d. If you use Flat File Checker extensively, please write an article or a business case and share it with other users. You can post them on the forum, in your blog, or here.

6. XSLT Schema
a. I don't know whether somebody uses IE to view the schema, but I find it quite handy. I want to change the way schema is presented slightly. Your thoughts on ways to improve transformation will be appreciated.

7. Data Rule Templates
a. The last, but not the least I need to create a GUI to create Query rules template files. This is a very powerful peace of functionality and I use it for data QA of marketing selections. Though it is possible to write template file from scratch or use Custom Query Xml as a base, it is not the easiest task and not something I like to do. Feel free to ask about Template Queries, while GUI functionality for them is under development consideration.
b. I’m also thinking about generic rule templates that will cover other then Query Rule templates.

Please post your wishes or vote here. You feedback is much appreciated!


UPD - June 2010:
I have done almost everything I wanted on duplicate records matching (releases after v.0.7.0.9), so will start working on improving of error logs and execution logs. Any requests are welcome.
UPD2 - June 2010:
Logs are now available in Html and Xml formats (releases after v.0.7.1.0)
UPD3 - Jan 2011
Now data preview is available which allows to edit original data files directly in Flat File Checker (see details here).