Monday, August 18, 2008

Parsing Flat files and new Data 2.0 - Part I

This is the first of a two part article on parsing a fixed length line delimited set of records into a C# Object. This month’s MSDN Magazine featured two articles that caught my eye. The first, was the Toolbox written by Scott Mitchell where he makes reference of an open source project filehelpers.com by Marcos Meli. I was excited to see this as a few of our projects need this type of support for manipulating text based data and this seemed to have all of the features that I wanted. The second article I will take a look at the new Data 2.0 features that were released with Visual Studio 2008 SP1 and .NET 3.5 SP1 (see Scott Hanselman’s posting).

Here’s the brief little demo that I want to build as a console application in .NET 3.5. Read a flat file containing prescribers. This happens to be defined by SureScripts and is a format that we use with our eScript Messenger product.

sample-1

There are 42 fields in the 4.0 format of this file. I fired up the FileHelpers Wizard and proceeded to type in the details on the 42 records. The wizard takes you through a series of four dialogs that ask you details on the record structure you will read (fixed length or delimited), the name of the class and the visibility of that class as well as a few more pages on how many fields each of your records will contain. There is support for ignoring the first N or last N lines, ignoring empty lines as well as identifying a comment marker. For the more advanced usages you can even selectively filter each of the records bases on conditions that you provide.

sample-2

sample-2aAbove is a sample that I entered with a preview of the class that the wizard creates for you. Take a look at the button in the lower left. That is a really nifty button that allows you to test your definitions before you copy/save the code files out to your project! Marcos really deserves a lot of credit for a well thought out application.

Before I gush with too many more accolades for FileHelpers, I should point out a couple of issues that I ran into while putting this together. While you can see the optional field checkboxes checked in the dialog above, I discovered that you can only check the Optional Field check box on fields if all of the following fields also have it checked as well. I was presented with an error dialog, but it would have been nicer to see that in the field designer. Hey its free/donation ware.

sample-2b

After I went back and removed all of the optional fields all was well and I was presented with a nice Grid View showing all of the data that I had in my sample data file. Now that is very nice as it reminds me of the old days when I imported flat files into Microsoft Access database and such.

sample-3

To finish this off I saved the generated class as as well as the handy template for loading the sample data, fired up Visual Studio 2008 and created a new console application. Below is the source to read in all of the records:

static void Main()
{

using (var engine = new FileHelperAsyncEngine<Prescriber>())

{

engine.BeginReadFile(@"..\..\sample.txt");

while (engine.ReadNext() != null)

{

Prescriber record = engine.LastRecord;

Console.WriteLine("{0}, {1}", record.LastName, record.FirstName);

}

}

}


sample-3aThat’s really all there was to it. I changed the original template a little to use .NET 3.5 var’s which I am liking more after my initial impressions of them. Incidently, I blame Resharper for that as it likes to provide too many of those helpful hints about changing the variable to an explicit type definition.

So there you have it. I had never used FileHelpers before and had a working prototype up and running in under 20 minutes. I saved the definition file for the flat file so I can go back and make changes to it in the future if necessary.

Here’s the console running in all of its glory. What would we do without our old friends Harry Winston and Anthony Cardino!


sample-4

In Part II I will show how to use the new .NET 3.5 (or 3.6 as Hanselman sez) and stich up some simple data access using LINQ and the new SP1 features.

No comments:

Post a Comment