Thursday, July 29, 2010

Querying the 4 table structure.

In my previous post, I described how four physical tables can be thought of as a means to create infinite virtual tables. One natural question you may have about this technique is "Ok, so how the heck do I query the virtual table?" It may seem a little challenging at first, but it is more than possible. Easy even.

Take our customer-process virtual table for example.


You need a query that will tell you every customer that has a max particle size over 1.5 inches. Remember that the table structure that drives this virtual table looks like this.


What we'll want is all the customer-process data results where the value is > 1.5 and the process-info-detail is Cement-Mixing's Max Particle Size.

That query simply looks like this :

SELECT CPD.CustomerProcessID
FROM Cust-Process-Data CPD
INNER JOIN Process-Info-Detail PID ON CPD.ProcessInfoDetailID = PID.ProcessInfoDetailID
INNER JOIN Process-Info PI ON PID.ProcessInfoID = PI.ProcessInfoID
WHERE PI.Process = 'Cement Mixing' AND PID.ProcessInfo = 'Max Particle Size' AND CPD.Value > 1.5

Or, to genericise it,

SELECT RowHeaderID
FROM CellData CD
INNER JOIN ColumnHeader CH ON CD.ColumnID = CH.ColumnID
INNER JOIN Table T ON CH.TableID = T.TableID
WHERE T.TableName = '{Desired Table}' AND CH.ColumnName = '{Desired Column}' AND CD.Value {ValueClause}


What I'll typically do with this clause is throw it in a sub select and join against the result. Pretend the above SQL format is abbreviated as Query in the following examples.

SELECT CustomerID
FROM CustomerProcess CP
INNER JOIN (Query) Q1 ON CP.CustomerProcessID = Q1.CustomerProcessID

And BAM we've got our results. This query structure is flexible enough to mimic 'and' and 'or' clauses against the underlying virtual tables. To get an and, you simply need to inner join the different subselect queries- the result set will only include rows that meet all the provided criteria - like so.


SELECT CustomerID 
FROM CustomerProcess CP 
INNER JOIN (Query) Q1 ON CP.CustomerProcessID = Q1.CustomerProcessID
INNER JOIN (Query2) Q2 ON Q1.CustomerProcessID = Q2.CustomerProcessID

And to mimic an 'Or' clause, you can simply union the sub selects. Cool, huh?

Wednesday, July 21, 2010

IProvider to sidestep dependency problems.

This is another not-new concept, but it's one that I haven't found explained simply and might benefit some people.

When you have a three tier structure like I've suggested, and you've got a process that requires looking though a large amount of data for a matching item, it may seem like you've got yourself in a pickle. Since the Data  assembly is dependent on the Core assembly, Core can not use Data directly. The first instinct may be to just relate all the Core objects and set up mappings to permit this data to be lazily loaded during the process by NHibernate. This does work, actually, but it has a few smells to it. First, Core is now being modified by our data concerns. Sometimes these relationships are appropriate from a purely business need, but many times they are not. Second, Data is now being modified by our Core concerns. The mapping tweaks such a solution would probably require to be palatable are not likely to be useful in other areas of the system.

So how else would you do this? Using a provider.

The first step is to define, using an Interface in Core, the query you want to run. Lets say you want to grab all orders above a specific quantity shipped to a specific state. Maybe you want to step though them to find out their frequency and see if you should advise a later ship date to save on freight by combining shipments. Something like that.

public interface IQueryOrders 
{
     IEnumerable FindOrders(string ToState, int AboveQty);
}

 Next, you want a place for this to live in your Core assembly.

public static class Providers
{
      public static IQueryOrders OrdersProvider { get; set; }
}

Now, you need to create this provider in the Data assembly.

public class OrderProvider : Core.IQueryOrders
{
     public IEnumerable FindOrders(string ToState, int AboveQty)
     {
          // Your Query logic goes here. 
     }
}

So now we've got the Core assembly setting up a contract, and the Data assembly setting up a facility to handle that contract. All that's left is to connect one to the other. I do this in the UI assembly.

Core.Providers.OrdersProvider = new Data.OrderProvider();

And finally, you can write your code in Core to consume the output of this provider and act upon it. Not ground breaking, but simple, and useful.

Monday, July 19, 2010

Three Tier Structure on a Desktop Application.

When you are developing in Visual Studio, the urge is to make a monolithic application. Visual Studio practically begs you to. It calls out to you with the siren's song of WYSIWYG DB structure diagrams. It tempts you with embedded data providers that serve their function with the minimal amount of code required. It seduces you with  auto-generated CRUD functionality that can keep you from having to even think about SQL. Lash yourself to the mast my friends. Don't give in. The up side to these monolithic applications is that they provide a shorter route from File -> New-> Project to something you can run and play with. The down side, however, is everything else.

Monolithic applications built in this manor are difficult, if not impossible to unit test. They are difficult, if not impossible, to migrate from DB structure to DB structure. They make it difficult for you to truly own your Business Logic and not have the class structure dictated to you by a one-to-one relationship between class and table. They limit your freedom, your ability to maintain your work, and, in the long run, are much more expensive to upkeep. 

What you want is an application architecture that permits unit testability. One that separates your concerns so that your Data Structure and your Business Logic class diagram only look similar when that similarity fits the needs of both functions. A design that permits you to minimize the unintended impact of code changes as much as humanly possible, and to keep the number of lines required to change any given aspect of it's functionality to the bare minimum. A design that allows you to have more than one UI, if needs dictate.

The most simple structure that provides for each of these needs is the three tier structure. It looks like this.

A Core project holds all your business logic. This is where you define the entities that you require to check off all the requirements on the spec sheet. These entities should be entirely code driven- unaware of the fact that they exist on SQL or in any other database structure. Unaware of the fact that they are being manipulated by a Windows program or a Webservice or any other front end. Do what I do, and don't permit the core project to reference System.Data or System.Windows. The reference direction will prohibit it from accessing functions in the Data and UI assemblies. 

The Data project is where you want all your SQL interaction to reside. It will consist of the mappings of the persist-able business objects to the database behind the scenes, as well as any-and-all direct queries that the UI might need to run. The Data assembly should not know nor care what UI technology it is providing data for, and should not be intelligent about the purpose of the Core objects. As far as Data is concerned, it's only function is to create Core objects and return them to whatever wants them. I prefer NHibernate for a DRM solution in this assembly.

Finally, the UI layer is where you want any and all code relating to the presentation of data and interaction with the end user. This technology could be WPF, or WinForms, or webforms, or Silverlight, or even a webservice. It is whatever makes the most sense. But whatever it is, it should not contain any business logic or sql interaction code. 

The hardest part of this structure is keeping yourself honest. It is tempting, hell, even easy, to write business logic in the UI layer by accident. A line of code that starts off as a simple call to a parameterless constructor one day may eventually morph into several lines that instantiate a business object and set up its default values, such as the creation date and name of creator. When requirements for prompts during certain procedures come along, it is very tempting to simply write a call to MessageBox in the Core assembly. 

These temptations must be resisted. 

Friday, July 16, 2010

Bliss and the Four Table Structure.

One of the most common design considerations I need to make in laying down a data structure is the flexibility that the end-user requires. Sometimes it's just the ability to modify what shows up in a drop down box- and then they get a lookup table. Sometimes it's the ability to create and apply properties to business objects- and then they get my standard tag structure (but that's another post). But sometimes, what the end-user really requires is the ability to create their own data structure.

Let's say your client wants to be able to attach information to their customer entity. But not just one piece of information- entire sets of it. If their customer deals with a specific process, they want to be able to record a great amount of detail about that process. And they want to be consistent about how they record it across many customers. A notes field is not going to lend itself to that structure. Let's further say that they might want to change what they record about this process tomorrow, and they don't want to have to wait for you to show up, modify a table and re-design a form to have a field for this new piece of data.

The data that goes into this structure just makes sense. You would want to capture this stuff anyway, and this format for capturing data works pretty well for allowing your end user to define what they want to capture without a need for your intervention.


I call it the four table structure. It defines what I like to think of as Virtual Tables. This is not ground breaking stuff. You've probably already made a few data structures that act this way without realizing it. What might be different about this is thinking of it as a virtual table. If you were to take the data from our example up there and envision it as an actual table using these mappings, the result would look something like this.



It's a table!  It's a freaking table! Made out of data!

Thursday, July 15, 2010

Desktop applications and NHibernate

I've been using NHibernate almost exclusively for the last 2 years for all my data access needs. That in an of itself is not remarkable. What does feel remarkable, to me at least, is that I don't really do any web development. Or maybe that isn't remarkable. I'm not sure.

What I am sure about is that most of the documentation and examples that you commonly encounter with NHibernate are about Blogs, or web stores, or something equally non-desktop based. Maybe as a direct result of this, it took me a good six months of banging my head against the NHibernate learning curve before everything about it really clicked. So, without further ado, here are some things I've picked up.

ITEM THE FIRST: An ISession and a database transaction are not the same thing. An ISession and a database transaction are not intended to be the same thing. Sessions should open and close transactions every time they interact with the database, but you should never, ever open a transaction and keep it open the entire time the ISession is alive. This will result in table locks galore and more headaches than you will know what to do with.

That being said, ISessions make sense to hold open in relation to a form- it enables first level caching as a performance booster and most users will intuitively understand that they will sometimes have to close and open or reload a form to see updates


Unfortunately, this is not one of those problems that single developer testing on a development database is going to enlighten you to. Keeping a Transaction open the entire time a form is open so you can quickly and easily roll back changes might seem like a good idea on the surface! IT IS NOT.

ITEM THE SECOND: Just because you have "Lazy" Loading doesn't mean you get to be. In my specific application domain, I have several high level domain wide objects or collections of objects that almost everything is related to, like "Company" and "Employee". In earlier versions of my domain, I had these objects related though object references with all the child objects that belonged to them. This resulted in some hideous N+1 problems.

For the uninitiated, an N+1 problem is when, for every result returned in a query, you run an additional query to fetch related data. If I wanted to show a list of some child object, and include on that list the name of employee who was the owner of that object, I face planted on an N+1 problem. N+1 is probably the biggest trap to avoid when working with NHibernate, because it will destroy your performance.

The internet will probably point you towards a more web oriented solution to these issues- which is to tweak your query plan, or your mapping files to perform batch selects. There is nothing wrong with these approaches, but they aren't ideal. You need to know ahead of time what kinds of queries you are going to run to tweak your query plan, and if you edit the mapping files you may introduce some unintended consequences. But we don't have to deal with unknowns in terms of the life span of the assembly, nor do we have to worry about extending the length of our first request and creating uneven performance, so there are other options.

The plan I settled on was to pre-load these kinds of things. On application_load, I run a series of queries which loads the non-changing parts of the domain into some application static objects. The child objects that refer to these get loaded with only a integer pointing to the unique ID of the static object in a private field from NHibernate, and then I have a simple property that wraps up access to this to make it transparent to the user that I am, in fact, not dealing with these things on a purely object level.

Not exactly ground breaking, but not exactly something that naturally occurs to people, or me at least.