malloryCode.com

Dynamic Predicate Building - Part 1

Nov 03 2013

There are many great blog posts on dynamic predicate building for c# and LINQ, many of which center on the venerable Alabahari PredicateBuilder. And in most of those blog posts there is a comment to the effect: But what if I don't know which fields/tables will be queried? Well that is a requirement many developers would like to avoid, but, of course, it does happen.

My own crack at answering that question came when I was tasked with building a query builder, for non technical users, with the requirement that the users could use any field in the database to query on, and any field in the database as an output. They could combine as many query expressions as they wanted, and (the fun part), they could group the query expressions as they saw fit. And sort on any output field. The data model (I used NHibernate) was fairly complicated with about 50 or so main entities and 50 lookup type entities.

So the Alabahari Predicate Builder was not going to work for me. I found several projects that got me close, but it ultimately did not work because of the complex nature of my data model. I would have to generate predicates such as: context.Where( x => x.Child.Toy.Manufacturer.Name == "Hasbro" ) or, one of the entities in the chain could be a collection: context.Where( x=> x.Child.Toys.Any( z => z.Manufacturer.Name == "Hasbro" ) ) given just text strings from form submissions. I did find Pete Montgomery's post about predicate building very useful, and I acutally started mine from his.

This was a web application, so the front end was all HTML and javascript. In brief, I iterated through the data model server side and created JSON, which ended up being rendered as a tree control. I did this so that the user could visualize the relationships between entites. Using the tree, they would select a field, then enter their criteria. Each criteria would go into a list - and behind the scenes I was tracking the fields selected and criteria entered in a format that was conducive to using in my predicate builder.

For each item in the criteria list, the field was submitted as a list of strings which completely described it within the data model. From the first example above, the field would have been indicated as { "Child", "Toy", "Manufacturer", "Name" }, or what I refer to as the type chain. And this is what would get passed into the predicate builder. How exactly I that array was constructed is the subject of another post.

public static Expression<Func<TEntity, bool>> Build<TEntity>( List<string> typeChain, 
                                                              OperatorType operation, 
                                                              object value1, 
                                                              object value2 )

This is the method signature of the main function. The ultimate property being queried on would be the last of the chain in the typeChain list. The list would start with the first property after the root type, and include the property name of each property between the root type and the query field, as in the example above. There is an Enum for the type of operation, and the values for comparison. I have two possible values to support the Between operator.

So let's dig into it. The first thing you need to do is get the root parameter expression. A lambda expression starts with ( x =>, x goes to ..., so x is the starting parameter expression. I have a generic Build method, so the root expression would be of type TEntity. This method is designed such that the whole parameter will build up from that root type.

ParameterExpression rootExp = Expression.Parameter( typeof( TEntity ), 
                                                    typeof( TEntity  ).Name.Substring(0,1));
Type dataType = typeof( TEntity );

The second parameter of Expression.Parameter() is just the name of the variable in the predicate. And we grab the TEntity type, since we will need to get at its properties. Next we start off with a null System.Reflection.PropertyInfo, which will get populated based off our links string array. I also use another Expression, set to the root expression. The root expression will not change within the method, but the other Expression will be changed. At various points, the original expression is needed.

PropertyInfo pi = null;
Expression expr = rootExp;

Then we loop through the typeChain list, and build up our expression until we reach the end. Here is the full loop, which I will explain from the bottom up, from the simple case to the more complex case. However, the first step is just to convert that string (property name) into a concrete PropertyInfo object.

for(int counter = 0; counter < typeChain.Count(); counter ++ )
{
    pi = dataType.GetProperty( typeChain.ElementAt(counter) );

    if( pi == null )
    {
        // possible if original list consists of collection and collection's 
        // parameters (i.e. Parents.Any(x=>x.Child.Name == 'Tony') )
        // if so, entire typeChain list consumed by recursive call. 
        // Return the inner predicate, which is in expr
        return Expression.Lambda<Func<TEntity, bool>>( expr, rootExp );
    }
    else if ( pi.PropertyType.Name == "IList`1" )
    {
        // call back into this function, but return is parameter for 'Any'
        Type collectionType = pi.PropertyType.GetGenericArguments().ElementAt( 0 );

        // any method
        MethodInfo anyMethod
            = typeof( Enumerable ).GetMethods()
                                  .Where( m => m.Name == "Any" && m.GetParameters().Length == 2 )
                                  .Single()
                                  .MakeGenericMethod( collectionType );

        // generic version of this method
        MethodInfo genericMethod = typeof( Instance ).GetMethod( "Build", BindingFlags.Public | BindingFlags.Static );
        MethodInfo genericBuild = genericMethod.MakeGenericMethod( collectionType );

        List<string> newtypeChain = typeChain;
        // may arrive here after several iterations
        for(int r = 0; r <= counter; r++)
        {
            newtypeChain.RemoveAt( 0 );
        }
        Expression innerExpression = (Expression)genericBuild.Invoke( null, 
                                                                      new object[] {
                                                                         newtypeChain,
                                                                         operation,
                                                                         value1,
                                                                         value2 }
                                                                    );

        Expression collectionExpression = Expression.Property( expr, pi );
        expr = Expression.Call( anyMethod, collectionExpression, innerExpression );
    }
    else
    {
        expr = Expression.Property( expr, pi );
    }
    dataType = pi.PropertyType;
}

The final else is the simple case. pi is a scalar property of the root type and we build up expr based on that property. For example, if TEntity was type Parent, and links[counter] was equal to 'Child', expr would become p.Child, where p is our root parameter expression.

At this point, we could return an Expression<Func<Parent, Child>> like so:

return Expression.Lambda<Func<Parent, Child>>( expr, rootExp ); 

which, in a LINQ method, would look like:

context.entities.Select( p => p.Child )

Continuing the loop expr would become p.Child.Toy.Manufacturer.Name, if each string in our list represented a scalar or navigation property. But what if the property was a collection? If our data model was Parent.Child.Toys, the PropertyType name for Toys would begin with IList`1.

Note that I set all this up using NHibernate, and my collections were configured as Bags. If you are using Entity Framework, or some other ORM, it's possible the collections would be detectable by some other means.

So, when a navigation property is encountered that is a collection, our lambda expression is going to have to use the Any method before delving deeper into the data model. Since, for example, Toys does not have any scalar properties, such as Manufacturer, the parameter will need to be structured as: Toys.Any( t => t.Manufacturer == 'Hasbro'). To achive that, we need to build an inner lambda. Since our Build method delivers lambdas for any type in the data model, we can recursively call the Build method to do just that.

So let's take a closer look at how to achieve that. First, we need to get the Type of the collection. And there is a handy method for doing just that.

Type collectionType = pi.PropertyType.GetGenericArguments().ElementAt( 0 );

Next we need to get a MethodInfo object, which is a generic version of the Any method for the current collection type:

MethodInfo anyMethod
    = typeof( Enumerable ).GetMethods()
                          .Where( m => m.Name == "Any" && m.GetParameters().Length == 2 )
                          .Single()
                          .MakeGenericMethod( collectionType );

Then a generic version of the Build method:

MethodInfo genericMethod 
    = typeof( PredicateBuilder ).GetMethod( "Build", 
                                            BindingFlags.Public | BindingFlags.Static );
MethodInfo genericBuild = genericMethod.MakeGenericMethod( collectionType );

There is some maintenance on the typeChain string array, and then actually creating the inner expression (that will go into Any()). When this call to Build returns, it's going to have the inner expression, i.e. t => t.Manufacturer.Name == 'Hasbro'. That inner expression is the predicate for the Any call.

Expression innerExpression = (Expression)genericBuild.Invoke( null, 
                                                              new object[] {
                                                                 newtypeChain,
                                                                 operation,
                                                                 value1,
                                                                 value2 }
                                                            );

which is done by invoking the generic version of the Build method. The last line:

expr = Expression.Call( anyMethod, collectionExpression, innerExpression );

is what creates the Toys.Any() part of the expression. Which leads to the special case that is what the first check in the for loop is for.

if( pi == null )
{
    // possible if original list consists of collection and collection's 
    // parameters (i.e. Parents.Any(x=>x.Child.Name == 'Tony') )
    // if so, entire typeChain list consumed by recursive call. 
    // Return the inner predicate, which is in expr
    return Expression.Lambda<Func<TEntity, bool>>( expr, rootExp );
}

In other words, this may happen if the entire type chain is consumed by recursive calls. When those calls return, and the original for loop continues, the dataType variable will be equal to the property type of the collection encountered, but next property name in the type chain will not be found.

OK, the predicate, in terms of selecting that end scalar property, has been put together. Now it's time to finish by adding in the criteria. The values coming in are converted to intrinsic types (they come in as objects), via the helper method ProperType() and a ConstantExpression is created.

// values
ConstantExpression constExp1 = Expression.Constant( ProperType( pi, value1 ) );
ConstantExpression constExp2 = Expression.Constant( ProperType( pi, value2 ) );

All that's left it to put it together, and return the predicate.

// put together
Expression compareExp = CompareExpression( expr, constExp1, constExp2, operation );
return Expression.Lambda<Func<TEntity, bool>>( compareExp, rootExp );

When Build<> returns, if you examine the returned predicate's DebugView at a breakpoint, it will look like this:

Debug View

You can see how LINQ will nest the predicates.

Let's take a quick look at CompareExpression, which is a helper method which matches the operation to perform to an actual method utilized by LINQ.

switch(operation)
{
    case OperatorType.Greater:
        return Expression.GreaterThan(paramExp, constExp1);
    case OperatorType.GreaterEqual:
        return Expression.GreaterThanOrEqual(paramExp, constExp1);
    case OperatorType.Less:
        return Expression.LessThan(paramExp, constExp1);
    case OperatorType.LessEqual:
        return Expression.LessThanOrEqual(paramExp, constExp1);
    case OperatorType.NotEqual:
        return Expression.NotEqual(paramExp, constExp1);
    case OperatorType.Between:
        Expression expGT = Expression.GreaterThanOrEqual(paramExp, constExp1);
        Expression expLT = Expression.LessThanOrEqual(paramExp, constExp2);
        return Expression.AndAlso( expGT, expLT );
    case OperatorType.CompareTo:
        // working, but not a bool - it returns int
        MethodInfo c = typeof(string).GetMethods().Where(x=>x.Name == "Compare")
                                                  .Where(x=>x.GetParameters().Length == 2)
                                                  .First();
        return Expression.Call(c, paramExp, constExp1);
    case OperatorType.Like:
        MethodInfo m = typeof(string).GetMethod("Contains", new Type[] { typeof(string) });
        return Expression.Call(paramExp, m, constExp1);
    case OperatorType.StartsWith:
        MethodInfo s = typeof(string).GetMethod("StartsWith", new Type[] { typeof(string) });
        return Expression.Call(paramExp, s, constExp1);
    case OperatorType.Blank:
        if(paramExp.Type.Name == "Int32")
        {
             return Expression.Equal( paramExp, Expression.Constant(0) );
        }
        return Expression.Equal( paramExp, constExp2 );
    default:
        return Expression.Equal(paramExp, constExp1);
}

As you can see, I never did get OperatorType.CompareTo to completely work. Some of the others were a bit tricky too. For Between, I used the predicate combination code from Peter Montgomery mentioned above.

For some, I had to use the Expression.Call() method as no built in method was available. The whole function is included in the example project.