Dynamic Code Generation in C# With Expression Trees

Bagoum
15 min readFeb 12, 2023

--

Expression trees are a powerful C# feature which allow dynamically creating functions that run at compiled speed. In many cases, you can massage Func and Action to handle your dynamic code generation needs, but if you want maximum power and flexibility, you'll want to use expression trees.

In this article, we’ll go through how expression trees work, and then look at how they can be used in practice.

Pictured: Compiling a scripting language into a C# function (Func<float,float,float,ParametricInfo,float>) at runtime, based on the concepts presented in this article.

Introduction to Expression Trees

Expression trees (or expressions for short) are similar to abstract syntax trees in that they are a more machine-readable representation of C# source code. As opposed to MSIL, which is painfully low-level, expression trees are high-level and largely require only boilerplating normal C#. In a way, compiling and executing an expression is similar to running eval in Javascript or Python, except you get the type-safety, speed, and security of statically-typed and compiled C#.

Furthermore, even though you’re “compiling” code, you can still operate on the same in-memory objects as the rest of your C# code, so expression trees have the same context as any other code you write.

To use expressions, we first create an expression using the static methods in System.Linq.Expressions.Expression, and then compile it into a delegate or function type. Let's say we want to create an expression tree to mimic the following function:

private static int func(int x) {
x += 10;
return x * 3;
}

The expression for this is straightforward enough: there are two expressions, one add-assign expression and one multiply expression.

using System.Linq.Expressions;
using Ex = System.Linq.Expressions.Expression;
...

ParameterExpression x = Ex.Parameter(typeof(int));
Expression exFunc = Ex.Block(
Ex.AddAssign(x, Ex.Constant(10)),
Ex.Multiply(x, Ex.Constant(3))
);

The value “returned” by Ex.Block is the value of the last expression.

ParameterExpression can be used for local variables or function parameters. In this case, we want x to be a function parameter, so we pass it to the compilation function:

Func<int, int> fun = Ex.Lambda<Func<int, int>>(exFunc, x).Compile();

Then we can execute it like any other Func:

Console.WriteLine(fun(42)); //=156

We could also make this a ref function if we wanted — more on this later in this article!

You can think of expressions as the ability to dynamically add new code, specifically functions, within the context of a running program. You get the speed and safety of compiled C# code, but at the same time, you can refer to objects or even types that only exist at runtime, and you can operate over types in a way that isn’t possible with normal C#.

In other words: expressions are the best of all worlds, with the one sticking point that they are extremely verbose.

Possible Use Cases

Let’s say that you want the ability to represent arbitrary mathematical functions (or, the more practical case, query selectors). In most use cases, the most complex functions are created by composing simple functions. For example, the function for calculating the polar radius of a n-pointed star, which is a “complex” function, is r(theta) = R cos(2f) / cos(mod(theta, 4f) - 2f) where f=pi/n. This complex function only requires multiplication, division, modulo, and subtraction operations.

If you knew what complex function you were going to need ahead of time, then you could write something like:

Func<double, double> StarPolarRadius(double R, double n) {
double f = PI / n;
return theta => R * Math.Cos(2*f) / Math.Cos((theta % (4*f)) - 2*f);
}

What if you don’t know the function ahead of time, and are instead given some kind of JSON describing the mathematical function that an API caller wants? This is where you might need to get into reflection or dynamic code generation.

You could use reflection to invoke each method in the description, such as Cos or * or /, whenever the function is called. However, if the output function is to be executed multiple times, then you run into some trouble, since you might have to reflect each method in the description every time the output function is called. There are ways to tiptoe around this with CreateDelegate, and you also have the option of rewriting all the basic functions as Func<double, double>, so that you can reflect them once and then invoke them repeatedly. In this case, your math helper library might end up looking something like this:

public delegate double MathFunc(double t);

static MathFunc Cos(MathFunc arg) {
return t => Math.Cos(arg(t));
}
static MathFunc Divide(MathFunc arg1, MathFunc arg2) {
return t => arg1(t) / arg2(t);
}
static MathFunc Identity() {
return t => t;
}
...

Then, if someone asked us to create a function cos(x)/x, we could use reflection to create a nested lambda Divide(Cos(Identity()), Identity()) once, and then we could execute it for each value of the function we need to inspect. This solution is workable in most cases. However, there are several issues with it, such as the question of how we handle basic features like the local variable double f = PI / n in the example code.

More flexible than reflection is dynamic code generation with expression trees. Instead of creating a deeply nested lambda that executes our desired function, we will create a deeply nested lambda that generates code for our desired function. We will then run this lambda once, compile the expression into a function, and cache the compiled function. The general structure for that looks like:

using System;
using System.Linq.Expressions;
using Ex = System.Linq.Expressions.Expression;

public delegate double MathFunc(double t);
public delegate Ex ExMathFunc(Ex t);

static class Program {
private static ExMathFunc Cos(ExMathFunc arg) {
return t => Ex.Call(typeof(Math).GetMethod("Cos", new[] {typeof(double)}), arg(t));
}
private static ExMathFunc Divide(ExMathFunc arg1, ExMathFunc arg2) {
return t => Ex.Divide(arg1(t), arg2(t));
}
private static ExMathFunc Identity() {
return t => t;
}
private static void Main(string[] args) {
//This might be created by arbitrary runtime reflection lookup instead
ExMathFunc ex_emf = Divide(Cos(Identity()), Identity());

ParameterExpression t = Ex.Parameter(typeof(double));
MathFunc emf = Ex.Lambda<MathFunc>(ex_emf(t), t).Compile();
Console.WriteLine(emf(2)); // cos(2)/2 = -.208
}
}

A few things to note here:

  • The code for ExMathFuncs is almost the same as for MathFuncs, it just uses “expressionified” syntax.
  • We still create a nested lambda (ex_emf), but we execute it immediately (by calling ex_emf(t)) and then discard it. The reason we generate a nested lambda at all is because compiling functions from expressions requires the input ParameterExpression to be passed into all levels of the expression tree, and it's cleanest to do that via a lambda. Also, allowing the child expression trees to be generated at arbitrary times permits some interesting functionality— see the section on Local Variables.
  • The nested lambda yields an expression, which is compiled along with the ParameterExpressions representing function arguments into our target MathFunc.

With expressions, the compilation step (Ex.Lambda<...>(...).Compile) is expensive, but the execution step (emf(2)) runs at native speed, since you're effectively running a C# compiler when you compile an expression. Since it’s basically a C# compiler, we can also handle other basic features like local variables without too much type trouble. As such, expressions are useful when you need to generate complex dynamic functions that will be run multiple times.

Handling Multiple Arguments

The code above assumes that we’re only interested in making functions of one argument. Instead, we can easily generalize expression handling to handle any number of arguments, without significantly changing the function library. This is much easier to do in expressions than it is in raw C#.

First, we’re going to store all the arguments in a class which will be the single object that we pass to all helper functions. This class will be ArgBag, and each argument will be stored in a DelegateArg. We'll write a function GetArgument to get arguments by their name, and a function CompileDelegate that uses the arguments in the ArgBag to compile a lambda of type D.

public interface IDelegateArg {
string Name { get; }
ParameterExpression Expr { get; }
}

public class DelegateArg<T> : IDelegateArg {
public string Name { get; }
public ParameterExpression Expr { get; }
public DelegateArg(string name) {
Name = name;
Expr = Expression.Parameter(typeof(T), name);
}
}

public record ArgBag(params IDelegateArg[] Arguments) {
public Expression GetArgument(string name) {
foreach (var a in Arguments)
if (a.Name == name)
return a.Expr;
throw new Exception();
}

public D CompileDelegate<D>(BagMathFunc body) {
var expr = body(this);
return Ex.Lambda<D>(expr, Arguments.Select(a => a.Expr)).Compile();
}
}

For our math helper functions, dealing with ArgBag is as simple as changing the types. However, the Identity function is now meaningless, since there may be multiple arguments. Instead, we'll use a Var function, which will pull values out of the ArgBag using the GetArgument function.

public delegate Ex ExMathFunc(Ex t); // Old type
public delegate Ex BagMathFunc(ArgBag x); // New type

private static BagMathFunc Cos(BagMathFunc arg) {
return t => Ex.Call(typeof(Math).GetMethod("Cos", new[] {typeof(double)}), arg(t));
}
private static BagMathFunc Divide(BagMathFunc arg1, BagMathFunc arg2) {
return t => Ex.Divide(arg1(t), arg2(t));
}
private static BagMathFunc Var(string arg) => bag => bag.GetArgument(arg);

Now we can easily create functions of any number of arguments, since arguments are referred to by name. For example, here’s a function that divides two numbers:

var bag = new ArgBag(new DelegateArg<double>("x"), new DelegateArg<double>("y"));
var divide = bag.CompileDelegate<Func<double, double, double>>(Divide(Var("x"), Var("y")));
Console.WriteLine(divide(24, 5)); //=4.8

It’s very important to understand that the overhead of locating arguments via GetArgument is a one-time overhead applied in bag.CompileDelegate. The expression that is constructed in CompileDelegate simply takes the first argument and divides it by the second argument. If you put a breakpoint after var expr = body(this), you can see that the expression expr is just (x / y). The x expression is the result from running GetArgument("x") once, and the y expression is the result from running GetArgument("y") once. This is one of the major benefits of expressions for runtime code generation: all the overheads and abstractions become a one-time cost, and the resulting function is as cheap as precompiled C# code.

Consider for a moment what would happen if we tried to generally handle raw C# functions with this argument-bag approach, where instead of IDelegateArg[] we had (string name, double value)[] holding the values of arguments. In that case, since there would be no "compilation" step, the overhead of lookup up argument names in the argument-bag would occur every time the function is executed!

Furthermore, the raw C# approach requires runtime type casting if we want to support multiple types — this overhead doesn’t exist for the expression-based approach since typechecking is done only once when we run Ex.Lambda. For example, our expression code doesn't necessitate doubles, except for the Cos function, since it references the method double Math.Cos(double). As such, we could write a division function using ints:

var bagI = new ArgBag(new DelegateArg<int>("x"), new DelegateArg<int>("y"));
var divideI = bagI.CompileDelegate<Func<int, int, int>>(Divide(Var("x"), Var("y")));
Console.WriteLine(divideI(5, 2)); //=2 (integer division)

This kind of type generality is very difficult to deal with in raw C#. If you try to generally handle arguments of arbitrary types in raw C#, you’ll need to do a bunch of boxing and casting, and all of it will, again, need to occur when the function is executed.

Local Variables

Let’s say you need to create a function at runtime that looks something like this:

double MyFunction(double x, double y) {
double z = ExpensiveFunction(x, y);
return z * z + z - Math.Pow(z, z);
}

How can we save a local value for z so that we don't end up recomputing it? It turns out that it's easy to extend ArgBag to handle local values.

First, let’s add a stack of local values in ArgBag, and let’s make it possible to query these local values in GetArgument:

public record ArgBag(params IDelegateArg[] Arguments) {
public Stack<(string, Expression)> LocalValues { get; } = new(); //New
public Expression GetArgument(string name) {
foreach (var a in Arguments)
if (a.Name == name)
return a.Expr;
foreach (var (n, ex) in LocalValues) //New
if (n == name) //New
return ex; //New
throw new Exception();
}

public D CompileDelegate<D>(BagMathFunc body) {
var expr = body(this);
return Ex.Lambda<D>(expr, Arguments.Select(a => a.Expr)).Compile();
}
}

Then, we can simply add a function Let, which creates the local variable and makes it accessible via GetArgument:

private static BagMathFunc Let(string name, BagMathFunc value, BagMathFunc nested) => bag => {
var localValue = value(bag);
var localVar = Ex.Variable(localValue.Type, name);
bag.LocalValues.Push((name, localVar));
var expr = Ex.Block(new[] { localVar },
Ex.Assign(localVar, localValue),
nested(bag)
);
bag.LocalValues.Pop();
return expr;
};

In expressions, when we want to create new local variables, we have to use Expression.Block. The first argument to Expression.Block is a list of local variables, and the rest of the arguments are a params Expression[] or IEnumerable<Expression> of all the code we want to run with the local variables in scope. This Let function works by creating a local variable called localVar and assigning its value to localValue(bag).The name-variable pair is stored in bag.LocalValues, so name can be accessed via GetArgument within nested. Then, while localVar is in scope, it runs nested on the argument bag. Since the result of nested is the last expression in the block, it is the "return value" of the block.

For example, let’s pretend that addition is an expensive operation. in order to represent the function

double MyFunction(double x, double y) {
double d = x + y;
return Math.Cos(d) / d;
}

we can use the code

private static BagMathFunc Add(BagMathFunc arg1, BagMathFunc arg2) {
return t => Ex.Add(arg1(t), arg2(t));
}

var divideCos = bag.CompileDelegate<Func<double, double, double>>(
Let("d", Add(Var("x"), Var("y")),
Divide(Cos(Var("d")), Var("d"))));
Console.WriteLine(divideCos(6, 7));

If you set a breakpoint in CompileDelegate and look at the DebugView parameter of the constructed expression, you'll see that it looks like this:

.Block(System.Double $d) {
$d = $x + $y;
.Call System.Math.Cos($d) / $d
}

Once again, the overhead of accessing GetArgument("d") occurs only once, when the expression is constructed in CompileDelegate. The resulting compiled function is just as efficient as writing the function in precompiled C#.

Creating Ref Functions

Let’s say you want to create this function at runtime:

double MyFunction(ref double x) {
return Math.Cos(x = Math.Cos(x));
}

This function assigns x = cos(x), and then calls cos again on this value. Using our approach of nested function calls, we might try to write:

var myFunction = Cos(Assign(Var("x"), Cos(Var("x"))));

If we’re using expressions, it’s very easy to write an implementation for Assign. We can just use the Expression.Assign function.

private static BagMathFunc Assign(BagMathFunc target, BagMathFunc val) => bag =>
Ex.Assign(target(bag), val(bag));

This will work because Var("x"), as the target argument, returns the parameter expression by the name of x. However, try to imagine what would happen if we tried this not with expressions, but with raw C# function calls, where instead of BagMathFunc (type ArgBag -> Expression) we used type ArgBag -> double. Then, target would return a value instead of a parameter. We can't assign to a value! The following code fails to compile for any type T (ArgBag or otherwise) because the result of target is not assignable:

//This code fails to compile.
private static Func<T, double> Assign<T>(Func<T, double> target, Func<T, double> val) => t =>
target(t) = val(t);

Let’s show the rest of the work for handling ref functions in expressions. If we simply compile a Func<double, double> without ref, then the parameter won't be modified, but the output of the function will be correct:

var bagRef = new ArgBag(new DelegateArg<double>("x"));
var cosAssign = bagRef.CompileDelegate<Func<double, double>>(Cos(Assign(Var("x"), Cos(Var("x")))));
var db = 1.57;
Console.WriteLine(cosAssign(db)); //cos(cos(1.57)) ~ 1
Console.WriteLine(db); // 1.57, unchanged

In order to make this a ref function, the first thing we need to do is declare a delegate type, since you can’t use Func with ref types:

//Func<ref double, double> cannot be written in C#
private delegate double RefMathFunc(ref double x);

Then, we need to use MakeByRefType when creating the ParameterExpression in DelegateArg. To handle this, we can add a flag to the DelegateArg constructor:

public DelegateArg(string name, bool byRef = false) {
Name = name;
Expr = Expression.Parameter(byRef ? typeof(T).MakeByRefType() : typeof(T), name);
}

With this, we can now generally write ref functions without much trouble.

var bagRef = new ArgBag(new DelegateArg<double>("x", byRef: true));
var cosAssign = bagRef.CompileDelegate<RefMathFunc>(Cos(Assign(Var("x"), Cos(Var("x")))));
var db = 1.57;
Console.WriteLine(cosAssign(ref db)); //cos(cos(1.57)) ~ 1
Console.WriteLine(db); // cos(1.57) ~ 0.001

If you set a breakpoint inside CompileDelegate and examine expr, you can see that expr.ToString() pretty-prints the expression as Cos((ref x = Cos(ref x))), which is exactly what we need.

Embedding Runtime Context

Let’s say we have some data structure containing some runtime information (for example, the weather) and we want our dynamically constructed functions to have access to this runtime information, similar to a lambda closure. You might expect that this would be difficult with expressions, but it’s actually quite easy.

Let’s demonstrate this by dynamically creating a function that reads and writes from an underlying array, ie. dynamically generating code that does the following:

double[] myArray = new double[] { 0, 10, 20, 30 };

public double ReadWriteArray(int index, double newValue) {
var currentVal = myArray[index];
myArray[index] = newValue;
return currentVal;
}

First, we need a helper function to index into arrays:

private static BagMathFunc Index(BagMathFunc array, BagMathFunc index) => bag =>
Ex.ArrayAccess(array(bag), index(bag));

This helper function can be used both for read and write functionality.

Next, let’s extend our Let function so it can have multiple expressions nested within the scope of its local variables:

private static BagMathFunc Let(string name, BagMathFunc value, params BagMathFunc[] nested) => bag => {
var localValue = value(bag);
var localVar = Ex.Variable(localValue.Type, name);
bag.LocalValues.Push((name, localVar));
var expr = Ex.Block(new[] { localVar },
nested.Select(n => n(bag))
.Prepend(Ex.Assign(localVar, localValue))
);
bag.LocalValues.Pop();
return expr;
};

Now, we can dynamically construct our function using Let, Index, and Assign.

var myArray = new double[] { 0, 10, 20, 30 };
var bagArr = new ArgBag(new DelegateArg<int>("index"), new DelegateArg<double>("value"));
BagMathFunc array_expr = _ => Ex.Constant(myArray);
var arrAssign = bagArr.CompileDelegate<Func<int, double, double>>(
Let("currentVal", Index(array_expr, Var("index")),
Assign(Index(array_expr, Var("index")), Var("value")),
Var("currentVal")
));
myArray[1] = 100;
Console.WriteLine(arrAssign(1, 42)); //=100, the initial value of myArray[1]
Console.WriteLine(myArray[1]); //=42

There’s one oddity in this code: the line BagMathFunc array_expr = _ => Ex.Constant(myArray);. What this does is capture the value of myArray inside the expression, regardless of whatever is contained inside the ArgBag argument. Since an array is a reference type, this means that the expression will continue to refer to myArray as long as we do not completely reassign myArray. For example, if we ran the following code, the arrAssign function would not read the zeroes from the newly-assigned value of myArray.

myArray[1] = 100;
myArray = new double[] { 0, 0, 0, 0 };
Console.WriteLine(arrAssign(1, 42)); //=100
Console.WriteLine(myArray[1]); //=42

While this functionality is satisfactory for most cases, there are times when it’s not sufficient. For example, what if myArray is the output of an API call to WeatherService? In that case, we don't necessarily want to assume that it's persistent. In such a case, we can derive the array by getting it from a property of something that is persistent. For example, let's say the weather service is persistent. In that case, we could write the following code:

private class WeatherService {
private double[] weather = { 0, 10, 20, 30 };
public double[] Weather => weather;
public double[] GetWeather() => weather;

public void ResetToZero() {
weather = new double[] { 0, 0, 0, 0 };
}
}

var weather = new WeatherService();
var bagArr = new ArgBag(new DelegateArg<int>("index"), new DelegateArg<double>("value"));
BagMathFunc array_expr = _ => Ex.PropertyOrField(Ex.Constant(weather), "Weather");
var arrAssign = bagArr.CompileDelegate<Func<int, double, double>>(
Let("currentVal", Index(array_expr, Var("index")),
Assign(Index(array_expr, Var("index")), Var("value")),
Var("currentVal")
));
weather.Weather[1] = 100;
Console.WriteLine(arrAssign(1, 42)); //=100
Console.WriteLine(weather.Weather[1]); //=42
weather.ResetToZero();
Console.WriteLine(arrAssign(1, 43)); //=0
Console.WriteLine(weather.Weather[1]); //=43

The difference here is that array_expr now references the property WeatherService.Weather on the constant value weather. We could also use the method call Ex.Call(Ex.Constant(weather), typeof(WeatherService).GetMethod("GetWeather")).

But what if weather also isn't persistent? What if the active instance of WeatherService is a singleton that might change? In that case, we can use a static property to get a current value. Let's create a singleton implementation of WeatherService:

private class WeatherService {
private static WeatherService singleton = new();
public static WeatherService Singleton => singleton;
public static WeatherService GetSingleton() => singleton;

public static void ResetSingleton() {
singleton = new();
}

private double[] weather = { 0, 10, 20, 30 };
public double[] Weather => weather;
public double[] GetWeather() => weather;

public void ResetToZero() {
weather = new double[] { 0, 0, 0, 0 };
}
}

The expression pattern for static properties is Ex.Property(null, STATIC_TYPE, PROPERTY_NAME), and the expression pattern for static methods is Ex.Call(null, METHOD_DEFINITION, ARGUMENTS). Thus, we can now write the following code, requiring no persistent objects:

var bagArr = new ArgBag(new DelegateArg<int>("index"), new DelegateArg<double>("value"));
BagMathFunc array_expr = _ => Ex.PropertyOrField(Ex.Property(null, typeof(WeatherService), "Singleton"), "Weather");
//BagMathFunc array_expr = _ => Ex.PropertyOrField(Ex.Call(null, typeof(WeatherService).GetMethod("GetSingleton")), "Weather");
var arrAssign = bagArr.CompileDelegate<Func<int, double, double>>(
Let("currentVal", Index(array_expr, Var("index")),
Assign(Index(array_expr, Var("index")), Var("value")),
Var("currentVal")
));
WeatherService.Singleton.Weather[1] = 100;
Console.WriteLine(arrAssign(1, 42)); //=100
Console.WriteLine(WeatherService.Singleton.Weather[1]); //=42
WeatherService.ResetSingleton();
Console.WriteLine(arrAssign(1, 43)); //=10
Console.WriteLine(WeatherService.Singleton.Weather[1]); //=43

Conclusion

You’ve probably already come into contact with expressions before — specifically, in your favorite ORM’s LINQ-to-SQL functionality. When you write code like from user in users where user.Age > 24 select user, or users.Where(user => user.Age > 24), the ORM library treats user => user.Age > 24 as an expression of type Expression<Func<User, bool>>, and then inspects the expression structure to convert it into an actual SQL query. (See this article) The type Expression<DelegateType> is also what gets returned from Ex.Lambda, so you could actually manually write something like:

var user = Expression.Parameter(typeof(User));
var filter = Expression.Lambda<Func<User, bool>>(
Expression.GreaterThan(
Expression.Property(user, "Age"),
Expression.Constant(24)),
user);

... users.Where(filter);

Hopefully this article has given you a view into how expressions can be used in C#, and why you might want to use them. It’s often said that code is data. However, while data is easy to manipulate and inspect, code can often be quite opaque to deal with. In C#, expression trees are a means by which you can manipulate, inspect, and even execute code at runtime.

I’ve also written an article on converting expressions into source code, which shows how you can manipulate and restructure expressions after they’re created using the ExpressionVisitor pattern. This restructuring is about as complex as expressions get, though there’s much more complexity to runtime code generation in C# — for example, runtime type construction with MSIL, which is something I may write about in the future.

--

--

Bagoum
Bagoum

Written by Bagoum

Software engineer, epic gamer, and Touhou fangame developer.

No responses yet