ScriptCS Code Rewriting - What for?

Tags: ScriptCS, roslyn

Recently I submitted an idea to the ScriptCS team to support code rewriting. Being the smart people that they are, they liked the idea. Here I am going to describe some of the thinking behind the idea of being able to rewrite your code in a transparent way.

The easiest thing would obviously be to write the code that you want to execute. But code is just a means to an end, and you may want to modify your code for different situations. This is why people have come up with configuration files to define the variations that are required at runtime. One common variation point is defining a file location. If you are sharing your code across multiple machines, then it's likely that you will want to customize the location. You could revert to the configuration file approach, or you could rewrite all paths to what you need right now. This rewriting can happen statically by changing the script, or it can happen dynamically by integrating a rewriter into the code parsing.

Code rewriting can also be a more advanced scenario where you are rewriting the code based on a syntactic analysis of the code. An AOP like approach is to have the rewriter inject logging into the first line of all your methods. This is useful is you are trying to debug your code, but don't want to change the original code.

A more advanced scenario would be to write custom syntax and have a rewriter write proper C# code for you. An example would be to allow users to write scripts that use 'yieldmany' as an alias for yielding individual items in an IEnumerable<IEnumerable<T>>. Especially in a scripting environment it may be easier to define aliases instead of having to write out tedious syntaxes.

So how does it work? Rewriting will happen as part of the code parsing by rewriters which implement the ICodeRewriter interface. This interface is essentially a function that takes a string and returns a string. The dependencies are kept simple so that it doesn't make any assumptions about how you analyze or rewrite your code. Roslyn's compiler as a service is an obvious dependency, but may not be suitable in all scenarios, fx. if you are running under Mono.

public interface ICodeRewriter
{
	string Rewrite(string code);
}

In the simple scenario the rewriter could simply perform a regex substitution on some snippet, ex [[FileLocation]] and replace it with your desired path. This could look something like the code below, which is a generic substitution example.

public class RegexRewriter : ICodeRewriter
{
	private readonly Regex _pattern;
	private readonly string _substitution;

	public RegexRewriter(string pattern, string substitution)
	{
		_pattern = new Regex(pattern, RegexOptions.Compiled | RegexOptions.Multiline);
		_substitution = substitution;
	}

	public string Rewrite(string code)
	{
		return _pattern.Replace(code, _substitution);
	}
}

If we look at the other two scenarios where you will need to find matches based on a syntactic analysis then you can turn to Roslyn (or some other code analysis tool like Mono.Cecil) to handle the parsing for you. Roslyn comes with a SyntaxRewriter base class (Roslyn.Compilers.CSharp.SyntaxRewriter) which you can use as a visitor for your syntax tree. The following simply injects an invocation of a Log method on a logger, which is your standard AOP scenario:

public class LoggingInserter : SyntaxRewriter, ICodeRewriter
{
	public string Rewrite(string code)
	{
		var syntaxTree = SyntaxTree.ParseText(code, options: new ParseOptions(kind: SourceCodeKind.Script));
		return Visit(syntaxTree.GetRoot()).ToFullString();
	}
	
	public override SyntaxNode DefaultVisit(SyntaxNode node)
	{
		return node;
	}
	
	public override SyntaxNode VisitMethodDeclaration(MethodDeclarationSyntax node)
	{
		var loggingInvocation =
			Syntax.ExpressionStatement(
				Syntax.InvocationExpression(
					Syntax.IdentifierName("logger.Log"),
					Syntax.ArgumentList(
						Syntax.SeparatedList(
							Syntax.Argument(
								Syntax.LiteralExpression(
									node.Identifier.Kind, 
									node.Identifier))))));

		return node.WithBody(Syntax.Block(loggingInvocation)
			.AddStatements(node.Body.Statements.ToArray()));
	}
}

As mentioned above, rewriting can also be used for more interesting things like creating your own keywords and having a rewriter turn that into proper C# syntax. The following snippet shows how you can create a 'yieldmany' keyword which gets turned into nested foreach loops:

public class YieldManyKeyword : SyntaxRewriter, ICodeRewriter
{
	public string Rewrite(string code)
	{
		var syntaxTree = SyntaxTree.ParseText(code, options: new ParseOptions(kind: SourceCodeKind.Script));
		return Visit(syntaxTree.GetRoot()).ToFullString();
	}
	
	public override SyntaxNode DefaultVisit(SyntaxNode node)
	{
		return node;
	}
	
	public override SyntaxNode VisitLocalDeclarationStatement(LocalDeclarationStatementSyntax node)
	{
		var keyword = node.Declaration.Type.ToFullString().Trim();
		if (keyword == "yieldmany")
		{
			var variable = node.Declaration.Variables.First();
			var syntax = Syntax.ForEachStatement(
			Syntax.ParseTypeName("var").WithTrailingTrivia(Syntax.Space),
			Syntax.ParseToken("firstLevel").WithTrailingTrivia(Syntax.Space),
			Syntax.IdentifierName(variable.Identifier).WithLeadingTrivia(Syntax.Space),
			Syntax.Block(
				Syntax.ForEachStatement(
					Syntax.ParseTypeName("var").WithTrailingTrivia(Syntax.Space),
					Syntax.ParseToken("secondLevel").WithTrailingTrivia(Syntax.Space),
					Syntax.IdentifierName("firstLevel").WithLeadingTrivia(Syntax.Space),
					Syntax.Block(
						Syntax.YieldStatement(
							SyntaxKind.YieldReturnStatement,
							Syntax.IdentifierName("secondLevel").WithLeadingTrivia(Syntax.Space))))));
	
			return syntax;
		}
		return base.VisitLocalDeclarationStatement(node);
	}
}

These are just some of the possibilities that are available to let you focus on solving your task and not worry about writing clean correct syntax.

Latest Tweets