How To Make A DSL, Hygienically
In this article, I’m going to show you how to implement a DSL like this:
xml version: '1.0', encoding: 'UTF-8'
weather at: @time.iso8601 do
description @description
temperature "#{@temp} C"
wind do
velocity "#{@wind_vel} kts"
direction @wind_direction
end
end
That produces XML like this:
<?xml version="1.0" encoding="UTF-8"?>
<weather at="2016-11-29T22:54:15+11:00">
<description>Bright & sunny.</description>
<temperature>18.3 C</temperature>
<wind>
<velocity>14 kts</velocity>
<direction>SSE</direction>
</wind>
</weather>
Using code like this:
template = RXT::Template.from_file('weather.rxt')
puts template.render(
time: Time.now,
description: 'Bright & sunny.',
temp: 18.3,
wind_vel: 14,
wind_direction: 'SSE',
)
I’ve given this Ruby XML templating language a spicy, exotic name: Ruby XML Template (RXT).
I know, XML doesn’t make for the most exciting DSL, and the usefulness of this particular DSL is questionable. But it will do nicely for the purpose of demonstration. It gives me the opportunity to showcase some of the more bendy, flexible parts of Ruby.
All the code is available on GitHub: tomdalling/ruby_xml_template
What Is A DSL?
Domain Specific Languages (DSLs) are custom-made computer languages, designed to be convenient for specific tasks. The DSL in this article is a shorthand for making XML documents, instead of something like this:
document = Ox::Document.new(version: '1.0', encoding: 'UTF-8')
weather = Ox::Element.new('weather')
document << weather
description = Ox::Element.new('description')
description << 'Bright & sunny.'
weather << description
# and so on...
In most other programming communities, DSLs are truly custom languages – with custom syntax, parsers, interpreters, compilers, and all that heavy-duty stuff.
In Ruby, DSLs are typically just Ruby code. These are much easier to implement, because you don’t need to write a parser or anything like that – you just use the parser built in to Ruby. This limits the DSL to the syntax of Ruby, but this is often an acceptable trade-off given that Ruby is such a flexible language.
When Should You Make A DSL?
In my opinion, DSLs are best used for making repetitive programming tasks more convenient. If you’re implementing new kinds of XML documents every day, and your XML gem is a bit cumbersome to use, a DSL could be helpful.
But beware – there are hidden costs. A bad DSL is worse than no DSL at all.
DSLs add an extra layer of complexity to your code. The combination of X + DSL will always be more complicated than X alone. You need to weigh this complexity against the convenience you are gaining.
DSLs are also notorious for being inflexible. They are designed to accomplish specific, not generic, tasks. You may adopt or create a DSL only to find that it doesn’t quite meet your requirements, and there is no simple workaround. This is why I recommend that DSLs should be an optional layer built on top of a flexible API. Solid design should come first, and convenience second.
So, while DSLs are certainly cool from a programming language perspective, do keep in mind that they are not all cupcakes and rainbows. Use them sparingly. If it feels more like a hassle than a convenience, consider ditching the DSL.
Step One: Write The DSL You Want
Start by writing the DSL code you wish you had, even though it can’t run yet. It’s supposed to be convenient, so make sure to include all the niceties that you’re looking forward to.
Here is the template code that I’m aiming for in this article:
xml version: '1.0', encoding: 'UTF-8'
weather at: @time.iso8601 do
description @description
temperature "#{@temp} C"
wind do
velocity "#{@wind_vel} kts"
direction @wind_direction
end
end
Remember that the code must be syntactically-correct Ruby.
You can check the syntax with Ruby’s -c
command line flag.
This will parse the code and display any syntax errors, but will not run the code.
$ ruby -c weather.rxt
Syntax OK
This would be a good point to write a test. Take your dream DSL code, pass in some dummy input, and assert that the output is correct.
Note: The syntax highlighter doesn’t seem to handle heredocs properly, so the colors below are slightly wrong.
They should also be squiggly heredocs (<<~
) to avoid indentation issues.
RSpec.describe RXT do
it 'is a template DSL for generating XML' do
template = RXT::Template.new(<<-'END_TEMPLATE')
xml version: '1.0', encoding: 'UTF-8'
weather at: @time.iso8601 do
description @description
temperature "#{@temp} C"
wind do
velocity "#{@wind_vel} kts"
direction @wind_direction
end
end
END_TEMPLATE
input = {
time: Time.new(2016, 11, 30, 1, 2, 3, '+11:00'),
description: 'Bright & sunny.',
temp: 18.3,
wind_vel: 14,
wind_direction: 'SSE',
}
expected_output = <<-END_OUTPUT
<?xml version="1.0" encoding="UTF-8"?>
<weather at="2016-11-30T01:02:03+11:00">
<description>Bright & sunny.</description>
<temperature>18.3 C</temperature>
<wind>
<velocity>14 kts</velocity>
<direction>SSE</direction>
</wind>
</weather>
END_OUTPUT
expect(template.render(input)).to eq(expected_output)
end
end
Step Two: Make The Wrapper API
Running the test above fails, complaining that RXT::Template
does not exist, so let’s start there.
This class is a wrapper for the DSL – it runs the DSL code, but does not implement the DSL itself.
Here is the whole class:
module RXT
class Template
def self.from_file(path)
new(File.read(path), path)
end
def initialize(rxt_source, filename='(rxt)', lineno=1)
@block = CleanBinding.get.eval(<<-END_SOURCE, filename, lineno-1)
Proc.new do
#{rxt_source}
end
END_SOURCE
end
def render(instance_variables={})
dsl = DSL.new
instance_variables.each do |name, value|
dsl.instance_variable_set("@#{name}", value)
end
dsl.instance_eval(&@block)
root = dsl.__root
Ox.dump(root, with_xml: root.attributes.any?)
end
module CleanBinding
def self.get
binding
end
end
end
end
Precompilation
The initialize
method is where things start to get interesting.
def initialize(rxt_source, filename='(rxt)', lineno=1)
@block = CleanBinding.get.eval(<<-END_SOURCE, filename, lineno-1)
Proc.new do
#{rxt_source}
end
END_SOURCE
end
Here, all the template source code is being compiled into a Proc
object using eval
.
The eval
method takes a string, and parses it as Ruby code.
This is how we use the builtin Ruby parser, instead of writing our own.
All Ruby DSLs run code through eval
at some point.
In this particular case, we are precompiling the template.
The template is parsed and stored as a callable function object (a Proc
).
We could instance_eval
the template source code in render
, but that would reparse the template source every time.
Using this precompilation approach, the template is parsed just once, and can then be reused every time render
is called.
It’s a minor difference that gives slightly better performance.
File Names And Line Numbers
The eval
method takes a filename
and lineno
argument.
These are optional, but important.
When these are provided, and an exception is raised from within a template, you will get nice errors like this:
weather.rxt:6:in `block (2 levels) in get': oops (RuntimeError)
This error tells you which line (6) of which template file (weather.rxt
) the error was raised from.
If you don’t provide a filename and line number, the same exception will give you an error like this:
rxt.rb:34:in `block (2 levels) in get': oops (RuntimeError)
Line 34 of rxt.rb
is where eval
was called, not where the exception was raised from.
Good luck hunting down that bug!
Clean Bindings
Whenever you use eval
, you must specify a binding.
Bindings describe which local and instance variables are accessible, and what the value of self
is.
It doesn’t really matter what self
is in this case, for reasons we will soon see.
However I am concerned about local variables leaking into the templates. I don’t want variables that are magically accessible to every template. That sounds like a recipe for nasty bugs.
Ideally, the binding for the DSL should have no local variables. That’s where this little module comes in:
module CleanBinding
def self.get
binding
end
end
Calling CleanBinding.get
will return a binding object containing no local variables, where self
is equal to CleanBinding
, which is essentially an empty module.
This stops variables from leaking into the templates, and limits the damage that templates could accidentally inflict via self
.
Running The Template
The final step is to actually run the precompiled template code.
def render(instance_variables={})
dsl = DSL.new
instance_variables.each do |name, value|
dsl.instance_variable_set("@#{name}", value)
end
dsl.instance_eval(&@block)
root = dsl.__root
Ox.dump(root, with_xml: root.attributes.any?)
end
We start by creating a clean, new DSL object. We will implement this class shortly.
The template parameters are passed into render
as a hash argument.
Each one is assigned to an instance variable on the dsl
object using instance_variable_set
.
The precompiled template code is then run against the dsl
object using instance_eval
.
This runs the template code as if it were a method on the dsl
object.
The block will have access to all the methods and instance variables available on the dsl
object.
After the template code has been run, the results (dsl.__root
) are pulled out of the dsl
object and converted into an XML string.
You might be wondering why the __root
method has two underscores, and why the dsl
object isn’t responsible for creating the XML string itself.
Let’s look at both of those points as we implement the final class: RXT::DSL
.
Step Three: Make A DSL Class
The RXT::DSL
class provides all of the DSL features accessible from the template files.
The template source code is run as if it were a method defined on this class.
Here is the entire class:
module RXT
class DSL
attr_reader :__root
def initialize
@__root = Ox::Document.new
@__element_stack = [@__root]
end
def xml(attrs={})
attrs.each do |key, value|
@__root[key] = value
end
end
def respond_to_missing?(method_name, include_private=false)
true # responds to all methods
end
def method_missing(method_name, *args)
elem = __make_element(method_name, *args)
@__element_stack.last << elem
@__element_stack.push(elem)
yield if block_given?
@__element_stack.pop
end
def __make_element(name, attributes_or_content={}, content=nil)
if attributes_or_content.is_a?(Hash)
attributes = attributes_or_content
else
attributes = {}
content = attributes_or_content
end
Ox::Element.new(name.to_s).tap do |elem|
attributes.each { |key, value| elem[key] = value }
elem << content.to_s unless content.nil?
end
end
end
end
Double Underscores
Instance variables are used as template parameters in this DSL, but that poses a problem.
The RXT::DSL
implementation needs a couple of instance variables in order to work at all, but these instance variables should not be used by templates.
The double underscores indicate that these instance variables are private.
They are still accessible from every template, because there is no easy alternative in Ruby, but this naming convention is a widely-understood warning sign.
It says, “do not touch!”
The underscores also serve to avoid name collisions.
It’s entirely plausible for a template to use a parameter called @root
.
If @__root
had no underscores, it would be overwritten by the template parameter.
The same goes for the __make_element
and __root
methods.
What if the XML output is supposed to have a <make_element>
or <root>
element?
Without the underscores, trying to create these elements would result in a bug.
The general idea here is to avoid polluting the DSL namespace as much as possible.
This is why RXT::DSL
is a separate class to RXT::Template
.
It’s also why the XML string generation is in RXT::Template#render
instead of RXT::DSL
.
The DSL class should have as few private methods and instance variables as possible.
If methods can be pulled out and placed somewhere else, then do so.
The few that are left should have a naming convention that discourages their use, and avoids collisions.
Defining DSL Methods
The first method called from the example template is xml
:
xml version: '1.0', encoding: 'UTF-8'
Here is the corresponding implementation on RXT::DSL
:
def xml(attrs={})
attrs.each do |key, value|
@__root[key] = value
end
end
The details aren’t important.
They are more about how the ox
gem works, than how to make a DSL.
The important thing to note is that all methods defined on this class will be callable from the template code.
method_missing
In this DSL, XML elements are made by calling a method of the same name.
For example, the method call name "Tom"
results in the XML <name>Tom</name>
.
But element names are arbitrary, with infinite possibilities.
It’s impossible to implement a method for each one.
Normally when you call a method that doesn’t exist, Ruby raises a NoMethodError
.
But before it raises an exception, it gives the object an opportunity to handle the call within method_missing
.
The strategy for this DSL is to let the templates call methods that don’t exist, and catch them all in method_missing
.
Here is the implementation:
def method_missing(method_name, *args)
elem = __make_element(method_name, *args)
@__element_stack.last << elem
@__element_stack.push(elem)
yield if block_given?
@__element_stack.pop
end
Every time a non-existent method is called, we create a new element using the attempted method’s name. The new element is appended to its parent element. Then we call the block, if one was given, to create the child elements.
Again, the XML-specific details aren’t super important.
The important part is the use of method_missing
to make the DSL work.
respond_to_missing?
Any time you implement method_missing
you should also implement respond_to_missing?
.
Using method_missing
alone breaks a few methods inherited from Object
.
This is how Ruby objects are supposed to behave:
x = "hello"
x.length # works
x.respond_to?(:length) #=> true
x.method(:length) #=> #<Method: String#length>
If you can successfully call a method on an object, then respond_to?
should return true
, and method
should return a method object.
But this is how an RXT::DSL
object behaves without respond_to_missing?
implemented:
dsl = RXT::DSL.new
dsl.whatever # works
dsl.respond_to?(:whatever) #=> false
dsl.method(:whatever) #=> NameError: undefined method `whatever' for class `RXT::DSL'
You can fix this discrepancy by implementing respond_to_missing?
.
It indicates which method names will be handled by method_missing
.
This specific DSL handles all method calls, regardless of their name, so respond_to_missing?
just returns true
.
def respond_to_missing?(method_name, include_private=false)
true # responds to all methods
end
Conclusion
That’s what a simple DSL implementation looks like, in under 100 lines of code.
DSLs are just Ruby code. They often don’t look like normal Ruby code, because they are implemented with the most dynamic, flexible parts of the language.
Use DSLs to make repetitive, cumbersome tasks more convenient. But beware, if used inappropriately, they can add unnecessary complexity to your codebase for little benefit.
When implementing a DSL, try to keep the DSL object clean.
- Keep unintended variables out of the binding.
- Pull functionality out into other objects, wherever possible.
- Use a naming convention to discourage the use of private methods and instance variables.
All the code is available on Github: tomdalling/ruby_xml_template
Got questions? Comments? Milk?
Shoot an email to [email protected] or hit me up on Twitter (@tom_dalling).