The Dimwit's Guide to Renderers and Formatters
...Or I will rend thee in the gobberwarts with my blurglecruncheon, See if I don't!
Vogon Jeltz / Douglas Adams, The Hitchhiker's Guide to the Galaxy
This document is written by a newcomer, hoping to offer some hints on "good" and "bad" ways to use Renderers and Formatters, and some architecture notes for those looking at how they work.
API Quick Reference
Here are some examples of ways you can request that some object 'obj' is rendered as a PDF document.
You may wish to skip over this and refer back to it later. All the calls are basically just wrappers around a single primary interface, MyRenderer.render (where MyRenderer is some subclass of Ruport::Renderer)
| Method | Calls | Notes |
| MyRenderer.render(:pdf, :data=>obj) | (internal) | Primary API |
| MyRenderer.render_pdf(obj) | MyRenderer.render(:pdf, :data=>obj) | |
| render_helper(MyRenderer, obj) | MyRenderer.render(format, :data=>obj, :io=>output, :layout=>false) | Only applicable within a Formatter method. Copies format from the running Formatter and sets :io to the current output object. |
| obj.as(:pdf) | MyRenderer.render(:pdf) { |r| r.data=xxx } | Only applicable if your object class has "include Ruport::Renderer::Hooks; renders_with MyRenderer". Renders obj.renderable_data if it exists, otherwise obj. |
| obj.save_as("foo.pdf") | obj.as(:pdf, :file=>"foo.pdf") | Filename extension mapped directly to format. The Report class in ruport-util extends this method so that .pdf opens the file in "wb" mode, and .txt is mapped to :text |
There are some extra methods available for the built-in data classes like Data::Table and Data::Row
| Method | Calls | Notes |
| table.to_pdf | table.as(:pdf) | |
| render_table(table) | render_helper(Renderer::Table, table) | Only applicable within a Formatter method. Copies format from the running Formatter and sets :io to the current output object. |
All these methods can take an options hash as an extra argument.
If you wish to send output directly to a stream (or any object which supports the '<<' method), pass it as an :io option, e.g.
MyRenderer.render(:pdf, :data => obj, :io => $stdout)
Otherwise the result is returned as a String.
Renderers and Formatters
When you are using the API to render something, you only interact with one thing: the Renderer class. However, behind the scenes there are two parts: a Renderer and a Formatter.
The Formatter is the part which actually generates output. There are several base classes for different output types: Ruport::Formatter::HTML, Ruport::Formatter::PDF, Ruport::Formatter::Text, Ruport::Formatter::CSV. Each of these has methods relevant to generating output in that particular format.
The Renderer's job is to pick the correct Formatter to use when output is requested. For example, if you ask for :pdf output then it picks your PDF formatter class, which will most likely be a subclass of Ruport::Formatter::PDF. On the other hand, if you ask for :html output then it picks your HTML formatter class, and so on.
The Renderer also sets instance variables in the Formatter holding the data object (the thing you want to format) and an options object, which may contain auxilliary information to be used while creating the output or controlling its format (things like titles and page margins)
Finally, the Renderer knows which method, or methods, to call in the Formatter to start the ball rolling. These are known as 'stages' in the Renderer, but they are really just named entry points into the Formatter.
Example 1: Ruport::Renderer::Row
This is one of the built-in renderers - it is intended to render a Ruport::Data::Record, which is one row of a table.
r = Ruport::Data::Record.new([1,2,3]) puts Ruport::Renderer::Row.render(:csv, :data=>r)
What's happening here? Ruport::Renderer::Row contains a formats hash mapping :csv => Ruport::Formatter::CSV (amongst others). So it creates an instance of that class, and sets its 'data' attribute to your provided record. Finally, it knows it must call the method called build_row in the Formatter object it just built, and that's what does the work.
You can find the glue which links this all together within the Ruport source. The CSV Formatter associates itself with several Renderers, and the Renderer declares a stage which is the method to call in the Formatter.
# in lib/ruport/formatter/csv.rb
class Formatter::CSV < Formatter
renders :csv, :for => [ Renderer::Row, Renderer::Table,
Renderer::Group, Renderer::Grouping ]
end
# in lib/ruport/renderer/table.rb
class Renderer::Row < Renderer
stage :row
end
As it happens, build_row just calls 'each' on the data object, so using duck typing you can pass in an Array or anything enumerable.
puts Ruport::Renderer::Row.render(:csv, :data=>1..5)
You can render the output differently just by passing in a different symbol:
puts Ruport::Renderer::Row.render(:text, :data=>1..5)
In this case the formatter is of a different class, but the Renderer still calls its build_row method. If you don't need the flexibility of choosing the format at runtime, you can use a shortcut API where the format is extracted from the method name.
puts Ruport::Renderer::Row.render_csv(1..5)
(Note: If you're playing with this, you may discover that Ruport::Formatter::PDF doesn't have a build_row method. It can only build whole tables, not individual rows)
Example 2: Ruport::Renderer::Table
t = Table(%w[a b c]) t << [1,2,3] << [4,5,6] puts Ruport::Renderer::Table.render(:text, :data=>t)
This is slightly more complex internally because it has multiple stages. Here's how the renderer is defined:
# lib/ruport/renderer/table.rb
class Renderer::Table < Renderer
options { |o| o.show_table_headers = true }
prepare :table
stage :table_header, :table_body, :table_footer
finalize :table
end
This renderer will call several methods in turn on the Formatter:
- prepare_table
- build_table_header
- build_table_body
- build_table_footer
- finalize_table
(in that order). If the Formatter doesn't implement any of these methods, they are silently skipped.
Custom formatters and renderers
Rendering a single primitive object like a Ruport::Data::Table is fine, but how do you output a composite object? The answer is to make a custom Formatter (or Formatters), and to link it to a custom Renderer.
Let's start with a simple data object which contains two tables.
class Accounts attr_accessor :balance_sheet, :profit_and_loss end
A first cut at making a custom Renderer and HTML Formatter might look like this:
class AccountsRenderer < Ruport::Renderer
stage :report
class HTML < Ruport::Formatter::HTML
renders :html, :for => AccountsRenderer
def build_report
output << "<h1>Accounts summary</h1>\n"
Ruport::Renderer::Table.render(:html,
:data=>data.balance_sheet, :io=>output)
Ruport::Renderer::Table.render(:html,
:data=>data.profit_and_loss, :io=>output)
end
end
end
a = Accounts.new
a.balance_sheet = Table(%w[item amount])
a.balance_sheet << ["Pencils", 123.40]
a.balance_sheet << ["Paperclips", 56.30]
a.balance_sheet << ["Capital", -179.70]
a.profit_and_loss = Table(%w[item amount])
a.profit_and_loss << ["Sales", 483.00]
a.profit_and_loss << ["Bad debt", -200.00]
AccountsRenderer.render(:html, :data=>a, :io=>$stdout)
The useful work is done in the build_report method, which creates the output we need. As part of that, it renders the two tables (a process which actually creates two new Renderer and Formatter objects behind the scenes, but you needn't let that concern you).
When rendering each table, we tell it to write to the same 'output' object that we are building for the whole report. In fact this is a common pattern and there's a helper method to make it less verbose:
def build_report
output << "<h1>Accounts summary</h1>"
render_table(data.balance_sheet)
render_table(data.profit_and_loss)
end
The formatter knows that it is formatting :html at the moment, and so render_table passes this to the table renderer, saving you having to duplicate that information.
Adding PDF output
In principle you just add another formatter class. Your first attempt might look like this:
class PDF < Ruport::Formatter::PDF
renders :pdf, :for => AccountsRenderer
def build_report
add_text "Accounts summary", :font_size => 18,
:justification => :center
pad(20) { render_table(data.balance_sheet) }
pad(20) { render_table(data.profit_and_loss) }
end
end
However this isn't quite right as it stands. The problem is that you need to write multiple items onto the same PDF document, but a new throwaway Renderer and Formatter is created when rendering each table. So you need to explicitly pass the existing PDF::Writer object as the shared 'canvas' onto which everything is to be written.
class PDF < Ruport::Formatter::PDF
renders :pdf, :for => AccountsRenderer
def build_report
add_text "Accounts summary", :font_size => 18,
:justification => :center
pad(20) { render_table(data.balance_sheet, :formatter => pdf_writer) }
pad(20) { render_table(data.profit_and_loss, :formatter => pdf_writer) }
end
end
...
AccountsRenderer.render(:pdf, :data=>a, :file=>"foo.pdf")
FIXME: The documentation says you should also have to call method 'render_pdf' to finalize the output of the report. However the above example runs without it. Need to explain why this is, and under what circumstances render_pdf is actually required.
So under some circumstances you may need to write the following:
class AccountsRenderer < Ruport::Renderer
stage :report
finalize :report
class PDF < Ruport::Formatter::PDF
renders :pdf, :for => AccountsRenderer
def build_report
add_text "Accounts summary", :font_size => 18,
:justification => :center
render_table(data.balance_sheet, :formatter => pdf_writer)
render_table(data.profit_and_loss, :formatter => pdf_writer)
end
def finalize_report
render_pdf # note: different from the Renderer.render_pdf() call
end
end
end
Binding the data object to the Renderer
So far we have been using fairly cumbersome calls to initiate the rendering:
AccountsRenderer.render(:html, :data=>a, :io=>$stdout) AccountsRenderer.render(:pdf, :data=>a, :file=>"foo.pdf")
If a data model only renders with one particular renderer, which is often the case, then you can add some simple glue to bind them together.
class Accounts include Ruport::Renderer::Hooks renders_with AccountsRenderer end
After this, you have access to very simple methods for generating output from this object:
a.as(:html, :io=>$stdout)
a.save_as("foo.pdf")
Adding annotations
Maybe you want to display some extra information on the report, such the date it was generated. You could add extra attributes to the data model itself, which may be appropriate in some cases. Otherwise, you can use the 'options' in the formatter to pass extra data. You can tag particular options as being mandatory, so the report will fail if they are not set.
class AccountsRenderer < Ruport::Renderer
stage :report
required_option :date
class HTML < Ruport::Formatter::HTML
renders :html, :for => AccountsRenderer
def build_report
output << "<h1>Accounts summary as at #{options.date}</h1>"
render_table(data.balance_sheet)
render_table(data.profit_and_loss)
end
end
end
...
puts a.as(:html) # exception, :date not set
puts a.as(:html, :date=>Time.now) # correct
You should consider carefully whether options are the right way to pass data, because you have to set them explicitly at rendering time, and they do not persist afterwards. In some cases it may be cleaner to keep everything as attributes of your data model object, rather than splitting information between 'data' and 'options'. However if you have an existing data model, and you want to annotate the report without modifying that model, then adding options may be the right way.
(In this balance sheet example, then it probably makes more sense for the date at which it was extracted from the accounts to be an attribute of the balance sheet itself, rather than something which has to be carried around as a separate piece of data)
Helpers
If you are generating output in multiple formats, the method which generates each format has to live in a separate subclass. However you can still share code between them, using a Helpers module.
The following example combines all the code we have seen so far, and includes a method format_time() which is shared between the HTML and PDF reports.
require 'rubygems'
require 'ruport'
class AccountsRenderer < Ruport::Renderer
stage :report
finalize :report
required_option :date
module Helpers
def format_time(t)
t.strftime "%Y-%m-%d %H:%M"
end
end
class HTML < Ruport::Formatter::HTML
renders :html, :for => AccountsRenderer
def build_report
output << "<h1>Accounts summary as at #{format_time(options.date)}</h1>"
render_table(data.balance_sheet)
render_table(data.profit_and_loss)
end
end
class PDF < Ruport::Formatter::PDF
renders :pdf, :for => AccountsRenderer
def build_report
add_text "Accounts summary as at #{format_time(options.date)}",
:font_size => 18,
:justification => :center
pad(20) { render_table(data.balance_sheet, :formatter => pdf_writer) }
pad(20) { render_table(data.profit_and_loss, :formatter => pdf_writer) }
end
def finalize_report
render_pdf
end
end
end
class Accounts
include Ruport::Renderer::Hooks
renders_with AccountsRenderer
attr_accessor :balance_sheet, :profit_and_loss
end
a = Accounts.new
a.balance_sheet = Table(%w[item amount])
a.balance_sheet << ["Pencils", 123.40]
a.balance_sheet << ["Paperclips", 56.30]
a.balance_sheet << ["Capital", -179.70]
a.profit_and_loss = Table(%w[item amount])
a.profit_and_loss << ["Sales", 483.00]
a.profit_and_loss << ["Bad debt", -200.00]
a.as(:html, :io=>$stdout, :date=>Time.now)
a.save_as("foo.pdf", :date=>Time.now)
Assorted hints and tips
Hint 1: Interact with Renderer class methods; forget Renderer instances
Renderer and Formatter instances are created quietly behind the scenes, are run once and discarded. Use the Renderer class methods to start this process off, and don't try to work with instances of these classes.
Hopefully the reasons for this will become clear.
Hint 2: Don't misuse 'setup'
The following code is in the Ruport book:
class CallInRenderer < Ruport::Renderer
stage :call_in_sheet
def setup
self.data =
CallInAggregator.new(:start => options[:start_date]).to_grouping
end
end
This might tempt you to try to use a Renderer instance as a data store for the content of a report. However if you pursue this too far you may end up in a dead end.
In my case, I put some long-running data-gathering code in setup, and then thought I would like to render this same data several times (say once as HTML, and then again as CSV). My thought process went:
"Obviously, the setup method is storing some data in this Renderer instance. So I can just tell the same Renderer instance to render the data a second time in a different format, or I can Marshal.dump it and use it again later. Hmm, there doesn't seem to be an API for re-running an existing Renderer instance. OK I'll add that, should take about 5 minutes..."
Unfortunately I was wrong, and it turns out I fell at the first hurdle.
Despite appearances, "self.data = ..." does NOT store any data in the Renderer instance. What it actually does is "self.formatter.data = ...". The Renderer instance delegates storage of both data and options to a Formatter instance.
When you call a Renderer class method, it creates a Renderer and a Formatter as a symbiotic pair, then calls your setup method and then generates the report. A Renderer instance simply cannot do work prior to a Formatter being chosen and created, nor can it be usefully reattached to a new Formatter.
A Formatter instance could be created by itself, but it's not especially useful. For example, it can't render a table without guidance from a Renderer::Table.
So how should you handle data which needs to be re-used in this way? Well, you have to keep it in a separate model class. The example in the book could be changed along these lines:
ca = CallInAggregator.new(:start => Time.parse(params[:period])) res1 = CallInRenderer.render_html(:data => ca.to_grouping) res2 = CallInRenderer.render_pdf(:data => ca.to_grouping)
In other words, you create a CallInAggregator which gathers some data. Then you create a temporary CallInRenderer to draw it. The original code was the other way round: you told the CallInRenderer class to render whatever data it felt like at that instant, and it created a temporary CallInAggregator to fetch it.
To make this useful, you need to modify the CallInAggregator so it does its work in the initialize method and saves it in an instance variable, and to_grouping returns the saved object, otherwise it will end up doing the work twice.
Unfortunately this code is still incomplete, because the formatter also makes use of options.start_date for the report heading, so you need to pass this explicitly:
period = Time.parse(params[:period]) ca = CallInAggregator.new(:start => period) res1 = CallInRenderer.render_html(:data => ca.to_grouping, :start_date => period) res2 = CallInRenderer.render_pdf(:data => ca.to_grouping, :start_date => period)
Arguably, the book example has split the report data model across two places: the start_date is in 'options', and the timesheet grouping is in 'data'
Since the CallInAggregator already has an @start instance variable, you can expose this via an accessor. Then the PDF and HTML formatters can be changed to use data.to_grouping and data.start, instead of data and options.start_date. In my opinion, that would make the design less brittle.
Now you just pass the whole CallInAggregator object to be rendered:
ca = CallInAggregator.new(:start => Time.parse(params[:period]) res1 = CallInRenderer.render_html(:data => ca) res2 = CallInRenderer.render_pdf(:data => ca)
Once you've decided to do that, you can further simplify the API by hooking the CallInAggregator directly to the CallInRenderer.
class CallInAggregator include Ruport::Renderer::Hooks renders_with CallInRenderer end ca = CallInAggregator.new(:start => Time.parse(params[:period])) res1 = ca.as(:html) res2 = ca.as(:pdf)
Perhaps if the book had used this approach in the first place, it would have been a better foundation for more complex reports, where a clear separation is required between the data gathering phase and the outputting phase.
Hint 3: don't misuse 'required_options'
The CallInRenderer example also uses another Renderer feature:
required_option :start_date
This further misled me to believe that Renderers were a good place for doing data gathering, because there was a nice convenient feature for validating that all the necessary parameters were present. Again, unless you are doing small disposable amounts of work, it would be better done in a separate object, and keep the Renderer for just, well, rendering it.
This means you'll have to do your own checking for required arguments, and you'll also lose the Renderer's method access for options, e.g. options.foo rather than options[:foo].
This gives the following:
class CallInAggregator
def initialize(options={})
@start = options[:start]
raise "Missing :start option" unless @start
... now do the work
end
There is now a clear distinction between options used to build the report data model (passed to CallInAggregator.new), and options used only for controlling the output rendering.
Hint 4: Formatters call Renderers
It's pretty obvious that Renderers call Formatters to do work; it's less obvious that Formatters also call Renderers to do work. If you're in the middle of a custom Formatter, and then decide to output a table, this will ultimately call Renderer::Table to do the work, which in turn will create a new Formatter::XXX. In ruport-1.4, for every row of the table it then called a Renderer::Row which in turn created another Formatter::XXX.
MyRenderer -> MyFormatter -> Renderer::Table -> Formatter::XXX -> Renderer::Row -> Formatter::XXX
This has been optimised out in trunk, and table rows are now output directly by the formatter:
MyRenderer -> MyFormatter -> Renderer::Table -> Formatter::XXX
But how does it know which class Formatter::XXX to create? Well, MyFormatter remembers what it is currently formatting, say :html, and passes this to Renderer::Table. So Renderer::Table picks the formatter class it knows about for :html, which is Formatter::HTML, and creates a new instance of it.
However, this does mean that if MyFormatter is a subclass of Formatter::HTML, and it overrides the build_row method, this won't be used when rendering a table because Renderer::Table will create a standard Formatter::HTML instead.
Hint 5: Be aware of block initializers
"Block initializers" are where you pass a block at object creation time to perform additional initialization on the new object. Example:
table = Table(%w[a b c]) do |t|
t << [1,2,3]
t << [4,5,6]
end
which in this case is effectively the same as the more imperative style
table = Table(%w[a b c]) table << [1,2,3] table << [4,5,6]
You should take note of the block initialization style, because it plays an important role in the operation of Renderers. When you write
myobject.as(:text) do |r|
...
end
the renderer/formatter pair is created, then your initialisation block is called, then the renderer/formatter pair is run (including the 'setup' method)
This style pervades deeply, and blocks are created down through multiple levels, e.g.
Renderer::Hooks::Classmethods.as() or Formatter#render_helper() --> Renderer.render() --> Renderer.build() --> Renderer#setup()
Think about the blocks as simply initialization code which will be inserted at the "right" point in the object's construction.
Unlike iterators, generally they're only called once. (The now-obsolete "renderer_data_by_row" method from ruport-1.4 invoked Renderer::Row multiple times, and therefore this initialization block was invoked once for each row)
Hint 6: Strings versus Streams
If you write your formatters like this:
output << data.table1.to_csv
output << data.table1.as(:csv)
then table1 will be rendered into a string, then this string appended to the output. For a table with tens of thousands of rows this may lead to a pause before you see the rows being output, and higher memory usage.
You can stream your output like this:
data.table1.as(:csv, :io=>output)
or within a formatter only, using
render_table(data.table1)
which merges :io=>output into the table rendering options for you.
