How to Transform Data in Plotrb

Acknowledgment: This tutorial is based on Vega's documentation.

Data Transform is one of the most important components of Plotrb. It is responsible for performing operations on data set prior to visualization.

A transform is specified by its type. In this article I will describe all transform types allowed by Vega.

First of all, to create a new Transform instance, you can either

t = # filter is one of the allowed types

Or directly call

t = Plotrb::Transform.filter

Differernt types will have different attributes to further specify the transforms.

Data Manipulation Transforms


Maps a data object to an array of selected values referenced by fields.

t = Plotrb::Transform.array.fields('pop.weight', 'pop.height')

You can also use take to specify the field references.

t = Plotrb::Transform.array do
  take 'pop.weight', 'pop.height'


Copys values into a top-level data object.

You can also use take to replace fields if it reads more natural.

t = Plotrb::Transform.copy do
  from 'population'
  take 'weight', 'height'
  as 'w', 'h'


Computes the cross-product of two data sets.

t = Plotrb::Transform.cross.with('another_data').include_diagonal

If you don't supply the secondary data set, the cross-product will be against the data set itself.

For example, if data is [1, 2, 3], cross.include_diagonal will produce

  {"a":1, "b":1},
  {"a":1, "b":2},
  {"a":1, "b":3},
  {"a":2, "b":1},
  {"a":2, "b":2},
  {"a":2, "b":3},
  {"a":3, "b":1},
  {"a":3, "b":2},
  {"a":3, "b":3},


Organizes data into groups.

This is similar to group by operation in SQL, so you may replace keys with group_by.

t = Plotrb::Transform.facet.group_by('category')

For more details of how the output is organized, please refer to Vega's wiki page here.


Filter the data set to remove unwanted items according to test expression.

t = Plotrb::Transform.filter.test(' > 10')


Converts a faceted or hierarchical data set back into a flat, tabular structure.

t = Plotrb::Transform.flatten


Collapses one or more data properties referenced by fields into two: a key (containing the original data property name) and a value (containing the data value).

You can also use into to replace fields.

t = Plotrb::Transform.fold.into('', 'data.silver')

For the following input

  {"data": {"country": "USA", "gold":10, "silver":20}}, 
  {"data": {"country": "Canada", "gold":7, "silver":26}}

The output will be

  {"index": 0, "key":"", "value":10, "data": {"country": "USA"}},
  {"index": 1, "key":"data.silver", "value":20, "data": {"country": "USA"}},
  {"index": 2, "key":"", "value":7, "data": {"country": "Canada"}},
  {"index": 3, "key":"data.silver", "value":26, "data": {"country": "Canada"}}

This can be used to transform matrix data into standardized format.


Applies formula to the data set, and stores the result in a new field.

t = Plotrb::Transform.formula.apply('abs( *').into('xy')


Generates a subset of the data array.

Assume the data is [5, 6, 7, 8, 9, 10, 11].

t = Plotrb::Transform.slice # => [8, 9, 10, 11] # => [10, 11][2, 5]) #=> [7, 8, 9]'min_value')


Sorts the values by fields as criteria. You can either use #reverse or prefixing a "-" character in front of the fields to specify descending order.

t ='foo')
t ='bar').reverse
t ='-baz')


Computes statistics for the data set. They are count, minimum, maximum, sum, mean, sample variance, and sample standard deviation.

t = Plotrb::Transform.stats do
  from ''


Truncates a string into specified length.

t = Plotrb::Transform.truncate do
  from 'data.text'
  to 'truncated'
  max_length 20
  position :middle
  ellipsis '***'


Construct a new data set that contains unique values for the specified field.

t = Plotrb::Transform.unique.from('').to('new_data')


Performs a "sliding window" over a data array and outputs each window frame.

t = Plotrb::Transform.window.size(3).step(2)

The above example returns triples in the data set, such that the last value of the previous triple is the first value in the next triple, because the step size is 2.


Merges two data sets together according to a join key. If no key is provided, the data sets are merged by indices.

t = do
  with 'unemployment'
  match ''
  against 'data.key'
  as 'value'

This example matches records in the input data with records in the secondary data set named "unemployment", where the values of (primary data) and data.key (secondary data) match. Matching values in the secondary data are added to the primary data in the field named "value".

Visual Encoding Transforms


Performs force-directed layout for network data.

The tranform acts on two data sets: one containing nodes and one containing links. Apply the transform to the node data, and include the name of the link data as a transform parameter.


Performs a cartographic projection. Given longitude and latitude values, sets corresponding x and y properties for a mark.


Creates paths for geographic regions such as countries, states and counties.


Computes a path definition for connecting nodes within a network or tree diagram.


Computes a pie chart layout.

If value property is not given, all pie slices will have equal spans.


Computes layout values for stacked graphs.

The :silhouette offset will center the stacks, while :wiggle will attempt to minimize changes in slope to make the graph easier to read. If :expand is chosen, the output values will be in the range [0,1].

You can also call #reverse or #inside_out directly to set the order.


Computes a squarified treemap layout.


Computes a word cloud layout.

comments powered by Disqus