Visualizing your data in Ruby
Charty is open-source Ruby library for visualizing your data in a simple way.
In Charty, you need to write very few lines of code for representing what you want to do.
It lets you focus on your analysis of data, instead of plotting.
$ brew install Python
$ pip3 install matplotlib
$ bundle
You should install tk libraries before install python and should add enabling shared library option to installing python.
So you may have to do pyenv uninstall 3.x.x
first.
$ apt install -y tk-dev python3-tk
$ CONFIGURE_OPTS="--enable-shared" pyenv install 3.x.x
gem install charty --pre
gem install matplotlib
sudo apt install python3-pip
sudo python3 -m pip install -U pip matplotlib
e.g.
$ docker build -f ./Dockerfile.dev -t charty-dev:latest .
$ docker run --rm -v $(pwd):/charty charty-dev:latest bundle install
$ docker run --rm -it -v $(pwd):/charty charty-dev:latest ./bin/console
irb(main):001:0> Charty::VERSION
=> "0.2.2"
# When using jupyter notebook
$ docker run --rm -it -v $(pwd):/charty -p 8888:8888 charty-dev:latest
Charty supports statistical plotting as Python’s seaborn.
In the following examplles, we use the penguins
dataset provided in red-datasets.
require "datasets"
penguins = Datasets::Penguins.new
The following code shows a basic workflow of the visualization with Charty.
First you need to load the Charty library.
require "charty"
Next you msut have a dataset you want to visualize. Here, we use the penguins dataset provided in red-datasets library.
require "datasets"
penguins = Datasets::Penguins.new
Next you need to create a plotter object by a plotting method. Here, we use scatter_plot
method to show the relationship
among body_mass_g
, flipper_length_mm
, and species
columns in the penguins dataset.
plot = Charty.scatter_plot(data: penguins, x: :body_mass_g, y: :flipper_length_mm, color: :species)
If you want to render and save this plotter object into an HTML file by plotly backend, you can do it like below.
Charty::Backends.use(:plotly) # select plotly backend
plot.save("scatter.html") # save the plot as an HTML file
When you already have prepared playwright-ruby-client,
you can render a plot into a PNG file by plotly backend by specifying a filename with .png
extension.
plot.save("scatter.png")
If you use Charty on Jupyter Notebook with IRuby kerenl (a.k.a. IRuby notebook),
you can render the plot just evaluate a plotter object. For example, the code below shows a scatter plot figure in
the output area.
Charty::Backends.use(:plotly)
Charty.scatter_plot(data: penguins, x: :body_mass_g, y: :flipper_length_mm, color: :species)
Note that if you want to use the pyplot backend, you need to activate the integration between the pyplot backend and IRuby.
You can activate the integration by the following two lines.
Charty::Backends.use(:pyplot)
Charty::Backends::Pyplot.activate_iruby_integration
Charty’s statistical bar plot shows the relationship between a categorical variable and estimated means of a numeric variable.
This plot automatically calculates mean estimation and its 95% confidence interval of the numeric variable.
When we specify the categorical varaible as x-axis, the plot draws a vertical bar chart.
Instead, when we specify the categorical variable as y-axis, the plot draws a horizontal bar chart.
The following code shows the relationship between species and the mean body masses of penguins in a vertical bar chart.
Charty.bar_plot(data: penguins, x: :species, y: :body_mass_g)
Exchanging x and y axes alternates the orientation of the resulting chart.
Charty.bar_plot(data: penguins, x: :body_mass_g, y: :species)
Adding color axis introduces color grouping in the bar plot.
Charty.bar_plot(data: penguins, x: :species, y: :body_mass_g, color: :sex)
Charty’s statistical box plot shows distributions of a numeric variable per categories.
The distributions are showed by boxes with whiskers that characterized by five-number summary.
This plot automatically calculates five-number summary the numeric variable per categories.
When we specify the categorical varaible as x-axis, the plot draws a vertical box plot chart.
Instead, when we specify the categorical variable as y-axis, the plot draws a horizontal box plot chart.
The following code draws a vertical box plot to show distributions of penguins’ body mass per species.
Charty.box_plot(data: penguins, x: :species, y: :body_mass_g)
As bar_plot
above, exchanging x and y axes alternates the orientation of the resulting chart.
Charty.box_plot(data: penguins, x: :body_mass_g, y: :species)
Adding color axis introduces color grouping in the box plot.
Charty.box_plot(data: penguins, x: :species, y: :body_mass_g, color: :sex)
Charty’s scatter plot shows the relationship between two numeric variables.
Charty.scatter_plot(data: penguins, x: :body_mass_g, y: flipper_length_mm)
Adding color axis introduces color grouping in the scatter plot.
The following example specifies :species
variable in the color axis.
It shows the different species by the different colors.
Charty.scatter_plot(data: penguins, x: :body_mass_g, y: flipper_length_mm, color: :species)
Moreover, size and style axes can be specified.
The following example specifies :sex
variable in the style axis.
Charty.scatter_plot(data: penguins, x: :body_mass_g, y: flipper_length_mm, color: :species, style: :sex)
require 'charty'
charty = Charty::Plotter.new(:pyplot)
bar = charty.bar do
series [0,1,2,3,4], [10,40,20,90,70], label: "sample1"
series [0,1,2,3,4], [90,80,70,60,50], label: "sample2"
series [0,1,2,3,4,5,6,7,8], [50,60,20,30,10, 90, 0, 100, 50], label: "sample3"
range x: 0..10, y: 1..100
xlabel 'foo'
ylabel 'bar'
title 'bar plot'
end
bar.render("sample_images/bar_pyplot.png")
Charty also supports Daru::DataFrame, Numo::NArray, NMatrix and ActiveRecord as Data Abstraction Layer.
For example.
require 'charty'
charty = Charty::Plotter.new(:pyplot)
### when Daru::DataFrame
require 'daru'
df = Daru::DataFrame.new({'a':[1,2,3,4], 'b':[4,5,6,7], 'c':[8, 9, 10, 11]})
charty.table = df
### when Numo::NArray
require "numo/narray"
narray = Numo::DFloat.new(3,5).seq
charty.table = narray
### when NMatrix
require "nmatrix"
nmatrix = NMatrix.new([3, 4], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], dtype: :int64)
charty.table = nmatrix
### when ActiveRecord
require "active_record"
ActiveRecord::Base.establish_connection(adapter: "sqlite3", database: ":memory:")
ActiveRecord::Schema.define do
create_table :foos do |t|
t.integer :price
t.integer :sales
end
end
class Foo < ActiveRecord::Base
end
100.times{|i| Foo.create!(price: 10 * i, sales: (1..100).to_a.sample) }
sales = Foo.where("sales >= 40")
charty.table = sales
bar = charty.to_bar(:price, :sales)
bar.render('sample')
box_plot = charty.to_box_plot(:price, :sales)
box_plot.render('sample')
bubble = charty.to_bubble(:price, :sales, :id)
bubble.render('sample')
curve = charty.to_curve(:price, :sales)
curve.render('sample')
scatter = charty.to_scatter(:price, :sales)
scatter.render('sample')
error_bar = charty.to_error_bar(:price, :sales)
error_bar.render('sample')
hst= charty.to_hst(:price, :sales)
hst.render('sample')
create an instance of the library you want to use.
require 'charty'
# when you want to use matplotlib.pyplot
charty = Charty::Plotter.new(:pyplot)
# when you want to use gruff
charty = Charty::Plotter.new(:gruff)
# when you wanto to use rubyplot
charty = Charty::Plotter.new(:rubyplot)
bar = charty.bar do
series [0,1,2,3,4], [10,40,20,90,70], label: "sample1"
series [0,1,2,3,4], [90,80,70,60,50], label: "sample2"
series [0,1,2,3,4,5,6,7,8], [50,60,20,30,10, 90, 0, 100, 50], label: "sample3"
range x: 0..10, y: 1..100
xlabel 'foo'
ylabel 'bar'
title 'bar plot'
end
bar.render("sample_images/bar_pyplot.png")
curve2 = charty.curve do
series [0,1,2,3,4], [10,40,20,90,70], label: "sample1"
series [0,1,2,3,4], [90,80,70,60,50], label: "sample2"
series [0,1,2,3,4,5,6,7,8], [50,60,20,30,10, 90, 0, 100, 50], label: "sample3"
range x: 0..10, y: 1..100
xlabel 'foo'
ylabel 'bar'
end
curve2.render("sample_images/curve_pyplot.png")
curve = charty.curve do
function {|x| Math.sin(x) }
range x: 0..10, y: -1..1
xlabel 'foo'
ylabel 'bar'
end
curve.render("sample_images/curve_with_function_pyplot.png")
Not supported
box_plot = charty.box_plot do
data [[60,70,80,70,50], [100,40,20,80,70], [30, 10]]
range x: 0..10, y: 1..100
xlabel 'foo'
ylabel 'bar'
title 'box plot'
end
box_plot.render("sample_images/box_plot_pyplot.png")
Not supported
Not supported
scatter = charty.scatter do
series 0..10, (0..1).step(0.1), label: 'sample1'
series 0..5, (0..1).step(0.2), label: 'sample2'
series [0, 1, 2, 3, 4], [0, -0.1, -0.5, -0.5, 0.1], label: 'sample3'
range x: 0..10, y: -1..1
# xlabel 'x label'
# xlabel ''
ylabel 'y label'
title 'scatter sample'
end
scatter.render("sample_images/scatter_pyplot.png")
error_bar = charty.error_bar do
series [1,2,3,4], [1,4,9,16], xerr: [0.5,1.0,1.5,0.3], yerr: [0.6,0.2,0.8,0.1], label: 'label1'
series [1,2,3,4], [16,9,4,1], label: 'label2'
series [1,2,3,4,5,6,7,8], [14,14,14,14,14,14,14,14], label: 'label2', xerr: [0.5,1.0,1.5,0.3, 1.1, 1.2, 1.3, 1.4]
range x: 0..10, y: -1..20
xlabel 'x label'
title 'error_bar'
end
error_bar.render("sample_images/error_bar_pyplot.png")
Not supported
Not supported
bubble = charty.bubble do
series 0..10, (0..1).step(0.1), [10, 100, 1000, 20, 200, 2000, 5, 50, 500, 4, 40], label: 'sample1'
series 0..5, (0..1).step(0.2), [1, 10, 100, 1000, 500, 100], label: 'sample2'
series [0, 1, 2, 3, 4], [0, -0.1, -0.5, -0.5, 0.1], [40, 30, 200, 10, 5]
range x: 0..10, y: -1..1
xlabel 'x label'
ylabel 'y label'
title 'bubble sample'
end
bubble.render("sample_images/bubble_pyplot.png")
Not supported
hist = charty.hist do
data [[10, 10, 20, 30, 40, 40,40,40,40,40, 50, 10, 10, 10], [100, 100, 100, 100, 90, 90, 80, 30, 30, 30, 30, 30]]
range x: 0..100, y: 0..7
xlabel 'x label'
ylabel 'y label'
title 'histogram sample'
end
hist.render("sample_images/hist_pyplot.png")
Not supported
layout = charty.layout
layout << curve
layout << scatter
layout.render("sample_images/subplot_pyplot.png")
Not supported
Not supported
curve_list = [0.5, 0.75].map do |f|
charty.curve(f:f) do
function {|x| Math.sin(f*x) }
range x: 0..10, y: -1..1
end
end
scatter_list = [-0.5, 0.5].map do |f|
charty.scatter(f: f) do
series Charty::Linspace.new(0..10, 20), Charty::Linspace.new(0..f, 20)
range x: 0..10, y: -1..1
end
end
grid_layout = charty.layout(:grid2x2)
grid_layout << curve_list
grid_layout << scatter_list
grid_layout.render("sample_images/subplot2_pyplot.png")
Not supported
Not supported
MIT License