Postgres Arrays vs JSON Datatypes in Rails 5
Having multiple values under one attribute definitely has its use cases. There are plenty of articles that compare the NoSQL database performance to the PostgreSQL JSON datatypes. Their results show the maturity of tools provided by this database management system. So which data types are best suited for what?
When you already decided that normalization is not the way you want to go and that in your specific case data duplication is better than having yet another join table you will most likely face the options below.
Array
Its main disadvantage is that you are limited to one data type and the strict order of the array. It's pretty easy to set up and requires less storage in comparison to jsonb. It has multi dimensional support (requires same number of elements though) and indexing with GIN, which greatly speeds up query performance.
class CreateMovies < ActiveRecord::Migration[5.2]
def change
create_table :movies do |t|
t.integer 'ratings', array: true, default: []
end
add_index :movies, :ratings, using: :gin
end
end
Movie.create(ratings: [7, 7, '1', 'string', :random_symbol]).ratings
#=> [7, 7, 1, 0, nil]
Movie.create(ratings: [[7, 5], [7]]).ratings
# invalid array
Movie.create(ratings: [[7, 5], [7, nil]]).ratings
# valid array
Movie.create(ratings: [[7, 5], [7, nil], 4]).ratings
# invalid array
Movie.where("7 = ANY ratings")
# movies having rating 7
Movie.where('ratings @> ARRAY[?]::integer[]', [7, 1])
# movies having rating 7 and 1
Use case: simple lists of information. Especially useful when duplications are needed, for example for saving multiple ratings of the same value.
Hstore
While not being a strictly JSON data type, it can be used as such - hstore operates on a string based key/value pair system. It doesn't allow for nesting and because of that has been mostly labeled as inferior to the jsonb data type, even though it also supports GIN and GIST indexing. Considering that inside Rails you also need to remember to enable the hstore extension its usability is questionable.
class CreateMovies < ActiveRecord::Migration[5.2]
def change
enable_extension 'hstore' unless extension_enabled?('hstore')
create_table :movies do |t|
t.hstore 'additional_data', default: {}
end
add_index :movies, :additional_data, using: :gin
end
end
Movie.create(additional_data: { 'advisor' => 'Random',
'aspect_ratio' => '2.35 : 1',
'advisor' => 'John Doe' }).additional_data
#=> {'advisor'=>'John Doe', 'aspect_ratio'=>'2.35 : 1'}
Movie.where('additional_data ? :key', key: 'advisor')
# movies with advisor
Movie.where('additional_data -> :key LIKE :value',
key: 'advisor', value: '%John%')
# movies with advisor named John
Use case: if you want to store simple key/value type information and expect to never need to nest data.
JSON
Acts more or less as expected from a JSON datatype - it allows nesting and using some basic data types. Compared to jsonb, it has no support for indexes but is also less strict (jsonb doesn't allow for the NULL character). It requires less storage compared to jsonb.
class CreateMovies < ActiveRecord::Migration[5.2]
def change
create_table :movies do |t|
t.json 'payload'
end
end
end
payload = { "request" => { data: { "yearsPlayedInTV" => [2018, 2019] } } }
Movie.create(payload: payload).payload
#=> {"request"=>{"data"=>{"yearsPlayedInTV"=>[2018, 2019]}}}
Movie.where("payload #>> '{request, data, yearsPlayedInTV, 0}' = :year",
year: "2018")
# movies where first year played in tv was 2018
Use case: log and payload storing, especially when no frequent access is needed.
JSON
While very similar to the json data type, it provides additional operators for querying as well us support for indexing. Those two features make this data type preferable over the other ones. It is worth noting that since jsonb has to parse the JSON data into a binary format, it tends to be slower than the json data type when writing but faster when reading the data.
class CreateMovies < ActiveRecord::Migration[5.2]
def change
create_table :movies do |t|
t.jsonb 'additional_data'
end
add_index :movies, :additional_data, using: :gin
add_index :movies,
"(additional_data->'screenwriters')",
name: "index_movies_on_additional_data_screenwriters"
end
end
additional_data = { "screenwriters" => ['John Doe', 1], other: nil }
Movie.create(additional_data: additional_data).additional_data
#=> {"screenwriters"=>["John Doe"], "other"=>nil}
Movie.where("additional_data -> 'screenwriters' ? :screenwriter",
screenwriter: "John Doe")
# movies with screenwriters named John Doe
Use case: any case of storing JSON type of data
Worth reading
PostgreSQL documentation - arrays
PostgreSQL documentation - hstore
PostgreSQL documentation - json and jsonb
Rails Guides - Active Record and PostgreSQL
Photo by Sergi Kabrera on Unsplash