Posted by Seamus on Tuesday, September 04, 2012.

Upsert for MySQL, PostgreSQL, and SQLite3 (and Ruby)

Our upsert library for Ruby gives you NoSQL-like upsert functionality in traditional RDBMS databases. How?

50%–80% faster than ActiveRecord

New in 0.4.0: When used in PostgreSQL mode, database functions are re-used, so you don’t have to be in batch mode to get the speed advantage.

You don’t need ActiveRecord to use it, but it’s benchmarked against ActiveRecord and found to be up to 50% to 80% faster than traditional techniques for emulating upsert:

# postgresql (pg library)
Upsert was 78% faster than find + new/set/save
Upsert was 78% faster than find_or_create + update_attributes
Upsert was 88% faster than create + rescue/find/update

# mysql (mysql2 library)
Upsert was 46% faster than find + new/set/save
Upsert was 63% faster than find_or_create + update_attributes
Upsert was 74% faster than create + rescue/find/update
Upsert was 28% faster than faking upserts with activerecord-import (which uses ON DUPLICATE KEY UPDATE)

# sqlite3
Upsert was 72% faster than find + new/set/save
Upsert was 74% faster than find_or_create + update_attributes
Upsert was 83% faster than create + rescue/find/update

(run the tests on your own machine to get these benchmarks)

What is a selector? What is a document?

upsert was inspired by the MongoDB upsert method – AKA mongo-ruby-driver’s update method – and involves a “selector” (how to find the row to be inserted or updated) and a “document” (attributes that should be set once the record has been found.)

Example 1

  • Selector: :name => 'Jerry'
  • Document: :age => 5
  • Expression: upsert.row({:name => 'Jerry'}, :age => 5)

Example 2

  • Selector: :id => 45
  • Document: :updated_at => Time.now
  • Expression: upsert.row({:id => 45}, :updated_at => Time.now)

Unfortunately, you currently can’t do things like :counter => 'counter + 1'.

Quickstart

One record at a time:

connection = Mysql2::Client.new([...])
upsert = Upsert.new connection, 'pets'
upsert.row({:name => 'Jerry'}, :breed => 'beagle')

With ActiveRecord helper: (first require 'upsert/active_record_upsert')

Pet.upsert({:name => 'Jerry'}, :breed => 'beagle')

In batch mode, which is the fastest:

connection = Mysql2::Client.new([...])
Upsert.batch(connection, 'pets') do |upsert|
  upsert.row({:name => 'Jerry'}, :breed => 'beagle')
  upsert.row({:name => 'Pierre'}, :breed => 'tabby')
end

What blog is this?

Safety in Numbers is Brighter Planet's blog about climate science, Ruby, Rails, data, transparency, and, well, us.

Who's behind this?

We're Brighter Planet, the world's leading computational sustainability platform.

Who's blogging here?

  1. Patti Prairie CEO