Safety in Numbers
Brighter Planet's blog
What you should know about mysql2 memory usage
If you’re using mysql2, you should be aware of a memory usage issue:
# mysql2 gem - no way to avoid using a lot of memory if you're streaming a lot of rows
client = Mysql2::Client.new(:host => "localhost", :username => "root")
results = client.query("SELECT * FROM users WHERE group='githubbers'")
# mysql gem - keep memory usage flat if you're streaming a lot of rows
dbh = Mysql.init
dbh.connect "localhost", "root"
dbh.query_with_result = false
dbh.query("SELECT * FROM users WHERE group='githubbers'")
results = dbh.use_result
While working on our reference data web service, I ran these benchmarks:
- Exporting 5,000 rows using mysql (the “old” gem) stays flat on memory
- Exporting 5,000 rows using mysql2 takes an ever-increasing amount of memory (not good!)
The problem
The author of the gem in question, mysql2, knows about the issue. In a nutshell, the gem’s use of mysql_store_result
(as opposed to mysql_use_result
) leads the underlying libmysql library to always load entire resultsets into memory… even if :cache_rows => false
is passed as a runtime option. A :streaming => true
option would make perfect sense!
The solution
- I had to modify my gem, mysql2xxxx (which provides mysql2csv, mysql2json, and mysql2xml) to use the “old” mysql gem.
- If you’re using mysql2 (it’s the default on Rails 3!) then be aware that processing huge resultsets has a different memory impact than it did with mysql.
- Vote up the issue on github… it would be nice to optionally use
mysql_use_result
!
I believe mysql2 has a promising future and this memory problem will probably be gone soon. Thanks brianmario!
What blog is this?
Safety in Numbers is Brighter Planet's blog about climate science, Ruby, Rails, data, transparency, and, well, us.
Who's behind this?
We're Brighter Planet, the world's leading computational sustainability platform.