While you can always fix a bug, you cannot always recover of inconsistent data.
In this blog post, I’ll describe some practices and techniques I use to prevent inconsistent data from silently ruining your companies’ most valuable asset: its database.
Use SQL database features
SQL databases provides simple features to enforce data integrity.
“Well… I don’t know.”
Most columns in your database should be set to a value for the record to be consistent. When writing your migration, set them to
null: false. The database will blow up if you forgot to set them instead of silently setting them to NULL.
create_table :users do |t| t.string :email, null: false end
“So that guy has two accounts with the same email address?”
Add index with unique constraints on unique attributes. It’s cheap and it makes look-up even more efficient! In case you also have a `deleted_at` attribute, add it to the unique index and you’re good to go.
add_index :users, [:email, :deleted_at], unique: true
“Errr, the money transfer failed half way through. We did not credit the other guy.”
Updating multiple records? Wrap that into a SQL transaction. If something goes wrong, changes are reverted and the database is rolled back to a consistent state.
Account.transaction do from_account.debit!(amount) to_account.credit!(amount) end
“Two workers were running the invoicing task in parallele… Some people got charged twice.”
Use locking when you want to prevent a record from being updated by others when you are using it.
task = InvoicingTask.lock(true).first task.perform!
“We lost the destination for this pending order…”
Foreign keys will help you prevent orphan records and missing relationships. They are effective on race conditions that Active Record validations can’t spot. I use Foreigner to add foreign key support to migrations.
Use ActiveRecord validations as live checks
Validations are mainly used to check user’s inputs. I also use them as live tests to ensure data integrity.
Fail early, fail loudly
cancel!: shout at objects, they will just blow up when things go wrong.
So instead of doing:
def update_price(product_name, new_price) if product = Product.find_by_name(product_name) product.update_attributes(price: new_price) end end
def update_price!(product_name, new_price) product = Product.find_by_name!(product_name) product.update_attributes!(price: new_price) end
Your users are likely to see an error message when something go wrong, but that’s much better than leaving inconsistent data.
Ensure that existing records are valid
As the code evolves, existing records become invalid. You’ll know about it when a user attempts to update his profile or when you run a migration that updates data. Run ActiveSanity daily to ensure that your data is valid.
What about you? Do you think that those practices are worthwhile? Are there other technics you use to ensure data integrity?