User model
Now that we have a basic Rails application in place, the first step to being able to authenticate users is to create a User
model. We can do this with the following command:
bin/rails generate model User email:string password_digest:string
Now, you might be wondering why we name the password field password_digest
instead of simply password
. This is because we must never store passwords in plain text in the database for two important security reasons:
- If the database is compromised, the attacker will have access to all the emails and passwords of all users. As users probably also use the same email/password combination on other websites (even if they shouldn't), this could lead to a lot of trouble.
- Even if we are not under attack, the developers working on the project who have access to the database will see those passwords, resulting in the same problem.
To avoid this, we must hash passwords before storing them in the database.
If you are not familiar with what hashing is, let me explain, as it's a very important concept that is often confused with encrypting and signing. It is very important to understand the difference when working on authentication features.
This chapter is the hardest and the most theoretical of the course, so bear with me. After this one, it will be easier and we'll write more code!
Cryptographic techniques
Hashing
Hashing is the process of converting a given string into a fixed-size string of gibberish. Hashing is deterministic, which means that hashing the same string twice will result in the same hash. It's also a one-way operation, meaning that it's impossible to reverse the process and get the original input from the output.
This is why we hash passwords. We don't want anyone (even the developers working on the application) to be able to retrieve the original value of the password from its hash!
Let's do a quick example with the SHA256 hashing algorithm in the rails console
:
# Example of the SHA256 hashing algorithm
require "securerandom"
# Hashing the same string twice will give the same result
Digest::SHA256.hexdigest("hello") => "2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824"
Digest::SHA256.hexdigest("hello") => "2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824"
# Hashing another string will give a different result
Digest::SHA256.hexdigest("hey") => "fa690b82061edfd2852629aeba8a8977b57e40fcb77d1a7a28b26cba62591204"
Now, you might be wondering: If we store hashed passwords in the password_digest
field in the database, when a user wants to sign in, how do we ensure that they submit the correct password if we can't retrieve the original value from the hash?
Because hashing algorithms are deterministic, we can simply hash the password submitted by the user and compare it to the hash stored in the database. If the hashes are equal, we know that it is the correct password!
There are many different hashing algorithms available, but they are not all suitable for hashing user-generated passwords. In fact, you should never use SHA256 for storing user passwords.
Let's imagine an attacker tries to guess a user's password from the SHA256 hash:
- Hashing with SHA256 is very fast. An attacker can try to hash a lot of different random strings in a small amount of time to check if one of them will match the password's hash. This is called a brute force attack.
- Hashing a given string with SHA256 will always return the same result. While hashing is a one-way operation, there is nothing stopping attackers from precomputing a huge database of strings and their corresponding hashes. If attackers have access to password hashes, they can simply look up the hash in their database to find the original value. Such databases are called rainbow tables and this attack is called a rainbow table attack.
To store user-generated passwords, you must always use a hashing algorithm specifically designed for user-generated passwords, such as bcrypt, which is the default for Ruby on Rails applications.
Bcrypt is specifically designed to be slow and consume a lot of memory. You can control the slowness and memory consumption by changing a parameter called the cost factor. A high cost factor will make it much harder for attackers to perform brute force attacks.
Bcrypt also uses a technique called salting to make it almost impossible to precompute rainbow tables. The salt is a random string that is prepended to the password before hashing it. Let's make an example to understand how it works:
# Example of a hashing algorithm using salt in pseudo code
# Let's imagine we want to hash the password "secret123":
password = "secret123"
# The hashing algorithm will automatically generate a random salt:
salt = "somesalt123"
# The algorithm then hashes the concatenation of the salt and the password:
hash(salt + password) # => "xyz123abc789"
# Finally, the algorithm prepends the salt that was used to the final hash.
# This is the value we would store in the `password_digest` field in the database.
hash_with_salt("secret123") = "somesalt123:xyz123abc789"
# As the salt is randomly generated every time a string is hashed,
# hashing the same string multiple times will result in different hashes
hash_with_salt("secret123") = "somesalt123:xyz123abc789"
hash_with_salt("secret123") = "somesalt456:qldjv6qo4idd"
hash_with_salt("secret123") = "somesalt789:mvnq23qvmqp0"
# Now, if we want to check if a given password matches a given hash,
# we simply need to reuse the same salt by extracting it from the hash:
salt, hash = "somesalt123:xyz123abc789".split(":")
password = "secret123"
# Re-hashing the password with the salt and comparing it to the stored hash
hash(salt + password) == hash # => true
In our example, as the salt is a string of 9 characters, and those characters are lowercase letters and numbers, there are 36^9 possible salts. This means that the same string can be hashed in 36^9 different ways.
In the bcrypt algorithm, the salt is 22 characters long and can contain numbers and letters both lowercase and uppercase. This makes it almost impossible to precompute rainbow tables for bcrypt as there are billions of billions of billions of possible ways to hash the same password!
Encrypting
Encrypting is the process of converting a given string into a string of gibberish as well. The difference with hashing is that encryption is reversible as long as you have the encryption key. This is why we must never use encryption for passwords, as the people who have access to the encryption key (at least someone on the engineering team) could decrypt the users' passwords and access the original values.
We won't use encryption at all in this course.
Signing
Signing is the process of adding a signature to a given string to ensure that it hasn't been tampered with. The signature is generated using a secret key and can be verified using the same key.
To sign strings in Ruby on Rails, we use the ActiveSupport::MessageVerifier
class:
verifier = ActiveSupport::MessageVerifier.new("secret key")
verifier.generate("signed message")
# => "InNpZ25lZCBtZXNzYWdlIg==--6cfaf50e583b1ca92c5f591ad8bfa835195b7260"
verifier.verify("InNpZ25lZCBtZXNzYWdlIg==--6cfaf50e583b1ca92c5f591ad8bfa835195b7260")
# => "signed message"
Now let's not be fooled by the apparent randomness of the signed string. Let's analyze it closely:
signed_string = "InNpZ25lZCBtZXNzYWdlIg==--6cfaf50e583b1ca92c5f591ad8bfa835195b7260"
This string is actually composed of two parts separated by two dashes --
:
payload, signature = signed_string.split("--")
The payload is public; it's simply base64-encoded and anyone can read the data that is contained in it:
Base64.decode64(payload)
# => "\"signed message\""
However, the signature part is unique to the payload, so if an attacker tries to change the payload, the verifier
will raise an error:
tampered_payload = Base64.encode64("tampered message").strip
# => "dGFtcGVyZWQgbWVzc2FnZQ=="
tampered_signed_string = [tampered_payload, signature].join("--")
# => "dGFtcGVyZWQgbWVzc2FnZQ==--6cfaf50e583b1ca92c5f591ad8bfa835195b7260"
verifier.verify(tampered_signed_string)
# raises ActiveSupport::MessageVerifier::InvalidSignature
What is important to remember here is that signed data is public and can't be tampered with by an attacker thanks to the signature. We will use signing later in this course in order to make sure the values we store in cookies can't be tampered with.
Back to our User model
Now that we have a clear understanding of the difference between hashing, encrypting, and signing, let's go back to our User
model. The generator that we used created a few files for us.
Let's first have a look at the migration:
# db/migrate/20240708114129_create_users.rb
class CreateUsers < ActiveRecord::Migration[7.1]
def change
create_table :users do |t|
t.string :email
t.string :password_digest
t.timestamps
end
end
end
This migration creates a users
table with two columns: email
and password_digest
. The password_digest
column will store the hashed passwords of our users. Before we run the migration, we should add some database constraints and indexes to ensure the data integrity and performance of our application.
# db/migrate/20240708114129_create_users.rb
class CreateUsers < ActiveRecord::Migration[7.1]
def change
create_table :users do |t|
t.string :email, null: false, index: { unique: true }
t.string :password_digest, null: false
t.timestamps
end
end
end
The null: false
database constraints make sure that the email
and password_digest
fields must always be present.
The index: { unique: true }
index ensures that there can't be two users with the same email address in the database. It also makes the search for a user by email address faster, and we will search users by email address when signing in later in this course.
We can now safely run the migration:
bin/rails db:migrate
If we run rails test
now, we will see that they are failing with the following error:
ActiveRecord::RecordNotUnique: RuntimeError: UNIQUE constraint failed: users.email
This is because we didn't update the automatically generated fixture:
# test/fixtures/users.yml
# Read about fixtures at https://api.rubyonrails.org/classes/ActiveRecord/FixtureSet.html
one:
email: MyString
password_digest: MyString
two:
email: MyString
password_digest: MyString
As we added the unique: true
constraint to the email
field, we can't have two users with the same email address in the database. We should update the fixture to reflect this:
# test/fixtures/users.yml
alex:
email: [email protected]
password_digest: TODO
Now, as we saw earlier, we need to hash our users' passwords before storing them in the database. To do this, we are going to stick with Rails defaults and use bcrypt. In the Gemfile, we can uncomment the following line:
# Gemfile
gem "bcrypt", "~> 3.1.7"
We then need to install the gem:
bundle install
Don't forget to restart your Rails server after installing a new gem!
We are now ready to use bcrypt in order to hash users' passwords. In the users.yml
fixture file, we can use the BCrypt::Password.create
method to hash the password:
# test/fixtures/users.yml
alex:
email: [email protected]
password_digest: <%= BCrypt::Password.create("password") %>
Alex's password is "password"
and we store a hash of this string in the database thanks to the BCrypt::Password.create
method.
We can check that it worked as expected by running the tests to load the fixtures and then run the rails console
in the test environment:
bin/rails test
RAILS_ENV=test bin/rails console
As we are in the test environment, we should have access to the fixtures:
alex = User.first
# => #<User id: 980190962, email: "[email protected]", password_digest: "[Filtered]">
# The value stored in the password_digest column is a hash
alex.password_digest
# => "$2a$12$bRZAa4OE/NW00g4TmIyUF.m6WfoHxFYN6WhgvuJUz6gyZrQatIARW"
# Your value will be different from mine here thanks to the salt!
# As hashing is deterministic, we can rehash the "password" string
# with the same salt and compare it to the value stored in the database
BCrypt::Password.new(alex.password_digest) == "password"
# => true
Note: Let's analyze the bcrypt hashed password to see the different parts that we talked about in the hashing section of this lesson:
alex.password_digest
# => "$2a$12$bRZAa4OE/NW00g4TmIyUF.m6WfoHxFYN6WhgvuJUz6gyZrQatIARW"
hash = BCrypt::Password.new(alex.password_digest)
# => "$2a$12$bRZAa4OE/NW00g4TmIyUF.m6WfoHxFYN6WhgvuJUz6gyZrQatIARW"
# The version of the bcrypt algorithm used
hash.version
# => "2a"
# The cost factor used to hash the password
hash.cost
# => 12
# The salt used to hash the password
hash.salt
# => "$2a$12$bRZAa4OE/NW00g4TmIyUF."
If you want to configure the cost factor used by bcrypt, you can do it in an initializer:
# config/initializers/bcrypt.rb
BCrypt::Engine.cost = 12
Let's run our tests again and make sure they are passing:
bin/rails test
We should be back to green!
Now that we have a clear understanding of how bcrypt works and that our tests are green, we can finalize the User
model. The first step is to add the has_secure_password
method to the model:
# app/models/user.rb
class User < ApplicationRecord
has_secure_password
end
According to the Rails documentation, the has_secure_password
method adds a presence validation for the password. It also ensures that the passwords are hashed in the password_digest
column. Finally, it adds the authenticate
method to check if a given password is correct for a specific user:
# Thanks to has_secure_password, the two following lines are equivalent.
BCrypt::Password.new(alex.password_digest) == "password"
# => true
alex.authenticate("password")
# => true
# It's a bit nicer to write!
Let's also add a minimum length validation to the password:
# app/models/user.rb
class User < ApplicationRecord
MINIMUM_PASSWORD_LENGTH = 8
has_secure_password
validates :password, length: { minimum: MINIMUM_PASSWORD_LENGTH }
end
Let's also add some tests to the model to make sure the code we just wrote is working as expected. The first test will check that the password is at least 8 characters long:
# test/models/user_test.rb
require "test_helper"
class UserTest < ActiveSupport::TestCase
test "password must be at least 8 characters" do
invalid_password = "a" * (User::MINIMUM_PASSWORD_LENGTH - 1)
user = User.new email: "[email protected]", password: invalid_password
assert_not user.valid?
assert_includes user.errors.full_messages, "Password is too short (minimum is 8 characters)"
end
end
Let's also add a test to make sure that the password must be present:
# test/models/user_test.rb
require "test_helper"
class UserTest < ActiveSupport::TestCase
# The previous test
test "password must be present" do
user = User.new email: "[email protected]", password: ""
assert_not user.valid?
assert_includes user.errors.full_messages, "Password can't be blank"
end
end
Let's run our tests and make sure they are passing:
bin/rails test
Now that our tests are green, we can add validations for the email. We must make sure the email is present and unique:
# app/models/user.rb
class User < ApplicationRecord
MINIMUM_PASSWORD_LENGTH = 8
has_secure_password
validates :password, length: { minimum: MINIMUM_PASSWORD_LENGTH }
validates :email, presence: true, uniqueness: true
normalizes :email, with: ->(email) { email.strip.downcase }
end
Note that we only want to store correctly formatted emails in the database, so we need to normalize them before running the validations. In our case, normalizing means stripping emails of any leading or trailing whitespaces and downcasing them.
We can now add tests for the email. We will test both the normalization and the uniqueness at the same time by duplicating the email of an existing user, upcasing it, and adding both leading and trailing whitespaces:
# test/models/user_test.rb
require "test_helper"
class UserTest < ActiveSupport::TestCase
# All the previous tests
test "email uniqueness" do
user = users(:alex).dup
user.email = " #{user.email.upcase} "
user.password = "password"
assert_not user.valid?
assert_equal ["has already been taken"], user.errors[:email]
end
end
Let's run our tests one last time to make sure they are green:
bin/rails test
If everything is green on your side as well, it means we are ready to move on to the next chapter where we will implement the user registration feature.
Summary
In this chapter, we learned the difference between hashing, encrypting, and signing. We also learned that there are different hashing algorithms and that some of them, such as bcrypt, are specifically designed to safely store user-generated passwords.
We learned how to use the bcrypt gem to hash passwords before storing them in the database thanks to the has_secure_password
method.
Finally, we learned how to write tests for our models and how to use fixtures to test our validations.