Last Updated on September 21, 2023 by Christopher G Mendla
I have a Ruby on Rails site that I use as an online portfolio. GitHub released a warning that there were vulnerabilities in Rails and associated gems. I started incrementally upgrading Rails. However, when I tried to push the Rails 5.2 update to production, the site crashed. Here is how I recovered
The Problem – a bad update
The site
The repo for the site is on GitHub and is public. The site itself is hosted on Digital Ocean. I’m hoping to continue to add features to the site and use that in lieu of ‘coding challenges‘
- Hosted on Digital Ocean using their one click Ruby installation
- Ubuntu 18.04.04 LTS
- Rails (Started with 5.1.4. The target was 5.4.2.2. Current after recovery is 5.1.7)
- Nginx/Postgres
Development environment
The development environment is straightforward with local development on a laptop, GitHub as the repository and a Digital Ocean Droplet.
- GitHub public Repo
- Local development on Ubuntu running under Virtualbox on Windows
- Small, incremental branches/pull requests.
- Workflow
- Go to the master branch on local. Fetch and pull for the latest from the repo
- Create a feature branch
- Make the changes and test
- Push the branch and merge to master
- Git fetch/pull on the server, bundle install if required.
- Switch back to master on local. Git fetch/pull.
- The only environments are Local Development and Production. There is no Staging or Preview.
The need or Rails updates.
I was in the process of adding some rspec tests to the site as a ‘coding exercise’ for a potential employer. The site had been unmaintained for some time and the Rails versions were lagging (along with Ruby and some of the gems).
As I was working with the repo, I saw a notice appear that there were vulnerabilities. One was critical and the other was important. Both were tied to the Rails version. Further investigation revealed that I would either have to update Rails to 5.4.2.2 or implement workarounds and monkey patches.Â
The warning below was on the repositor’s home page.
The image below shows the details of the issues
Updating Rails was a much cleaner solution so I went down that path.Â
When Updating Rails, I use Railsdiff. I prefer to update Rails incrementally rather than making one large leap.Â
All was good until Rails 5.2
I updated Rails through 5.1.7. That was about 3 PRs which went well.Â
Rails 5.2 the update that went off the rails.
Going from 5.1.7 to 5.2 was a much more involved update. There were a number of files that required changes.
In addition, Rails 5.2 changed the way credentials are stored. Previously I had the passwords as environment variables on the server. Rails 5.2 makes use of an encoded yml file to store passwords.
fe sendauth error
Everything seemed OK in local dev. The tests ran ok and puma ran ok.
I pushed the final changes and merged into master. When I pulled the Master into the server, I got an error that the database could not connect.
I spent a good part of the day troubleshooting the problems. I had not set up the new credentialing system. However the Postgres connection would not work for the site. I was able to connect via the console.
I started tweaking some things such as the pg_hba.conf file. Finally I got the credentials right and was able to connect. BUT that was only with puma locally in a single thread mode.
508 Nginx Gateway not found.
My new error was a 508 Nginx Gateway not found. I spend a couple more hours troubleshooting to no avail.
The best 8 bucks a month I ever spent.
When I created the new Digital Ocean droplet I was a little conflicted as to if I should spend the 8 dollars a month for backups. I’m in between jobs and money doesn’t’ grow on trees. I decided to go for it and added it to the site.
Trying to fix the 5.2 upgrade failure was turning into a real mess and was sucking hours up like a black hole of time.
I could continue to bang my head against the wall or I could simply rewind back to a stable system.
I realized I had the following situation:
- A full site/server backup as of Sunday night.
- About a dozen PRs merged to master including the fatal 5.2 update.
So I should be able to restore the backup, then apply the PRs after the backup but before the attempted 5.2 upgrade.
Getting back to life.
I decided to go for a restore. If this was a real production site, that could be problematical as there would be data changes. That wasn’t the case here since there was no data to lose.
The recovery process.
- I took a snapshot on Digital Ocean. I really didn’t think I would need it but backups are backups.
- I made sure I didn’t have any local changes. I did not. I checked out master on the local machine.
- I started reverting the PRs beginning with the oldest. When you open a closed PR, you will see a “Revert” button. That undoes that PR and will update the master accordingly.
- The next step was to restore the latest backup. That went surprisingly well.
- Now, I had my site back. This avoided the awkward situation of a potential employer checking it out and getting a “Gateway not found” error page.
- On the server, I did a git fetch / git pull. That applied all the PRs that were left after reverting the post 5.2 . At that point, my site was synced properly with the Master in the GitHub repository.
- On the local machine, I made sure I was on the Master branch. A git fetch / git pull reset my local code to the Github Repo. I was a bit concerned as to how that would work exactly.
NOTE – I’m not sure how much of a mess this might have made as far as the git history. I don’t think it messed things up much, if at all. The reverts of the previous PRs created new PRs so the tracking should be good.
I still need to continue the update process.
I was successful in getting the site stable again. However, I still need to continue with the Rails and Ruby updates.
Digital Ocean allows you to take snapshots for a very nominal fee based on the size of the snapshot and how long you keep it. . My plan with the Rails updates is to take a snapshot, and then delete it after I confirm that the update is stable. That isn’t an ideal workflow but it should work.
That would allow me to roll back a failed update.