Porting SQLite3 Data to PostgreSQL Database in Rails 4

This past week I’ve been building a database with almost 1,000,000 records across five tables. So far, it has been just fine to use the Rails default, SQLite3, to manage it. That’s no longer the case, as I need to start thinking about the deployment environment for my application, and I have also been running into some issues with my SQLite3 database locking as I try to add records. Thus, I have ported my data over to PostgreSQL. The process is straightforward, but I did run into a few hangups and did not find a great guide to cover my situation. After some struggling, I got my new PostgreSQL database up and running with all of my data. Here’s how I did it.

First, I had to create a dumpfile from my sqlite3 database. I had been doing this already as quick way to back up my data. I read that it might be more convenient to dump it into a different format, like CSV, that is better understood by other databases like PostgreSQL, but I used the normal SQLite3 command and found that the manual adjustments from there were not cumbersome.

$ sqlite3 db/development.sqlite3 .dump > db/sqlite-dumpfile201311231200.sql

This is the path to my database: db/development.sqlite3. The .dump command breaks down your database into SQL statements, including everything you need for your schema. Then you indicate where you want to direct the dump using >. I created my dumpfile in the same directory and added a timestamp. Since the dump gives you SQL statements, the file extension should be .sql.

This file is sufficient to reseed another SQLite database, but PostgreSQL databases have some differences in syntax and datatypes. At first, I looked around to find a script or an application to manage these changes for me. My advice is: don’t bother. Finding one is hard enough, then you have to learn how to use it. A simple find/replace in your dumpfile will solve it just fine. This is what I had to change:

find replace with
datetime timestamp
varchar(255) text

I only had to update this last one because I had entries in “description” columns that were too long. According to this very helpful post on StackOverflow, you may also need to change your boolean values into a type that is compatible with PostgreSQL. My booleans were already formatted correctly. For reference, here’s the list from the PostgreSQL documentation:

Valid literal values for “true” state: For the “false” state:
’t’ ‘f’
‘true’ ‘false’
‘y’ ‘n’
‘yes’ ‘no’
‘on’ ‘off’
‘1’ ‘0’

“Leading or trailing whitespace is ignored, and case does not matter. The key words TRUE and FALSE are the preferred (SQL-compliant) usage.” docs

The next step is to take out the SQLite-specific statements. At the top of my dumpfile, I had PRAGMA foreign_keys=OFF;. This tells the database to not enforce foreign key constraints, but PostgreSQL handles these differently. I took out the statement with no issue.

At the bottom of my dumpfile, I had a statement like this for each of my tables:

INSERT INTO "sqlite_sequence" VALUES('ks_users', 227393);

These tell SQLite where to set the next value for the autoincrementing primary keys of my tables, which in the example is ks_users. PostgreSQL does the same thing, but in a different way, using sequences, which are special database objects designed for this purpose. We’ve already told our PostgreSQL database to use sequences when we changed our primary keys to SERIAL instead of INTEGER with AUTOINCREMENT, so the next step is just to take out all references to sqlite_sequence tables. (If you need to add new records to your database after the conversion, you’ll need to make one other change that we will talk about a bit lower down.)

If you haven’t created the PostgreSQL database yet, follow these steps:

  • Update config/database.yml to look something like this:
  adapter: postgresql
  database: my_database_development
  pool: 5
  timeout: 5000
  adapter: postgresql
  database: my_database_test
  pool: 5
  timeout: 5000

  adapter: postgresql
  database: my_database_production
  pool: 5
  timeout: 500
  • Start the database server with this command (if you’ve got the default settings)

    $ postgres -D usr/local/var/postgres

  • Create the database

    $ rake db:create

  • Don’t run the migrations. All the necessary statements are in the dumpfile.

At this point you should be able to fill your PostgreSQL database with your SQLite data with this command:

$ psql -d my_database_development -W < db/sqlite-dumpfile201311231200.sql

This invokes the PostgreSQL binary. The -d flag tells it to open a database with the name that follows (in this case “my_database_development”; the name of your database can be found in config/database.yml). Then -W specifies to prompt for your password. This password will be the password for your OSX user account. The last part of the command specifies which file to use to feed in to your new database. This will be the SQLite dumpfile that you created and then modified.

There’s one more step you need to do. If left as is, PostgreSQL will try to add records with the primary key starting at 1, because we removed the sequence information from the dumpfile, as it pertained specifically to SQLite. You won’t be able to add new records to the database because the primary key for new records won’t be unique. It is easy to update the sequence information manually, and you may also be able to include the correct statements into your dumpfile prior to loading it into your PostgreSQL database. Here’s the manual version:

  • Go into the PostgreSQL interactive terminal with your database

    $ psql -d my_database_development

  • I recommend looking at your schema next, so that you know the names of all the tables you have (database=# represents the psql prompt — the commands to run come after the =#)

    database=# \d

  • You can also check to see where your database will try to start autoincrementing from (but this is not necessary)

    database=# SELECT nextval('ks_users_id_seq');

  • Set the starting point to be one after than the last record in the table. Be sure to use single quotes for the arguments here

    database=# SELECT setval('ks_users_id_seq', (SELECT MAX(id) FROM (ks_users)+1));

  • Repeat this process for every table that has an autoincrementing primary key See this post for a more detailed explanation of this process.

And that’s all. You should be able to access your database with all of your old data and add records with no trouble.