Converting a SQLite Database to PostgreSQL
Recently, I had a substantial database stored in an SQLite format, and needed to import it into a PostgreSQL database. This required a significant number of transformations to clean up the content.
🔗PGLoader
My recommendation for converting the format from SQLite to Postgres is PGLoader. Specifically, PGLoader has documentation for importing SQLite databases.
This requires a script which specifies what actions PGLoader should take. I adapted the documentation example for my own purposes:
LOAD DATABASE
FROM sqlite:///path/to/sqlite/db.sqlite
INTO postgresql://username:password@host:port/db
WITH include drop, create tables, create indexes, reset sequences, on error resume next
SET work_mem to '128MB', maintenance_work_mem to '1024 MB';
This will delete any existing data, but see PGLoader 'With' options if you would prefer alternative functionality.
Then, save this to a file, like sqlite.load
, and run:
pgloader --verbose sqlite.load
The --verbose
flag will allow you to see more details about any errors which
occur.
🔗Dealing with Invalid Encoding
Unfortunately, during my import, I encountered the error:
ERROR Illegal :UTF-8 character needs more reporting
This killed the import of the remaining data in the table, so I needed to clean this data before import.
Please note, the following steps replicate the data several times, so ensure that you have enough storage to store these intermediate copies (you will probably need about 5 times the original SQLite DB's storage space). Alternatively, it is possible to stream the commands listed with pipes, however, this will not preserve the intermediary results if a command fails.
🔗Dumping SQLite Data
The first step is getting the data in a textual format that commandline tools can process (i.e. not the binary format used by SQLite).
sqlite3 /path/to/sqlite/db.sqlite .dump > dump.sql
🔗Filtering Invalid UTF-8
This can then be processed by uconv
, a tool from the ICU tools,
however, this is likely already packaged for your distribution.
uconv --callback skip -t utf8 dump.sql -o dump_utf8.sql
Here, I uses the skip
callback, which simply discards any invalid UTF-8 byte
sequences, however, there are many other escaping or replacement options
available, see uconv
's manual page.
🔗Restoring the SQL Dump
pgloader
needs an SQLite database to load data from, therefore, I proceeded to
load this modified dump into a new SQLite database:
sqlite /path/to/sqlite/db_utf8.sqlite < dump_utf8.sql
🔗Loading the Cleaned SQLite DB
This DB can now be loaded with pgloader
into the PostgreSQL database as
before. Just make sure to replace the database imported with
FROM
, with your new db_utf8.sqlite
file:
FROM sqlite:///path/to/sqlite/db_utf8.sqlite
🔗Cleanup
After verifying that all the imported data was processed:
table name errors read imported bytes total time read write
----------------------- --------- --------- --------- --------- -------------- --------- ---------
fetch 0 0 0 0.000s
fetch meta data 0 18 18 0.143s
Create Schemas 0 0 0 0.000s
Create SQL Types 0 0 0 0.017s
Create tables 0 14 14 0.100s
Set Table OIDs 0 7 7 0.023s
----------------------- --------- --------- --------- --------- -------------- --------- ---------
table1 0 3080230 3080230 421.7 MB 3m9.205s 3m2.768s 2m50.694s
table2 0 104150000 104150000 8.3 GB 1h3m14.511s 1h3m9.381s 56m7.111s
----------------------- --------- --------- --------- --------- -------------- --------- ---------
COPY Threads Completion 0 4 4 1h6m27.182s
Create Indexes 0 10 10 19m33.343s
Index Build Completion 0 10 10 6m8.346s
Reset Sequences 0 0 0 0.200s
Primary Keys 0 6 6 56.408s
Create Foreign Keys 1 1 0 0.010s
Create Triggers 0 0 0 0.000s
Install Comments 0 0 0 0.000s
----------------------- --------- --------- --------- --------- -------------- --------- ---------
Total import time 1 113416301 113416301 9.1 GB 1h33m5.489s
I.e., ensure that the read
and imported
columns are the same for all data
you need imported.
Now, you can delete all the temporary files:
rm /path/to/sqlite/db_utf8.sqlite
rm dump.sql dump_utf8.sql
# If you wish to delete the original DB
rm /path/to/sqlite/db.sqlite