Jan 12, 2014

INSERT VS COPY: The fastest way to do a bulk insert into PostgreSQL

It takes time to populate lots of data in database by many "INSERT" statements. The official PostgreSQL documentation mentions the topic.

PostgreSQL 9.3.2 Documentation Chapter 14. Performance Tips 14.4. Populating a Database

The chapter shows many options, but I am paying attention to first two.

Disable autocommit


PostgreSQL commits each statement automatically. That means if you run an INSERT statement, it runs like this for each statement:

  1. Open a transaction
  2. Insert data
  3. Close the transaction

It is redundant. In this case, it becomes faster if you use autocommit like this:

BEGIN; -- the beginning of the transaction
INSERT xxxx
INSERT xxxx
END;    -- the end of the transaction

PostgreSQL does not commit the statements between BEGIN and END.

COPY


COPY is also to use populate data, but the different point is that COPY does not offer autocommit. It means that PostgreSQL commit after all data is populated on a database.
Please refer to the document about how to use COPY statement.


No comments:

Post a Comment