Increase Your App Query Performance By Bulk Selecting Database

Image credits to Héctor Martínez at https://unsplash.com/photos/Hs-Tt4fBX3M

For those of you who haven’t heard, bulk processing in a nutshell can be described more or less of a small multiple processes that is being ran at one single process rather than individually. In this writing, I’m about to show you one simple case in which bulk query to database can enhance performance compared to doing several queries one after another.

There are a few process in this example that we will run through:

  1. Create postgres DB instance

Create Postgres DB Instance

We’re going to use docker for the sake of simplicity in creating disposable DB instance. Following the basic instruction in the following link: https://hub.docker.com/_/postgre

you can run the written command (with some minor adjustment)

docker run --name local-postgres -e POSTGRES_PASSWORD=password -p 5432:5432 -d postgres

We’re going to change the container name to be local-postgres, the password to be password, and declare port mapping 5432 to 5432.

Creating Table and Data Seeding

postgre DB instance is ready

After our DB container is running, we need to create the table and put some data for our experiment. You can download / see the seeder SQL query here:

Create Table and Seeds SQL

It will create users table with two columns (id auto increment primary key, name varchar), then we’ll insert 100 random data. More or less the table example will be like this kind of format:

table format example

Go Program to Do the Benchmarking

Since the DB is all ready, now we’re going to create simple go program to conduct the experiment.

Disclaimer: The code written for this experiment is written to the point for the sake of simplicity.

The code flow will works like these:

  1. Init the DB function (we’re using sqlx library for all the DB init and operation, https://github.com/jmoiron/sqlx)

The source code can be in this link:

Result

The result based on above scenario can be seen on table below (data is in miliseconds ms):

Experiment result, lower is better
lower is better

As we can see that the bulk process on average got 8.4 times faster than the single query for 10 data, and even 13.7 times faster than single query for 20 data.

Summary

Since the bulk query performance does better performance than looping multiple data and do query for each of them, it’s better to use bulk query if you want to get multiple data at once. Single query iterated through a loop is slower due to its flow to call and use DB connection over and over again while the bulk performance wins due to one call is all that it needs to get the data. However, this might not be working on all DB calling cases, It is more suitable for the case when one single simple and fast query is being done for multiple different parameters, then you can group it into batch process to get this kind of improvement.

Fellow Software Engineer

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store