Batch update performance enhancements using SQL withBatch()
Hi guys,
Recently as part of a project, I had to populate a SQLite database with large amounts of data pertaining to a number of classes requiring more than 5000 inserts and updates per class. I created a new SQLite database using Groovy’s Sql class. The initial strategy involved creating prepared statements and executing individual insert/update statements for each record that needed to be inserted/updated in the new SQLite database.
However the process was taking longer than expected, and the cumulative time taken for the whole application to sync less than 30 classes was coming to be more than 2 minutes. This time taken was extremely high regarding the context of the application and was a real performance bottleneck. What I did notice was that the insert statements were identical for a particular class, disregarding the values that needed to be inserted. The same was the case with the update statements.
After looking through the Sql GDK, I found a method named Sql.withBatch() that performs batch manipulation of records in a database. See the following code for illustration:
[code]
Sql sql = Sql.newInstance("jdbc:sqlite:/home/ron/Desktop/test.db", "org.sqlite.JDBC")
sql.execute("create table dummyTable(number)")
Long startTime = System.currentTimeMillis()
100.times {
sql.execute("insert into dummyTable(number) values(${it})")
}
Long endTime = System.currentTimeMillis()
println "Time taken: " + ((endTime – startTime)/1000)
[/code]
The output of the above code comes out to be 14.313 seconds. That is to execute 100 insert statements with only a single attribute, it would take around 15 seconds. The time taken to insert records is dependent on the number of records being inserted and increases exponentially. Clearly a performance bottleneck in any application involving batch inserts and updates.
Let us consider the same code and the performance with the withBatch() closure.
[code]
Sql sql = Sql.newInstance("jdbc:sqlite:/home/ron/Desktop/test.db", "org.sqlite.JDBC")
sql.execute("create table dummyTable(number)")
Long startTime = System.currentTimeMillis()
sql.withBatch {stmt->
100.times {
stmt.addBatch("insert into dummyTable(number) values(${it})")
}
stmt.executeBatch()
}
Long endTime = System.currentTimeMillis()
println "Time taken: " + ((endTime – startTime)/1000)
[/code]
The time taken with the above code comes out to be 0.103 seconds! A remarkable performance improvement over the conventional method of inserting records using the execute() method.
The only drawback with using the withBatch() closure is that it does not allow prepared statements to be added to the batch. This limits the use of batch statements as we have to manually create insert or update statements.
Thanks a lot!
Save tens of seconds for my application performance!