How JOOQ helped us reduce our query response times

How JOOQ helped us reduce our query response times

In this post, we aim to explain why we chose to try JOOQ as the data access library for one of our services, what were our main benefits and obstacles and how this led us to use it as our main database library in the rest of the services.

Info about the stack

Spring/Spring Boot is the dominant framework for web services in the Java world and this is what we are using. A small note here is that we are using Kotlin instead of Java.

Spring Boot is a "batteries included" framework and one of these batteries is a data access library which allows us to interact with any database.  The most common data access library, which we were also using in our services is Spring JPA.

Object Relational Mappers

Every mainstream language has an ORM and sometimes they are already bundled with the framework you are using.  Laravel has Eloquent, Ruby on Rails has Active Record, C# has Entity Framework and so on, and Java which in our case has Hibernate.

The truth is that ORMs reduce a lot of boilerplate and in Java's case they reduce it even more since Java has a lot of boilerplate code. This is one of the reasons developers like ORMs, because they allow them to use a simple:

val entity = entityRepository->findById(1)

instead of manually using the entity manager and messing with opening connections, managing sessions, closing sessions, concatenating strings to build a SQL query and so on to fetch a row from the database.

Some problems with ORMs are the following:

  • We need to learn how the ORM works, on top of a language and a framework to use it effectively
  • We cannot control the queries that an ORM produces easily
  • ORMs are good for trivial queries but very cumbersome when you need to build a more complex query

Sometimes, ORMs are poor even in trivial queries which was the issue in our case that we explain below.

The problem

In our case we have 2 entities, A and B and there is a relationship 1-1 between them.  What we wanted to do was to fetch a list of ids for entity A where a specific column was equal to a specific value. We realised that we had a problem, when the count of matching rows of the query was 563.

We started by using the common "findByXIn" method in the repository interface, which returns a list of entities.  That returned 563 objects which we had to keep in memory and then iterate over them to get their ids, something not really efficient. We also didn't need the entities themselves but only their ids. hat's why we decided to use a @Query annotation and write some HQL which essentially tried to select one field of the class, since HQL is working in an OOP manner.

But even that was not enough, the query needed 510ms to retrieve 563 items. Databases have decades of work put into them, they are extremely powerful which meant that spending 500ms to fetch 600 ids was very bizarre.

After spending some time with hibernate’s logs, we realised that the issue was the 1-1 relationship between A and B entities, which was eagerly loaded.  Making the relationship lazy was not an option, so we always had to load the B entity while looking for A entities.

We decided to create a custom repository where we used the entity manager.  The way Spring Boot repositories work didn't allow us to create a custom method in ARepository.  This meant that we needed to create 2 more classes, 1 interface and 1 implementation in order to create a simple method where we fetched a list of ids using raw SQL.

We had to bypass the ORM to do something very simple in theory. Fetching data now had more reasonable timings and we decided to move on, but we were not satisfied.  It felt weird having to build all these workarounds to accomplish something that simple.

We wanted to improve this and we decided to do some research for alternatives, and ended up choosing among Exposed, Ktorm, JOOQ and SQLDelight.  The first two are ORMs so we would probably face the same issues - if not more - with Hibernate with less content available on the internet for potential issues, so we ignored them. An engineer from our team had already experience working with JOOQ before that, so when comparing it with SQLDelight we decided that it would be better to go with JOOQ since it was around longer, had a lot of features and very good documentation.  So we chose to try JOOQ in a small service we have to evaluate it.

What is JOOQ?

JOOQ is a library created by Lukas Eder that allows you to write type safe SQL as if you were writing Java.  We are using Kotlin, but this is not a problem. JOOQ interacts with Kotlin without any issues, and there are cases that you can express things in a more elegant way because of Kotlin's expressiveness.  The way it works essentially is that it uses your connection credentials to the database, and based on your schema it can generate the classes that represent your tables and their columns. After doing that, you just need to use its very powerful DSL to write queries and execute them.  That's it.  You tell it what to do through code, and it converts it to SQL. It does not matter which database you are using, since JOOQ abstracts that part.  Technically it matters because for specific databases (e.g SQL Server, Oracle Express) you need to use the paid edition depending on your team size but this was not our case since we mainly use Postgres and Mysql.

Solving our issue

The results of testing JOOQ were very promising, we saw that writing queries was straightforward, what we wrote was directly translated to what we wanted in SQL and therefore we decided that we can try to solve our data access issues with JOOQ. Spoiler alert: It did!The difference was huge in 2 different aspects.

  • Comparing to the Hibernate version the query response times reduced from 510ms to 83.7ms
  • Comparing to raw SQL, we got rid of hand written SQL in prepared statements and instead we had a nice type-safe and expressive way to write our queries

Below you can find 2 screenshots of the queries using Hibernate and JOOQ that we captured in our Datadog dashboard.

An important detail is the difference of the number of the blue lines. Both images contain 2 main sets of blue lines separated by a white space. Each blue line represents a query to the database.

We will ignore the first blue set which is an unrelated query to our issue and focus on the second set. This allows us to visualise the fact that in the first case, for every database row we made an extra query to fetch the related B entity, which essentially was the cause of the slowdown.

The second image is after using JOOQ which shows that we had to make only 1 query.

The journey from Hibernate to JOOQ looked like this:

interface ARepository {
  fun findByValueEqSomeOtherValue(value): List<A>
}

to HQL

@Query("select a.id from A where a.someColumn := value")
interface ARepository {
  fun findMatchingA(value): List<Long>
}

to raw SQL

fun getMatchingA(value: Int): List<Long> {
  val query = entityManager.createNativeQuery(
  """
  SELECT a.id
  FROM a
  WHERE a.some_column = :value
  """
  )
  query.setParameter(value, value)
  return query.getResultList()
}

to JOOQ


fun getMatchingA(value: Int): List<Int> = 
  dsl.select()
     .from(A)
     .where(A.SOME_COLUMN.eq(value))
     .fetch(A.ID, Long::class.java)

In this example, some may say that the switch to JOOQ didn't change a lot but what we saw was that JOOQ pays off even more when you go further than simple selects. That is why we decided to also replace Hibernate gradually in the rest of services.

Issues we faced with JOOQ

There is no silver bullet and of course JOOQ is not a panacea to our problems. Below we can see some of the problems we faced.

Build configuration: One very powerful feature of JOOQ is its ability to generate classes based on the database schema, but this comes with a cost. We had to adjust our build flow to run the generator. Fortunately there is very good documentation about it. This of course, depending on the schema's size, adds time to the build time, but it can be reduced by not generating classes if the schema has not changed. In our case we decided to  use Testcontainers and Flyway  to run the migrations in the container and then make JOOQ look at that schema and run the generator.

Kotlin data classes: Since we are using Kotlin, we would like to use its null safety.  If we use the Kotlin generator the auto generated classes from JOOQ, currently produce data classes where all the fields are nullable. This is not very useful for us since we will have to use "?." in every field access. It is worth mentioning  that this is not entirely JOOQ's issue, since NULL has a different meaning in programming languages and SQL so there is no straightforward way to fix this.  For now we are using Java records instead of Kotlin data classes for POJOs and trying to reduce null pointers possibility.

Learning curve: It's a fact that everything new has its own learning curve and this of course applies to JOOQ. You have to learn new things but this is not the core issue. The core issue in this specific case is that you have to make a shift in the mentality about how you approach problems that interact with the database.

ORMs made us think less about our SQL and we let them do the job for us, even if there were many cases where they did a very poor job. With a tool like JOOQ you are supposed to model your queries effectively since it will be the programmer's job to write effective queries, and JOOQ is just going to translate it from code to SQL.

Conclusion

This post has no intention to discriminate against ORMs in any way. We believe that they are very powerful tools and Hibernate helped us to move forward with our services quickly.

We came to a point where we didn't need to just execute simple queries for certain microservices and Hibernate was a constant problem for us. That's why we looked for alternatives and it turned out to be a very good decision. Nothing is a silver bullet but we saw that JOOQ fits our needs better than Hibernate in some cases

Lastly, we would like to mention that this article is focused on the Java world and that similar tools may not be available in other ecosystems, which means that an ORM may be a more sensible choice.

JOOQ™ and its logo is a trademark of Data Geekery GmbH