Saturday, December 20, 2014

Data access benchmark: direct query, dOPF object query, dOPF entity query

It's been a long time since my last post (over a year!). In this post I'd like to share some benchmark result of data access using Free Pascal. The setup is as below:

  • Machine: HP Pavilion 14-n038tx
    • OS: Manjaro Linux x86-64 0.8.11 + first update + flash plugin critical security update, kernel 3.17.4-1-MANJARO
    • CPU: Intel Core i5-4200u 1.6 GHz (Turbo Boost up to 2.6 GHz)
    • RAM: 4 GB (stock) + 4 GiB (additional) = 7.6 GiB
    • HDD: Hitachi HTS54757 Rev A50A 750 GB ~= 698 GiB
  • Development Environment:
  • Benchmark Environment
    • MariaDB 10.0.15, default settings
    • Table 1: 2600 rows
    • Table 2: 2654 rows
    • Select query that joins table 1 and table 2 on an integer key
Actual table information is hidden, because they're real table used in my company's product, which is of course a proprietary product. A little info: table 2 has a foreign key to table 1, because it's meant as multilingual information for table 1 rows.

Benchmark flow is simple:
  1. Connect to db
  2. Query
  3. Loop over the result while creating JSON array of object
  4. Write it out (for output correctness comparison)
The result is, well, interesting as well as confusing. Why? Here's the result when I limit my select query result (in the SQL statement) to 2651 rows:

Looks normal, isn't it? A framework will have overhead so direct query will always be faster. But wait, what if I change the limit to 2652 or even without one:

Dafuq? How come dOPF ones get faster? At this stage, I thought the resulting txt files of dOPF might be broken, so I diff them with direct query one. Nope, they're all the same. I have no explanation at the moment because I haven't got the time to dig in what dOPF (or probably SQLdb as its connector) do to make this weird result.

dOPF surely has a big advantage over direct query because it's easy to change backend dbms simply by changing the value of TdSQLdbConnector.Driver. If the result is also faster than direct query, then there's no any other reason to use direct query for portable, cross dbms programming solution.

dOPF as a framework is very modular, it supports:
  • manual SQL
  • entities (result row is automatically converted as Pascal object, with automatic memory management)
  • SQL builder (automatic SQL generation from given conditions)
  • OPF itself (no SQL required in your code, you can save, load and access objects just like any other Pascal objects)
You're not forced to use one, feel free to mix any of them as you like.

If you want to try yourself, the whole package (source files + benchmark script) is available here. You should be able to run this on any supported platforms, provided you have bash (>3.0) & bc installed (because the benchmark script depends on it). Don't forget to edit to fill it with your database name and query.

Feel free to add more persistence frameworks (tiOPF for instance) or libraries (ZEOS for instance) to make more data available.