Vishesh Handa's personal website

More Nepomuk Performance Upgrades


As you might have read, the 4.11 release of Nepomuk is a lot faster while doing writes and therefore indexing is going to much faster. However, read performance was nearly the same.

Architecture

All application communicate to Nepomuk via the Storage Service which acts like a server. It is responsible for loading the ontologies, managing virtuoso and sending change notifications. The read communications happen over a local socket whereas the writes are sent over dbus.

Original Architecture

The Storage service communicates with virtuoso via odbc, which interally also uses a local socket. This architecture is quite similar to that of Akonadi.

Cutting the middle man

I initially started with many optimizations in Soprano where we use ODBC to communicate with virtuoso. These gave a good 30% increase in performance. However, the largest increase in performance was by removing the local socket communication between applications and the storage service.

New Architecture

All applications now directly communicate via virtuoso for reading data. The writes still go through the storage service so that we can do type checking and so that we can send the change notifications.

Performance Increase

Removing the storage service from the middle results in a performance increase of 6-7x. For example - listing 50000 results goes down from ~20 seconds to about 2.5 seconds.

Charts

Aditionally, since now the applications now directly communicate with the database, the CPU load of the Storage service also goes down a lot. It no longer has to serialize and deserialize data.