Nepomuk and KDE 4.9.1
Last week 4.9.1 was tagged, and it should release any day now. A large part of my time was spent bug fixing and stabilizing some of the most user visible features. This blog post highlights some of the more user visible changes.
The Nepomuk File monitoring service had a serious memory leak in 4.9.0, which was unfortunately not caught in the beta or RC testing period. Depending on the number of directories present, the file watcher service would consume a good 2 GB of your RAM. Fortunately, many distributions patched their own tarballs before releasing to the public.
Another bug was that lots of files were being unnecessarily reindexed when they were opened in write mode even though no changes had been made. With 4.9.1 we actually check if the modification time of the file has changed, even when the file notification system tells us otherwise.
This should fix most of the problems of files unnecessarily being reindexed.
The File Indexer has also gone through a number of fixes -
Do not check for new files on startup
One of the most annoying things I found about file indexing was the initial check for new files on startup. Even though this check runs silently in the background, it was very annoying. Specially for people who run Nepomuk all the time.
Now, we only check if the strigi version has changed, and only then do we check for new files / check for any files that could not be indexed the last time. If someone wants to forcibly check for new files, they can do so with a dbus call -
$ qdbus org.kde.nepomuk.services.nepomukfileindexer /nepomukfileindexer updateAllFolders false
Or wait till 4.10, when I add an option to do so in the KCM.
This patch technically made it into 4.9.0, but I never got the chance to blog about it. I’ve introduced a SimpleIndexer which serves as backup when the strigi indexers provide incorrect data. This way instead of having no data about the file we at least have the basic information such as filename, url and mimtype. I have plans for a more concrete solution for 4.10, but that’s for another blog post.
Another annoying bug that was not caught with 4.9.0 was a deadlock that rendered the file indexer service useless.
This only happened during the first run of Nepomuk, and led to some unfortunate publicity. In in the end it was a very simple fix once the proper backtrace had been provided.
Another big change which probably has a large impact on indexing times is the blacklisting of certain file indexing (strigi) plugins. Currently by default, we only blacklist the SHA1 hash generator. With it blacklisted, we do not (hopefully) need to read the full contents of the file. This results in a noticeable performance improvement for large files.
One of my favorite features of Dolphin is their Places Panel with all of those defaults - Documents, Images, Audio and Video files. Unfortunately, those defaults were normal Nepomuk queries and not file queries. With 4.9.1, they have been changed to file queries which lets them access a number of optimizations on our end, and allows them to benefit from my minor optimizations in file queries.
Most applications which currently use Nepomuk utilize the QueryService in order to run queries. The Query Service can then provide updates on those queries if there is some change in the results - It used to do so by running ALL open queries when any data in Nepomuk changes. This was not a good approach, as it obviously does not scale.
With 4.9.1, we are now using some simple heuristics and the Resource Watcher to only re-run queries when affected data changes. So indexing a file will not result in the updating of all custom Nepomuk magic folders.
Nepomuk Core Library
There have been a large number of improvements in the Nepomuk Testing Suite, and more tests have been added. Most of the tests passed, but it still serves as a good way of checking regressions.
Along with that there have been some fixes with Removeable Media handling.
Recently I blogged about certain major query optimizations, which would only be public in 4.10. This major optimization was done by skipping a large part of the Soprano code. After the blog post, we identified some of the bottle necks, and optimized them. The most impressive result was a reduction of one query from 6 seconds to 1 second.
Again, this isn’t really a part of KDE 4.9.1, but it will be a part of Soprano 2.8.1. So when it releases, make sure you upgrade.
That’s all for this release :)