Update: This blog post is quite old and does not reflect the current state of Desktop Search in KDE. If you’re just looking to disable it, you can either run the $ balooctl disable command or go to System Settings -> Search -> File Search -> Disable. KDE SC 4.13 is finally out. As you may have heard this marks the release of Baloo. The bear is now out in the wild!
As many of you may have heard, with KDE SC 4.13, the searching infrastructure has been through a big overhaul. Along with these changes, we will also be releasing a dedicated search plasmoid called “Milou”. Milou can also be used as an alternative to KRunner, and does provide application launching. The main difference is that it concentrates more on searching. Previews Milou also offers experimental support for previews for different results types.
As you might have read, the 4.11 release of Nepomuk is a lot faster while doing writes and therefore indexing is going to much faster. However, read performance was nearly the same. Architecture All application communicate to Nepomuk via the Storage Service which acts like a server. It is responsible for loading the ontologies, managing virtuoso and sending change notifications. The read communications happen over a local socket whereas the writes are sent over dbus.
With the 4.10 release of Nepomuk, we decided to move away from Strigi and write our own indexers. We support most of the commonly used formats. Also, the new code is faster and more importantly more maintainable and easier to contribute to. So far this decision has worked pretty well for us. That being said - we still do not have enough indexers. Over the last week I managed to write simple ODF and Office2007 indexers, but we still need some more.
A couple of days ago I talked about how we have been clearing up some unwanted data in Nepomuk for the 4.11 release - mainly graphs. This change comes with a increased performance of over 100% in many cases, and makes the codebase simpler, and easier to maintain. Unfortunately, it comes at a cost. The graphs in the old database need to be merged to a small number. This operation is a very time consuming process cause merging graphs is equivalent to slowly removing your entire database and reinserting it.
Since I’ve become the maintainer of Nepomuk we have put a strong emphasis on performance and stability. One of the core parts of Nepomuk are the high level operations that are exposed to the applications. These operations are typically used to insert and modify data into Nepomuk. Each of these operations is quite complex and involves a number of complicated queries. For this 4.11 release we wanted to simplify that code and make it more efficient.
When something goes wrong in Nepomuk, its easy for us Nepomuk developers to track it down, but for other developers and users it can be quite hard. Even simple things like reporting which component is malfunctioning isn’t completely obvious. Over the last month, we have simplified some of the external details and added tools which will help us debug your problems so that we can fix things more easily. These all will be shipped with nepomuk-core in 4.
I’ve blogged about some of the more prominent changes in this new Nepomuk release. I thought it would be a good idea to document all the changes, which Nepomuk has gone through thanks to Blue Systems! File Indexing As the release announcement has been saying, the file indexer has undergone the maximum number of changes. New Double Queue Architecture We’ve split the working of the indexer into two parts - The first basic indexing and second full file indexing.
Nepomuk has long required a convenient way of managing tags. I’ve previously tried this with a simple Tag Managing Application, but that wasn’t something that we wanted to ship. For the KDE Workspace 4.10 release, we are releasing a Nepomuk Tags kioslave. Listing Tags The kioslave provides a very convenient way of listing all the tags. You can even rename and delete tags, just like you would for any other folder.
Nepomuk has a unique problem of maintaining an RDF store. Unlike traditional SQL based stores, RDF offers a very loose schema, which is a HUGE advantage. Unfortunately all of the current RDF stores do not support any form of schema enforcement. It’s up to the client code to make sure that the data being pushed is valid. This has resulted in a number of problems such as strings being stored where an integer should go.