Let's check the performance of some other plugins. The strigi png extractor was clearly faster than the KFileMetaInfo based one, but how about the other fileformats?
So, I went looking for a fileformat which was abundantly available on my pc, and supported by both strigi and kfile. Not much choice, but in the end, xpm turned out to be interesting. Again, xmlindexer was a lot faster. It took xmlindexer 13 seconds to extract the meta-data from all xpm files on my disk, while kfile needed almost 2 minutes (average over 3 runs again).
Now I'm really wondering why the difference is so big. So I went to sysprof to figure out what was happening. It can show you what an app is spending it's time on.
Looking at the results, it seems kfile spends only a very small percentage actually reading the metainfo. Most of it's time, around 50%, it's trying to figure out what mimetype the file is, using kmimetype! So that's the slow part... In the xpm case, a bit more than 3% was spend in reading meta-data, while with the png plugin, it was less than 0.2%!
So the speed difference between kfile and strigi's xmlindexer wasn't really in reading meta-data itself, it was mostly in figuring out the mimetype in KDE. Which strigi does much more efficient.
What does this say about strigi performance? Well, we're back to square one. It's not reliably possible to correct for the slowness of kmimetype, so I can't figure out how much faster (or slower, but I think that's not very likely) the strigi meta-data extractors are compared to kfilemetainfo. At least, we know if someone would write a strigi analyzer for it, figuring out mimetype will be a lot faster in KDE 4...
And of course, strigi could use it's database for meta-data - which might give a big speedup after all. And it will allow things like special directory listings based on some property of the data.