Yesterday I posted a feature wish list, design elements I’d like to see in my media library.
How does calibre measure up?
- Large number of documents, in a performant way. Calibre can handle a large number of documents, but performance drops fast as the number of documents increases. I find that at around 20k-30k documents it’s pretty okay on an SSD, but a few hundred thousand on an HDD is a bad time.
- Arbitrary file types. It appears calibre can store pretty much any file with an extension (but not those without extensions). Ebooks get some extra love (reading metadata, capturing covers, etc.) but other file types can be stored. I suspect plugins can expand support for other file types, but I haven’t looked into it.
- Rich metadata. Calibre has a pretty good range of metadata built in for documents, and documents can have custom metadata fields added. I have the sense that adding many custom metadata fields adversely affects performance, and you can only add custom metadata to documents, but within its limits I think calibre does a pretty decent job with this.
- Batch processing. Calibre has a command line interface (calibredb program) and a powerful plugin system. I can see things that could make it better, but calibre does support batch processing.
- Modular implementation with plugin support. Calibre has a powerful plugin system. The application overall is fairly modular, but there are some critical elements I feel would be very difficult to swap out… and they’re the ones that I’d like to work differently.
- Hierarchical entities. Calibre does not have hierarchical entities. Some metadata can be presented hierarchically in the tag browser, but I see no real support for hierarchical entities.
- Powerful deduplication. Calibre has a plugin that’s good at finding duplicate entries (by ID or by author and title) and finding duplicate files (size and checksum match, though it appears to recalculate each time it’s needed). I feel it needs human intervention to ensure duplicates are correctly remediated, but the facility is there.
- Separate database and storage locations. Calibre doesn’t really do this. There is a way to separate the storage if you have only one library, and I think in a UNIX-based system you might be able to use symbolic links to simulate it (but I can see how it could fail, if the database is not updated but replaced on storage). It does not appear to do this in any systemic way… perhaps a plugin could, I haven’t looked into it.
- Metadata-agnostic repository. Calibre totally doesn’t do this, all document files are stored by (author, title). Every time either of these fields changes, the files are moved. This causes certain metadata changes to be slow and leads to very inefficient file storage.
- Multiple concurrent accesses. Calibre can sort of do this by using the web-based interface, but I find it runs slower than even the command line tool. You cannot run two instances of calibre at once, even on different libraries. You can’t even use calibredb to read one library at all when you have the GUI open, even on another library entirely.
- (nice to have but not needed) Metadata capture/scraping. Calibre can fetch metadata from many sources.
- (nice to have but not needed) Device interaction. Calibre recognizes and can update files on many ebook readres and tablets.
All in all, calibre doesn’t appear to fit my needs very well.
To be fair, much of my wish list is specifically to address the elements of calibre that I work around.
I don’t want to suggest that calibre isn’t a good program. I find it is good at doing what it is designed to do, and so far it has been able to do what I’ve asked it. My challenge is that i want to ask it to do things it’s not really meant to do.