Title
Total Recoll (desktop search, writing a custom document input handler) | AMA
Plan
- get Recoll to index Zotero database in a nice way
- following on from 2025-05-05 11:29 Technosoap live-stream
- maybe Datasette?
Live notes (tidied)
install recoll
sudo apt install recoll
Recoll 1.36.1 + Xapian 1.4.22
potentially interesting files in home dir after first run of recoll
</.config/QtProject.conf>
</.config/Recoll.org/>
</.recoll/>
</.recoll/recoll.conf>
sqlitebrowser — https://sqlitebrowser.org/
files in a apt package: apt-file list foo-package
ref https://serverfault.com/questions/96964/list-of-files-installed-from-apt-package
recoll execm
handler the way to go because 1 zotero db corresponds to many documents
ref https://www.recoll.org/usermanual/webhelp/docs/RCL.PROGRAM.FILTERS.html
aka “persistent handler” ref https://www.recoll.org/usermanual/webhelp/docs/RCL.PROGRAM.FILTERS.ASSOCIATION.html
Most trivial execm document input handler: rcltxtlines.py
— only in source
control:
https://framagit.org/medoc92/recoll/-/blob/master/src/filters/rcltxtlines.py?ref_type=heads
to revisit
more specific mime handler
rclzotero.py should be for zotero.sqlite only, not all sqlite databases.
-
mimemap
supports restricting mappings to directories and supports ”~” tooe.g.:
[/usr/share/man] .0p = text/x-man [~/.kde4/share/apps/okular/docdata] .xml = application/x-okular-notes
so could do that for
~/Zotero
HTML output, fields
https://www.recoll.org/usermanual/webhelp/docs/RCL.PROGRAM.FIELDS.html
a recoll web browser extension for indexing pages you’ve visited
https://www.recoll.org/faqsandhowtos/IndexWebHistory.html
I think Recoll “external indexers” offer more power but leave it for now
e.g. you have control over subdocument reindexing/purging
https://www.recoll.org/usermanual/webhelp/docs/RCL.PROGRAM.PYTHONAPI.UPDATE.EXTINDEXER.html
uncertain about where my execm python script should live
<~/tso/bu/devel/me/recoll-zotero-db-handler/>
I’ll just symlink in the dependencies for now.
When I get it ready for upstreaming, it’ll live along side all the others
- these get installed to </usr/share/recoll/filters/>
debugging tip: use recollindex
to index just one file
https://www.recoll.org/faqsandhowtos/WhyIsMyFileNotIndexed.html
e.g.
recollindex -e -i /home/tsoap/tso/bu/devel/third-party/jackyzha0--quartz/content-org-roam/scratch/zotero.sqlite 2>&1
- -e purges the file from index
- -i indexes it
increase verbosity of recollindex
by setting loglevel = 5
in recoll.conf
https://www.recoll.org/faqsandhowtos/logfilesetup.html
.sqlite
is in noContentSuffixes
in the default recoll.conf
https://www.recoll.org/usermanual/webhelp/docs/RCL.INSTALL.CONFIG.RECOLLCONF.WHATDOCS.html
That’s why I was getting this in the logs:
:4:index/mimetype.cpp:181::mimetype: fn [[...]/zotero.sqlite] in stopsuffixes