Title

Total Recoll (desktop search, writing a custom document input handler) | AMA

Plan

Live notes (tidied)

install recoll

sudo apt install recoll

Recoll 1.36.1 + Xapian 1.4.22

potentially interesting files in home dir after first run of recoll

</.config/QtProject.conf> </.config/Recoll.org/> </.recoll/> </.recoll/recoll.conf>

sqlitebrowser — https://sqlitebrowser.org/

files in a apt package: apt-file list foo-package

ref https://serverfault.com/questions/96964/list-of-files-installed-from-apt-package

recoll execm handler the way to go because 1 zotero db corresponds to many documents

ref https://www.recoll.org/usermanual/webhelp/docs/RCL.PROGRAM.FILTERS.html

aka “persistent handler” ref https://www.recoll.org/usermanual/webhelp/docs/RCL.PROGRAM.FILTERS.ASSOCIATION.html

Most trivial execm document input handler: rcltxtlines.py — only in source control: https://framagit.org/medoc92/recoll/-/blob/master/src/filters/rcltxtlines.py?ref_type=heads

to revisit

more specific mime handler

rclzotero.py should be for zotero.sqlite only, not all sqlite databases.

  • mimemap supports restricting mappings to directories and supports ”~” too

    e.g.:

    [/usr/share/man]
    .0p = text/x-man
     
    [~/.kde4/share/apps/okular/docdata]
    .xml = application/x-okular-notes

    so could do that for ~/Zotero

HTML output, fields

https://www.recoll.org/usermanual/webhelp/docs/RCL.PROGRAM.FIELDS.html

a recoll web browser extension for indexing pages you’ve visited

https://www.recoll.org/faqsandhowtos/IndexWebHistory.html

I think Recoll “external indexers” offer more power but leave it for now

e.g. you have control over subdocument reindexing/purging

https://www.recoll.org/usermanual/webhelp/docs/RCL.PROGRAM.PYTHONAPI.UPDATE.EXTINDEXER.html

uncertain about where my execm python script should live

<~/tso/bu/devel/me/recoll-zotero-db-handler/>

I’ll just symlink in the dependencies for now.

When I get it ready for upstreaming, it’ll live along side all the others

  • these get installed to </usr/share/recoll/filters/>

debugging tip: use recollindex to index just one file

https://www.recoll.org/faqsandhowtos/WhyIsMyFileNotIndexed.html

e.g.

recollindex -e -i /home/tsoap/tso/bu/devel/third-party/jackyzha0--quartz/content-org-roam/scratch/zotero.sqlite 2>&1
  • -e purges the file from index
  • -i indexes it

increase verbosity of recollindex by setting loglevel = 5 in recoll.conf

https://www.recoll.org/faqsandhowtos/logfilesetup.html

.sqlite is in noContentSuffixes in the default recoll.conf

https://www.recoll.org/usermanual/webhelp/docs/RCL.INSTALL.CONFIG.RECOLLCONF.WHATDOCS.html

That’s why I was getting this in the logs:

:4:index/mimetype.cpp:181::mimetype: fn [[...]/zotero.sqlite] in stopsuffixes

asides

Scotrail announcements in Datasette: https://scotrail.datasette.io/