TL;WR: Exchange IMAP Searches are kind of broken. Exchange IMAP doesn’t support searching for categories, but you can work around that by searching for your category tag in the body. Make sure you only use lowercase letters in your search.
I was setting up paperless-ngx as an experiment for personal document management and wanted to be able to ingest emails & attachments from an O365 account. I wanted to be able to be able to tell Paperless to ingest an email by setting a category flag (e.g. Paperless-Inbox) so that I could archive emails without moving them from their folders. Unfortunately Exchange On-Prem and Online do not support the KEYWORDS search (See Exchange IMAP Standards V0039).
Note: The email library used by paperless-ngx does not support Modern Authentication, so I built a container to host the headless version of email-oauth2-proxy
While looking for workarounds I found that Exchange does provide the email categories in a
Keywords: header; however, despite Exchange including
Keywords: My-Category in the
FETCH n (BODY[HEADERS]) response, a search using
SEARCH HEADER "Keywords" "" returns no results. Sad trombone.
So if you want to find your categories in an IMAP search your categories must not have spaces and you must use a TEXT search.
Undocumented Exchange IMAP rules
- Searched phrases must be lowercase, uppercase letters in the search will always return no results.
- No spaces are permitted in a search phrase (i.e.
SEARCH BODY "first"works,
SEARCH BODY "second"works, but
SEARCH BODY "first second"gives no results).
- Message category flags do appear in a
TEXTsearch of a message! This still has the same no-spaces and all-lowercase limitations as a regular BODY search.
- String Literal searches behave the same as quoted string searches.
|IMAP Command||Finds Category Flags?||Find Body Text?|
|BODY (Quoted String)||No||Yes (lowercase only, no spaces)|
|BODY (String Literal)||No||Yes (lowercase only, no spaces)|
|TEXT (Quoted String)||Yes||Yes (lowercase only, no spaces)|
|TEXT (String Literal)||Yes||Yes (lowercase only, no spaces)|