Filtering reply e-mails#

The goal is the automatic cleanup of incoming e-mails that have been recognized as a reply to an item-related e-mail previously sent by Allegra. In the process, quoted text, e-mail signatures, earlier conversation history and so on are removed.

This can be configured via the file EmailReplyPatterns.properties.

Configuration file#

File:

$ALLEGRA_HOME/EmailReplyPatterns.properties

This configuration file is loaded from the $ALLEGRA_HOME directory and can be customized.

Changes to the configuration file are automatically reloaded at runtime. A restart of the application is not required.

Main configuration options#

Enabling/disabling the reply filter#

# Controls whether to extract the reply part or keep the
# entire email as it is
extractReplayPart=true
true (default)

Removes quoted content and keeps only the reply content.

false

Keeps the entire e-mail content including all quoted parts, but still removes some general Allegra selectors.

Use case:

Set to false if the e-mail history in item comments should be preserved.

General Allegra selectors#

Various item-specific content can optionally be inserted into e-mails sent from Allegra, e.g.:

  • item details

  • conversation history

  • item link

The flag extractReplayPart mentioned above does not affect the removal of this content. It is removed regardless, provided the following selectors are not explicitly commented out.

Format:

<client>.<itemSpecificContent>.<index>=<CSS selector>

<itemSpecificContent> can be:

  • itemDetail

  • itemConversationHistory

  • itemLink

#Common Allegra element selectors.
generic.itemDetail.selector.0=div[class*=itemDetail]
generic.itemDetail.selector.1=div[id*=itemDetail]

#Workaround for Outlook
outlook.itemDetail.selector.siblings.0=
hr[style*=background-color: rgb(208, 208, 208)]

# Workarounds for Gmail
gmail.itemDetail.selector.0=
div:has(> div:has(> table[width=100%]
[cellpadding=0][cellspacing=0]:has(th)))

gmail.itemDetail.selector.1=
div:has(> table[width=100%]
[cellpadding=0][cellspacing=0]:has(th))

generic.itemConversationHistory.selector.0=
div[class*=itemConversationHistory]

generic.itemConversationHistory.selector.1=
div[id*=itemConversationHistory]

outlook.itemConversationHistory.selector.siblings.0=
hr[style*=background-color: rgb(208, 208, 208)]

gmail.itemConversationHistory.selector.0=
div:has(> div:has(> p:contains(conversationHistoryText)))

gmail.itemConversationHistory.selector.1=
div:has(> p:contains(conversationHistoryText))

generic.itemLink.selector.0=div[class*=itemLink]
generic.itemLink.selector.1=div[id*=itemLink]

outlook.itemLink.selector.siblings.0=
hr[style*=background-color: rgb(208, 208, 208)]

gmail.itemLink.selector.0=
div:has(> a[href*=printItem.action?key=])

Client-specific detection#

# Controls whether to detect the email client
# before applying patterns

extractReplayPartClientSpecific=false
true

Detects the e-mail client (Outlook, Gmail, etc.) and applies only the patterns relevant for it.

false (default)

Applies all patterns one after another to achieve better compatibility.

Set to true for better performance when mainly a single e-mail client is used within the organization.

Patterns with the generic prefix are always applied regardless of this setting.

Supported e-mail clients#

The system includes preconfigured patterns for:

E-mail client

Prefix

Description

Outlook

outlook.

Microsoft Outlook

Gmail

gmail.

Google Gmail web interface

Thunderbird

thunderbird.

Mozilla Thunderbird

Apple Mail

applemail.

macOS/iOS Mail application

Yahoo Mail

yahoomail.

Yahoo webmail

ProtonMail

protonmail.

ProtonMail

Generic

generic.

Patterns for all e-mails

Pattern types and configuration#

Pattern type 1: E-mail client detection#

Identifies the e-mail client that sent the message.

Format:

<client>.detection.<index>=<CSS selector>
# Detection patterns
outlook.detection.0=div[id^='divRplyFwdMsg']
outlook.detection.1=div.ms-outlook-mobile-reference-message
outlook.detection.2=div[id='appendonsend']
outlook.detection.3=div.WordSection1

gmail.detection.0=div.gmail_quote
gmail.detection.1=div.gmail_attr

thunderbird.detection.0=div.moz-cite-prefix
thunderbird.detection.1=div.moz-forward-container
thunderbird.detection.2=blockquote[type='cite']

applemail.detection.0=blockquote[type='cite']

yahoomail.detection.0=div[class*=yahoo_quoted]

protonmail.detection.0=div.protonmail_quote

The first matching selector determines the client type.

Useful with:

extractReplayPartClientSpecific=true

Pattern type 2: Simple removal selectors#

Removes matching items. This is useful when a quote element encompasses the original part of the e-mail.

gmail.selector.0=div.gmail_quote
gmail.selector.1=div.gmail_attr

thunderbird.selector.0=blockquote[type='cite']

Format:

<client>.selector.<index>=<CSS selector>

Pattern type 3: Removal with siblings#

Removes the matched element AND all following elements (the entire quoted region).

This is useful when the start of the original part is marked by an element and all following elements should be removed as well.

outlook.selector.siblings.0=
hr[style*='border:none'][style*='border-top']

yahoomail.selector.siblings.0=
div[style*='border-top'][style*='dotted']

Format:

<client>.selector.siblings.<index>=<CSS selector>

Pattern type 4: Conditional removal with siblings#

Removes elements that match a selector AND contain particular text or match a regular expression.

gmail.selector.siblings.conditional.0=div.gmail_extra

gmail.selector.siblings.conditional.0.regex=
.*On.*\\d{4}.*wrote:.*

applemail.selector.siblings.conditional.0=div, p, br

applemail.selector.siblings.conditional.0.text.contains=
Begin forwarded message:

Format:

<client>.selector.siblings.conditional.<index>=
<CSS selector>

<client>.selector.siblings.conditional.<index>.regex=
<regex pattern>

or:

<client>.selector.siblings.conditional.<index>.text.contains=
<exact text>

Pattern type 5: Reply header patterns#

Text-based patterns for recognizing reply headers in various languages. This processing is performed independently of the e-mail client.

# English
replyheader.selector.siblings.regex.0=
From:.*Sent:.*To:.*Subject:

replyheader.selector.siblings.regex.1=
On .* wrote:

# German
replyheader.selector.siblings.regex.3=
Von:.*Gesendet:.*An:.*Betreff:

replyheader.selector.siblings.regex.4=
Am .* schrieb .*:

Format:

replyheader.selector.siblings.regex.<index>=
<regex pattern>

Elements to search:

replyheader.element.selector=div, p, pre, span

Pattern type 6: Allegra-specific reply marker#

Use this approach when other client-specific or generic selectors do not clean up the e-mails as expected.

Encapsulating content from Allegra e-mails#

Make sure that the e-mail content sent from Allegra is embedded in a special marker element.

# Wrap outgoing emails with a marker element
item.emailSend.wrapEmailContent=false

# the start of the wrapper element
item.emailSend.wrapStartTag=
<div class="answerDelimiter" id="answerDelimiter">
<p style="color:#b5b5b5">${delimiterText}</p>

#the end of the wrapper element
item.emailSend.wrapEndTag=</div>

Removing the wrapper element from received e-mails#

If the marker element is present in the reply e-mail, it can be removed with a client-specific selector.

outlook.delimiterSelector=
div[id*=answerDelimiter]:has(p:contains(delimiterText))

gmail.delimiterSelector=
div:containsOwn(delimiterText)

Format:

<client>.wrapSelector=<CSS selector>
Note:

%delimiterText% is replaced at runtime by the localized delimiter text.

CSS selector syntax#

The configuration uses Jsoup CSS selectors.

Selector

Example

Meaning

Tag

div

All div tags

Class

.gmail_quote

Class

ID

#appendonsend

Element ID

Attribute exact

[type='cite']

Exact value

Attribute contains

[style*='border']

Substring

Attribute starts with

[id^='divRply']

Prefix

Multiple selectors

div, p, br

Multiple

Has child

:has(p)

Contains child

Contains text

:contains(text)

Text search

Testing and troubleshooting#

Reloading configuration#

After changes to EmailReplyPatterns.properties, the changes are detected automatically. A restart is not required.

Common problems#

Quoted text is not removed#

  • analyze the HTML source of the e-mail

  • identify unique CSS selectors or text patterns

  • add matching patterns to the configuration file

Too much content is removed#

  • phrase selectors more specifically

  • remove patterns that are too general

Patterns no longer work after changes#

  • check the sequential numbering (0,1,2,…)

  • check for syntax errors

  • use double backslashes in regex: \\d instead of \d

Selectors with siblings#

If a selector with siblings matches, everything below the matched element is removed, including signatures, disclaimers, privacy notices and virus notices.

Best practices#

  • test with real e-mails

  • define client-specific patterns first

  • always use sequential numbering

  • document changes

  • create a backup of the configuration before making changes

Technical notes#

Processing order:

Client-specific patterns ->
Generic patterns ->
Reply headers
Caching

Regex patterns are compiled and cached for better performance.

Case sensitivity

Regex patterns for reply headers are case-insensitive by default.