Filtering reply e-mails#
The goal is the automatic cleanup of incoming e-mails that have been recognized as a reply to an item-related e-mail previously sent by Allegra. In the process, quoted text, e-mail signatures, earlier conversation history and so on are removed.
This can be configured via the file EmailReplyPatterns.properties.
Configuration file#
File:
$ALLEGRA_HOME/EmailReplyPatterns.properties
This configuration file is loaded from the $ALLEGRA_HOME directory and can be customized.
Changes to the configuration file are automatically reloaded at runtime. A restart of the application is not required.
Main configuration options#
Enabling/disabling the reply filter#
# Controls whether to extract the reply part or keep the
# entire email as it is
extractReplayPart=true
true(default)Removes quoted content and keeps only the reply content.
falseKeeps the entire e-mail content including all quoted parts, but still removes some general Allegra selectors.
- Use case:
Set to
falseif the e-mail history in item comments should be preserved.
General Allegra selectors#
Various item-specific content can optionally be inserted into e-mails sent from Allegra, e.g.:
item details
conversation history
item link
The flag extractReplayPart mentioned above does not affect the removal
of this content. It is removed regardless, provided
the following selectors are not explicitly commented out.
Format:
<client>.<itemSpecificContent>.<index>=<CSS selector>
<itemSpecificContent> can be:
itemDetailitemConversationHistoryitemLink
#Common Allegra element selectors.
generic.itemDetail.selector.0=div[class*=itemDetail]
generic.itemDetail.selector.1=div[id*=itemDetail]
#Workaround for Outlook
outlook.itemDetail.selector.siblings.0=
hr[style*=background-color: rgb(208, 208, 208)]
# Workarounds for Gmail
gmail.itemDetail.selector.0=
div:has(> div:has(> table[width=100%]
[cellpadding=0][cellspacing=0]:has(th)))
gmail.itemDetail.selector.1=
div:has(> table[width=100%]
[cellpadding=0][cellspacing=0]:has(th))
generic.itemConversationHistory.selector.0=
div[class*=itemConversationHistory]
generic.itemConversationHistory.selector.1=
div[id*=itemConversationHistory]
outlook.itemConversationHistory.selector.siblings.0=
hr[style*=background-color: rgb(208, 208, 208)]
gmail.itemConversationHistory.selector.0=
div:has(> div:has(> p:contains(conversationHistoryText)))
gmail.itemConversationHistory.selector.1=
div:has(> p:contains(conversationHistoryText))
generic.itemLink.selector.0=div[class*=itemLink]
generic.itemLink.selector.1=div[id*=itemLink]
outlook.itemLink.selector.siblings.0=
hr[style*=background-color: rgb(208, 208, 208)]
gmail.itemLink.selector.0=
div:has(> a[href*=printItem.action?key=])
Client-specific detection#
# Controls whether to detect the email client
# before applying patterns
extractReplayPartClientSpecific=false
trueDetects the e-mail client (Outlook, Gmail, etc.) and applies only the patterns relevant for it.
false(default)Applies all patterns one after another to achieve better compatibility.
Set to true for better performance when mainly a single e-mail client
is used within the organization.
Patterns with the generic prefix are always applied regardless of this
setting.
Supported e-mail clients#
The system includes preconfigured patterns for:
E-mail client |
Prefix |
Description |
|---|---|---|
Outlook |
|
Microsoft Outlook |
Gmail |
|
Google Gmail web interface |
Thunderbird |
|
Mozilla Thunderbird |
Apple Mail |
|
macOS/iOS Mail application |
Yahoo Mail |
|
Yahoo webmail |
ProtonMail |
|
ProtonMail |
Generic |
|
Patterns for all e-mails |
Pattern types and configuration#
Pattern type 1: E-mail client detection#
Identifies the e-mail client that sent the message.
Format:
<client>.detection.<index>=<CSS selector>
# Detection patterns
outlook.detection.0=div[id^='divRplyFwdMsg']
outlook.detection.1=div.ms-outlook-mobile-reference-message
outlook.detection.2=div[id='appendonsend']
outlook.detection.3=div.WordSection1
gmail.detection.0=div.gmail_quote
gmail.detection.1=div.gmail_attr
thunderbird.detection.0=div.moz-cite-prefix
thunderbird.detection.1=div.moz-forward-container
thunderbird.detection.2=blockquote[type='cite']
applemail.detection.0=blockquote[type='cite']
yahoomail.detection.0=div[class*=yahoo_quoted]
protonmail.detection.0=div.protonmail_quote
The first matching selector determines the client type.
Useful with:
extractReplayPartClientSpecific=true
Pattern type 2: Simple removal selectors#
Removes matching items. This is useful when a quote element encompasses the original part of the e-mail.
gmail.selector.0=div.gmail_quote
gmail.selector.1=div.gmail_attr
thunderbird.selector.0=blockquote[type='cite']
Format:
<client>.selector.<index>=<CSS selector>
Pattern type 3: Removal with siblings#
Removes the matched element AND all following elements (the entire quoted region).
This is useful when the start of the original part is marked by an element and all following elements should be removed as well.
outlook.selector.siblings.0=
hr[style*='border:none'][style*='border-top']
yahoomail.selector.siblings.0=
div[style*='border-top'][style*='dotted']
Format:
<client>.selector.siblings.<index>=<CSS selector>
Pattern type 4: Conditional removal with siblings#
Removes elements that match a selector AND contain particular text or match a regular expression.
gmail.selector.siblings.conditional.0=div.gmail_extra
gmail.selector.siblings.conditional.0.regex=
.*On.*\\d{4}.*wrote:.*
applemail.selector.siblings.conditional.0=div, p, br
applemail.selector.siblings.conditional.0.text.contains=
Begin forwarded message:
Format:
<client>.selector.siblings.conditional.<index>=
<CSS selector>
<client>.selector.siblings.conditional.<index>.regex=
<regex pattern>
or:
<client>.selector.siblings.conditional.<index>.text.contains=
<exact text>
Pattern type 5: Reply header patterns#
Text-based patterns for recognizing reply headers in various languages. This processing is performed independently of the e-mail client.
# English
replyheader.selector.siblings.regex.0=
From:.*Sent:.*To:.*Subject:
replyheader.selector.siblings.regex.1=
On .* wrote:
# German
replyheader.selector.siblings.regex.3=
Von:.*Gesendet:.*An:.*Betreff:
replyheader.selector.siblings.regex.4=
Am .* schrieb .*:
Format:
replyheader.selector.siblings.regex.<index>=
<regex pattern>
Elements to search:
replyheader.element.selector=div, p, pre, span
Pattern type 6: Allegra-specific reply marker#
Use this approach when other client-specific or generic selectors do not clean up the e-mails as expected.
Encapsulating content from Allegra e-mails#
Make sure that the e-mail content sent from Allegra is embedded in a special marker element.
# Wrap outgoing emails with a marker element
item.emailSend.wrapEmailContent=false
# the start of the wrapper element
item.emailSend.wrapStartTag=
<div class="answerDelimiter" id="answerDelimiter">
<p style="color:#b5b5b5">${delimiterText}</p>
#the end of the wrapper element
item.emailSend.wrapEndTag=</div>
Removing the wrapper element from received e-mails#
If the marker element is present in the reply e-mail, it can be removed with a client-specific selector.
outlook.delimiterSelector=
div[id*=answerDelimiter]:has(p:contains(delimiterText))
gmail.delimiterSelector=
div:containsOwn(delimiterText)
Format:
<client>.wrapSelector=<CSS selector>
- Note:
%delimiterText%is replaced at runtime by the localized delimiter text.
CSS selector syntax#
The configuration uses Jsoup CSS selectors.
Selector |
Example |
Meaning |
|---|---|---|
Tag |
|
All div tags |
Class |
|
Class |
ID |
|
Element ID |
Attribute exact |
|
Exact value |
Attribute contains |
|
Substring |
Attribute starts with |
|
Prefix |
Multiple selectors |
|
Multiple |
Has child |
|
Contains child |
Contains text |
|
Text search |
Testing and troubleshooting#
Reloading configuration#
After changes to EmailReplyPatterns.properties, the changes are
detected automatically. A restart is not required.
Common problems#
Quoted text is not removed#
analyze the HTML source of the e-mail
identify unique CSS selectors or text patterns
add matching patterns to the configuration file
Too much content is removed#
phrase selectors more specifically
remove patterns that are too general
Patterns no longer work after changes#
check the sequential numbering (0,1,2,…)
check for syntax errors
use double backslashes in regex:
\\dinstead of\d
Selectors with siblings#
If a selector with siblings matches, everything below the
matched element is removed, including signatures,
disclaimers, privacy notices and virus notices.
Best practices#
test with real e-mails
define client-specific patterns first
always use sequential numbering
document changes
create a backup of the configuration before making changes
Technical notes#
Processing order:
Client-specific patterns ->
Generic patterns ->
Reply headers
- Caching
Regex patterns are compiled and cached for better performance.
- Case sensitivity
Regex patterns for reply headers are case-insensitive by default.