Anti-Phishing Measures in Tellody

(Cross-posting of https://www.tellody.com/blog/166-anti-phishing-measures-in-tellody)

Sending bulk emails is a risky business because, on the one hand, for Tellody that is running it, it’s prone to be abused by scammers and, on the other, for our users, it is easy to create emails in good faith but inadvertently contain indicators that will taint them.

Protecting ourselves and our users is an open-ended issue without easy solutions, but we decided that something is better than nothing and some action is better than no action. We tried to implement some kind of spoofing detection, as a first step. In this article, I’ll briefly explain what we have implemented, what you have to look out for and how you can fix your email if it gets flagged. Since we target phishers, the component is called “CoastGuard”.

I will use the following sample screen that contains all current notifications, as my reference.

phishing-indicators-2

 

First of all, you should not mix scripts in your words. Mixing scripts is a prime indicator of spoofing because words can appear as meaning one thing without actually writing them the way they are supposed to be. For example, the “Spoofed” notification appears to refer to the word “Paypal” but it mixes two scripts, “Basic Latin” (what “Paypal” is supposed to be written in) and “Greek and Coptic”. In this case it’s the uppercase Greek Rho, indistinguishable in all manners from the Latin uppercase P, that triggers this notification. There’s no legitimate reason to use more than one script within a word, and there are plenty of illegitimate ones so, if you find yourself needing to use a particularly clever brand name, that you just thought of, which uses more than one script, Tellody will not allow you to send it in an email. Get in touch if this is something essential to you.

Second, Punycodes have been shown to present the same spoofing opportunities, with little added value. The Punycode that is rejected above looks the same as https://www.epic.com but is not the same URL. Tellody rejects punycodes altogether, at least until major browsers start presenting adequate warnings about them. Again, if using a punycode is essential to you, get in touch with us.

The third category of spoofing indicator is when we think that someone tries to masquerade link text to make it appear as if it’s a URL. Maybe not many people would be fooled by “http:\\”, but would you be fooled by “http:⁄⁄”, containing a character that looks a lot like a forward slash? I know I would.

The next two indicators pertain to using the plethora of Unicode characters to make a string look as if it’s contiguous but is actually broken by characters with no width or scarcely any width. Adding such a pseudo-space character can create a string that looks to the eye like “Viagra” but fails to match a filter that looks for an exact match.

Finally, the last indicator is shown when the text of a link is something that looks as if it is a URL but it is different than the actual link target. A URL that appears to go to http://www.apple.com, while it goes to some shady site, is rejected by Tellody. Note that “appears” means how we think the text appears to a human and, the more tricks we detect to make something look like a URL while failing the straightforward syntactic check for one, the worse off we deem it to be.

This is a first version of those checks and they might evolve over time. If you find yourself having a legitimate business need that is hampered, please get in touch with info@tellody.com to discuss it.

Gender Detection, Vocatives and Namedays for Greek First Names in Tellody

(Cross posting of https://www.tellody.com/blog/164-gender-detection-and-vocatives-for-greek-first-names-in-tellody)

When Tellody was selected by Wind to run the MarketApp service, a big market opened in Greece. Even though Greek is one of the supported UI languages (and currently only EN and EL are usable in the language selector), we felt that the Greek audience deserved some more functionality, enabled by the inherent structure of the Greek language.

You might have noticed that, when the UI language is Greek, you get two new buttons for “message tags”. These enable you to use vocatives for the first and last names.vocative_buttons

We made a decision that we’d limit ourselves to traditional birth names and usual diminutives. That ruled out names of foreign origin like “Μισέλ” or “Νάταλι”. First of all, these names cannot be inflected, so there can be no separate vocative form for them. Secondly, we cannot use them unambiguously for detecting gender. Both previous examples can be male or female names.

This leaves baptismal names which are of ancient origin (ancient Greek names or names from the Old Testament) and modern ones. Both detecting the gender and forming the vocative are operations on the structure of the name, which we’ve implemented in Tellody. Because of that, we can handle a lot more names than what is in the name list we are using for matching names against, which we are currently using to calculate the nameday for the corresponding names.

We test for phonetic equality, so we’ll match various phonemes, no matter how they are spelled. However, we handle diacritic marks in the same phonetic way. We’ll handle omission of the accent mark (Unicode Character ‘GREEK TONOS’ U+0384) and unneeded uses of diaeresis (Unicode Character ‘COMBINING DIAERESIS’ U+0308) that don’t affect syllabication and enunciation, but we won’t match, for example, “Χάιδω” (the first syllable of which is a diphthong of “α” and “ι”) and “Χαιδω” or “ΧΑΙΔΩ”, in uppercase, (the first syllable of which is the composite vowel “αι”, same phoneme as “ε”). However, “Χαϊδω”, having two first syllables heard as “α” and “ι”, will be treated as phonetically equal to the diphthong “άι”, since both consist of the same phonemes.

Namedays come in five flavors:

  1. Fixed date
  2. Easter ± (n)
  3. If Easter is before April 23rd then (Fixed date) else Easter ± (n)
  4. First Sunday on or after December 11th
  5. First Sunday on or after February 13th

Tellody, at the time of this writing, includes data for approximately 1800 name forms for 1250 male and female names, with 800 of them having nameday data.