Octonity
All articles
aimoderationarabic

Why multilingual moderation can't be an afterthought

Most tools bolt on a translation layer and call it done. Here's why moderation has to think in the language the comment was written in.

Layla Haddad
Head of Trust & Safety
12 June 2026
2 min read

Open almost any social tool's moderation settings and you'll find the same quiet assumption: that a comment can be understood by first turning it into English. Run it through a translation API, score the English, act on the score. It's tidy, it's cheap, and it falls apart the moment someone writes the way people actually write.

Translation throws away the signal you need

Moderation isn't sentiment analysis. The thing you're trying to catch — a threat, a slur, coordinated spam — lives in the exact words, the register, the dialect. A machine translation smooths all of that into bland, plausible English and the sharp edges that mattered are gone.

Consider Arabic. A phrase that's harmless in Modern Standard Arabic can be a targeted insult in Egyptian or Gulf dialect. Translate first and you get a clean English sentence that scores as "neutral" — while the original was the whole reason you wanted moderation in the first place.

Translate-then-moderate systematically under-flags exactly the content that's hardest to catch: dialect, slang, and code-switching. The cleaner the translation reads, the more confident — and wrong — the score.

Moderate in the language it was written in

The alternative is to never leave the source language. Octonity scores each comment natively across 30+ languages, so dialect and intent survive all the way to the decision. That means:

  • Dialect-aware models, not one Arabic but the Arabics people post in.
  • Code-switching support — the Hinglish, Franco-Arabic, and Spanglish that real comment sections are full of.
  • Context from the thread, because the same word means different things under a product launch and under a condolence post.

Here's the difference in miniature. A naive filter matches a banned word literally; a language-aware one normalises first. Hit Run to see why the literal match misses the obfuscated version:

JavaScript
const banned = "spam";

// Normalise the way a language-aware pass would: strip the tricks people use
// to slip past a literal match.
const normalise = (s) =>
  s.toLowerCase().replace(/[0-9]/g, (d) => ({ "0": "o", "1": "i", "3": "e" }[d] ?? d))
   .replace(/[^a-z]/g, "");

const comments = ["buy SPAM now", "s p a m", "sp4m!!", "great post"];

for (const c of comments) {
  const literal = c.toLowerCase().includes(banned);
  const aware = normalise(c).includes(banned);
  console.log(c.padEnd(14), "literal:", literal, " aware:", aware);
}

What this looks like in practice

A creator with an audience across Cairo, Riyadh, and Dubai doesn't get one blunt filter. They get moderation that understands each community's register — hiding the abuse, surfacing the genuine questions, and never silently dropping a real comment because a translation layer flattened it.

That's the bar. Anything that starts by translating away the evidence is solving an easier problem than the one you have.

Layla Haddad
Head of Trust & Safety at Octonity