A text processing practice that standardizes source strings so translation tools can reliably match and reuse existing translations.
String normalization is the practice of converting source text into a consistent format before or during the translation workflow. It removes formatting differences such as whitespace, quotation styles, punctuation, or line breaks that prevent translation memory systems from matching identical strings. This improves translation memory reuse and avoids duplicated translation work.
When content comes from multiple authors, content systems, or legacy code, small formatting differences can cause the same text to be treated as separate strings.
For example, “Click here” and “Click here” would be translated twice without normalization
There is no single standard for what should be normalized. Rules vary by content type, tooling, and quality needs. Formatting that is safe to normalize in marketing copy may change meaning in technical or UI content where whitespace or symbols matter. Teams define normalization rules carefully to improve translation memory matching without altering meaning.
📚 Read more about Translation Memory in Localazy and how source consistency affects reuse