What is Encoding in SMS?

Overview

Encoding in SMS refers to the way text characters are represented digitally when a message is sent through a mobile network.
Every character you include in an SMS—letters, numbers, punctuation, or symbols—must be translated into a standardized digital format that both the sending and receiving devices understand.

The type of encoding used directly affects:

  • How many characters can fit into a single SMS.

  • Whether the message will be split into multiple parts (segments).

  • The overall delivery cost of the message.


Why encoding matters

Mobile networks have a fixed data size for each SMS message—specifically, 140 bytes.
Encoding determines how many bytes are required to represent each character, which in turn affects how many characters you can include in a single message.

Different encodings handle this differently:

  • Some encodings use fewer bytes per character, allowing more characters per message.

  • Others use more bytes per character, reducing the available space and causing longer messages to be split into multiple segments.


Types of SMS encoding

There are three main encoding types used in SMS communication. However, two are most commonly used: GSM-7 and Unicode (UCS-2).

1. GSM-7

GSM-7 is the default encoding used by most networks around the world.
It supports the standard Latin alphabet (A–Z, a–z), digits (0–9), and some special characters like !, ?, @, and #.

  • Maximum characters per SMS: 160

  • When concatenated: 153 characters per segment (7 bytes reserved for metadata)

  • Most cost-efficient encoding

Example:

Hello! Your order has been confirmed. Thank you for shopping with us.

This message uses only GSM-7 characters and fits comfortably in one segment.


2. Unicode (UCS-2)

Unicode, also known as UCS-2, is used when a message contains characters not supported by GSM-7, such as:

  • Accented characters (á, é, í, ó, ú, ñ)

  • Emojis 🙂

  • Non-Latin alphabets (Arabic, Chinese, Cyrillic, etc.)

  • Maximum characters per SMS: 70

  • When concatenated: 67 characters per segment

  • Used automatically when a special character is detected

Example:

¡Gracias por tu compra! 😊 Tu pedido será entregado pronto.

This message includes ¡ and 😊, which trigger Unicode encoding. It will be sent as multiple segments if it exceeds 70 characters.


3. 8-bit Encoding (rare)

8-bit encoding is occasionally used for data messages or binary SMS, such as system notifications, WAP pushes, or device configurations.
This type of encoding is not commonly used for regular marketing or transactional messages and is typically reserved for technical applications.


How encoding affects your messages

Encoding TypeCharacters per SMSCharacters per Segment (concatenated)Common Use
GSM-7160153Standard text in Latin alphabet
Unicode (UCS-2)7067Messages with special or non-Latin characters
8-bitN/AN/ABinary/system messages

Key impact areas:

  1. Character count: Fewer characters fit in a single SMS when using Unicode.

  2. Segmentation: Longer messages get divided into multiple parts, increasing cost.

  3. Billing: Each segment counts as one SMS, even if they are concatenated into a single message on the user’s device.


How to identify encoding before sending

Before sending a campaign or notification, it’s important to check which encoding your message uses.
On platforms like Messangi, the message composer or preview section typically indicates:

  • The encoding type (GSM-7 or Unicode)

  • The total number of characters

  • The number of segments the message will use

If a single special character (such as á, ñ, or 😊) is detected, the system automatically converts the entire message to Unicode.


Best practices

  • Use GSM-7 characters whenever possible to maximize message length and minimize cost.

  • Avoid copying text from Word or email, as these sources may include hidden Unicode symbols (e.g., smart quotes or non-breaking spaces).

  • Test your messages before sending bulk campaigns to confirm the encoding and segment count.

  • Shorten URLs using link shorteners to save character space.

  • If Unicode is required, such as for brand names or non-Latin scripts, keep your text concise to control message segmentation.


Summary

Encoding determines how your SMS message is structured, transmitted, and billed.

  • GSM-7 is efficient and cost-effective, ideal for messages in English or without special characters.

  • Unicode (UCS-2) supports more languages and symbols but reduces message length and increases segmentation.

By understanding how encoding works and monitoring which type your message uses, you can optimize your SMS communications for clarity, reach, and cost efficiency.

Was this article helpful?
0 out of 0 found this helpful