GSM-7 Character Set Reference and Encoding Guide

Overview

The GSM-7 character set is the default encoding used for most SMS messages worldwide.
It defines a standard set of characters that can be transmitted efficiently using 7-bit encoding, allowing up to 160 characters per SMS.

When a message includes characters outside of this set, the encoding automatically switches to Unicode (UCS-2), reducing the message length limit to 70 characters.

This article provides a reference of all characters supported in GSM-7, details about extended (double-length) characters, and how these affect message segmentation and billing.


Standard GSM-7 Character Set

The following table lists the core GSM-7 characters, their HEX values, and corresponding ISO-8859-1 equivalents.
These characters each consume one unit of the SMS character limit.

HEXCharacter NameCharacterISO-8859-1 Hex
0x00COMMERCIAL AT@40
0x01POUND SIGN£A3
0x02DOLLAR SIGN$24
0x03YEN SIGN¥A5
0x04LATIN SMALL LETTER E WITH GRAVEèE8
0x05LATIN SMALL LETTER E WITH ACUTEéE9
0x06LATIN SMALL LETTER U WITH GRAVEùF9
0x07LATIN SMALL LETTER I WITH GRAVEìEC
0x08LATIN SMALL LETTER O WITH GRAVEòF2
0x09LATIN CAPITAL LETTER C WITH CEDILLAÇC7
0x0BLATIN CAPITAL LETTER O WITH STROKEØD8
0x0CLATIN SMALL LETTER O WITH STROKEøF8
0x0ELATIN CAPITAL LETTER A WITH RING ABOVEÅC5
0x0FLATIN SMALL LETTER A WITH RING ABOVEåE5
0x10GREEK CAPITAL LETTER DELTAΔ
0x11LOW LINE_5F
0x12GREEK CAPITAL LETTER PHIΦ
0x13GREEK CAPITAL LETTER GAMMAΓ
0x14GREEK CAPITAL LETTER LAMBDAΛ
0x15GREEK CAPITAL LETTER OMEGAΩ
0x16GREEK CAPITAL LETTER PIΠ
0x17GREEK CAPITAL LETTER PSIΨ
0x18GREEK CAPITAL LETTER SIGMAΣ
0x19GREEK CAPITAL LETTER THETAΘ
0x1AGREEK CAPITAL LETTER XIΞ
0x20–0x7ELatin letters, digits, and common punctuationA–Z, a–z, 0–9, and symbols such as !, ?, #, &, etc.

(For a complete technical table, see GSM 03.38 specification.)


Extended GSM-7 Characters (Double-Length)

Certain characters in the GSM-7 set require an escape sequence (0x1B) and therefore count as two characters toward the total message length.
These are sometimes referred to as 14-bit characters because of their representation using the escape code plus the character code.

Extended CharacterEncoding SequenceCounts AsDescription
^0x1B0A2Circumflex accent
{0x1B142Left curly bracket
}0x1B282Right curly bracket
\0x1B292Backslash
[0x1B2F2Left square bracket
~0x1B3C2Tilde
]0x1B3D2Right square bracket
  0x1B402
0x1B652Euro symbol

Important:
When any of these characters appear in an SMS, they count as two characters each in the total length calculation.


Character Counting and Billing

SMS billing depends on the total character count, including extended characters and spaces.

Encoding TypeMax Characters (Single SMS)Max Characters per Segment (Concatenated)
GSM-7160153
Unicode (UCS-2)7067

When using extended GSM-7 characters, remember to count each of them as two when calculating total message length.

Example:

Message:

Your balance is €100. Please confirm.
  • Total visible characters: 33

  • Extended character: (counts as 2)

  • Adjusted character count: 34

  • Since it’s under 160, this message fits in one SMS.


SMS Length Calculation Formula

For concatenated messages longer than 160 characters:

Formula:
Number of SMS = Math.Ceiling(total_characters / 153)

Example:
If a message contains 344 characters (after counting double-length characters):
344 / 153 = 2.25 → rounded up to 3 SMS

Therefore, the message will be billed as three SMS.


FTP and File Upload Compatibility

When uploading message files via FTP, encoding compatibility depends on the file’s text encoding format.
All uploaded files must support the basic GSM-7 character set.

SupportedNot Supported
A–Z, a–z, 0–9, space, and GSM-7 punctuationNon-Latin characters, emojis, symbols outside GSM-7
Examples: @, #, !, &, +, :Examples: é, ñ, 😊, ß

If unsupported characters are detected, the message may be converted to Unicode or truncated.


Line Breaks

Line breaks can be inserted using the pipe character (|) when supported by the platform.
This symbol acts as a line separator within long messages or templates uploaded via FTP.

Example:

Thank you for your purchase!|
Your order will arrive soon.

This will display as:

Thank you for your purchase!
Your order will arrive soon.

Best Practices

  • Always review your message content before sending to confirm encoding type.

  • Avoid using characters outside of the GSM-7 set to prevent conversion to Unicode.

  • Be mindful of extended characters (^, {, }, , etc.) that count as two characters.

  • Use short links to save space in your message.

  • Test your SMS length and encoding in the platform before launching bulk campaigns.


Summary

The GSM-7 character set provides an efficient way to encode SMS messages using 7 bits per character.
However, some extended symbols require two bytes, reducing the available space within a single message.

By understanding which characters are supported and how they are encoded, you can optimize your messages for both delivery efficiency and cost control.

Was this article helpful?
0 out of 0 found this helpful