Encoding strings

JavaScript strings are encoded into a format known as UTF-16.

This format is great for string manipulation, but not ideal when interfacing with binary protocols, file systems, or Web APIs expecting the UTF-8 format. The built-in TextEncoder and TextDecoder provide a standard way to convert between JavaScript strings and raw binary data that uses standardized encodings like UTF-8.

TextEncoder

TextEncoder takes a JavaScript string and encodes it into a Uint8Array of bytes, typically using UTF-8.

Loading TypeScript...

Each item in the array is a byte (0 - 255) representing part of the UTF-8 encoding of the string. This is especially useful when working with emojis and other special characters that may require more than 2 bytes to represent.

Loading TypeScript...

TextDecoder

TextDecoder reverses the process of TextEncoder, taking some binary data like a Uint8Array (or Buffer in Node.js) and decoding it into a UTF-16 JavaScript string.

It assumes UTF-8 by default, but supports other encodings as well.

Error handling

You can configure TextDecoder to throw an error when an invalid byte sequence is encountered. This can be important for secure applications where arbitrary strings are being accepted from user input.

TextEncoder only supports UTF-8 by design — modern web standards are UTF-8 first.

Working with ArrayBuffer

Was this page helpful?