Strings

A string is a sequence of characters.

This lesson expands on the basics of strings and incorporates what you learned about arrays and functions.

Consider the simple program below, which outputs the length of a few strings. Can you predict what the output will be?

Loading TypeScript...

The letter A is considered 1 character, but the emoji πŸ’© is 2 characters, and the emoji πŸ‘¨β€πŸ‘¦ is actually much longer! If you are surprised by this, you may be asking the following questions:

Retrieving characters

You can retrieve individual characters from a string by index (character position) in the same way you would retrieve elements from an array.

When you retrieve the character at index 0, you are actually retrieving a single UTF-16 code unit at the specified offset.

Accessing an out-of-bound array index like emoji[2] or even emoji[-1] produces undefined, and does not throw an error.

Loading TypeScript...

The at function returns a new string containing the code point at the given index.

Loading TypeScript...

The argument can be a positive or negative integer - if negative, the index counts backward from the last character.

Loading TypeScript...

The at function requires your TypeScript compiler target to be set to ES2022 or higher. If you are targeting older JavaScript versions or simply don't need negative indexing, the charAt function is an equivalent option for retrieving individual characters.

Loading TypeScript...

The at and charAt functions always return a new string with the single UTF-16 code point at the given index, so it may return lone surrogates.

Loading TypeScript...

To get the full Unicode code point at a given index, use codePointAt and fromCodePoint.

Loading TypeScript...

Retrieving character subsequences

You can retrieve a subsequence of the string's character with slice, which does support negative indexes.

Loading TypeScript...

While at is the most concise way to retrieve a single character of a string, the same string can also be retrieved with slice and charAt.

Loading TypeScript...

Padding a string

The padStart and padEnd function continuously append characters to a string until it reaches a certain length. These functions are especially useful when displaying information in tables or aligning values in console output.

Loading TypeScript...

If the character to append is not specified, it is assumed to be a space character. The resulting string will be the exact specified length, which means a padding string that does not evenly fit the remaining space will be truncated.

Loading TypeScript...

Notice that if the original string is larger than the desired padding length, it won't be truncated.

Loading TypeScript...

Padding is often useful when converting numbers to a fixed width. Consider the function below that converts an RGB value to a hex code, which would otherwise fail to output a correct hex code if either red, green, or blue are less than 16.

Loading TypeScript...

Repeating a string

A string's repeat function produces a string which contains the specified number of copies of the original string.

Loading TypeScript...

Splitting a string

You can split a string into an array with the split function.

The first argument to split is the character to split by.

Loading TypeScript...

The second argument to split allows you to limit the number of split elements that are returned. This is an easy way to efficiently return the first few words or lines from a long string.

Loading TypeScript...

Regular expressions

A regular expression is a pattern for matching a string.

Regular expressions are an extremely useful mechanism for extracting information from strings and manipulating their contents. There is an entire lesson on regular expressions - this is just a condensed summary for the purpose of understanding strings.

Creating a regular expression

You can specify a regular expression by writing a pattern between two slash (/) characters, or by creating a RegExp object.

These lessons tend to use the / notation to define regular expressions, but it is worth noticing that all regular expressions are simply RegExp objects regardless of how they are created. You can use the test function of a RegExp object to check if a string matches the pattern.

Loading TypeScript...

Matching a string

The string match function searches a string with a regular expression, and returns an array of matches. If there were no matches, it returns null.

Loading TypeScript...

After the ending slash character (/) of a regular expression, one or more flags can be added to modify the pattern's behavior. The g flag returns all matches in the string, not just the first one.

Loading TypeScript...

The i flag performs a case-insensitive match, and can be combined with the g flag.

Loading TypeScript...

Special regex characters

There are special regex characters like \s, \d, and \w which represent character groups.

  • \w - matches any letter (a - z and A - Z)
  • \d - matches any digit (0 - 9)
  • \s - matches any space character ( , \n, \t)
Loading TypeScript...

The \b character represents a word boundary.

Loading TypeScript...
  • TODO: all groups

Searching a string

The search() method searches a string for a match to a regular expression.

It returns the index of the first match, or -1 if no match is found.

Replacing in a string

The string replace function can replace parts of a string that match a regular expression with a new string.

Loading TypeScript...

The special regex character + matches the preceding pattern one or more times.

Loading TypeScript...

The replacement string can also be a function which accepts the content of each match as an argument.

Loading TypeScript...

Comparing strings

The localeCompare function returns a number that indicates how a string should be ordered in relation to another string.

Loading TypeScript...

This is commonly used in an array sort function to arrange a list of strings into alphabetical order.

Loading TypeScript...

Searching a string

The includes function performs a case-sensitive search to determine whether a certain string appears within the string's character sequence.

Loading TypeScript...

If you need to perform a case insensitive search, convert the sentence and search query to the same case.

You can also provide an index to start searching from.

The indexOf function searches a string and returns the index of the first occurrence of the given string.

The lastIndexOf function returns the last occurrence of the given string.

Concatenating strings

The concat method creates a new string by concatenating the given strings to the target string. This is equivalent to using the + operator to join strings in order.

Loading TypeScript...

Notice that the original string is not modified. In JavaScript, if the arguments to concat are not strings, they are converted to strings before joining. TypeScript enforces a type of string on the arguments to concat.

Literal types

The generic string type refers to any sequence of characters. TypeScript also allows us to define types that refer to specific strings.

Encoding strings

JavaScript strings are encoded into a format known as UTF-16.

This format is great for string manipulation, but not ideal when interfacing with binary protocols, file systems, or Web APIs expecting the UTF-8 format. The built-in TextEncoder and TextDecoder provide a standard way to convert between JavaScript strings and raw binary data that uses standardized encodings like UTF-8.

TextEncoder

TextEncoder takes a JavaScript string and encodes it into a Uint8Array of bytes, typically using UTF-8.

Loading TypeScript...

Each item in the array is a byte (0 - 255) representing part of the UTF-8 encoding of the string. This is especially useful when working with emojis and other special characters that may require more than 2 bytes to represent.

Loading TypeScript...

TextDecoder

TextDecoder reverses the process of TextEncoder, taking some binary data like a Uint8Array (or Buffer in Node.js) and decoding it into a UTF-16 JavaScript string.

It assumes UTF-8 by default, but supports other encodings as well.

Error handling

You can configure TextDecoder to throw an error when an invalid byte sequence is encountered. This can be important for secure applications where arbitrary strings are being accepted from user input.

TextEncoder only supports UTF-8 by design β€” modern web standards are UTF-8 first.

Working with ArrayBuffer

String templates

String template literals make it easy to create strings that include expressions or variables.

A template literal is enclosed in backticks (`) instead of single or double quotes. This allows you to embed expressions inside the string using ${...}.

You can embed arbitrary expressions in a template literal.

Tagged template literals

A tagged template allows you to process a template literal with a function.

The function receives the string, along with any embedded expressions, as arguments.

Loading TypeScript...

The first argument is an array of strings with each fragment of literal text from the template. The type of this argument is TemplateStringsArray, and not string[].

The subsequent arguments are the values to insert between the literal text fragments.

Loading TypeScript...

You can also use a function generator to produce a tagging function.

Loading TypeScript...

Was this page helpful?