URL Encoding Explained: Percent-Encoding for Web Developers

What Is URL Encoding?

URL encoding, also known as percent-encoding, is the process of converting characters into a format that can be safely transmitted in a URL. Since URLs can only contain a limited set of ASCII characters, any character outside that set must be encoded as a percent sign followed by two hexadecimal digits representing the character’s ASCII value.

For example, a space becomes %20, an ampersand (&) becomes %26, and an equals sign (=) becomes %3D. Without encoding, these characters would be interpreted as URL structure elements rather than data.

Why URL Encoding Matters

URLs use certain characters as delimiters: slashes separate path segments, question marks begin query strings, ampersands separate parameters, and equals signs pair keys with values. If your data contains any of these characters, the browser or server cannot distinguish between data and structure without encoding.

Imagine a search query for “cats & dogs.” Without encoding, the URL would be /search?q=cats & dogs, where the ampersand would be interpreted as a parameter separator, breaking the query. Properly encoded, it becomes /search?q=cats%20%26%20dogs, preserving the intended meaning.

Characters That Need Encoding

Reserved characters have special meaning in URLs and must be encoded when used as data: : / ? # [ ] @ ! $ & ’ ( ) * + , ; =

Unsafe characters can cause problems due to inconsistent handling across systems: spaces, quotes, angle brackets, curly braces, pipes, backslashes, carets, and tildes.

Unreserved characters never need encoding: A-Z, a-z, 0-9, hyphens, underscores, periods, and tildes. These are safe in any position within a URL.

Spaces have a special case. In query strings, spaces can be encoded as either %20 or +. In path segments, only %20 is correct. Modern best practice is to use %20 everywhere for consistency.

How Encoding Works in Practice

Most programming languages provide built-in URL encoding functions. JavaScript offers encodeURIComponent() for encoding individual values and encodeURI() for encoding full URLs. Python has urllib.parse.quote(). PHP provides urlencode() and rawurlencode().

The distinction between encoding a full URL and encoding a component matters. If you encode a complete URL, you do not want to encode the slashes, colons, and question marks that define its structure. If you encode a parameter value, you want to encode everything that could conflict with URL syntax.

API developers must ensure that query parameters, path variables, and form data are properly encoded before transmission. Missing encoding is a common source of bugs, security vulnerabilities (injection attacks), and broken links.

Unicode and International Characters

Non-ASCII characters like accented letters, Chinese characters, or emoji require multi-step encoding. First, the character is converted to its UTF-8 byte representation, then each byte is percent-encoded. The character e with an acute accent becomes %C3%A9 (two bytes in UTF-8). A Chinese character might produce three percent-encoded bytes.

Internationalized Domain Names (IDNs) use a separate system called Punycode for the domain portion of URLs, while the path and query use standard percent-encoding.

Use the URL encoder/decoder on CalcHub to encode and decode URLs instantly, or explore our text tools for other encoding utilities.

Encode and decode URLs safely with CalcHub’s developer tools.

Explore all free tools on CalcHub

Browse Tools