URI Component Encoder

URI Component Encode Strings Online

Encode any string for safe use in URLs with this free online uri encode tool. The encodeURIComponent function converts special characters, spaces, and non-ASCII text into percent-encoded sequences that are safe for inclusion in URL query parameters, path segments, and fragment identifiers. Whether you are building API requests, constructing url encode query strings, or preparing data for safe HTTP transmission, get properly encoded output in real time.

What is URI Component Encoding

URI component encoding, commonly known as URL encoding or percent encoding, is the process of converting characters that have special meaning in URLs or that fall outside the allowed ASCII character set into a safe representation. Each unsafe character is replaced by a percent sign followed by two hexadecimal digits representing the character's byte value in UTF-8 encoding.

The encoding is defined by RFC 3986, which specifies that URIs can only contain a limited set of unreserved characters: uppercase and lowercase ASCII letters (A-Z, a-z), digits (0-9), hyphen, period, underscore, and tilde. All other characters, including spaces, punctuation marks, and any non-ASCII characters, must be percent-encoded when used in URI components like query parameter values, path segments, or fragment identifiers.

For example, the space character becomes %20, the ampersand becomes %26, the equals sign becomes %3D, and the forward slash becomes %2F. Non-ASCII characters like accented letters or CJK ideographs are first converted to their UTF-8 byte sequences, then each byte is individually percent-encoded. This ensures that any Unicode text can be safely transmitted through URL-based channels that only support ASCII.

How URI Component Encoding Works

The encoding algorithm examines each character in the input string. Characters that belong to the unreserved set (A-Z, a-z, 0-9, hyphen, period, underscore, tilde) pass through unchanged. All other characters are converted to their UTF-8 byte representation, and each byte is encoded as a percent sign followed by two uppercase hexadecimal digits.

For ASCII characters outside the unreserved set, the encoding is straightforward: the character's single-byte ASCII value is converted to hex. The space character (byte value 0x20) becomes %20, the exclamation mark (0x21) becomes %21, and the at sign (0x40) becomes %40. For non-ASCII characters, UTF-8 produces multi-byte sequences that result in multiple percent-encoded triplets per character.

It is important to distinguish between encodeURIComponent and encodeURI in JavaScript. The encodeURIComponent function encodes all characters except unreserved ones, making it suitable for encoding individual parameter values. The encodeURI function preserves characters that have structural meaning in complete URIs (like colons, slashes, and question marks), making it suitable for encoding full URLs. For decoding percent-encoded strings back to readable text, our UTF-8 percent decoder reverses the process. When working with full UTF-8 encoding of text, our UTF-8 percent encoder handles the same transformation. For encoding binary data into ASCII-safe formats, Base64 encoding for data offers an alternative approach.

Syntax Comparison

Here is how to perform URI component encoding in popular programming languages:

JavaScript: Use encodeURIComponent("hello world & more") which returns "hello%20world%20%26%20more". This is the most commonly used function for encoding URL parameter values in web development.

Python: Use urllib.parse.quote("hello world & more", safe="") which returns "hello%20world%20%26%20more". The safe parameter specifies characters that should not be encoded; an empty string means encode everything except unreserved characters.

PHP: Use rawurlencode("hello world & more") which returns "hello%20world%20%26%20more". Note that urlencode() encodes spaces as plus signs instead of %20, following the older application/x-www-form-urlencoded convention.

Java: Use URLEncoder.encode("hello world & more", StandardCharsets.UTF_8) which encodes spaces as plus signs. For strict RFC 3986 compliance, replace the plus signs with %20 after encoding.

Common Use Cases

Query Parameter Construction: When building URLs with dynamic query parameters, each parameter value must be URI-encoded to prevent special characters from breaking the URL structure. For example, a search query containing ampersands or equals signs would be misinterpreted as parameter delimiters without proper encoding.

Form Data Submission: HTML forms using the GET method encode field values in the URL query string. The encodeURIComponent function ensures that user input containing special characters, spaces, or non-ASCII text is safely transmitted without corrupting the URL structure or losing data.

API Request Building: REST API calls frequently require encoded parameter values in URLs. When passing JSON strings, file paths, or user-generated content as URL parameters, proper encoding prevents parsing errors and ensures the API receives the intended data.

Redirect URL Construction: OAuth flows and single sign-on systems pass callback URLs as parameters within other URLs. The inner URL must be fully encoded so that its colons, slashes, and query parameters are not confused with the outer URL's structure.

Cookie Value Encoding: Cookie values that contain special characters like semicolons, commas, or spaces must be percent-encoded before being set in HTTP headers. URI component encoding ensures these values are stored and retrieved correctly across different browsers and servers.

URI Encode Examples

Here are practical examples demonstrating uri encode and encodeURIComponent behavior with different input types:

Example 1 - Spaces and Punctuation: The input "hello world!" encodes to "hello%20world%21". The space becomes %20 and the exclamation mark becomes %21. These are among the most commonly encoded characters in URL parameters.

Example 2 - Query String Value: The input "name=John&age=30" encodes to "name%3DJohn%26age%3D30". The equals signs become %3D and the ampersand becomes %26, preventing these characters from being interpreted as query parameter delimiters when the entire string is used as a single parameter value.

Example 3 - URL as Parameter: The input "https://example.com/path?q=test" encodes to "https%3A%2F%2Fexample.com%2Fpath%3Fq%3Dtest". Every structural URL character (colons, slashes, question mark, equals) is encoded, making the entire URL safe to embed as a parameter value within another URL.

Example 4 - International Text: The input "cafe" with an accented e encodes to "caf%C3%A9". The accented character requires two UTF-8 bytes (C3 and A9), each producing a percent-encoded triplet. ASCII characters pass through unchanged while non-ASCII characters are fully encoded.

Example 5 - JSON Data: The input '{"key":"value"}' encodes to "%7B%22key%22%3A%22value%22%7D". Curly braces, double quotes, and colons are all encoded, making the JSON string safe for inclusion in a URL query parameter.

Frequently Asked Questions

What is the difference between encodeURIComponent and encodeURI?

encodeURIComponent encodes all characters except unreserved ones (letters, digits, hyphen, period, underscore, tilde), making it suitable for encoding individual URI component values like query parameters. encodeURI preserves characters that have structural meaning in complete URIs, including colons, slashes, question marks, hash signs, and ampersands. Use encodeURIComponent for parameter values and encodeURI for complete URLs where you want to preserve the URL structure.

Should I encode spaces as %20 or plus signs?

RFC 3986 specifies %20 as the correct encoding for spaces in URIs. The plus sign convention comes from the older application/x-www-form-urlencoded format used by HTML forms. In modern web development, %20 is preferred for URL paths and query parameters. JavaScript's encodeURIComponent uses %20, while Java's URLEncoder uses plus signs. If you need plus signs for form compatibility, apply that conversion separately after standard percent encoding.

Which characters does encodeURIComponent not encode?

encodeURIComponent leaves the following characters unencoded: uppercase letters A through Z, lowercase letters a through z, digits 0 through 9, hyphen (-), period (.), underscore (_), and tilde (~). These are the "unreserved characters" defined in RFC 3986. Every other character, including common punctuation like exclamation marks, asterisks, parentheses, and single quotes, is percent-encoded. Note that some older implementations also leave asterisks and single quotes unencoded, but strict RFC 3986 compliance requires encoding them.

Can I double-encode a URL by accident?

Yes, double encoding is a common mistake that occurs when you apply URI encoding to a string that has already been encoded. For example, encoding "hello%20world" would produce "hello%2520world" because the percent sign itself gets encoded to %25. This results in the literal text "%20" appearing in the decoded output instead of a space. Always ensure you encode raw, unencoded values and avoid encoding strings that have already been percent-encoded.

Is URI component encoding the same as HTML encoding?

No, they are different encoding schemes for different purposes. URI component encoding (percent encoding) converts characters to %XX format for safe inclusion in URLs. HTML encoding converts characters to entity references like & < and > for safe inclusion in HTML documents. A space becomes %20 in URI encoding but remains a space in HTML (or   for non-breaking spaces). Using the wrong encoding type can cause security vulnerabilities or display errors.

Do I need to encode the entire URL or just the parameters?

You should only encode individual component values, not the entire URL. The structural characters in a URL (the scheme colon, double slashes, path slashes, question mark, ampersands between parameters, and equals signs between keys and values) must remain unencoded for the URL to function correctly. Apply encodeURIComponent only to parameter values and path segment values, not to the complete URL string.

How does URI encoding handle non-ASCII characters?

Non-ASCII characters are first converted to their UTF-8 byte representation, then each byte is individually percent-encoded. For example, an accented e character has the UTF-8 bytes C3 A9, which become %C3%A9. A Chinese character might use three UTF-8 bytes, producing three percent-encoded triplets. This UTF-8-based approach is specified by RFC 3986 and the WHATWG URL Standard, ensuring consistent encoding across all modern browsers and servers.

FAQ

How does URI Component Encoder work?

Encode text for safe use in URI components.

Ad