Punycode Decoder

Punycode Decode Domain Names Online

Convert punycode encoded xn-- domain labels back to readable Unicode text instantly with this free online punycode decode tool. Whether you are inspecting DNS records, verifying idn decode results for internationalized domains, or investigating suspicious xn-- decode strings for potential homograph attacks, this tool restores the original Unicode characters from any valid Punycode input in real time.

What is Punycode Decoding

Punycode decoding is the reverse of Punycode encoding. It converts ASCII-compatible domain labels that start with the "xn--" prefix back into their original Unicode representation containing non-ASCII characters. This process is essential for displaying internationalized domain names in a human-readable form that users can recognize in their native script.

When your browser resolves a domain name, the DNS system works entirely with ASCII labels. If the domain contains non-ASCII characters, it is stored and transmitted in its Punycode form. The browser then decodes the Punycode to display the Unicode version in the address bar, making the domain readable for users who type in Chinese, Arabic, Cyrillic, or any other non-Latin script.

The decoding process is defined in RFC 3492 as part of the Internationalized Domain Names in Applications (IDNA) framework. It is fully deterministic and reversible, meaning the same Punycode input always produces the same Unicode output. For the encoding direction, our Punycode encoding converter tool converts Unicode domain labels to their xn-- prefixed ASCII form.

How Punycode Decoding Works

The Punycode decoding algorithm reverses the bootstring encoding process. It first strips the "xn--" prefix, then separates the remaining string into two parts: the basic ASCII characters (everything before the last hyphen) and the encoded non-ASCII character data (everything after the last hyphen). The ASCII characters form the initial skeleton of the output string.

The decoder then processes the encoded suffix to determine the Unicode code points and insertion positions of the non-ASCII characters. Using the same adaptive bias mechanism as the encoder, it reads variable-length integers from the encoded data and inserts the corresponding Unicode characters at the correct positions within the ASCII skeleton. This reconstructs the original Unicode string exactly.

Each domain label is decoded independently. A full domain name like "xn--mnchen-3ya.xn--e1afmapc.com" would have its first two labels decoded separately while the "com" label passes through unchanged. For a different kind of text decoding that works with percent-encoded byte sequences, our UTF-8 percent decoder handles URL-encoded content. When working with encoded data in other formats, our online Base64 decoding tool handles that common encoding scheme.

Syntax Comparison

Here is how to decode Punycode in popular programming languages:

JavaScript (Node.js): The built-in punycode module provides punycode.toUnicode("xn--mnchen-3ya.de") for full domain conversion and punycode.decode("mnchen-3ya") for individual labels without the prefix.

Python: Use the "idna" codec: b"xn--mnchen-3ya".decode("idna") returns the Unicode string. The third-party idna library provides idna.decode() with IDNA 2008 compliance.

Java: Use java.net.IDN.toUnicode("xn--mnchen-3ya") to convert a Punycode label back to its Unicode form. This handles the full IDNA processing pipeline.

Common Use Cases

DNS Record Inspection: When examining DNS zone files or query results, domain labels appear in their Punycode form. Decoding them reveals the actual internationalized domain names that users see in their browsers, making it easier to verify correct DNS configuration.

Security Investigation: Security analysts decode Punycode domains to identify potential homograph attacks where visually similar characters from different scripts are used to impersonate legitimate domains. Seeing the decoded Unicode characters helps determine whether a domain is genuinely internationalized or deliberately deceptive.

Certificate Analysis: SSL/TLS certificates list domain names in Punycode format. Decoding these labels helps administrators verify that certificates cover the correct internationalized domains and diagnose certificate mismatch errors.

Log Analysis: Web server and DNS logs record domain names in their ASCII Punycode form. Decoding these entries makes logs more readable and helps identify traffic patterns for internationalized domains.

Punycode Decode Examples

Here are practical examples of xn-- decode conversions:

Example 1 - German Domain: "xn--mnchen-3ya" decodes to "munchen" with an umlaut over the u. The ASCII skeleton "mnchen" is combined with the encoded umlaut character inserted at the correct position.

Example 2 - Chinese Domain: "xn--fsq228c" decodes to Chinese characters. Since the original label contained no ASCII characters, the decoded output consists entirely of CJK ideographs.

Example 3 - Accented Label: "xn--caf-dma" decodes to "cafe" with an accented final e. The ASCII portion "caf" forms the skeleton, and the accented character is inserted at position 3.

Example 4 - Full Domain: "xn--mnchen-3ya.de" decodes by processing only the first label. The ".de" TLD passes through unchanged, producing the full internationalized domain name.

Frequently Asked Questions

How do I know if a domain is Punycode encoded?

A Punycode-encoded domain label always starts with the "xn--" prefix. If you see a domain like "xn--mnchen-3ya.de" in a DNS record, certificate, or log file, the "xn--" prefix tells you that the label contains encoded non-ASCII characters. Labels without this prefix are plain ASCII and do not need Punycode decoding.

Can Punycode decoding fail?

Yes, decoding can fail if the input is not a valid Punycode string. Invalid inputs include strings with characters outside the allowed ASCII range, malformed variable-length integer sequences, or code points that fall outside the valid Unicode range. Most implementations throw an error or return null for invalid input rather than producing garbage output.

Why does my browser show xn-- instead of the Unicode domain?

Modern browsers display the Punycode form instead of the Unicode form as a security measure when they detect potential homograph attacks. If a domain mixes characters from multiple scripts (like Latin and Cyrillic), the browser shows the xn-- form to alert users that the domain may be deceptive. This behavior varies by browser and can usually be configured in advanced settings.

Is Punycode decoding the same as URL decoding?

No, they are completely different encoding schemes. Punycode is specifically designed for DNS domain labels and uses the bootstring algorithm with the xn-- prefix. URL decoding (percent decoding) converts percent-hex sequences like %C3%A9 back to bytes. Domain names use Punycode while URL paths and query strings use percent encoding. Both may appear in the same URL but require different decoders.

Can I decode multiple xn-- labels in a full domain at once?

Yes, most Punycode decoding libraries support full domain names. The decoder splits the domain on dots, checks each label for the xn-- prefix, decodes those that have it, and leaves plain ASCII labels unchanged. The result is the complete internationalized domain name with all labels in their Unicode form.

What scripts are supported by Punycode?

Punycode can encode any valid Unicode code point, so it supports all scripts defined in the Unicode standard. This includes Latin with diacritics, Cyrillic, Greek, Arabic, Hebrew, Devanagari, Chinese, Japanese, Korean, Thai, and hundreds of other scripts. However, domain registries may restrict which characters are allowed in domain names for their specific TLDs based on language and security policies.

FAQ

How does Punycode Decoder work?

Decode Punycode (xn-- prefix) back to Unicode text.

Ad