Unfake Text

Convert homoglyphs and look like characters back to normal Latin text.

Input Text (Fake)

Detection Settings

Normalized Text (0 fixed)
0 chars
Normal text will appear here...

What is Homoglyph Detection?

Homoglyph detection is the process of identifying and converting look-alike Unicode characters back to their standard Latin equivalents. This tool scans text for Cyrillic, Greek, and Fullwidth characters that visually resemble English letters but have different Unicode code points—essential for detecting phishing attacks and cleaning obfuscated data.

For example, gοοglе.cοm (with Greek ο and Cyrillic е) is normalized to google.com (all Latin).

Features

Auto-Detection

Automatically detects Cyrillic, Greek, Fullwidth, and Roman numeral homoglyphs.

Detection Stats

See exactly how many homoglyphs were found and the detection rate.

Comparison Mode

Side-by-side view of fake text vs. normalized Latin text.

File Upload

Load files and download normalized versions for cleaning datasets.

Batch Processing

Process multiple lines independently for URL lists or datasets.

Security Focus

Detect phishing domains and prevent homoglyph-based attacks.

Common Use Cases

Security Protection

Detect phishing URLs, deceptive domain names, and homoglyph-based social engineering attacks.

Data Cleaning

Normalize user input, clean database records, and ensure text matching works correctly in search systems.

Quality Assurance

Verify text authenticity, prevent filter bypass attempts, and maintain data integrity across systems.

How to use

  1. Input: Paste text that may contain homoglyphs (e.g., a suspicious URL).
  2. Detect: The tool instantly scans and converts all homoglyphs to Latin.
  3. Review: View detection stats to see how many fakes were found.
  4. Copy/Download: Use the cleaned, normalized text for security checks.

Example - Phishing URL Detection

Fake URL (Phishing)
раypal.com
Normalized (Detected)
paypal.com
Detection Report
  • Cyrillic 'р' (U+0440) → Latin 'p' (U+0070)
  • Cyrillic 'а' (U+0430) → Latin 'a' (U+0061)
  • Detection Rate: 28.6% (2 out of 7 chars)
  • ⚠️ Warning: Fake domain detected!

Frequently Asked Questions

How does homoglyph detection work?

The tool scans each character in your text and checks if it's a homoglyph (look-alike character from Cyrillic, Greek, or Fullwidth Unicode). If detected, it replaces the homoglyph with its standard Latin equivalent. For example, Cyrillic 'о' (U+043E) is converted to Latin 'o' (U+006F). Detection stats show exactly how many characters were normalized.

What homoglyphs are detected?

The tool detects common homoglyphs including: (1) Cyrillic - а, е, о, р, с, т, х (and uppercase). (2) Greek - α, ο, ρ, ν, κ (and uppercase). (3) Fullwidth Unicode - a, b, c etc. (4) Roman numerals - ⅰ, ⅼ, ⅴ, ⅹ. These are the most commonly used in phishing attacks and text obfuscation.

Why would I need to unfake text?

Common use cases include: (1) Security - Detecting phishing URLs (e.g., аpple.com vs apple.com). (2) Data cleaning - Normalizing user input before database storage. (3) Search accuracy - Ensuring search queries match records. (4) Compliance - Preventing homoglyph-based filter evasion. (5) Quality assurance - Verifying text authenticity in content moderation.

Can this detect all Unicode variations?

This tool detects common homoglyphs used in 95%+ of real-world attacks and obfuscation cases. It covers Cyrillic, Greek, Fullwidth, and Roman numerals. However, Unicode has thousands of lookalike characters—this tool focuses on the most practical and frequently encountered ones.

What does the detection rate percentage mean?

Detection rate shows what percentage of your text was homoglyphs. For example, '50%' means half of the characters were fake lookalikes. 0% means the text is clean (all normal Latin). 100% means every character was a homoglyph (highly suspicious!).

Does unfaking change the meaning of text?

No! Homoglyphs are visually identical to their Latin equivalents, so normalizing them doesn't change the meaning—it only makes the text technically correct. For instance, 'gοοgle' (with Greek ο) becomes 'google' (with Latin o)—same visual appearance, different Unicode.

Can I use batch mode to check multiple lines?

Yes! Enable 'Batch Mode' to process each line independently. This is useful when checking lists of URLs, usernames, or domain names. Each line is analyzed separately while preserving the line structure.

Can I upload a file to check for homoglyphs?

Yes! Click 'Upload' to load a .txt, .md, or .csv file. The tool will scan and normalize all homoglyphs across the entire file. You can then download the cleaned version.

How is this different from Unicode normalization (NFC/NFD)?

Unicode normalization (NFC/NFD/NFKC/NFKD) handles combining characters and decomposition. This tool specifically handles confusable homoglyphs—characters from different scripts that look the same but have different code points. They solve different problems; this tool is for security and deception detection.

Is my text sent to your server for processing?

No. All homoglyph detection and normalization happens entirely in your browser using JavaScript. We never see, store, or transmit your text. This makes it safe to check sensitive URLs, passwords, or confidential content.