Soft hyphens give hard problems

Okay, *deep breath*. Yesterday, I was playing around with my newly discovered soft hyphen. It’s a small little dash that’s not rendered on most browsers. I’ll cut the discovery anecdote crap, after a small set of contests with Iluvatar on our local institute phpbb forum, I discovered I could do this.

All phpbb and most other php authentication systems take up a string while signing up. All characters are gonna be visible (like spaces, underscores, etc), so I made a login with 4 soft hyphens. And sure enough it looked like there was no name at all.

img6

Now this is where it got interesting. I took people’s ids who already existed, registered with that name and added a few soft hyphens after them. And after changing my avatar and sig, I was virtually indistinguishable from the actual poster.

img7

Well this isn’t actually a “security” threat, it’s a threat to trust. Though the names look different to php, they look absolutely the same to the end user. And with a little trouble, a lot of pranksters can make use of this. I thought digg would break and I’d make a profile identical to DigitalGopher to prove my point while submitting this(noone diggs my submissions :( ).

After a lot of experimenting, I found almost all php registration systems had a problem with this. (Digg was safe by accepting only alphanumeric characters) I haven’t had the time to try if this works with gmail or explore this problem further courtesy my exams.

With a little more research I found that this technique applies to practically *ANY* field where you can’t enter a null value but you want it displayed as a null value. Any site that accepts an underscore will have a problem with this. I was able to fool ubuntu forums, several login ids, and masqueraded as an administrator of my local forums.

I concluded this: This hack can use to make two different pieces of text absolutely indistinguishable to humans.

I’m not specifying how to do this, I don’t want to be responsible for any problems. But anyone with basic knowledge of ascii can work it out. I don’t know if it’ll display that way on a mac or through other rendering engines apart from gecko and ie7.

Also, after a little coding, I found that mysql treats both differently, so I can have two names which look identical to humans but different to machines, and spammers can endlessly exploit this system.

More on this later. Now playing: Children of Bodom: Trashed Lost and Strungout

5 Responses to “Soft hyphens give hard problems”


  1. 1 Anirudh
  2. 2 Iluvatar

    yahoo and gmail seem to be using an explicit regex. [a-zA-Z0-9_\.]+

    they’re safe.

  3. 3 pravin

    This is interesting. Could be used in a number of ways.

    Okay, seriously, you *need* to move the anti-spam field out so that it is visible even if I’m a repeat visitor. I had written a lot more. sigh

  4. 4 Anirudh

    will do it after exams. and it should not come when you’re a repeat visitor. will look into that.

  1. 1 FAQ

Leave a Reply