Published on
April 8, 2009 .
Over the past couple of months or so, me and a friend were working on an idea, and we kept the details of the same mostly to ourselves. It was called “Connr”, a shortened version of “connector”. (Update – we’re using the name for soemthign else now.)
The whole idea was to connect a visitor to another random visitor and let them speak. It was a good idea, and there wasn’t really anybody else who had an idea like this. We were planning to launch early next month on an amazon EC2 server.
Today, I found out about “Omegle“, a service that does exactly what ours does. It received coverage from many major weblogs, including XKCD’s, thus making our product completely redundant even if we do decide to launch it. While our product was more impressive on the backend, and we already figured out how to solve some of the problems that they are facing, they were first – and in this industry that’s all that counts.
While we are terribly disheartened by our idea not making it into the big league, we learnt a lot, and we’re glad that an independent student like us, rather than a faceless corporation got in on this idea.
I would like to wish Leif Brooks, creator of Omegle the best of luck for a successful application.
Published on
April 7, 2009 .
I’m not a big fan of twitter since I closed my account a year ago. I was sick last night and watching the Colbert Report because I couldn’t sleep and Stephen Colbert was interviewing Biz Stone, cribbing about the 140 character limit. I had an idea which seemed great when I thought of it – to apply some on-the-fly compression by exploiting text encoding. Now the idea doesen’t seem so shiny after all.
Abstract:
To send upto 420 characters per twitter message. This will be done by processing the large text to fit into 140 characters. The 140 characters will be the same length and will not appear differently to the server, but will be a jumbled mess of text for any human who reads it. But on running an inflation algorithm on the same text, another user can get back the 420 character message.
Background:
Today, we’ll be exploiting the encoding that Twitter uses for it’s text: UTF-8. UTF-8 is variable length from one to four octets for each character. Since most english messages use ASCII character set, only the first octet is used, and any ascii string is utf-8 compatible by default. It’s a fantastic idea that saves space, and allows the full unicode alphabet because of it’s enormous size in case all four octets are used. Most of the regular english content that you read ever uses more than two or three octets.
Core Idea:
Construct a utf-8 character with 4 bytes. The first byte signals the start of a 4-byte sequence(thus having a value between 240 and 244).
Use each of the three octets to store a different ascii character. This way three ascii characters appear as one utf-8 character, thus only one out of 140 characters is used up. To deflate, read one utf-8 character, and interpret each of the final octets as an ascii character.
Technical complications:
- The first byte can’t be used
. It has to be a value between 240-244.
- The second, third and fourth byte can have only 0×80-0xBF (that is 128-191) as the content. This means the size of the character set that we can compress is 64. Sufficient for Alphanumeric characters and spaces. UTF-8 is not to blame here however. A full four-byte encoded character can have 64*64*64 variations (multiply that by 4 considering that the descriptor byte in the first slot can have four combinations), and you get yourself a fine encoding format.
- Any non-alphanumeric character would be left alone because it won’t fit in the space that we have for each octet.
- Any string polluted with non-alphanum characters would compress real poorly. Consider ‘http://is.gd/1f4′ or something like that. ‘htt’ will be one character, ‘p’ will be one, ‘is’ and ‘gd’ and ‘1f4′ would be singular characters. The code will have to make sure that we don’t use 4 octets by default, or fill in the remaining octets with blanks and discard them during inflation.
Implementation:
I’m guessing the implementation is easy enough, though text encoding is one of the hardest things for me to get right. I’m going to try to use the System.Text.UTF8Encoding and Sytem.Text.Encoder classes to read the bytes and re-construct a new string. I’ll dump the code here when I feel like it.
Ideally, there should be implementations in each major language – javascript, python, C++, C#, actionscript – which are used to write twitter apps. Ideally, a greasemonkey script can be run which looks for compressed utf-8 text and decodes them in the twitter page itself.
Disadvantages:
- All clients should necessarily have an inflation script running.
- Supports only alphanumeric characters+space
- Won’t be searchable.(unless you re-encode the search query itself)
Closing words:
Die twitter! die!
Published on
April 5, 2009 .
After about five months of waiting, I got invited to Suse Studio Alpha. If you don’t know what Suse Studio is, and you are interested to know what it is, (congratulations, you are part of a very small sample set) it’s this:
Suse Studio as far as I used it, is a tool to help you build Ready-to-deploy variant of the opensuse 11.1 distribution. The only catch is that you can have as many or as few packages as required. This allows you to build very specialized “appliances”.
If I wanted to make a distribution catering specifically to multimedia based packages, it’s a throwaway task. You go to the website, add packages from their intuitive web2.0-ish application, and click build. Ten minutes later, a spanking new application is ready for your usage.
I see several use cases of this:
- A power user who also uses opensuse can make a distribution perfectly catering to his/her taste.
- Perfect for deploying specialized tools in offices, etc. I can make a appliance catered specifically for java development(and later, stab my head with a butter knife), and deploy it on thousands of computers around.
- Many more, I’m not going to elucidate here.
If you are truly interested in finding out how it works, you can see the screencast over here.
But here’s my two free-as-in-beer cents on the tool.
- Oh god! How did they even manage to build that thing. It’s mind boggling. And they made it a web application. Holy shit! It builds an ISO while you sip coffee! The only thing that can make it better if it can automatically do all your work, write a few research papers for you and store them on the image so it’ll be on your desktop when you start using it.
- Will this convince me to move out of my long-term relationship with debian/ubuntu? Probably not.
- Will this convince me to build the most badass distro, and waste an evening trying it out when I should be doing something productive? Heck yeah!
- While there’s no point in this for me, there’s no doubt that a lot of people will find this really useful to distribute customized applications.
You can look at a screenshot of how it works for me. I was building an appliance with lots of development packages. Finding packages are realy easy with the feature of “patterns”. It’s sort of like groups of packages, like “C++ Development” which do not deserve their own category yet.

I would love to see a tool like this for debian. I’m pretty sure it won’t be much harder than suse’s tool to build. The clock counts down before the open-source hippies ruin a great tool’s reputation claiming that the server-side code is proprietary, despite knowing that it runs on server farms and took a really long time to get running.
Rest assured, this is an awesome product which will make a big impact and not just a novellty(the puns!!! someone make them stop).
Published on
April 4, 2009 .
Anglicizing Indian names in the US is quite common these days. Let’s face it, some names are hard to pronounce. When I was in the US eight years ago I remember mine getting butchered in as many ways as one can count. A few people who knew they had to stay abroad for a longer time did not want to have to spell out their name or correct pronunciation every time they met someone new.
For example, my friend and mentor, Vikram Dendi, who was in the US from his undergraduate times claimed to have shortened it to “Vik” every once in a while when I met him in Bangalore.
Over the past eight years, somehow, people started calling me “Andy”, and it’s stuck surprisingly well. It’s rare to find anyone call me by my given name in college, not for any cultural significance but rather for the sake of convenience. You can’t beat a one syllable name.
I’ve been talking on mailing lists and on IRC more these days and am starting to wonder if I need to regularly use my anglicized name in my online collaborations. The short answer is yes, and the long answer is too boring to put down here.
So I decided I’ll just use my utterly cool domain name which is lying around stale for email. I will be using andy@ninjagod . com for my purposes. So if you can’t remember my name or email address(why would one even need to?), you’ve got somewhere new to fire it off to.

Published on
April 1, 2009 .
Just because Jerry Seinfeld, George Carlin, and Randall Munroe can be funny whenever they want to doesen’t mean you can.
April fools’ is the time of the year, when everyone thinks they’re being funny but they’re just being dicks. And when someone thinks pictures of cats with horribly disfigured grammar is funny, you know it’s going to get really pathetic. I’m talking about you – the Internet.
A lot of people know that I have a short term memory length of nearly three seconds in the morning. I spend about five minutes reading my feeds in the morning and I was pleasantly surprised to see interesting and breaking news just to get fooled, and this happened six times in the morning.
I’m pretty sure a conversation like this has happened several places in the world at this time:
Billy: Hey grampa, <enter tragic news here>
Grampa: (goes into shock)
Billy: April fools’!
Grampa: Oh god, I find it hard to breathe, there’s pain in my left arm.
Billy: Good one grampa, but you’re going to have to try harder than that to get me.
Grampa: I’m not kidding, get me to a hospital.
Billy: Get more creative than that
The worst of all seems to be that of slideshare. I’d signed up for their service, which I actually think is pretty neat.
I got an email saying: “You’re a slideshare rockstar”(IF you didn’t know already, “rockstar” is the word I loathe the most.) Now that I’m a slideshare rockstar and was impatiently looking for ways to find slideshare groupies, I smelled something fishy.
“
We’ve noticed that your slideshow on SlideShare has been getting a LOT of views in the last 24 hours. Great job … you must be doing something right.
Why don’t you tweet or blog this? Use the hashtag #bestofslideshare so we can track the conversation.
Congratulations,
-SlideShare Team“
Every self obsessed half witted moron (a sample set, which, by some freakish coincidence, are all twitter users), decided to blog, twaat, microblog, nanoblog, fermiblog and gloat about their newfound glory. After seeing the pageviews rise and increase hundredfold, they realized that their creative inputs were clearly worth billions. They quit their jobs, brought hundreds of “I’m a slideshare rockstar” Tshirts, maxed out their credit limit knowing that a few minutes on powerpoint would reimburse their investment.
Then they found out that it was a joke with 100x fake page views and everyone got the same email.
Not surprisingly, a lot of people are pissed at slideshare, which is pretty sad because I’m a fan of Rashmi Sinha’s work, which I’ve been following for quite a while now.
There’s a lesson to be learned here, in my opinion:
- Humour is contextual. A normal person can be clever and insert a few puns into their text, but that’s it. The funniest jokes are internal, the ones that you crack with your buds, and have a lot of context to them.
- What slideshare did wasn’t “fooling”, it was outright lying. I just hope that it doesen’t get them sued or something.
- It’s funny to make one or two people laugh. Try it for an entire userbase and you’ll just piss them off.
Recent Comments