Flaptor autotagger

As I posted on the Flaptor blog recently, we released a tool to automatically suggest tags for blog posts based on a learning algorithm. You can check it out here:

Flaptor autotagger

Why most startups fail

I read Paul Graham’s essay titled The 18 Mistakes That Kill Startups. It’s a good read and it’s hard not to agree with the majority of what he says. In a nutshell: success is the opposite of failure. If you avoid doing absolutely everything that would cause you to fail, you succeed. Here is a partial list of things to avoid. Among them, he has somewhat uninformative points like “don’t raise too little money” and “don’t raise too much money” which is like telling someone to not undercook or overcook a steak. No recipe here, the proof is on the plate.

In a way, it reminds me of someone one I knew who was reading “The 7 Habits of Highly Effective People” in the hope of becoming a more effective person. After browsing the book for a while, I thought that she might as well be reading “The 12 Habits of Famous Race Horses”. This article is in a gray area between descriptive and prescriptive. Even though it’s nice to read, I don’t know if it’s actually very helpful. The people who have what it takes to create a successful startup (a tiny minority) will do it anyway. Those who don’t can have this article read to them every morning by their personal trainer as they work off the undercooked steak from the night before and it will do them no good.

At this point, I feel like inserting a link to something more Carl-Saganesque like the Drake Equation. The point is that a number of factors are completely outside of anyone’s control.

Haute cuisine

One of my favorite dishes.

IBM – National Geographic Genographic Project

First post after several weeks of traveling around.

I ordered a kit for this project. Their objective is to track the migrations of human genes from our common African ancestor until the current state of the world population. I’ll try to post the results when I get them, probably a couple of months from now.

The Arcade Fire

Thanks to John, I learned of The Arcade Fire. Their album Funeral is the best thing I’ve heard in a long time and I have been listening to it over and over. It’s one of those rare occasions that makes me feel like sending money to an artist.

The age of thought crimes

Getting outraged is one of my favorite pastimes. Fortunately it’s an inexpensive one. All I need is to read the news. For example, this week I saw this story on Slashdot about a man being sent to prison for having viewed child porn pictures on the web. The prosecution found the files in the web browser cache (the Temporary Internet Files folder on Windows) and was able to convince a jury that this constitutes “possession of child pornography”. He was sentenced to twenty years.

Now, where do I begin my outrage? First, I believe that going to prison is one of the worst things that can happen to a person. I am of the opinion that keeping the innocent out of jail is more important than punishing all the guilty. Apparently this is not the view of the legal system in many countries including the US. How else can “possession” of something be considered a crime? That concept always bothered me because it’s open to the worst possible abuses. You hate someone? Just hide drugs in their house and call the cops anonymously. The person won’t be able to cooperate (i.e. tell anything useful about where the drugs came from) so the prosecutor will be especially harsh. If you happen to live in Singapore, you are in luck. All you need is to have the police find your “friend” with 15 grams of heroin, which carries a mandatory death penalty.

Let’s escalate the outrage a bit more. The guy in the Slashdot story did not even have possession of the files. He simply browsed a site containing them, which is equivalent to watching TV. Unfortunately for him, web browsers implement a feature to speed up navigation. The first time a person visits a page, the text and images are stored on the hard drive for a while. The idea is that, if the person visits the site again and there have been no updates, there is no reason to reload the content from the internet. Because of this technical detail, the prosecution could argue that the files were stored on the guy’s computer even though, in all likelihood, he was unaware of this.

My impression of this case is that the defense lawyer must have been nowhere near the top of mount Cleverest. Anyone competent would have been able to compare this to what happens with people who own a TiVo, or set up a quick entrapment experiment. I would have brought in a laptop and send the judge an email with a link to “an interesting story that questions Yout Honor’s reputation”. Upon clicking, a browser window would have shown a bunch of random images with text like “Bang! you got child porn on your computer now!”. I guess I have been watching too much Boston Legal, since it sounds something that James Spader’s character would come up with.

Continuing the outrage-fest, twenty years? What is left for those who produce child porn? Probably not much since they are likely to be outside of the US. What are the legislators thinking? Probably something like this: “I need more soccer mom votes. I’ll double the sentences. That looks good.” Well, here’s another interesting scenario: boy dumps girl. Girl is vengeful. She finds one of those trojan programs that install spyware and sends it to him. Only this program does not install spyware. Instead, it downloads large quantities of illegal content and then erases itself. Perhaps she doesn’t even know the law and thinks he will get a slap in the wrist. Boy, don’t make any plans for the next two decades.

I hope someone smart takes up this case and shows the stupidity of the situation. Otherwise, it’s another step into a globalized, Orwellian society.

The boring weblogs vs. journalism debate

Whenever a new trend or technology surfaces, there are always people who compare it to the closest thing available before, and how that closest thing will be killed/replaced/rendered obsolete by the new one.

In the past, it happened with live musicians and recorded music, stage plays and movies, movies and VCRs. In the case of weblogs, those people see them as an alternative to regular journalism. How is it not obvious that, just like in the aforementioned examples, they complement each other?

Let’s define first what ‘weblog’ means in the context of this discussion. Today, anything running on blogspot.com, or powered by Movable Type or WordPress can be considered a weblog. We can leave out of the discussion those who are just personal pages in disguise, community discussion boards or diaries (“…today I’m wearing new socks, got a B+ and I have a crush on my neighbor’s cousin’s friend…”). There are two types of weblogs that ressemble traditional media:

- aggregators, such as Slashdot or BoingBoing. These sites add value by carefully selecting stories that interest their readers, a new one every hour or so. They very rarely post original content.

- editorials, such as Andrew Sullivan. These tend to feature opinion pieces written by one or more authors, about whatever they find interesting that day.

Some weblogs lie somewhere in between, such as Kottke or Daily Kos, posting mostly links and some personal opinion once in a while.

The important issue is: what do all these have in common? answer: they are maintained by people who work at home or in an office, far from where news happens. Unlike traditional media, they do not have armies of paid correspondents who report from all around the world. When unexpected things happen such as natural catastrophes or revolutions, webloggers can only link to news sites and speculate just like everybody else who’s not there.

Eventually, weblogs and regular media will coexist in harmony because they are good for different things. The line between weblogs and traditional media will blur. There will be more paid, specialized webloggers, some of whom will work for traditional media and other corporations. Just like we can choose from different ways of seeing movies (theatre, buying, renting, downloading), the same will (continue to) be true for news. This debate has been going on for three years now. It’s starting to smell old (like the bricks vs. clicks discussions of 1999) and it’s time to retire it.

Kitties

Here are Tashi and Simone, now about four months old. We found them in the street in late February (pictures taken by Sarah).

Mobile devices and the “real” web

John Battelle makes an excellent point about how users of mobile phones are subject to the whims of the carriers in terms of what they are allowed to do, as opposed to the unlimited options of “raw DSL” for a wired computer.

Having worked on synchronization of mobile devices using the SyncML protocol, I know what he means. Carriers such as Verizon sell crippled phones with the sync function disabled (it can be re-enabled through a very tedious procedure, out of reach of the typical user). It’s interesting how the carriers don’t provide a solution to allow a customer to synchronize contacts or calendar information to a server, yet they disable the functionality so that the customer cannot do it through a third party either!

This Wall Street Journal article from last week complains about the same issues and gives a more business-like perspective, comparing the carriers with Soviet-style ministries.

Java performance

Anybody who has been following the Java language since its inception ten years ago is aware of the discussions about its performance and the comparison with other languages such as C++. Some people who complain about the lack of performance of a language or platform are guilty of not looking at a system as a whole, a combination of the programs, the operating system it runs on, the processor and the input/output systems. This article discusses performance issues in Java and dispels old myths. It is a good reference for people who complain that their Java program is too slow, so it must be the language’s fault.

The Java Performance Debate, by Andy Roberts