Monthly Archives: April 2011

Early thoughts on bo.lt

bo.lt is a service that let’s you copy, edit and share webpages. TechCrunch labels bo.lt as “bit.ly on steroids”.

The core copy and edit functionality looks really solid and works well. It finds all the images/css/js and other dependencies of the page and replaces them in the page and serves it up from their own S3 account. They have done a good job with the design as well.

As a user, I can think of times when I would want to use this service, but not as much as I use bit.ly. Most of the time when I share links, I don’t need to edit the pages.

As a content publisher, I am not too sure if I want users to use this. I understand that anyone can copy my content, edit it and host on their servers right now even without bo.lt. I believe bo.lt makes it easier for content publishers to lose control over the content. Any future changes to the original content will not be passed onto users with bo.lt.

According to TechCrunch: “…is content provider friendly in that Bo.lt still serves up a given page’s ads and analytics systems.” I haven’t tested the ads, but bo.lt does seem to append their own Google Analytics code to the existing GA code on the page. I am not sure how they work with other third-party or home-brewed analytics systems.

Another issue as a content provider is that bo.lt now competes with the original page for Search Engine rankings. For instance this page competes with  the original page without any reference to the original page. Currently, this whole model is breaking the web as we know it to multiple versions of almost same content with different URLs. Depending on how important getting indexed by Google is in their strategy, it can be easily fixed by either adding ‘noindex’ to bo.lt pages or even better adding a canonical tag to the original page. Perhaps, this is more of a reason for content publishers to have a rel canonical tag on their pages so that they get credited for their content.

It takes a lot of effort to change elements of a webpage on the fly and expect most of them to function as the original one. I think bo.lt is an impressive technology, but I am not too sure that they will be well received by publishers with their current model. I hope that they evolve and make it compelling for publishers to be comfortable with their product. They have raised $5 Million from Benchmark Capital.

Update 1: Brian Rutledge mentions that bo.lt seems to be moving rel=canonical tag into the body section, making it obsolete. However, that’s not what my experience was. Looks like a bug in bo.lt.

Update 2: I wanted to see if there is a way a bo.lt can be blocked to copy a website, same way crawlers can be excluded using robots.txt. I found that bo.lt does not look for robots.txt prior to grabbing a page. I looked through my apache logs and it never requested robots.txt today. Another problem is that they are faking the User Agent to be Firefox 3.6.4 (see image below). Whois on the IP address confirms it is owned by Boltnet, Inc. The only way to stop bo.lt from copying your pages is by banning IP address 199.204.84.2, until they change it.

Stop using “hits”

Back in the days of early web (pre-bubble 1.0) “hits” was a popular metric used to measure popularity of a webpage/site. Technically a hit is “a request for a file from the web server.” Any file – jpg, png, css, js, pdf, html…you get the point.

I am amazed how many people still use “hits” when talking about their website and even tech bloggers use them in their articles. It is not that the hits metrics is wrong, it is useless. It does not tell you anything about the actual usage of a website. I can always increase the number of hits by 10X my blog gets by adding 10 1px by 1px images in the footer. The number will sound really larger, but the usage of the website will be same.

Now that we know that the 2,200 – no – 22,000 hits Zuckerberg got in the movie The Social Network isn’t that impressive, let’s see how should we refer our traffic as.

I believe when most of the people these days talk about hits, they actually mean pageviews. As a tech startup founder or a tech blogger you need to know what is the difference between a pageview and  hits.

I consider “pageviews” as the lowest granular metric that you can talk about a website. It is better than hits, but still doesn’t tell you much about the user retention or engagement.

Here are some of the metrics that do tell something about your website:

  • Visitors
  • Unique Visitors
  • Repeat Visitors
  • Pages Per Visit
  • Time Spent Per Visit
  • Monthly Active Users
  • Daily Active Users

All Tech bloggers and startup founders – Please stop using the word/metrics “hits“.