Google Sandbox Theory

I was writing a series of articles on Redirects for SEO's and part way through I was hit with an inspiration related to the Google Aging Delay and the Sandbox.

Personally, I think it makes a lot of sense, but I ran into a lot of "prove it" type resistance over at HR, so I guess I'm going to.

I don't believe in conspiracy theories and refuse to believe that the "sandbox" is some evil plot to screw over new webmasters. Therefore I'm testing my theory. If I'm right, this domain here:

Will tell me if I'm smoking drugs or not. Domain registered Oct 24, 2005, Anti-Sandbox test countdown begins now... ;)

NB: Please keep in mind that the sandbox article is based on an unproven theory, using a combination of redirect theory, quotes from Google engineers, the Historical Data Patent, a bunch of anecdotal evidence, and me sitting around thinking about it carefully. It's not proven - that's what the domain, and specifically the way I'm using it, is for.

Caveat Emptor


Google Sitemap is also an IBL link checker

I was installing a Google Sitemap on a client site today and came across a nifty feature that I haven't seen anyone mention yet.

Once you have uploaded the sitemap, it will come back with whether or not there has been an error. As long as there isn't, most people stop there. But there is a link right beside the sitemap in the control panel called "verify".

Being a cautious and curious sort, I decided that although there were no errors, and the sitemap had already been spidered, I'd verify it anyway. In order to do this you need to upload a fake file with a specific name, so that Google can verify that you actually have control over the site.

Once we did so, I was shown a screen that informed me that Google had trouble spidering the following 10 links, then listed them. The interesting thing is that not one of the links listed were in the sitemap!

This leads me to believe that this is a nifty method of checking broken or difficult IBL's to your site. Naturally, we are creating a 301 redirect for these links to deal with the lost traffic and PR...

Just thought I'd pass that along.


Blog Comment Legal Issues in the News

There is a new article out by Corilyn Shropshire (Pittsburgh Post-Gazette) inspired by the TP suit against Aaron Wall about the potential legal ramifications of comments in blogs against bloggers.

Since this is very much an interest of mine currently ;) , I'd like to look at the article. For the record, Corilyn did interview me.

The case has raised the ire of bloggers across the Internet, outraged and fearful that companies that don't like what is written about them can sue.

"This kind of thing raises my dander," said Ian McAnerin, a consultant and blogger who founded a search engine industry group, Search Marketing Association of North America. "The speed at which blogs are updated and comments can be made on them makes it very difficult to have editorial control," he added.

Mr. McAnerin said he expects more lawsuits like the one against Mr. Wall as the Internet and blogs become more commercial.

That worries what Mr. McAnerin calls "the little guy," individual bloggers without financial or corporate backing, such as Greg Jarboe. The Acton, Mass.-based blogger runs a search engine-focused marketing firm. "I have a blog, and I call them like I see them," said Mr. Jarboe. "I like to think it's my First Amendment right."

Just to be clear, I never mentioned Greg Jarboe (I didn't even know he had a blog) so although it looks like I used him as an example, I didn't. Nothing personal against Greg, but I prefer to not have words or references put into my mouth. It's possible it's just worded in such a way that I'm not interpreting it properly, but that's how I read it.

I'm still kind of annoyed at him for being the conduit for this drivel:

It is the policy of SEMPO not to comment on any legal cases pending, particularly those that do not directly involve our organization. This matter in particular will be decided under existing case law relating to freedom of speech, libel/slander, and contract law. There is no compelling reason for a nonprofit group with a mission of education and market expansion to become embroiled in a legal discussion unless there is a specific reason for it such as providing expert opinion on definitions or methodologies; and if we had been solicited, then we certainly wouldn’t be able to comment.

I didn't see SEMPO standing up for anyone earlier. So it's not an issue until they come knocking on your own door? Come on. That's just not right.

Having said those nitpicks, it's a pretty good article - too bad TP's lawyer never seems to respond to anything. I suppose it might be a case of "when you find yourself in a hole, stop digging and put down the shovel", but of course I don't know. Maybe it's a master plan or something...

The important part of the article, of course, is this issue:

Will bloggers be treated like newspaper reporters, protected by the First Amendment but subject to libel and defamatory laws, or will they be treated like common carriers," such as telephone companies, and not held liable for what other people write and say? Section 230 of the 1996 Communications Decency Act protects Internet service providers and Web sites from liability for information posted by third parties. But the courts have yet to decide if bloggers enjoy those same privileges. It's his job to convince the court, Mr. Stern said, that bloggers fall in the same category as Internet service providers and Web sites.

Naturally, I'm on the side of blogs being more of a "common carrier" than a "publisher". But it's kind of complicated. See, a blog is a bit of both.

When a blogger writes what they write as an article, they can't turn around and claim they can't be held responsible for their own words just because it's on a blog. Having a blog does not relieve you of taking responsibility for your own actions and words.

Now, of course there are all sorts of defenses such as Freedom of Speech (though that usually only applies to governments), fair comment, personal opinion, fact, discussion of a public figure, and so forth. These obviously apply to all bloggers and, indeed, all writers period - blog or not.

But there are 2 parts to a blog - the original blog entry made by (usually) the owner of the blog, and the comments by others about the entry. These comments are the sticking point, and the area of contention.

It boils down to this:

1. The blogging software company (ie Blogger, in my case) is clearly a common carrier under the law and isn't responsible for what a blogger writes, as they have very little control over it.

2. The blogger is clearly a publisher with regard to their own posts on their own blog. They have total control over what they say and how they say it.

3. The people making comments are a totally different issue. On one hand, you could argue that the blog owner can exert control over their posts. One the other hand, this isn't how it's normally done. - blog spam being a perfect example of the lack of control in this case.

It all boils down to control. Control equals responsibility, most of the time. The more control you have over the results, the more responsible you are for them.

Just because a common carrier *can* exercise control doesn't mean that they do, or should be expected to. If they did, they would probably lose their common carrier status.

So what about bloggers? Should they be expected to exert large levels of control over the comments in their blogs? See, it's not a case of exerting *some* control - it's unavoidable in some cases. But just because a common carrier will often act against an obvious spammer on their network doesn't mean that they are exercising enough control to stop being a common carrier. But it certainly makes it harder to draw the line.

Some have argued that there is a middle position, often called a "distributor". This is the equivalent of a newspaper stand that distributes the newspapers, but has no control over their content. The thinking was that the distributed is not liable unless they know that the publications they are carrying are libelous, at which point they would be required to remove them.

This sounds like it might be the appropriate approach to the comments (it's what TP may consider arguing), but it's not that simple.

First, the courts have held that there is no such thing as a special "distributor liability" - a distributor is just another kind of publisher. However, they have also acknowledged that if you had to check with your lawyer every single time someone complained on the internet about something, you'd go broke - it's just not feasible.

Further, the natural tendency for people in that position would be to simply ban everything, which would result in an unwanted "chilling effect" on speech. Since the reason for this effect would be the response to the law, the First Amendment became involved and things got messy.

"Any attempt to distinguish between 'publisher' liability and notice-based distributor' liability and to argue that Section 230 was only intended to immunize the former would be unavailing. Congress made no distinction between publishers and distributors in providing immunity from liability. As the Fourth Circuit has noted: '[I]f computer service providers were subject to distributor liability, they would face potential liability each time they receive notice of a potentially defamatory statement--from any party, concerning any message,' and such notice-based liability 'would deter service providers from regulating the dissemination of offensive material over their own services' by confronting them with 'ceaseless choices of suppressing controversial speech or sustaining prohibitive liability'--exactly what Congress intended to insulate them from in Section 230. Zeran v. America Online, Inc., 129 F.3d at 333. C.f. Cubby, Inc. v. Compuserve, Inc., 776 F.Supp. 135, 139-40 (S.D.N.Y. 1991) (decided before enactment of Communications Decency Act)."

So, now we are at Section 230 of the 1996 Communications Decency Act. Feel free to read it for yourself. First, the relevant passage is:

Treatment of publisher or speaker. No provider or user of an interactive computer service shall be treated as the publisher or speaker of any information provided by another information content provider.

Sounds great! What exactly does that mean? Well, clearly the lynchpin in the passage is the term "provider" If I'm not responsible because I'm just providing what some other provider created then I'm home free. So we need to know what a provider is in context of this act, and then see how that would apply to a commenter in a blog.

Let's look at the definition of "provider" then:

Information content provider. The term "information content provider" means any person or entity that is responsible, in whole or in part, for the creation or development of information provided through the Internet or any other interactive computer service.

Hmmm... Anyone responsible for the creation or development of information provided through the Internet sounds like "anyone posting their own stuff on the internet". In view of the fact that internet forums were firmly in mind (AOL and Prodigy specifically) when this was passed, I think that's a reasonable interpretation.

The law is also clear that only publishers and speakers are liable - not common carriers, etc. I think everyone can agree that the poster is also the publisher of their own posts.

"User of an interactive computer service" pretty clearly includes website users, I think.

So, to do a cut and replace, that would imply that this passage means:

"No poster or website user shall be liable for information provided by another poster."

I think that pretty clearly spells out the rules for blog comments.

My opinion,


Marketing vs SEO

A search engine attempts to identify whether your content is relevant by comparing it to all the other potentially relevant documents for that term.

It has no choice, since it doesn't understand English, French, Chinese or whatever the documents happen to be written in - it just counts words and compares results.

The net result is that a search engine will define "relevant" as something that talks about a subject in the same manner that other sites on the subject talk about it. It then relies on link analysis to sort it out from there.

The good news is that this identifies most spam fairly quickly, and also identifies on-topic documents pretty well.

The bad news is that spam written using pseudo-natural syntax will often pass the relevancy filter, and very well written information that approaches a subject from a different angle than normal, or uses more technical or less well known words to describe a subject may be judged as less relevant, when in fact it might be far superior.

In the case of the far superior content, although it would get dinged as originally being less relevant, the search engine will attempt to take into account human opinion by looking at links. This is why links will often trump content. The search engine is hoping to reward exceptional material it can't understand simply by comparing it to its peers.

This is a built in limitation of using a computer to do search - it rewards mediocrity in the content because mediocrity is easier to measure.

Basically, the larger the data-set, the more confident you are in your conclusion. The largest data set comes from the largest pool - i.e. the "average". Therefore your content is judged based on comparing it to the average, rather than the spectacular. This works well for content that is inherently informative, but not so well for content that is inherently creative in nature.

This type of analysis worked very well back when the searching on the web was primarily for information. However, when commercial sites came along, so did marketing.

Marketing is inherently creative. Great marketing is distinguished by not following the norm. Great marketing doesn't pay much attention to search engine algorithms; it attempts to speak to the consumers needs and dreams.

This is a problem for SEO's and one reason why the best SEO's are generally creative people, not technicians or search scientists. They need to work with both sides of the equation - the technical side gets you rankings and visitors, but the side that speaks to people's souls also speaks to their wallets.

Therefore a good SEO will attempt to compensate for this problem by using one or more methods:

1. First, the SEO will attempt to make the document match the relevancy criteria the search engine is looking for simply by adding in keywords and related terms and phrases. In short, use "natural" writing combined with knowledge of keywords and search engine behavior.
This works most of the time because a great many relevant pieces of content are solid information pieces, not artistic masterworks. You can make solid information SEO friendly while maintaining (and usually improving) the writing. This is where a good SEO copywriter really shines.

The very best can take information and make it speak to a consumers needs and dreams while still being search engine friendly, but it's an art, not a science.

2. If the document can't be changed, or if it would be a crime against common sense to do so (for example, taking someone’s poetry and making it "seo-friendly" would ruin it - it would no longer be the same poem) then you have to be more creative and work with titles, anchor text, linking, and so forth in order to compensate for the search engine's inability to appreciate the work's artistic merits. This requires a more technical SEO approach.

The best ranking sites match a search engines expectation for what a good site should be. The best converting sites match the consumer’s expectations for what a good site would be. The best SEO's understand this and work to accomplish both goals at the same time.


Searching For the Perfect Bride at

My good friend, Barry Schwartz (aka Rustybrick) is well known on the SEO forums. Today he also made internet history by being the first person to propose (to his girlfriend Yisha Tversky) using a search engine, thanks to the great people over at Ask.

So she does the search and bam!, up comes that result that asks her to marry me. At that time, while I kneel behind her, I pull out flowers and the diamond ring from the surrounding draws. She turns the swivel chair around and I ask her to marry me.

Awwww... Ultra-sweet :) And very cool. There have been lots of cases of people bidding on someones name to send them a message (we did that over at the High Rankings forum to wish Mike Grehan a happy birthday) and of course, people have been bidding or optimizing on other people's names in order to connect to a market (or complain) almost since the beginning of search engines, but this takes it to a whole new level.

The full story is on the Yisha and Barry wedding site.

There are 2 different searches that bring up different information on Ask. If you type in rustybrick engagement you'll get a very nice information spot:

And if you type in Yisha Tversky you'll get the original proposal:

I'd also like to take this opportunity to thank the Ask/Teoma team for helping out with this - you guys rock!

Many good wishes to the happy couple - looks like you've found who you've been searching for after all! :)


New Search Engine Share Chart

I've updated my search engine share chart to reflect the latest information. I started with logs from my own sites and then augmented them with some information from Search Engine Watch. Enjoy.