Like many people, I use lynx to test websites as part of my SEO work. Lynx is a text-only browser for the web that allows people to view websites in text only mode. I used in it the old days when I would telnet into a SunOS server in order to access that newfangled fad called the web.
Once graphical browsers became common, lynx lost a lot of popularity. However, text browsing has seen a revival recently due to two main issues: first, people who are vision-impaired can't used graphical browsers, and there is a strong movement among forward thinking designers and companies to make sure that websites can be visited and used by everyone, not just people with good eyesight, fast connections, and graphical browsers. I think this is a very good thing.
The second reason is less selfless and more pragmatic: search engines, which often deliver a huge portion of a websites visitors, are essentially text only browsers. If you don't make them happy, you can find yourself losing a substantial number of potential visitors to competitors whose sites might not have all the nifty Flash that yours does, but instead are visible to search engines. Since that's a direct hit to the pocketbook. it's caught the attention of the business world as well.
Arguably, the single most important visitor to a website is a search engine, and a search engine is, for all intents and purposes, blind. It's kind of sad that it took this type of pressure to get a lot of sites to care about something they should care about anyway, but I take my victories where I can.
Accordingly, the use and importance of having a text only browser to test your website with has moved from "it would be nice" to "it's absolutely critical". A text browser is an essential part of the toolkit of both the SEO and the web designer.
Normally, I've been using the actual lynx executable, but it's kind of hard to use after several years of windows (no, the mouse doesn't work in it! Blind people can't see pointers, silly).
Additionally, when I'm talking to a client or giving a presentation, I can't ask them to install an executable on their system just to demonstrate somethign quickly. Some clients are disallowed from doing so by corporate policy even if they wanted to. Plus, it's hard to use for the mouse generation.
Up until now, the answer has been using the Delorie tool, which is basically a web based lynx viewer. Recently, due to bandwidth issues and hacks, they have discontinued it's easy functionality and required a site to install a custom page on it in order to view pages from that site. Well, clients are not going to do that any more than they would install lynx. Now the tool can basically only be used for sites only under your control, which presumably is the intent.
But that makes it useless for most SEO's and designers, except on personal projects or clients they already have.
Additionally, this tool didn't do what I wanted it to do (I still had to use 10 or 15 sites just to check one site) so I started to develop my own, in conjunction with a marketing company and developer I know. Lynx is close, but it doesen't show you what a search engine sees, only what is intended for text browsers, there is a big difference in the handling of graphics, headings, lists, and other page components between the two.
In short, lynx, is good, but not good enough. I needed a true SEO Browser. So I made one, in conjuction with a marketing company I work with a lot (Anduro) and a developer here in Calgary (Commerx). They are key to the fact that this project exists anywhere outside of my head. :D
The concept is simple, create a tool that mimics what a search engine sees, and then add other tools an SEO needs in conjuction with it.
Due to bandwidth usage, we are planning to make it half commercial, half free. The basic functionality will be, of course free. It's essentially an online lynx-type viewer with some enhancments specific to SEO's (like how it handles some things that a SE might care about but a pure text browser would not).
Then the idea is to have a much more robust version behind a login that will do all sorts of fun things, many of which will require a Google API, etc (thus the login).
The basic functional version is here:http://www.seo-browser.com/
For now, the advanced version is being added to daily and is open to the public for testing purposes. After we get it working perfectly, It will be a paid area (probably along the lines of Wordtracker style - pay by day, week, year, etc.)
One thing I'd like to add once it's done is the ability to get an XML feed from the advanced section that people can use to format their own reports with, etc. Please feel free to test it and provide feedback. We are committed to making both versions (including the free one) available and the best we can make them.
So far, the response has been amazing! Especially in view of the fact that this is still very much in development and is nowhere near being finished.
Some upcoming additional information it will hopefully provide include:
- A link to http header information for the page.
- A link to the CSS file
- A word and character count for the meta data, title and body text
- A link count (how many links are on the page – how many are internal and how many are external)
- An image count, along with how many do not have ALT parameters (alt=”” would count as an alt parameter for SEO purposes)
- A link to check the W3C Validity of the page
- The number of backlinks this page has under Google, Yahoo, MSN, and Teoma (and whether it exists in the index at all)
- Whether the page contains Flash, Java, imagemaps or DHTML (in red – these are usually bad)
- The Meta Keywords would be links to add extra functionality to the page (explained below) and would also show a number used and density % (example: “SEO [14, 12.5%], promotion [3, 1.2%], mcanerin [5, 4.6%]) I would suggest a limit of 15 keywords – if they have more than that then it should be noted as an error, and only show the first 15.
- A list of cookies requested/sent would be listed. This tool should never accept cookies, however (search engines don’t)
- A link to the robots.txt, and an error if it does not exist
- The IP address displayed. *Page load time displayed.
- A list of comments (how many, and the contents of them – can be an unformatted dump)
Under this is the text only page itself. This page would look the same as the free version except:
Text that is hidden using CSS or in other manners (ie black text on black, or 1 point high text) would be italized. Text that is within a header tag is actually displayed in a header tag (H1, H2, etc) When you click on a keyword in the keyword list, it highlights in bold. Up to 3 keywords can be highlighted at once.
Under this should be another section with 3 links: Compress, Tokenize, and Index.
Compress would popup another window and show the text as a pure, unformatted text dump with no punctuation. Tokenize would popup another window and show the compressed text but with stop words removed (a, the, and, but, etc) Index would take the Tokenized list and display it as a word list with number counts for each word as well as density (like the metatag keywords)