Content Separation

Google uses an interesting system in their Search Appliance for businesses: googleoff/googleon tags disable the indexing of a part of a web page. If you enter this code in a HTML file:

Google Search
<!--googleoff: index-->
Appliance
<!--googleon: index-->
is great.


the search engine won't index the text "Appliance". This might be useful for the navigation, irrelevant or sensitive content. There are other ways to use these tags, and the idea is similar to the section targeting from AdSense.

Seeing an article from Wikipedia (now edited) that stated that Google uses these comment tags for the web search engine, Search Engine Roundtable asks if the information is accurate. Most likely Google doesn't use the flags, but they would be useful to improve the quality of the search results. Google should analyze if the flags are used for dubious reasons (prevent indexing most of the page content, hide spam).

I've seen many search results that are in the top positions because of the keywords from the navigation links.

Another idea would be to stop taking into account the keywords from the navigation links. Also Google shouldn't index the most part of a blog's homepage, archive pages and essentially any homepage that changes very frequently, as most of the times the content is available in separate pages (permalinks), so it's redundant. You also have to use the cache to see the page, as it shows different content now.

Google could also separate the content of a page in different clusters so if a page talks about more than one subject, each part will be treated like a different page. This way, the keywords won't be mixed and the different topics will be independent.

The first result for [windows live mail] is ideas.live.com page, and not the product's page, and that's because "windows live" appears a lot on that page.

A combination of flags and automatic content separating will be useful to improve Google Search.

Labels

Web Search Gmail Google Docs Mobile YouTube Google Maps Google Chrome User interface Tips iGoogle Social Google Reader Traffic Making Devices cpp programming Ads Image Search Google Calendar tips dan trik Google Video Google Translate web programming Picasa Web Albums Blogger Google News Google Earth Yahoo Android Google Talk Google Plus Greasemonkey Security software download info Firefox extensions Google Toolbar Software OneBox Google Apps Google Suggest SEO Traffic tips Book Search API Acquisitions InOut Visualization Web Design Method for Getting Ultimate Traffic Webmasters Google Desktop How to Blogging Music Nostalgia orkut Google Chrome OS Google Contacts Google Notebook SQL programming Google Local Make Money Windows Live GDrive Google Gears April Fools Day Google Analytics Google Co-op visual basic Knowledge java programming Google Checkout Google Instant Google Bookmarks Google Phone Google Trends Web History mp3 download Easter Egg Google Profiles Blog Search Google Buzz Google Services Site Map for Ur Site game download games trick Google Pack Spam cerita hidup Picasa Product's Marketing Universal Search FeedBurner Google Groups Month in review Twitter Traffic AJAX Search Google Dictionary Google Sites Google Update Page Creator Game Google Finance Google Goggles Google Music file download Annoyances Froogle Google Base Google Latitude Google Voice Google Wave Google Health Google Scholar PlusBox SearchMash teknologi unik video download windows Facebook Traffic Social Media Marketing Yahoo Pipes Google Play Google Promos Google TV SketchUp WEB Domain WWW World Wide Service chord Improve Adsence Earning jurnalistik sistem operasi AdWords Traffic App Designing Tips and Tricks WEB Hosting linux How to Get Hosting Linux Kernel WEB Errors Writing Content award business communication ubuntu unik