How To Create An Image Search Engine?
While crawling the web and following links, you should track the img tag and the links that point to images. Most images on the web are jpegs, gifs and pngs so you can detect them based on extension.
Now what metadata can you find about a picture? You can find information about a picture in the name, in the alt attribute, in the link description, in the text near the image and in the context of the page. Although the text that surrounds the image is sometimes relevant, nobody guarantees that the image doesn't illustrate a very small detail or something related to the topic on the page.
To see what's the best image search engine, I tested 5 big search engines (ok, Flickr is not a search engine, but it has great photos) against 20 searches.
Test searches
personalities
John Battelle
Larry Page
Vladimir Putin
Andy Roddick
Tori Amos
feelings
fear
shallow
unrequited love
laughing child
tough argument
screenshots
gnuplot
paint shop pro
openoffice
freebsd
solaris [the book, the film, the operating system]
events
New York attacks
India tsunami
gnomedex
Robbie Williams concert
Oscars 2005
I looked mainly at the first result, but I took in the considerations the others if they were great.
The results:
picsearch.com : 17/20
Probably the best image search. It doesn't have many results, but the sources are carefully selected.
(bad results for: new york attacks, shallow, tough argument )
images.search.yahoo.com: 15/20
Hillarious results, not safe-for-work results, logos usually come first.
(bad results for: fear, tough argument, freebsd, new york attacks, gnomedex)
images.google.com: 14/20
Not as good as the web search.
(bad results for: fear, shallow, unrequited love, paint shop pro, freebsd, new york attacks)
pictures.ask.com: 14/20
Ask has really good related searches.
(bad results for: fear, shallow, unrequited love, tough argument, paint shop pro, freebsd)
flickr.com: 8/20
Where is the search button? The tags are great, but they restrict the search queries.
(bad results or no results for: shallow, unrequited love, laughing child, tough argument, gnuplot
paint shop pro, openoffice, new york attacks. india tsunami, gnomedex, robbie williams concert, oscars 2005)
Labels
Web Search
Gmail
Google Docs
Mobile
YouTube
Google Maps
Google Chrome
User interface
Tips
iGoogle
Social
Google Reader
Traffic Making Devices
cpp programming
Ads
Image Search
Google Calendar
tips dan trik
Google Video
Google Translate
web programming
Picasa Web Albums
Blogger
Google News
Google Earth
Yahoo
Android
Google Talk
Google Plus
Greasemonkey
Security
software download
info
Firefox extensions
Google Toolbar
Software
OneBox
Google Apps
Google Suggest
SEO Traffic tips
Book Search
API
Acquisitions
InOut
Visualization
Web Design Method for Getting Ultimate Traffic
Webmasters
Google Desktop
How to Blogging
Music
Nostalgia
orkut
Google Chrome OS
Google Contacts
Google Notebook
SQL programming
Google Local
Make Money
Windows Live
GDrive
Google Gears
April Fools Day
Google Analytics
Google Co-op
visual basic
Knowledge
java programming
Google Checkout
Google Instant
Google Bookmarks
Google Phone
Google Trends
Web History
mp3 download
Easter Egg
Google Profiles
Blog Search
Google Buzz
Google Services
Site Map for Ur Site
game download
games trick
Google Pack
Spam
cerita hidup
Picasa
Product's Marketing
Universal Search
FeedBurner
Google Groups
Month in review
Twitter Traffic
AJAX Search
Google Dictionary
Google Sites
Google Update
Page Creator
Game
Google Finance
Google Goggles
Google Music
file download
Annoyances
Froogle
Google Base
Google Latitude
Google Voice
Google Wave
Google Health
Google Scholar
PlusBox
SearchMash
teknologi unik
video download
windows
Facebook Traffic
Social Media Marketing
Yahoo Pipes
Google Play
Google Promos
Google TV
SketchUp
WEB Domain
WWW World Wide Service
chord
Improve Adsence Earning
jurnalistik
sistem operasi
AdWords Traffic
App Designing
Tips and Tricks
WEB Hosting
linux
How to Get Hosting
Linux Kernel
WEB Errors
Writing Content
award
business communication
ubuntu
unik