How It Works

How does it work? In geek-speak, it's a multi-tier, fully distributed system. In plain language, it runs like this:

We have a central server, which coordinates all the activity of the system and is the central data store. It keeps track of what url's have been crawled, and receives back the data that comes from the crawler.

There is a "feed", which can crawl around the internet, getting html pages, pull data out of xml sources, or take it from news wires directly. At the moment, it's just crawling around, grabbing pages that are new enough to be of interest. Periodically, it starts a new crawl against the sources it's directed to. There can be as many "feeds" as desired, and soon, we'll start distributing the feed, so that other people can contribute content and search power to the system.

There is a Java applet, which is the part of the system you see. It is the thing that comes up in your browser, asks you to login, gets the data for the pages the server has, then searches through that data, using the power of your machine, to find the things you are looking for. This client goes back to the server every so often, asking for any new data, which is then sent down for display to you! Pretty simple, really.

The whole system is written in Java, which allows us to run it on virtually anything, and you to run the client on any java-capable browser.


NewsToYou.com - making the web self-aware®
NewsToYou.com Copyright 2003-2010,
and is a TradeMark of Spinning Cogs, Inc.

Contact via email:

Java is a trademark of Sun Microsystems, Inc.

Site Produced and Developed by
SnowDog Web Development, Inc.