Thin content is content with little added value. Search engines tend to penalize these less valuable pages in the search results. One approach to avoiding thin content is to pay attention to page word count. The standard rule of thumb is a minimum of 200-300 words per page. The right threshold for you may differ. Quality always trumps length, short or long.
About the Spider To borrow a term from dungeons and dragons, the datayze spider is a Neutral Good spider. This tool is designed for website engineers wanting to improve their sites' navigability, and possibly improve their search engine rankings. Since it's assuming the webmaster of each domain is initiating the crawl request, it crawls each and every page it can find. The spider does not store the content of the pages it crawls. Effectively the spider treats each page as "FOLLOW, NOINDEX" regardless of the robots.txt file or robots meta tag. Hence the lack of a lawful good alignment.
We do recognize the possibility for abuse, and put several safe guards to ensure our spider remains on the Good side of the alignment chart. Our spider crawls at a leisurely rate of 1 page ever 1.5 seconds. While the spider doesn't keep track of the contents of the pages it crawls, it does keep track of the number of requests issued by each visitor. The spider will crawl no more than 400 pages for a given visitor on a given day.
Interested in Web Development? Try our other tools, like the Site Navigability Analyzer, which can let you see what a spider sees. It can analyze your anchor text diversity and find the length shortest path to any page. The Site Validator can summaries the types of HTML errors on your site, as well as provide a page by page breakdown.