The TargetSeek crawler is a focused robot that gathers data concerning product announcements, application notes and white papers for reference on leading electronics industry publications' sites.
The TargetSeek crawler obeys the Robot Exclusion Protocol and robot meta tags. It gathers pages with a random delay between page fetches. The minimum delay is at least several seconds in duration.
The TargetSeek crawler was designed specifically to ensure a light footprint on target sites. After an initial lightweight exploration, the TargetSeek crawler will return only to site pages that provide relevant items, including press releases, application notes and white papers. The TargetSeek crawler will return to exploration mode only if the locations of prime material appear to change such as in connection with a redesign of a target site.
Although the TargetSeek crawler will do its best to comply with directives in a site's robots.txt file, it will ignore directives that appear to be in error per the Robot Exclusion Protocol. To ensure the correctness of your robots.txt file, we recommend one of the robots.txt validators such as the one here or here. Of course, these validators are third-party utilities and the consequences of their use lie with the user and the third-party provider.
For more information about the TargetSeek crawler, please contact us by email at targetseek@targetgroups.net.