For a bit I was writing down the tools I had been working with and making. And then my blog blew up. Or more literally locked up and I lost the data because it was all on a dev machine that I didn’t care that much about.

I didn’t really stop working on things, but didn’t write much about it.

Then yesterday I had an idea. It wasn’t an original idea. It was really a how can I make that so I can use it and not need to install more software.

I came across this tool in a tweet. https://github.com/hakluke/hakcheckurl Written in Go, it checks on URLs, looks like it spiders and gets status codes for the URLs. Cool I thought. Go I thought.

Can I do it in python (I thought)? I played around. I looked around. I really didn’t want to rewrite a crawler. Lazy I know, but it’s my project and time.

New Direction

Sites have places they don’t want crawled. They put these places in a file in hopes that crawlers will respect this and not look there.

Most of these files/folders will be benign, style folders, images taken out of context, but some can help people looking for vulnerabilities out.

So, why not work out a way to take a look at them solo or in batches of sites?

talkToRobots

Or as Gabe calls it, Skynet. It’s available at my github repo.

talking to a robot
https://github.com/m0nkeyplay/talkToRobots

So, what can it do?

Right now it’s pretty simple. Choose from one site or provide a list of sites and we will go check if they have a robots.txt file and log that data for review.

I’m hoping to add the ability to switch between http and https if one doesn’t show results soon for a site. The thought of piping the disallows to be followed and see what’s there has also crept into my mind.

Download it. Give it a spin. Give it a whirl. Please help me improve it.

Leave a Reply

Your email address will not be published. Required fields are marked *