summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorMarvin Borner2018-09-16 00:59:02 +0200
committerMarvin Borner2018-09-16 00:59:02 +0200
commit0d7867ebc1e7733c8fccde934c9bf2c8334b846d (patch)
tree87e00641a1aa0c33da1027cab55693dff40ae200
parent6a4d5ef83abc6b609309401f5a9b132cfc83db77 (diff)
Added readme :memo:
-rw-r--r--README.md12
1 files changed, 12 insertions, 0 deletions
diff --git a/README.md b/README.md
new file mode 100644
index 0000000..bd669c6
--- /dev/null
+++ b/README.md
@@ -0,0 +1,12 @@
+# Introduction
+This crawler gets all important information and all links of a website and writes the links to a queue.
+After it has finished the information gathering, it will go on by using the first url of the queue and it will start again.
+
+# Using the crawler
+1. Create a mysql database: `mysql -u username -p` and `CREATE DATABASE database_name;`
+2. Import the `database.sql` file into your database with `mysql -u username -p database_name < database.sql`
+3. Edit `mysql_conf.inc` according to your databases credentials
+4. Run `php crawler.php http://dmoztools.net/`
+5. For future runs, just execute `php crawler.php` and it will automatically start with the first url of the queue
+6. Finished!
+ \ No newline at end of file