diff options
author | Marvin Borner | 2018-09-16 00:59:02 +0200 |
---|---|---|
committer | Marvin Borner | 2018-09-16 00:59:02 +0200 |
commit | 0d7867ebc1e7733c8fccde934c9bf2c8334b846d (patch) | |
tree | 87e00641a1aa0c33da1027cab55693dff40ae200 | |
parent | 6a4d5ef83abc6b609309401f5a9b132cfc83db77 (diff) |
Added readme :memo:
-rw-r--r-- | README.md | 12 |
1 files changed, 12 insertions, 0 deletions
diff --git a/README.md b/README.md new file mode 100644 index 0000000..bd669c6 --- /dev/null +++ b/README.md @@ -0,0 +1,12 @@ +# Introduction +This crawler gets all important information and all links of a website and writes the links to a queue. +After it has finished the information gathering, it will go on by using the first url of the queue and it will start again. + +# Using the crawler +1. Create a mysql database: `mysql -u username -p` and `CREATE DATABASE database_name;` +2. Import the `database.sql` file into your database with `mysql -u username -p database_name < database.sql` +3. Edit `mysql_conf.inc` according to your databases credentials +4. Run `php crawler.php http://dmoztools.net/` +5. For future runs, just execute `php crawler.php` and it will automatically start with the first url of the queue +6. Finished! +
\ No newline at end of file |