===== Scraper Documentation =====
 

Web Scraper to request data that is not provided through an API

==== Table of content ====
	- [[wiki:software:beuthbot:webscraper|Scraper Doku]]
	- [[wiki:software:beuthbot:webscraper#table_of_content|Table Of Content]]
	- [[wiki:software:beuthbot:webscraper#getting_started|Getting Started]]
		-[[wiki:software:beuthbot:webscraper#prerequisites|Prerequisites]]
		-[[wiki:software:beuthbot:webscraper#installing |Installing ]]
	- [[wiki:software:beuthbot:webscraper#overview|Overview]]
	- [[wiki:software:beuthbot:webscraper#structure|Structure]]
	- [[wiki:software:beuthbot:webscraper#functionalities|Functionalities]]
		-[[wiki:software:beuthbot:webscraper#study_rooms|Study Rooms]]
	- [[wiki:software:beuthbot:webscraper#further_development|Further Development]]
	- [[wiki:software:beuthbot:webscraper#built_with|Built With]]
	- [[wiki:software:beuthbot:webscraper#versioning|Versioning]]
	- [[wiki:software:beuthbot:webscraper#authors|Authors]]

==== Getting Started ====

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.

=== Prerequisites ===

You will need a current version of [[https://nodejs.org/en/|node & npm]].


=== Installing ===

After cloning the repository, install the dependencies. You can then run the project.


<code>
# install dependencies
npm install

# serve at localhost:8000
npm start

</code>

==== Overview ====

The bot is basically a //Node//-//Express//-Backend. Incoming requests are checked and specifically handled.

==== Structure ====

The bot is separated into two files. ''index.js'' contains the fundamental logic. At the moment there is only one ressource but we expect to expand this with more ressources. A ressource is represented by a //route//. If the user request the lists of study rooms at our university, the script notices the request at the specified route and prepares a //JSON// response.

The second script, ''scrape.js'', takes care of the actual web scraping. The given URL is requested with //axios// and then parsed with //cheerio//.

==== Functionalities ====

=== Study Rooms ===

When the resource is requested we scrape the[[https://asta.studis-bht.de/service/lernraeume/|ASTA Website]] and try to return a list of available rooms, that are provided for students, from our university.

==== Further Development ====

Add a new route for every ressource in ''index.js'' and prepare functions in ''scrape.js'' to scrape the requested data from given Websites.

====  Built With ====

- [[https://nodejs.org/en/|Node.js]]\\
- [[https://expressjs.com/|Express.js]]\\
- [[https://github.com/axios/axios|Axios]]\\
- [[https://github.com/cheeriojs/cheerio|Cheerio]]\\


==== Versioning ====

We use [[http://semver.org/|SemVer]] for versioning. For the versions available, see the [[https://github.com/beuthbot/scraper/tags|tags on this repository]].

==== Authors ====

- **Tobias Klatt** - //Initial work// - [[https://github.com/T0biWan/|GitHub]]

See also the list of [[https://github.com/beuthbot/scraper/graphs/contributors|contributors]] who participated in this project.
<WRAP pagebreak></WRAP>