We are only scraping publically available pages (Pages not requiring authentication). Avoid hammering the data sources with requests and consider using the data that has already been scraped.
Tool to scrape data and distill it.
git clone git@github.com:Acrylic125/fntu.git then cd scraper and pnpm i.Once cloned, run the following commands to start scraping:
pnpm run start courses
pnpm run start locations
The code can be found here.

You may modify any of these steps to fit your needs.
Scraping can lead to incosnsitent results as we do not own the pages we scrape from. The steps are broken into individual scripts to easily tell which step failed.
Data sources are the pages we want to scrape from.
Use | Data Sources | |||
|---|---|---|---|---|
Courses | ||||
Locations | Sourcing main locations. We have to link the names used in undergraduate programsto the names used in MapIndoors. Thus, we add Altername Names (altNames) to each location. We source | |||