Skip to content

LoikaAR/info-retrieval

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

92 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

info-retrieval

Information Retrieval project

Websites in use:

  • ticinotopten.ch
    • good: many categories, local specialties
    • bad: poor meta for each activity
  • myswitzerland.ch
    • good: many categories, great meta
    • bad: need to go into pages for more detailed info per activity
  • zermatt.ch
    • good: many categories, great meta, local specialties
    • bad: poorly structured

Elements scraped:

  • Activity name
  • Activity type:
    • Hike
    • Cycling
    • Adventure
    • (Other - water, snow, parks etc.)
  • Region
  • Distance (km)
  • Duration (h)

If possible, also scrape:

  • Ascent
  • Description
  • Accessibility info

Files:

  • hiking_ti
    • crawls the hiking pages of ticinotopten.ch
  • activities_ti
    • crawls the other activity categories of ticinotopten.ch
  • hiking_ch1
    • crawls all activities of zermatt.ch
  • hiking_ch2
    • crawls the hiking section of myswitzerland.ch

TODO:

  • User eval