Skip to content

Build and use a model for matching text headers to HXL hashtags and attributes.

License

Notifications You must be signed in to change notification settings

HXLStandard/hxltags

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HXL hashtag lookup

Compile and use a model for matching text headers to HXL hashtags.

For more information about HXL, see https://hxlstandard.org

Requirements

  • Python 3
  • libhxl

Use the command

$ pip3 install -r requirements.txt

to install the requirements.

Usage

This package uses the output from https://github.com/HXLStandard/hdx-hashtag-crawler

The distribution includes a snapshot in inputs/ (but you can create your own, fresher one)

Command-line usage

$ python3 -m hxltags.compiler inputs/20200720-hxl-tags-atts.csv > my-model.json
$ python3 -m hxltags.lookup my-model.json

Python usage

import hxltags.compiler, hxltags.lookup

model = hxltags.compiler.build_model("inputs/20200720-hxl-tags-atts.csv")

results = hxltags.lookup.lookup_header("Number of people affected", model)

Result format

The results are a list of tuples consisting of a hashtag and a number. The higher the number, the more-certain the match. The most-certain matches appear first. Example:

[
    ("#affected", 57,),
    ("#affected+children+f", 43,),
    ("#inneed+children", 22,),
]

License

This code is in the Public Domain. See UNLICENSE.md for details.

Author

David Megginson, Centre for Humanitarian Data, UNOCHA

About

Build and use a model for matching text headers to HXL hashtags and attributes.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published