Skip to content

Latest commit

 

History

History
10 lines (7 loc) · 544 Bytes

README.md

File metadata and controls

10 lines (7 loc) · 544 Bytes

HERD

HERD (Hajen Entity Recognizer and Disambiguator) is a tool for automatically recognizing names in text (entity recognition) and specifying who is meant (disambiguation).

It is written in Java, and depends on Solr Text Tagger, by David Smiley and has a lot of inspiration from the Tulip project by Marek Lipczak et al.

The code will not run as is. It contains static paths to directories on my machine, and needs Wikipedia to be processed a couple of times to generate said files. It can be interesting for someone to read though.