Repository containing large corpus files for testing gosaca.
The files in the large_corpus directory are from Manzini's Large Corpus.
The files in the gauntlet_corpus directory are from Maniscalco's Gauntlet Corpus. The original path to this corpus was http://www.michael-maniscalco.com/testset/gauntlet/, but last time I looked on that website I couldn't find them. They are available from a mirror at compressionratings.com.