Harvester

OAI

You find the list of registered OAI data provider online. The entries from the list can be used to add OAI harvesters:

Name URL
citeseerx http://citeseerx.ist.psu.edu/oai2
arxiv http://export.arxiv.org/oai2

MAG

Download and extract compressed file

$ cd downloads/harvester/mag
$ curl -O https://academicgraph.blob.core.windows.net/graph-2015-11-06/MicrosoftAcademicGraph.zip
$ 7z x MicrosoftAcademicGraph.zip

Run harvester in terminal

>>> from scholarly_citation_finder.tools.harvester.mag.MagNormalize import MagNormalize
>>> normalizer = MagNormalize('downloads/mag/')
>>> normalizer.run()
>>>
>>> from scholarly_citation_finder.tools.harvester.mag.MagHarvester import MagHarvester
>>> harvester = MagHarvester()
>>> harvester.run()