Micah P. Dombrowski / May 13 2019
NLTK Stanford Environment
1. Download NLTK Add-ons
Download various Java-based Stanford tools that NLTK can use. Cell is locked to prevent spurious re-downloading.
cd /results wget --progress=dot:giga \ https://nlp.stanford.edu/software/stanford-ner-2018-10-16.zip \ https://nlp.stanford.edu/software/stanford-postagger-full-2018-10-16.zip \ https://nlp.stanford.edu/software/stanford-parser-full-2018-10-17.zip \ https://nlp.stanford.edu/software/stanford-{arabic,chinese,english,french,german,spanish}-corenlp-2018-10-05-models.jar
2. Install NLTK + Tools
apt-get -qq update DEBIAN_FRONTEND=noninteractive apt-get install --no-install-recommends \ default-jdk apt-get clean rm -r /var/lib/apt/lists/* # Clear package list so it isn't stale conda install nltk conda clean -qtipy pip install PyPDF2
mkdir /usr/local/nltk cd /usr/local/nltk unzip -qstanford-spanish-corenlp-2018-10-05-models.jarln -s stanford-ner* ner unzip -qstanford-spanish-corenlp-2018-10-05-models.jarln -s stanford-postagger* postagger unzip -qstanford-spanish-corenlp-2018-10-05-models.jarln -s stanford-parser* parser cpstanford-spanish-corenlp-2018-10-05-models.jar\stanford-spanish-corenlp-2018-10-05-models.jar\stanford-spanish-corenlp-2018-10-05-models.jar\stanford-spanish-corenlp-2018-10-05-models.jar\stanford-spanish-corenlp-2018-10-05-models.jar\stanford-spanish-corenlp-2018-10-05-models.jar\ parser/