Carin Meier / Feb 15 2020
Parens for Python - Sci SpaCy
NLP for scientific text
We are going to explore some more Python libraries through the use of libpython-clj.
This time, we are going to look at Sci SpaCy
{:deps {org.clojure/clojure {:mvn/version "1.10.1"} clj-python/libpython-clj {:mvn/version "1.36"}}}deps.edn
Clojure
Install the python dependencies and model
pip3 install spacy scispacypip3 install https://s3-us-west-2.amazonaws.com/ai2-s2-scispacy/releases/v0.2.4/en_core_sci_sm-0.2.4.tar.gz34.3s
Clj & Python env (Bash in Clojure)
We are going to be following the tutorial from https://allenai.github.io/scispacy/
Load up the model and analyze
The first thing we need to do is to load up the namespace, and model
(ns gigasquid.sci-spacy (:require [libpython-clj.require :refer [require-python]] [libpython-clj.python :as py :refer [py. py.. py.-]]))(require-python [spacy :as spacy])(require-python [scispacy :as scispacy])(def nlp (spacy/load "en_core_sci_sm"))17.0s
Clj & Python env (Clojure)
gigasquid.sci-spacy/nlp
Now, we are ready to analyze some text:
(def text "Myeloid derived suppressor cells (MDSC) are immature myeloid cells with immunosuppressive activity. They accumulate in tumor-bearing mice and humans with different types of cancer, including hepatocellular carcinoma (HCC).")(def doc (nlp text))0.1s
Clj & Python env (Clojure)
gigasquid.sci-spacy/doc
Let's find all the entities.
(map (fn [ent] (py.- ent text)) (py.- doc ents))0.0s
Clj & Python env (Clojure)
List(12) ("Myeloid", "suppressor cells", "MDSC", "immature", "myeloid cells", "immunosuppressive activity", "accumulate", "tumor-bearing mice", "humans", "cancer", "hepatocellular
carcinoma", "HCC")
The same with the sentences.
(map (fn [sent] (py.- sent text)) (py.- doc sents))0.0s
Clj & Python env (Clojure)
List(2) ("Myeloid derived suppressor cells (MDSC) are immature
myeloid cells with immunosuppressive activity.
", "They accumulate in tumor-bearing mice and humans
with different types of cancer, including hepatocellular
carcinoma (HCC).")
We can even graph things!
(require-python [spacy.displacy :as displacy])(spit "results/my-pic.svg" (displacy/render (first (py.- doc sents)) :style "dep"))0.0s
Clj & Python env (Clojure)
Want more examples? Check them out here: https://github.com/gigasquid/libpython-clj-examples