Carin Meier / Feb 15 2020
Parens for Python - Sci SpaCy
NLP for scientific text
We are going to explore some more Python libraries through the use of libpython-clj.
This time, we are going to look at Sci SpaCy
{:deps
{org.clojure/clojure {:mvn/version "1.10.1"}
clj-python/libpython-clj {:mvn/version "1.36"}}}
deps.edn
Clojure
Install the python dependencies and model
pip3 install spacy scispacy
pip3 install https://s3-us-west-2.amazonaws.com/ai2-s2-scispacy/releases/v0.2.4/en_core_sci_sm-0.2.4.tar.gz
34.3s
Clj & Python env (Bash in Clojure)
We are going to be following the tutorial from https://allenai.github.io/scispacy/
Load up the model and analyze
The first thing we need to do is to load up the namespace, and model
(ns gigasquid.sci-spacy
(:require [libpython-clj.require :refer [require-python]]
[libpython-clj.python :as py :refer [py. py.. py.-]]))
(require-python [spacy :as spacy])
(require-python [scispacy :as scispacy])
(def nlp (spacy/load "en_core_sci_sm"))
17.0s
Clj & Python env (Clojure)
gigasquid.sci-spacy/nlp
Now, we are ready to analyze some text:
(def text "Myeloid derived suppressor cells (MDSC) are immature
myeloid cells with immunosuppressive activity.
They accumulate in tumor-bearing mice and humans
with different types of cancer, including hepatocellular
carcinoma (HCC).")
(def doc (nlp text))
0.1s
Clj & Python env (Clojure)
gigasquid.sci-spacy/doc
Let's find all the entities.
(map (fn [ent] (py.- ent text)) (py.- doc ents))
0.0s
Clj & Python env (Clojure)
List(12) ("Myeloid", "suppressor cells", "MDSC", "immature", "myeloid cells", "immunosuppressive activity", "accumulate", "tumor-bearing mice", "humans", "cancer", "hepatocellular
carcinoma", "HCC")
The same with the sentences.
(map (fn [sent] (py.- sent text)) (py.- doc sents))
0.0s
Clj & Python env (Clojure)
List(2) ("Myeloid derived suppressor cells (MDSC) are immature
myeloid cells with immunosuppressive activity.
", "They accumulate in tumor-bearing mice and humans
with different types of cancer, including hepatocellular
carcinoma (HCC).")
We can even graph things!
(require-python [spacy.displacy :as displacy])
(spit "results/my-pic.svg" (displacy/render (first (py.- doc sents)) :style "dep"))
0.0s
Clj & Python env (Clojure)
Want more examples? Check them out here: https://github.com/gigasquid/libpython-clj-examples