Nick Doiron / Mar 04 2020
Remix of Python by Nextjournal
Testing for Manifold
import sys; sys.version.split()[0]
0.2s
Python
'3.7.5'
pip install xgboost
pip install --upgrade pandas
11.9s
Bash in Python
import pandas as pd
listings = pd.read_csv(listings.csv)
listings.head()
0.4s
Python
id | name | host_id | host_name | neighbourhood_group | neighbourhood | latitude | longitude | room_type | price | minimum_nights | number_of_reviews | last_review | reviews_per_month | calculated_host_listings_count | availability_365 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2818 | Quiet Garden View Room & Super Fast WiFi | 3159 | Daniel | Oostelijk Havengebied - Indische Buurt | 52.36575 | 4.94142 | Private room | 59 | 3 | 277 | 2019-11-21 | 2.09 | 1 | 37 | |
1 | 20168 | Studio with private bathroom in the centre 1 | 59484 | Alexander | Centrum-Oost | 52.36509 | 4.89354 | Private room | 100 | 1 | 321 | 2020-02-07 | 2.65 | 2 | 134 | |
2 | 25428 | Lovely apt in City Centre (w.lift) near Jordaan | 56142 | Joan | Centrum-West | 52.37297 | 4.88339 | Entire home/apt | 125 | 14 | 5 | 2020-02-09 | 0.2 | 2 | 129 | |
3 | 27886 | Romantic, stylish B&B houseboat in canal district | 97647 | Flip | Centrum-West | 52.387609999999995 | 4.8918800000000005 | Private room | 155 | 2 | 213 | 2020-02-10 | 2.16 | 1 | 163 | |
4 | 28871 | Comfortable double room | 124245 | Edwin | Centrum-West | 52.36719 | 4.8909199999999995 | Private room | 75 | 2 | 323 | 2020-02-10 | 2.8 | 3 | 114 |
5 items
y = listings['price']
print(y[0:10])
0.4s
Python
X = listings.drop(columns=['id', 'host_id', 'name', 'host_name', 'last_review', 'price', 'neighbourhood_group', 'neighbourhood', 'availability_365'])
#X['neighbourhood'] = X['neighbourhood'].astype('category')
#X['room_type'] = X['room_type'].astype('category')
X['room_type'] = X['room_type'].replace('Shared room', 0).replace('Hotel room', 1).replace('Private room', 2).replace('Entire home/apt', 3)
X['room_type'] = X['room_type'].astype('int64')
print(X.head())
0.4s
Python
print(X.isnull().sum())
0.4s
Python
X = X.fillna(0)
0.1s
Python
from sklearn.linear_model import BayesianRidge, LinearRegression
from sklearn.ensemble import RandomForestRegressor
0.2s
Python
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5)
0.2s
Python
model = LinearRegression()
model.fit(X_train, y_train)
0.1s
Python
LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None, normalize=False)
y_predict = model.predict(X_test)
print(y_predict)
0.4s
Python
from sklearn.metrics import explained_variance_score, r2_score
print(explained_variance_score(y_test, y_predict))
print(r2_score(y_test, y_predict))
0.5s
Python
Downloads
X_test
1.1s
Python
y_test
0.8s
Python
pd.DataFrame(y_predict)
1.0s
Python