scikit-learn
It is a popular framework for
- its rich content,
- wide selection of model algorithms,
- unified API coding style
Preprocessing :
train_test_split :
from sklearn.model_selection import train_test_split
# without random_state = 0, the split will be random every time
train_X, val_X, train_y, val_y = train_test_split(X, y, random_state = 0)
Models :
Record the algorithm I’ve used or seen.
Example of unified API :
# import the algorithm you want
# Only this line varies by the model you choose
from sklearn.tree import DecisionTreeRegressor
# From here are all similar
model = DecisionTreeRegressor(random_state=0)
# Train
model.fit(X,y)
# Predict
predictions = model.predict(test_data)
Metrics :
- i.e.
from sklearn.metrics import xxxx
mean_absolute_error :
from sklearn.metrics import mean_absolute_error
pred = model.predict(X)
mean_absolute_error(y, pred)