Skip to content

scikit-learn


It is a popular framework for

  • its rich content,
  • wide selection of model algorithms,
  • unified API coding style

Preprocessing :

train_test_split :

from sklearn.model_selection import train_test_split

# without random_state = 0, the split will be random every time
train_X, val_X, train_y, val_y = train_test_split(X, y, random_state = 0)

Models :

Record the algorithm I’ve used or seen.

Example of unified API :

# import the algorithm you want
# Only this line varies by the model you choose
from sklearn.tree import DecisionTreeRegressor

# From here are all similar
model = DecisionTreeRegressor(random_state=0)

# Train
model.fit(X,y)

# Predict
predictions = model.predict(test_data)

Metrics :

  • i.e. from sklearn.metrics import xxxx

mean_absolute_error :

from sklearn.metrics import mean_absolute_error

pred = model.predict(X)
mean_absolute_error(y, pred)

Comments