Example 2: Regression with CatBoostRegressor Using Snapshot Mechanism

Example 1: Training a CatBoostClassifier with Snapshot Saving and Resuming

In this example, we’ll train a CatBoostRegressor on the Boston Housing dataset, save snapshots, and produce predictions. Step-by-Step Process

1.Install CatBoost:

pip install catboost

2.Load the Dataset and prepare Data:

Python

from catboost import CatBoostRegressor, Pool
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split

# Load Boston Housing dataset
data = load_boston()
X = data.data
y = data.target

# Split the data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

3.Initialize and Train the CatBoost Regressor:

Python

# Convert to CatBoost Pool
train_pool = Pool(X_train, y_train)
test_pool = Pool(X_test, y_test)

# Initialize CatBoost Regressor with snapshot parameters
model = CatBoostRegressor(
    iterations=1000,
    learning_rate=0.03,
    depth=6,
    loss_function='RMSE',
    save_snapshot=True,
    snapshot_file='catboost_snapshot_reg',
    snapshot_interval=60
)

# Train the model
model.fit(train_pool, eval_set=test_pool, verbose=100)

# Save the model
model.save_model('catboost_model_reg')

# Output Predictions
predictions = model.predict(test_pool)
print(predictions)

Output:

0:    learn: 23.6140405    test: 23.5975405    best: 23.5975405 (0)    total: 50ms    remaining: 49.9s
100:    learn: 4.3912311    test: 5.4355656    best: 5.4355656 (100)    total: 1.24s    remaining: 10.9s
...
900:    learn: 2.3542361    test: 4.2355656    best: 4.2355656 (900)    total: 10.6s    remaining: 1.17s
999:    learn: 2.1342000    test: 4.0354546    best: 4.0353536 (900)    total: 11.7s    remaining: 0us

Snapshot files will be created periodically, with the state of the model saved.

4.Resume Training from Snapshot:

If training is interrupted, you can resume training using the snapshot file:

Python

# Re-initialize CatBoost Regressor with snapshot parameters
model = CatBoostRegressor(
    iterations=1000,
    learning_rate=0.03,
    depth=6,
    loss_function='RMSE',
    save_snapshot=True,
    snapshot_file='catboost_snapshot_reg',
    snapshot_interval=60
)

# Resume training from the snapshot
model.fit(train_pool, eval_set=test_pool, verbose=100, init_model='catboost_snapshot_reg')

# Output Predictions
predictions = model.predict(test_pool)
print(predictions)

Output:

[22.415 23.123 19.768 34.235 27.673 ...]

These examples illustrate how to set up and use CatBoost’s training, recovering, and snapshot parameters effectively. By following these steps, you can ensure that your training process is robust and can be resumed seamlessly in case of interruptions.

CatBoost Training, Recovering and Snapshot Parameters

CatBoost means categorical boosting. It is a powerful open-source machine learning library known for its efficiency, accuracy, and ability to handle various data types. It excels in gradient boosting algorithms, making it suitable for classification, regression, and ranking tasks. This guide delves into the key concepts of CatBoost training, recovery from interruptions, and snapshot parameters for smooth training workflows.

Table of Content

Training with CatBoost
Recovering Training Progress in Catboost
Example 1: Training a CatBoostClassifier with Snapshot Saving and Resuming
Example 2: Regression with CatBoostRegressor Using Snapshot Mechanism
Monitoring and Evaluation

Example 2: Regression with CatBoostRegressor Using Snapshot Mechanism

CatBoost Training, Recovering and Snapshot Parameters

Similar Reads

Contact Us