AutoGluon: The AutoML Framework That Finally Lives Up to the Hype
What is AutoGluon, Really? AutoGluon isn’t just another AutoML tool that blindly tries multiple models and hopes something sticks. It’s a production-ready AutoML framework built and battle-tested inside Amazon.
AutoGluon was created by AWS AI Labs after years of research into a framework that supports the use of:
- State-of-the-art techniques for model ensembling, including multi-layer stacking, bagging, and blending.
- Intelligent preprocessing through automated feature engineering, encoding, and missing value repairs.
- Automated neural architecture search, which allows you to find the best neural network architectures.
- Efficient hyperparameter tuning with the use of Bayesian optimization (instead of random search).
- Support for multi-modal learning operation. The AutoGluon library supports all four modes of learning including tables, images, text, and time series, all within a single library.
AutoGluon is also unique because it was developed by people who have actually built and deployed machine learning models into production. AutoGluon is not academic research code; it has been developed and validated at the Amazon scale.
Why AutoGluon Matters (And Why It’s Different) There are several options in the AutoML ecosystem today, such as Auto-sklearn, TPOT, H2O AutoML, and various others. However, here are a few ways AutoGluon has differentiated itself within this large space. Production-Ready Out of the Box
AutoGluon features include:
- Model Persistence
- Inferred Optimizations
- Efficient memory usage when training on large datasets
- The ability to effectively work with real-world “messy” data.
Multi-Modal Skills AutoGluon is able to handle tabular data, images, text, and time series from a single framework. Furthermore, users can combine data types (tabular and image, for example) in a single model. Users can also take advantage of transfer learning between data types (for example, transferring knowledge learned on images to tabular data).
Intelligent, Not Just Automated AutoGluon learns by observing how users set up their data. Therefore, AutoGluon can automatically change the depth of ensembles based on the risk of overfitting to the training dataset, ensuring the automated choice of optimal hyperparameters and making decisions based on probability — not on arbitrary selection criteria. AutoGluon Respects Your Time and Your Hardware
If you run AutoGluon for 60 seconds, you’ll have a fairly good baseline. If you run AutoGluon for three minutes, you’ll get competitive-level results. If you run AutoGluon for one hour, you’ll exhaustively search the entire space of models. The Old Way vs The AutoGluon Way: A Reality Check
Let me show you the difference between traditional ML and AutoGluon using a real house price prediction problem. The Traditional Approach (The Way We’ve Been Doing It)
Here’s what a typical scikit-learn pipeline looks like for this problem:
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
from sklearn.ensemble import RandomForestRegressor, GradientBoostingRegressor
from sklearn.linear_model import Ridge
from sklearn.model_selection import GridSearchCV, cross_val_score
import xgboost as xgb
import lightgbm as lgb
from sklearn.ensemble import VotingRegressor
# Define preprocessing for numerical and categorical features
numeric_features = ['square_feet', 'bedrooms', 'bathrooms', 'year_built',
'garage_spaces', 'lot_size', 'has_pool']
categorical_features = ['neighborhood', 'condition']
# Create preprocessing pipelines
numeric_transformer = Pipeline(steps=[
('scaler', StandardScaler())
])
categorical_transformer = Pipeline(steps=[
('onehot', OneHotEncoder(handle_unknown='ignore'))
])
preprocessor = ColumnTransformer(
transformers=[
('num', numeric_transformer, numeric_features),
('cat', categorical_transformer, categorical_features)
])
# Try multiple models manually
models = {
'rf': RandomForestRegressor(random_state=42),
'gbm': GradientBoostingRegressor(random_state=42),
'xgb': xgb.XGBRegressor(random_state=42),
'lgb': lgb.LGBMRegressor(random_state=42),
'ridge': Ridge()
}
# Hyperparameter grids for each model (this is tedious)
rf_params = {
'n_estimators': [100, 200, 300],
'max_depth': [10, 20, None],
'min_samples_split': [2, 5, 10]
}
gbm_params = {
'n_estimators': [100, 200],
'learning_rate': [0.01, 0.1, 0.2],
'max_depth': [3, 5, 7]
}
xgb_params = {
'n_estimators': [100, 200],
'learning_rate': [0.01, 0.1],
'max_depth': [3, 5, 7],
'colsample_bytree': [0.7, 0.8]
}
# Train each model with grid search (hours of compute time)
best_models = {}
for name, model in models.items():
pipeline = Pipeline([
('preprocessor', preprocessor),
('model', model)
])
if name == 'rf':
param_grid = {'model__' + k: v for k, v in rf_params.items()}
elif name == 'gbm':
param_grid = {'model__' + k: v for k, v in gbm_params.items()}
# ... and so on for each model
grid_search = GridSearchCV(pipeline, param_grid, cv=5,
scoring='neg_mean_squared_error',
n_jobs=-1, verbose=1)
grid_search.fit(X_train, y_train)
best_models[name] = grid_search.best_estimator_
# Create ensemble manually
ensemble = VotingRegressor([
('rf', best_models['rf']),
('xgb', best_models['xgb']),
('lgb', best_models['lgb'])
])
ensemble.fit(X_train, y_train)
# Evaluate
predictions = ensemble.predict(X_test)
rmse = np.sqrt(mean_squared_error(y_test, predictions))
print(f"RMSE: {rmse:.2f}")
Line count: ~100 lines of code Mental overhead: High (dozens of decisions to make) Flexibility for experiments: Low (changing anything requires significant refactoring) The AutoGluon Way (How It Should Be)
Now, here’s the same task with AutoGluon:
from autogluon.tabular import TabularPredictor
# That's literally it for imports
# Train
predictor = TabularPredictor(
label='price',
eval_metric='root_mean_squared_error'
).fit(
train_data=train_data,
time_limit=180, # 3 minutes
presets='best_quality'
)
# Evaluate
predictions = predictor.predict(test_data)
rmse = predictor.evaluate(test_data)
print(f"RMSE: {rmse}")
Line count: 10–15 lines of code Mental overhead: Minimal (AutoGluon makes the decisions) Flexibility for experiments: High (change one parameter and re-run) The contrast between the two is incredible. The most impressive aspect is that AutoGluon often produces superior results due to its ability to investigate combinations of different models and explore hyperparameter configurations that would take an extensive amount of time to perform manually. Real-World Example: House Price Prediction From Start to Finish Let me walk you through a complete AutoGluon workflow using a house price prediction dataset. I’ll show you everything, from data prep to model deployment.
Step 1: Data Preparation
First, let’s create a realistic house price dataset:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
# Create synthetic data
np.random.seed(42)
n_samples = 1000
data = pd.DataFrame({
'square_feet': np.random.randint(800, 4000, n_samples),
'bedrooms': np.random.randint(1, 6, n_samples),
'bathrooms': np.random.randint(1, 4, n_samples),
'year_built': np.random.randint(1960, 2024, n_samples),
'garage_spaces': np.random.randint(0, 4, n_samples),
'lot_size': np.random.randint(2000, 15000, n_samples),
'neighborhood': np.random.choice(['Downtown', 'Suburban', 'Rural', 'Uptown'], n_samples),
'has_pool': np.random.choice([0, 1], n_samples, p=[0.7, 0.3]),
'condition': np.random.choice(['Excellent', 'Good', 'Fair', 'Poor'], n_samples)
})
# Create realistic price target
data['price'] = (
data['square_feet'] * 150 +
data['bedrooms'] * 10000 +
data['bathrooms'] * 15000 +
(2024 - data['year_built']) * (-500) +
data['garage_spaces'] * 8000 +
data['lot_size'] * 5 +
data['has_pool'] * 25000 +
np.random.normal(0, 30000, n_samples)
)
# Add neighborhood and condition effects
neighborhood_premium = {'Downtown': 50000, 'Uptown': 40000, 'Suburban': 20000, 'Rural': 0}
data['price'] += data['neighborhood'].map(neighborhood_premium)
condition_multiplier = {'Excellent': 1.15, 'Good': 1.0, 'Fair': 0.9, 'Poor': 0.75}
data['price'] *= data['condition'].map(condition_multiplier)
print(data.head())
print(f"\nDataset shape: {data.shape}")
Output:
square_feet bedrooms bathrooms year_built garage_spaces lot_size \ 0 3974 1 2 1971 2 13534 1 1660 5 2 1961 0 14931 2 2094 4 1 1966 1 5676 3 1930 2 2 1975 0 4426 4 1895 1 1 1962 1 10264 neighborhood has_pool condition price 0 Downtown 1 Poor 570335.178459 1 Suburban 1 Fair 341998.293896 2 Uptown 0 Good 382041.925454 3 Uptown 1 Fair 362824.235549 4 Uptown 0 Fair 289870.988067 Dataset shape: (1000, 10)
Step 2: Train-Test Split
train_data, test_data = train_test_split(data, test_size=0.2, random_state=42)
print(f"Training set size: {len(train_data)}")
print(f"Test set size: {len(test_data)}")
Output:
Training set size: 800 Test set size: 200
Step 3: Quick Baseline Model (60 seconds) Let’s start with a quick baseline to see what AutoGluon can do in just one minute:
from autogluon.tabular import TabularPredictor
predictor_quick = TabularPredictor(
label='price',
eval_metric='root_mean_squared_error',
path='autogluon_models_quick'
).fit(
train_data=train_data,
time_limit=60, # Just 60 seconds!
presets='medium_quality'
)
What Happened Behind the Scenes:
Beginning AutoGluon training ... Time limit = 60s
Train Data Rows: 800
Train Data Columns: 9
Label Column: price
Problem Type: regression
Preprocessing data ...
- Identified 2 categorical features
- Identified 6 numerical features
- Identified 1 boolean feature
- Automatically handled feature encoding
Fitting 9 L1 models, fit_strategy="sequential" ...
Fitting model: LightGBMXT | Validation RMSE: 33,476 | 12.8s
Fitting model: LightGBM | Validation RMSE: 32,580 | 0.7s
Fitting model: RandomForestMSE | Validation RMSE: 43,095 | 1.1s
Fitting model: CatBoost | Validation RMSE: 30,878 | 9.9s ← Best!
Fitting model: ExtraTreesMSE | Validation RMSE: 40,513 | 1.1s
Fitting model: NeuralNetFastAI | Validation RMSE: 34,249 | 3.6s
Fitting model: XGBoost | Validation RMSE: 37,926 | 0.6s
Fitting model: NeuralNetTorch | Validation RMSE: 33,798 | 13.3s
Fitting model: LightGBMLarge | Validation RMSE: 51,443 | 2.8s
Fitting model: WeightedEnsemble_L2 ...
Ensemble Weights: {'CatBoost': 0.571, 'LightGBM': 0.238,
'NeuralNetFastAI': 0.143, 'NeuralNetTorch': 0.048}
Validation RMSE: 30,361 ← Even better with ensemble!
Best model: WeightedEnsemble_L2
Total runtime = 46.84s
In under 60 seconds, AutoGluon:
- Trained 9 different model types
- Created an optimized ensemble
- Achieved an RMSE of 30,361
- Automatically handled all preprocessing
Step 4: High-Quality Model (3 minutes) Now let’s give AutoGluon more time to really optimize:
predictor = TabularPredictor(
label='price',
eval_metric='root_mean_squared_error',
path='autogluon_models'
).fit(
train_data=train_data,
time_limit=180, # 3 minutes for better performance
presets='best_quality',
num_bag_folds=5, # 5-fold bagging
num_bag_sets=1,
num_stack_levels=1 # Enable stacking
)
Advanced Training Log:
Stack configuration: num_stack_levels=1, num_bag_folds=5 Running DyStack (Dynamic Stacking) to detect optimal stack depth... ├─ Testing if stacked overfitting occurs ├─ Running sub-fit on 711 samples └─ Optimal stack levels: 1 (Stacked Overfitting: False) Fitting 106 L1 models with 5-fold bagging ... ├─ LightGBMXT_BAG_L1 | Val RMSE: 32,564 | 3.4s ├─ LightGBM_BAG_L1 | Val RMSE: 34,774 | 3.4s ├─ RandomForestMSE_BAG | Val RMSE: 40,970 | 0.9s ├─ CatBoost_BAG_L1 | Val RMSE: 32,784 | 11.7s ├─ ExtraTreesMSE_BAG | Val RMSE: 38,347 | 0.8s ├─ NeuralNetFastAI_BAG | Val RMSE: 34,769 | 4.9s ├─ XGBoost_BAG_L1 | Val RMSE: 36,952 | 2.4s ├─ NeuralNetTorch_BAG | Val RMSE: 33,486 | 46.7s └─ WeightedEnsemble_L2 | Val RMSE: 30,797 | 0.01s ← Best Base Ensemble Fitting 106 L2 stacked models ... ├─ LightGBMXT_BAG_L2 | Val RMSE: 32,907 | 2.9s ├─ LightGBM_BAG_L2 | Val RMSE: 33,222 | 3.3s ├─ RandomForestMSE_L2 | Val RMSE: 33,014 | 1.4s ├─ CatBoost_BAG_L2 | Val RMSE: 32,361 | 8.2s ├─ ExtraTreesMSE_L2 | Val RMSE: 32,133 | 0.8s ├─ NeuralNetFastAI_L2 | Val RMSE: 32,608 | 4.9s └─ WeightedEnsemble_L3 | Val RMSE: 30,963 | 0.01s Best model: WeightedEnsemble_L2 Total runtime = 131.71s
AutoGluon trained 19 different models (some bagged, some stacked) and intelligently combined them. The result? An RMSE that’s even better than the quick baseline.
Step 5: Model Evaluation and Leaderboard
# Get comprehensive leaderboard leaderboard = predictor.leaderboard(test_data, silent=True) print(leaderboard[['model', 'score_val', 'score_test', 'pred_time_test', 'fit_time']])
Output:
model score_val score_test pred_time_test fit_time 0 NeuralNetFastAI_BAG_L2 -32608.303638 -31054.533152 0.754521 75.729637 1 WeightedEnsemble_L2 -30796.981062 -31190.174153 0.398151 67.542006 2 WeightedEnsemble_L3 -30963.386973 -31209.135082 1.017765 88.501941 3 ExtraTreesMSE_BAG_L2 -32133.058226 -31530.727709 0.813446 71.636815 4 CatBoost_BAG_L1 -32783.502448 -31602.647861 0.027552 11.709530 5 CatBoost_BAG_L2 -32360.740232 -31791.340694 0.686180 79.099476 6 RandomForestMSE_BAG_L2 -33014.135121 -32244.523621 0.843467 72.263429 7 LightGBMXT_BAG_L1 -32563.664400 -32554.257240 0.069319 3.430695
Key Insights:
- Best validation model: WeightedEnsemble_L2 (RMSE: 30,797)
- Best test model: NeuralNetFastAI_BAG_L2 (RMSE: 31,055)
- Fastest inference: CatBoost_BAG_L1 (27ms per prediction)
- Most efficient training: LightGBMXT_BAG_L1 (3.4s training time)
Step 6: Feature Importance Analysis
feature_importance = predictor.feature_importance(test_data) print(feature_importance)
Output:
importance stddev p_value square_feet 148798.929296 8870.398492 0.000002 ← Most important! condition 78011.505445 7921.873242 0.000013 lot_size 9756.626895 1449.772953 0.000057 neighborhood 5653.546990 619.151724 0.000017 bedrooms 3552.873036 544.044853 0.000064 has_pool 3334.849684 476.459196 0.000049 bathrooms 3005.488483 739.324534 0.000406 garage_spaces 2145.916096 466.344572 0.000252 year_built 1931.766729 841.835877 0.003417
AutoGluon uses permutation importance to rank features. This tells us:
- Square footage dominates price predictions (148k importance score)
- Condition is the second most important factor
- Traditional features like bedrooms/bathrooms matter less than expected
Step 7: Making Predictions on New Data
# Predict on test set
predictions = predictor.predict(test_data.drop('price', axis=1))
# Performance metrics
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
mae = mean_absolute_error(test_data['price'], predictions)
rmse = np.sqrt(mean_squared_error(test_data['price'], predictions))
r2 = r2_score(test_data['price'], predictions)
print(f"Performance Metrics:")
print(f"MAE: ${mae:,.2f}")
print(f"RMSE: ${rmse:,.2f}")
print(f"R² Score: {r2:.4f}")
Output:
Performance Metrics: MAE: $25,505.91 RMSE: $31,190.17 R² Score: 0.9602
Translation: The model explains 96% of the variance in house prices, with an average error of about $25,500.
Step 8: Predicting on Brand New Houses
new_house = pd.DataFrame({
'square_feet': [2500],
'bedrooms': [4],
'bathrooms': [3],
'year_built': [2015],
'garage_spaces': [2],
'lot_size': [8000],
'neighborhood': ['Uptown'],
'has_pool': [1],
'condition': ['Excellent']
})
predicted_price = predictor.predict(new_house)
print(f"Predicted Price: ${predicted_price.values[0]:,.2f}")
Output:
Predicted Price: $651,057.31
The model predicts this excellent-condition, 4-bedroom Uptown house with a pool would sell for around $651k. Reasonable given our dataset’s characteristics! Advanced Features That Set AutoGluon Apart
1. Dynamic Stacking (DyStack) Many AutoML systems do not stack or use base models for stacking in a systematic manner. AutoGluon has a more intelligent approach. The issue with stacking models is that there is potential for overfitting. Furthermore, adding additional layers does not necessarily provide any benefit.
AutoGluon’s approach to this challenge is to automatically decide on how many layers to add for stacking using DyStack which:
- Trains the models on a subset of the data
- Evaluates the performance of stacked models with respect to overfitting against the holdout data
- Selects the stacking depth that provides the best overall performance on generalization.
From our training log:
Running DyStack for up to 45s ... ├─ Testing 0, 1, and 2 stack levels ├─ 1 stack level: Holdout RMSE = 31,221 ├─ 2 stack levels: Holdout RMSE = 32,195 (overfitting detected!) └─ Optimal stack levels: 1
AutoGluon discovered that stacking at two different levels was actually detrimental to performance, so they chose to use one level. This type of decision is why AutoGluon will outsmart naive AutoML solutions.
2. Model Distillation If you want the accuracy of an ensemble model but require fast single models, you can use AutoGluon to perform model distillation on ensemble models:
- Create a distilled model
predictor.distill(time_limit=30, augment_method='spunge')
What Just Happened:
Distilling with teacher='WeightedEnsemble_L2_FULL' SPUNGE: Augmenting training data with 3200 synthetic samples... Training student models: ├─ LightGBM_2_DSTL | Val RMSE: 30,832 | 0.95s ├─ CatBoost_2_DSTL | Val RMSE: 31,749 | 14.2s ├─ RandomForest_2_DSTL | Val RMSE: 47,870 | 2.3s ├─ NeuralNetTorch_2_DSTL | Val RMSE: 30,771 | 12.1s └─ WeightedEnsemble_DSTL | Val RMSE: 29,426 | 0.01s ← Best! Distilled model is 4.5x faster with only 2% accuracy loss!
This distilled version of the ensemble: Inference Time: 17ms (compared to 398ms for original ensemble) Root Mean Square Error (RMSE): 29,426 (which is actually better than the original!) One file only (no ensemble complexity).
This is a huge improvement for using it in production because you can deploy a single model that has almost the same accuracy as an entire ensemble.
3. Refit on Full Data
After finding the best models via cross-validation, AutoGluon can retrain them on 100% of your data:
predictor.refit_full()
This typically adds 2–5% performance boost by using validation data for training too. The “_FULL” models are your production models.
4. Model-Specific Predictions
Need to use a specific model instead of the ensemble?
# Use only CatBoost for inference catboost_predictions = predictor.predict(test_data, model='CatBoost_BAG_L1') # Use the fastest model for real-time serving fast_predictions = predictor.predict(test_data, model='LightGBMXT_BAG_L1')
This flexibility is crucial when deployment constraints matter. When AutoGluon Shines (And When It Doesn’t) AutoGluon is PERFECT for: 1. Baseline Models & Competitions
- Proof-of-concept projects
- Beating your data scientist friend’s hand-tuned model
2. Production ML for Tabular Data
- Customer churn prediction
- Fraud detection
- Demand forecasting
- Risk assessment
- Anything with structured data
3. When You Lack Domain Expertise in ML
- You’re a domain expert but not an ML expert
- You need results fast
- You don’t have time for hyperparameter tuning
4. Benchmarking
- Testing if complex modeling is even worth it
- Establishing performance baselines
- Comparing with manual approaches
When to Think Twice About AutoGluon:
- Very specific architecture. If you require: a. Customised loss function b. Customised attention mechanisms c. Architecturally constrained models Then, you may be better off manually implementing your solution. AutoGluon has a great deal of functionality available to users but is not an infinitely flexible solution.
- Explainability is very important. Models created by AutoGluon can be considered “black boxes.” Therefore, if your organisation is required to explain every decision made by a model to a regulator, you should either look to create a simpler model or to use AutoGluon with interpretable base models only, while utilising additional tools (e.g. SHAP) to aid with model interpretation, keeping in mind that these tools will add additional complexity to the process.
- Very limited resources. AutoGluon requires a significant amount of RAM and computing resources. If you are using: a. Devices with very little memory available b. Devices with less than 2GB of memory c. Devices that you cannot afford to train on for at least three minutes Then, you should consider using a simpler, manually optimised model.
- Online Learning. AutoGluon supports batch learning models only; it does not support online learning or continually learning from information as it becomes available. Therefore, if your organisation needs: a. Models that will be continuously updated b. Models that continuously learn from information streams c. Models that can adapt immediately upon receipt of new information You will want to investigate online learning frameworks instead.
Common Pitfalls and How to Avoid Them Through extensive use of AutoGluon, I’ve learned some hard lessons:
1. Time Limit Too Short
# ❌ Bad: Not enough time for quality models predictor = TabularPredictor(label='target').fit(train_data, time_limit=30) # ✅ Good: Minimum 60s for baseline, 180s+ for quality predictor = TabularPredictor(label='target').fit(train_data, time_limit=180)
Rule of thumb:
- 60s: Quick baseline
- 180s (3 min): Good quality
- 600s (10 min): High quality
- 3600s (1 hour): Competition-grade
2. Ignoring the Leaderboard AutoGluon trains many models. Always check the leaderboard to understand speed/accuracy tradeoffs:
leaderboard = predictor.leaderboard(test_data)
# Find fastest model within 5% of best accuracy
best_score = leaderboard['score_test'].max()
threshold = best_score * 0.95
fast_models = leaderboard[
(leaderboard['score_test'] >= threshold) &
(leaderboard['pred_time_test'] < 0.1) # < 100ms inference
]
print(fast_models)
For production, you often want the fastest model that meets your accuracy requirements, not the most accurate model.
3. Not Specifying Evaluation Metrics
# ❌ Bad: Uses accuracy for regression (makes no sense)
predictor = TabularPredictor(label='price').fit(train_data)
# ✅ Good: Specify appropriate metric
predictor = TabularPredictor(
label='price',
eval_metric='root_mean_squared_error' # For regression
).fit(train_data)
Common metrics:
- Regression: 'root_mean_squared_error', 'mean_absolute_error', 'r2'
- Binary classification: 'roc_auc', 'f1', 'accuracy'
- Multi-class: 'log_loss', 'accuracy'
4. Forgetting to Save Your Models AutoGluon auto-saves, but specify paths explicitly:
# ✅ Good: Explicit paths for different experiments
predictor = TabularPredictor(
label='price',
path='./models/house_prices_v1' # Clear version control
).fit(train_data)
# Load later
loaded_predictor = TabularPredictor.load('./models/house_prices_v1')
5. Not Using Presets Wisely AutoGluon has several presets:
# Fast but basic predictor.fit(train_data, presets='medium_quality') # Good for prototyping # Slower but better predictor.fit(train_data, presets='best_quality') # Production models # Interpretable models only predictor.fit(train_data, presets='interpretable') # When explainability matters # Optimize for speed predictor.fit(train_data, presets='optimize_for_deployment') # Fast inference
Choose based on your priorities. Don’t default to best_quality for every experiment.
6. Memory Issues with Large Datasets
# ❌ Bad: Training on 10M rows with default settings (OOM likely) predictor.fit(huge_dataset, time_limit=600) # ✅ Good: Sample for initial experiments sample = huge_dataset.sample(n=100000, random_state=42) predictor.fit(sample, time_limit=180) # Or use bag_folds to reduce memory predictor.fit(huge_dataset, time_limit=600, num_bag_folds=3) # Lower folds = less memory
For truly large datasets, consider:
- Training on a sample first
- Using fewer bag folds
- Increasing RAM or using cloud instances
- Distributed training (AutoGluon supports this)
Integration with Your ML Stack AutoGluon plays well with the ecosystem: With MLflow
import mlflow
from autogluon.tabular import TabularPredictor
with mlflow.start_run():
predictor = TabularPredictor(label='price').fit(train_data)
# Log metrics
mlflow.log_metrics({
'rmse': predictor.evaluate(test_data),
'best_model': predictor.get_model_best()
})
# Log model
mlflow.log_artifacts(predictor.path)
With FastAPI for Deployment
from fastapi import FastAPI
from autogluon.tabular import TabularPredictor
import pandas as pd
app = FastAPI()
predictor = TabularPredictor.load('./models/house_prices_v1')
@app.post("/predict")
def predict(data: dict):
df = pd.DataFrame([data])
prediction = predictor.predict(df)
return {"predicted_price": float(prediction.values[0])}
With Docker
FROM python:3.10-slim RUN pip install autogluon.tabular COPY ./models /app/models COPY ./app.py /app/app.py WORKDIR /app CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]
AutoGluon models are self-contained and deploy easily. The Bottom Line: Why You Should Use AutoGluon I’ve worked with AutoGluon extensively across a variety of projects and I’m here to share my thoughts:
AutoGluon is the best option for those who:
- Utilize tabular data on a continuous basis;
- Require an immediate production-ready model;
- Value the time spent implementing a model more than anything else;
- Wish to employ the most cutting-edge models without the advanced degrees or years of experience typically required;
- Have the option to select the model they plan to deploy knowing it will deliver successful results.
However, AutoGluon may not be suitable for those who:
- Require ultimate control over their models’ architecture;
- Need highly unique model architectures with special requirements;
- Are working with very small datasets (less than 100 instances);
- Require online learning capabilities;
- Are limited by a maximum model size of less than 1 gigabyte.
For approximately 80% of real-world ML challenges, AutoGluon is your smartest option.
Getting Started: Your First AutoGluon Project Ready to try AutoGluon? Here’s a 5-minute quickstart:
# 1. Install
!pip install autogluon
# 2. Load your data
import pandas as pd
train_data = pd.read_csv('your_data.csv')
# 3. Train
from autogluon.tabular import TabularPredictor
predictor = TabularPredictor(label='your_target_column').fit(
train_data,
time_limit=180
)
# 4. Evaluate
leaderboard = predictor.leaderboard(test_data)
print(leaderboard)
# 5. Predict
predictions = predictor.predict(new_data)
That’s it. You now have a production-ready ensemble model. Resources to Go Deeper Official Documentation:
- AutoGluon Main Site
- Tutorials
- GitHub
Advanced Topics:
- Multi-modal learning (text + tabular + images)
- Time series forecasting
- Custom models
Conclusion: The Automation Revolution Continues
AutoGluon does not just function as an AutoML; AutoGluon changes the way machine learning workflows are created, optimized and implemented. Unlike other AutoMLs that require you to attempt to coordinate the steps of preprocessing, selecting models, tuning hyperparameters and ensemble methods, AutoGluon manages all these components in an intelligent and efficient way so you do not need to do them manually. What would normally take an individual hours or maybe even days to perform can be delivered by AutoGluon in a matter of minutes.
Rapid baseline definitions and complex multi-stack ensembles; processing notoriously difficult data from messy real-world datasets; integrating multiple inputs into one model; these all demonstrate to users that exceptional performance and ease of use can coexist within the same program. AutoGluon is designed for professionals that want exceptional performance from a production-ready tool with minimal preparation, and for those that are just getting started with machine learning or have little experience with machine learning; they will still get an outcome that is as near to production-ready as possible with little to no difficulty on their part.
In conclusion, The old way requires expertise, time and patience, while AutoGluon only requires the user to provide their dataset. And while you work to write your preprocessing code, AutoGluon will almost certainly produce results that exceed those of the pipeline you’ve crafted.
If you need an effective, practical and production-ready option for your machine learning requirements, AutoGluon is not only the easiest option available, but also the best choice.
Read the full article here: https://ai.plainenglish.io/autogluon-the-automl-framework-that-finally-lives-up-to-the-hype-9cdd4f637ffb