"The goal isn't to predict crime—it's to predict violence. And in that narrow but critical task, the algorithm shows promise."
Machine learning has transformed predictive policing from science fiction to operational reality. But the ethical implications demand we build models that are not just accurate, but transparent and accountable.
The Competition: Three Algorithms
We evaluated three variants of LightGBM, a gradient boosting framework known for speed and accuracy on tabular data.
DART (Dropouts meet Multiple Additive Regression Trees) emerged as the winner. Its dropout mechanism prevents overfitting by randomly ignoring trees during training, similar to dropout in neural networks.
What the Model Sees
Feature importance analysis reveals what information the model finds most predictive of violence. Geography dominates—confirming our spatial analysis.
| Feature | Importance | Category |
|---|---|---|
| Longitude | 25.5% | Geographic |
| Latitude | 23.9% | Geographic |
| Hour (sin) | 8.9% | Temporal |
| Hour (cos) | 7.8% | Temporal |
| Day of Week | 5.2% | Temporal |
| District Code | 4.8% | Administrative |
| Reporting Area | 4.1% | Administrative |
| Month (sin) | 2.8% | Temporal |
Geographic coordinates alone account for nearly 50% of predictive power. This validates the "location, location, location" principle of criminology— and suggests that place-based interventions may be more effective than person-based approaches.
Model Architecture
The winning DART model uses carefully tuned hyperparameters to balance accuracy against overfitting:
params = {
'objective': 'binary',
'metric': 'auc',
'boosting_type': 'dart', # Winner
'num_leaves': 31,
'learning_rate': 0.05,
'feature_fraction': 0.9,
'bagging_fraction': 0.8,
'bagging_freq': 5,
'drop_rate': 0.1, # DART-specific
'skip_drop': 0.5,
'verbose': -1
}
model = lgb.train(
params,
train_set,
num_boost_round=1000,
valid_sets=[valid_set],
early_stopping_rounds=50
)
Interpreting AUC-ROC
AUC-ROC (Area Under the Receiver Operating Characteristic curve) measures how well the model distinguishes between violent and non-violent incidents across all possible classification thresholds.
- 50% = Random guessing (coin flip)
- 64.97% = Our model (30% improvement)
- 100% = Perfect prediction
Is 64.97% good enough? It depends on the use case. For resource allocation— deciding where to focus patrols—even modest improvements over random can save lives. For individual predictions, more caution is warranted.
Prediction Tasks
We trained models for three distinct prediction tasks:
| Task | Target | AUC-ROC |
|---|---|---|
| Violence Risk | Is incident violent? | 64.97% |
| Hotspot Detection | High crime area? | 71.2% |
| Crime Type | Offense category | 58.3% |
Hotspot detection performs best—predicting where crime concentrates is easier than predicting what type will occur.
Predictive policing raises serious concerns about feedback loops, bias amplification, and over-policing of historically targeted communities. Our model uses only geographic and temporal features—no demographic data— but location itself can serve as a proxy for race and class.
From Prediction to Action
Predictions alone don't reduce crime. The next chapter explores how we translate model outputs into optimized patrol routes using operations research techniques.