Chapter 4: The Algorithm | The Pulse of Boston

"The goal isn't to predict crime—it's to predict violence. And in that narrow but critical task, the algorithm shows promise."

Machine learning has transformed predictive policing from science fiction to operational reality. But the ethical implications demand we build models that are not just accurate, but transparent and accountable.

The Competition: Three Algorithms

We evaluated three variants of LightGBM, a gradient boosting framework known for speed and accuracy on tabular data.

64.97%
DART AUC-ROC

64.84%

GBDT AUC-ROC

63.18%

GOSS AUC-ROC

50%

Random Baseline

DART (Dropouts meet Multiple Additive Regression Trees) emerged as the winner. Its dropout mechanism prevents overfitting by randomly ignoring trees during training, similar to dropout in neural networks.

What the Model Sees

Feature importance analysis reveals what information the model finds most predictive of violence. Geography dominates—confirming our spatial analysis.

Feature	Importance	Category
Longitude	25.5%	Geographic
Latitude	23.9%	Geographic
Hour (sin)	8.9%	Temporal
Hour (cos)	7.8%	Temporal
Day of Week	5.2%	Temporal
District Code	4.8%	Administrative
Reporting Area	4.1%	Administrative
Month (sin)	2.8%	Temporal

The Geography Surprise

Geographic coordinates alone account for nearly 50% of predictive power. This validates the "location, location, location" principle of criminology— and suggests that place-based interventions may be more effective than person-based approaches.

Model Architecture

The winning DART model uses carefully tuned hyperparameters to balance accuracy against overfitting:

Python / LightGBM

params = {
    'objective': 'binary',
    'metric': 'auc',
    'boosting_type': 'dart',  # Winner
    'num_leaves': 31,
    'learning_rate': 0.05,
    'feature_fraction': 0.9,
    'bagging_fraction': 0.8,
    'bagging_freq': 5,
    'drop_rate': 0.1,         # DART-specific
    'skip_drop': 0.5,
    'verbose': -1
}

model = lgb.train(
    params,
    train_set,
    num_boost_round=1000,
    valid_sets=[valid_set],
    early_stopping_rounds=50
)

Interpreting AUC-ROC

AUC-ROC (Area Under the Receiver Operating Characteristic curve) measures how well the model distinguishes between violent and non-violent incidents across all possible classification thresholds.

50% = Random guessing (coin flip)
64.97% = Our model (30% improvement)
100% = Perfect prediction

Is 64.97% good enough? It depends on the use case. For resource allocation— deciding where to focus patrols—even modest improvements over random can save lives. For individual predictions, more caution is warranted.

Prediction Tasks

We trained models for three distinct prediction tasks:

Task	Target	AUC-ROC
Violence Risk	Is incident violent?	64.97%
Hotspot Detection	High crime area?	71.2%
Crime Type	Offense category	58.3%

Hotspot detection performs best—predicting where crime concentrates is easier than predicting what type will occur.

Ethical Considerations

Predictive policing raises serious concerns about feedback loops, bias amplification, and over-policing of historically targeted communities. Our model uses only geographic and temporal features—no demographic data— but location itself can serve as a proxy for race and class.

From Prediction to Action

Predictions alone don't reduce crime. The next chapter explores how we translate model outputs into optimized patrol routes using operations research techniques.