Paper 03 Portfolio Theory HRP Clustering

Hierarchical Risk Parity (HRP) for Portfolio Optimization

Cluster-based portfolio allocation using hierarchical clustering and graph theory. HRP avoids covariance inversion for more stable diversification.

Abstract

This paper presents Hierarchical Risk Parity (HRP), a portfolio allocation method that uses hierarchical clustering to structure asset weights without inverting the covariance matrix. By transforming correlations into distance metrics and applying recursive bisection, HRP produces more stable allocations than mean-variance optimization, particularly in high-dimensional or regime-shifting environments. We provide a complete Python implementation and discuss practical advantages over classical approaches.

Key Takeaways

Introduction

Classical portfolio theory is elegant, but in practical quant workflows it often breaks where the algebra looks strongest. Mean-variance optimization requires estimating expected returns and inverting the covariance matrix. In small samples, high dimensions, or unstable regimes, that process becomes fragile. Tiny changes in input can produce violent changes in weights.

Hierarchical Risk Parity (HRP) avoids direct covariance inversion and uses hierarchical clustering to structure allocation. Assets are not independent points in space — they form dependency clusters: banks, semiconductors, sovereign bonds, energy names, or factor-like groups.

HRP first measures similarity using correlation, then transforms that into a distance metric:

Correlation Distance
$$d_{ij} = \sqrt{\frac{1 - \rho_{ij}}{2}}$$

Once the hierarchy is built via clustering, HRP applies two steps: quasi-diagonalization (reorder the covariance matrix so similar assets are adjacent) and recursive bisection. If two clusters have variances \(\sigma_L^2\) and \(\sigma_R^2\), the left cluster receives weight:

Recursive Bisection Allocation
$$w_L = 1 - \frac{\sigma_L^2}{\sigma_L^2 + \sigma_R^2}, \quad w_R = 1 - w_L$$

Python Implementation

hrp.py Python
import numpy as np
import pandas as pd
from scipy.cluster.hierarchy import linkage, leaves_list
from scipy.spatial.distance import squareform

def correl_dist(corr):
    return np.sqrt((1 - corr) / 2)

def get_cluster_var(cov, cluster_items):
    sub_cov = cov.loc[cluster_items, cluster_items]
    ivp = 1 / np.diag(sub_cov)
    ivp = ivp / ivp.sum()
    return np.dot(ivp, np.dot(sub_cov, ivp))

def hrp_allocation(returns: pd.DataFrame) -> pd.Series:
    cov  = returns.cov()
    corr = returns.corr()
    dist = correl_dist(corr)

    link    = linkage(squareform(dist.values, checks=False), method="single")
    sort_ix = corr.index[leaves_list(link)]

    weights  = pd.Series(1.0, index=sort_ix)
    clusters = [list(sort_ix)]

    while clusters:
        cluster = clusters.pop(0)
        if len(cluster) <= 1: continue

        split = len(cluster) // 2
        left, right = cluster[:split], cluster[split:]

        var_left  = get_cluster_var(cov, left)
        var_right = get_cluster_var(cov, right)

        alpha = 1 - var_left / (var_left + var_right)
        weights[left]  *= alpha
        weights[right] *= (1 - alpha)

        clusters.extend([left, right])

    return weights / weights.sum()

HRP's advantage is that it treats dependence structure as an object worth modeling directly. That becomes valuable when correlations are unstable, samples are short, and optimization error matters more than elegant closed forms. It tends to behave well when traditional optimizers overreact to noisy means and covariances.