HCGR: Hyperbolic Contrastive Graph Representation Learning for Session-based Recommendation

Session-based recommendation often hides a hierarchical structure: users start with a coarse intent (e.g., “ running shoes ”), then narrow down to brand, style, size, and price. Euclidean embeddings are good at “ flat similarity ”, but they are not a natural geometry for tree-like growth. HCGR ’ s core idea is to model session graphs in hyperbolic space (specifically the Lorentz model) and use contrastive learning to make the representations more robust and discriminative.

Why hyperbolic geometry shows up in session recommendation

In many recommender datasets you see:

power-law popularity: a few head items dominate interactions
taxonomy-like structure: categories → subcategories → specific items
expanding neighborhoods: as you move from a coarse concept outward, the number of fine-grained choices grows quickly

This “ branching growth ” matches hyperbolic space better than Euclidean space. In Euclidean space, the volume grows polynomially with radius; in hyperbolic space (negative curvature), the volume grows roughly exponentially with radius, which fits tree-like structures.

Practical intuition:

coarse concepts can sit closer to the center
fine-grained items can spread out without forcing everything into very high dimension

Session graph modeling recap (what is the graph here?)

Given a session , construct a directed session graph:

nodes: unique items in the session
edges: transitions(optionally weighted by frequency)

Graph-based session models (e.g., SR-GNN family) are strong at capturing local transition structure. HCGR keeps that spirit but changes the representation geometry and adds contrastive training signals.

The Lorentz model: a workable hyperbolic space for learning

There are multiple equivalent models of hyperbolic geometry (Poincar é ball, Lorentz/hyperboloid, Klein). HCGR uses the Lorentz model because it is numerically stable for optimization and has convenient formulas.

Hyperboloid manifold

Define the Lorentzian inner product for vectors:Then the hyperboloid (curvaturefor simplicity) is: $Extra close brace or missing open brace\mathbb{H}^d = \{\mathbf{x}\in\mathbb{R}^{d+1} : \langle \mathbf{x},\mathbf{x} \rangle_{\mathcal{L }} = -1,\; x_0 > 0 }$

Distance

The Lorentz distance can be written as:This distance naturally expands “ space ” as you move away from the origin, which helps represent hierarchical separation.

Tangent space + exp/log maps (how you do gradient updates)

Optimization is often done by:

moving computations into a tangent space (locally Euclidean)
applying standard operations
mapping back to the manifold

You ’ ll frequently see the exponential mapand logarithmic mapused to move between the manifold and tangent spaces.

You don ’ t need to memorize the closed forms to use the idea: the key is that HCGR is doing “ graph representation learning ”, but the representation lives onrather than.

Hyperbolic graph aggregation: “ attention ” on a curved space

In Euclidean GNNs, we typically do neighbor aggregation via:

weighted sums
attention mechanisms
message passing with MLPs

In hyperbolic space, you can ’ t naively sum points on the manifold. A common pattern is:

map node embeddings to a tangent space
do attention-weighted aggregation (Euclidean operation)
map the result back to the manifold

Conceptually:whereare attention weights andare neighbors in the session graph.

Why this is useful:

local transitions still matter (graph neighborhood)
hyperbolic geometry helps preserve hierarchical separation while aggregating

Contrastive learning: make representations stable and discriminative

Session graphs are noisy. A single session can contain exploration clicks, repeated items, and imperfect signals. Contrastive learning improves robustness by enforcing:

“ two views of the same session should be close ”
“ different sessions should be separated ”

Two-view augmentation for sessions

Typical augmentations for session graphs include:

edge dropout (remove some transitions)
node dropout (drop some items)
subgraph sampling
perturbation in order (small swaps) — depending on method design

You generate two augmented viewsof the same original session.

A common contrastive objective (InfoNCE-style)

Letandbe session representations from two views. With a similarity functionand temperature:In a hyperbolic setting, similarity may be defined via negative hyperbolic distance, or computed in a tangent space for stability.

Final objective: recommendation + contrastive regularization

HCGR typically combines:

a recommendation loss (cross-entropy over next item, or pairwise ranking)
a contrastive loss as auxiliary regularization

Conceptually:wherebalances “ fit the next-click task ” vs “ learn robust geometry-aware representations ”.

What to look for in results (and how to sanity-check the claim)

When reading HCGR-style papers, I focus on:

Is hyperbolic geometry really helping, or is it just more parameters?
Look for controlled comparisons: Euclidean vs hyperbolic under comparable capacity.
Does contrastive learning provide consistent gains?
Ablations should show improvement across datasets, not only one.
Does it help head vs tail items differently?
Hyperbolic geometry is often motivated by hierarchy / long-tail; look for breakdowns.
Training stability
Hyperbolic optimization can be tricky; check whether they use stable parameterizations and whether results are reproducible.

Practical takeaways for your own system

If you are building a session recommender:

Start with a strong baseline (SR-GNN-like graph model, or an attention-based sequential model).
If your data shows strong hierarchical structure (categories, long tail, multi-level intent), hyperbolic embeddings are worth trying.
Contrastive learning is often a “ cheap win ” if you can define meaningful augmentations.

But also be honest about the cost:

implementation complexity increases (manifold operations, stability)
tuning becomes more delicate (curvature, temperature, augmentation strength)

A minimal reproducibility checklist

To reproduce HCGR-style results without getting lost:

Fix random seeds and report variance over multiple runs.
Use the same evaluation protocol as baselines (session split, metrics, candidate set).
Report ablations:
- Euclidean vs hyperbolic
- with vs without contrastive loss
- augmentation types/strength
- embedding dimension and curvature sensitivity

If those pieces hold, the paper ’ s contribution is much more convincing.