Session-based Recommendation with Graph Neural Networks (SR-GNN)
Chen Kai BOSS

Session-based recommendation is challenging when you only observe a short click sequence and have little or no long-term user profile. SR-GNN tackles this by turning each session into a directed graph, where repeated items and multi-step transitions form richer structure than a plain sequence. A gated GNN propagates information over this session graph to learn item representations, and the model then aggregates them into a session representation to score next-item candidates. This note explains the session-graph construction, the gated message passing update, and how SR-GNN produces the final ranking — highlighting why this graph view often outperforms purely sequential baselines on standard SBR benchmarks.

Background

In session-based recommendation, we only observe a short sequence of clicks within the current session and aim to predict the next item. Formally, given an item setand a sessionordered by time, the goal is to predict. SR-GNN outputs a score vectorover items; the top-items are recommended.

Paper PDF

Method details

Session graph construction

To capture complex transitions within a session, SR-GNN converts each session into a directed graph:

  • Nodes: items clicked in the session.
  • Edges: directed transitions following the click order.

For example, a click sequenceyields a session graph with repeated nodes (e.g.,). Edge weights are normalized by occurrence counts and out-degree to account for repetition.

Learning item embeddings with a gated GNN

After constructing the session graph, SR-GNN applies a gated GNN to propagate and aggregate information over the graph. Node embeddings are updated by:Intuitively, each node aggregates messages from its neighbors via the adjacency structure, then uses GRU-like gates to update its state.

-: adjacency information for node. -: previous-step node embeddings. -and: learnable parameters. After multiple propagation steps, the final node embeddingscapture item-to-item dependencies within the session graph.

Building a session representation

After node embeddings are learned, SR-GNN forms a session representation by combining a local signal and a global aggregation:

  • Local: use the embedding of the last-clicked itemas short-term intent.

  • Global: apply an attention-like aggregation over all item embeddings in the session.

    -is a learnable query vector controlling importance weights. -anchors attention around the last click. -project embeddings for computing attention weights.

  • Final: combine local and global vectors to get the session embedding.

Prediction and training

Given the session embedding, SR-GNN scores candidate items (typically via dot product):The model is trained with a cross-entropy objective over the softmax-normalized scores.Hereis the ground-truth label andis the predicted probability.

In many implementations, the scoring function is a dot product:wherestacks item embeddings andis the score vector overitems. The scores are converted into probabilities by softmax:

Loss and optimization

A standard training objective is cross-entropy between the predicted distribution and the one-hot target next item. In practice, SR-GNN is trained with backpropagation through time (BPTT) over session sequences; since sessions are usually short, this is typically manageable.

Implementation reference

The original implementation is available at:

https://github.com/CRIPAC-DIG/SR-GNN/tree/master

In this note, I focus on the model structure and equations; for production use, refer to the official code for data preprocessing (session graphs, normalization, batching) and training details.


Why Session Graphs Outperform Sequential Baselines

Problem with pure sequence models (RNN/GRU)

Traditional RNN-based session models treat a session as a linear sequence and use hidden states to encode history:

Limitations:

  • Lost transitions: If a user clicks A → B → C → B, the RNN forgets the transition B → C when it revisits B.
  • No explicit relational structure: The model must learn dependencies implicitly through hidden states.
  • Fixed directionality: RNN processes left-to-right; cannot model bidirectional dependencies naturally.

Session graph advantages

By converting the session into a graph:

  • Preserves all transitions: Edge (B, C) remains even when user revisits B.
  • Explicit structure: GNN message passing directly models item-to-item dependencies.
  • Bidirectional propagation: Information flows in both directions along edges.

Hyperparameters and Training Details

Key hyperparameters

From the original paper:

Hyperparameter Value Description
Embedding dim 100 Item embedding size
GNN layers 1-2 Number of gated propagation steps
Batch size 100 Number of sessions per batch
Learning rate 0.001 Adam optimizer
Dropout 0.5 Regularization

Training strategy

  • Objective: Cross-entropy loss with softmax over all items
  • Optimizer: Adam with default
  • Early stopping: Monitor validation recall@20, stop if no improvement for 5 epochs
  • Negative sampling: For large item catalogs, use sampled softmax to reduce compute

Common Failure Modes and Troubleshooting

Symptom: Recall@20 is decent but diversity is low; top-K recommendations are always the same popular items.

Cause: Imbalanced training data (popular items dominate sessions).

Fix:

  • Add popularity penalty in the loss:- Use inverse propensity weighting to reweight samples.

Failure 2: Poor performance on short sessions

Symptom: Long sessions (n > 10) work well, but short sessions (n ≤ 3) have low recall.

Cause: Graph structure is too sparse for short sessions.

Fix:

  • Augment short sessions with co-click patterns from the training set.
  • Use a hybrid model: GNN for long sessions, item-KNN or popularity baseline for short sessions.

Failure 3: Overfitting on small datasets

Symptom: Training recall is high but validation recall plateaus early.

Cause: GNN has too many parameters relative to dataset size.

Fix:

  • Reduce embedding dimension (e.g., 100 → 50).
  • Increase dropout (e.g., 0.5 → 0.7).
  • Use weight decay (L2 regularization).

Variants and Extensions

1. Attention-based SR-GNN

Replace fixed aggregation with attention weights over neighbors:

Benefit: Learns which transitions are more important.

2. Temporal SR-GNN

Add time gaps as edge features:Encode time gap into edge weight:

Benefit: Recent clicks weigh more than old ones.

3. Multi-task SR-GNN

Jointly predict:

  • Next item (main task)
  • Session length (auxiliary task)
  • User return probability (auxiliary task)

Benefit: Auxiliary tasks regularize the model and improve generalization.


When to Use SR-GNN vs Alternatives

Scenario Recommendation
Long sessions (n > 5) ✅ Use SR-GNN
Short sessions (n ≤ 3) ⚠️ Consider item-KNN or popularity baseline
Cold-start items ⚠️ SR-GNN struggles; use content-based features
Real-time latency critical ⚠️ GNN inference can be slow; consider caching or simpler models
Large item catalog (>1M) ⚠️ Use sampled softmax or two-tower retrieval

Summary: SR-GNN in 5 Key Points

  1. Session graph construction: Convert click sequence into directed graph, preserving all transitions.
  2. Gated GNN propagation: Update node embeddings via GRU-like gates over multiple steps.
  3. Local + global aggregation: Combine last-click (local) and attention-weighted (global) representations.
  4. Softmax prediction: Score all items via dot product, train with cross-entropy.
  5. When it works best: Long sessions with complex transition patterns; struggles on cold-start and very short sessions.

SR-GNN demonstrates that explicit graph structure can outperform purely sequential models by preserving relational information. The key insight is that session-based recommendation is fundamentally a graph problem, not just a sequence problem.

  • Post title:Session-based Recommendation with Graph Neural Networks (SR-GNN)
  • Post author:Chen Kai
  • Create time:2024-10-01 00:00:00
  • Post link:https://www.chenk.top/en/Session-based%20Recommendation%20with%20Graph%20Neural%20Networks/
  • Copyright Notice:All articles in this blog are licensed under BY-NC-SA unless stating additionally.
 Comments