Session-based recommendation is challenging when you only observe a short click sequence and have little or no long-term user profile. SR-GNN tackles this by turning each session into a directed graph, where repeated items and multi-step transitions form richer structure than a plain sequence. A gated GNN propagates information over this session graph to learn item representations, and the model then aggregates them into a session representation to score next-item candidates. This note explains the session-graph construction, the gated message passing update, and how SR-GNN produces the final ranking — highlighting why this graph view often outperforms purely sequential baselines on standard SBR benchmarks.
Background
In session-based recommendation, we only observe a short sequence of
clicks within the current session and aim to predict the next item.
Formally, given an item set
Method details
Session graph construction
To capture complex transitions within a session, SR-GNN converts each
session into a directed graph
- Nodes: items clicked in the session.
- Edges: directed transitions following the click order.
For example, a click sequence
Learning item embeddings with a gated GNN
After constructing the session graph, SR-GNN applies a gated GNN to
propagate and aggregate information over the graph. Node embeddings are
updated by:
-
Building a session representation
After node embeddings are learned, SR-GNN forms a session representation by combining a local signal and a global aggregation:
Local: use the embedding of the last-clicked item
as short-term intent. Global: apply an attention-like aggregation over all item embeddings in the session.
-
is a learnable query vector controlling importance weights. - anchors attention around the last click. - project embeddings for computing attention weights . Final: combine local and global vectors to get the session embedding
.
Prediction and training
Given the session embedding, SR-GNN scores candidate items (typically
via dot product):
In many implementations, the scoring function is a dot product:
Loss and optimization
A standard training objective is cross-entropy between the predicted distribution and the one-hot target next item. In practice, SR-GNN is trained with backpropagation through time (BPTT) over session sequences; since sessions are usually short, this is typically manageable.
Implementation reference
The original implementation is available at:
https://github.com/CRIPAC-DIG/SR-GNN/tree/master
In this note, I focus on the model structure and equations; for production use, refer to the official code for data preprocessing (session graphs, normalization, batching) and training details.
Why Session Graphs Outperform Sequential Baselines
Problem with pure sequence models (RNN/GRU)
Traditional RNN-based session models treat a session as a linear
sequence and use hidden states to encode history:
Limitations:
- Lost transitions: If a user clicks A → B → C → B, the RNN forgets the transition B → C when it revisits B.
- No explicit relational structure: The model must learn dependencies implicitly through hidden states.
- Fixed directionality: RNN processes left-to-right; cannot model bidirectional dependencies naturally.
Session graph advantages
By converting the session into a graph:
- Preserves all transitions: Edge (B, C) remains even when user revisits B.
- Explicit structure: GNN message passing directly models item-to-item dependencies.
- Bidirectional propagation: Information flows in both directions along edges.
Hyperparameters and Training Details
Key hyperparameters
From the original paper:
| Hyperparameter | Value | Description |
|---|---|---|
| Embedding dim | 100 | Item embedding size |
| GNN layers | 1-2 | Number of gated propagation steps |
| Batch size | 100 | Number of sessions per batch |
| Learning rate | 0.001 | Adam optimizer |
| Dropout | 0.5 | Regularization |
Training strategy
- Objective: Cross-entropy loss with softmax over all items
- Optimizer: Adam with default
- Early stopping: Monitor validation recall@20, stop if no improvement for 5 epochs
- Negative sampling: For large item catalogs, use sampled softmax to reduce compute
Common Failure Modes and Troubleshooting
Failure 1: Model predicts only popular items
Symptom: Recall@20 is decent but diversity is low; top-K recommendations are always the same popular items.
Cause: Imbalanced training data (popular items dominate sessions).
Fix:
- Add popularity penalty in the loss:
- Use inverse propensity weighting to reweight samples.
Failure 2: Poor performance on short sessions
Symptom: Long sessions (n > 10) work well, but short sessions (n ≤ 3) have low recall.
Cause: Graph structure is too sparse for short sessions.
Fix:
- Augment short sessions with co-click patterns from the training set.
- Use a hybrid model: GNN for long sessions, item-KNN or popularity baseline for short sessions.
Failure 3: Overfitting on small datasets
Symptom: Training recall is high but validation recall plateaus early.
Cause: GNN has too many parameters relative to dataset size.
Fix:
- Reduce embedding dimension (e.g., 100 → 50).
- Increase dropout (e.g., 0.5 → 0.7).
- Use weight decay (L2 regularization).
Variants and Extensions
1. Attention-based SR-GNN
Replace fixed aggregation with attention weights
over neighbors:
Benefit: Learns which transitions are more important.
2. Temporal SR-GNN
Add time gaps as edge features:
Benefit: Recent clicks weigh more than old ones.
3. Multi-task SR-GNN
Jointly predict:
- Next item (main task)
- Session length (auxiliary task)
- User return probability (auxiliary task)
Benefit: Auxiliary tasks regularize the model and improve generalization.
When to Use SR-GNN vs Alternatives
| Scenario | Recommendation |
|---|---|
| Long sessions (n > 5) | ✅ Use SR-GNN |
| Short sessions (n ≤ 3) | ⚠️ Consider item-KNN or popularity baseline |
| Cold-start items | ⚠️ SR-GNN struggles; use content-based features |
| Real-time latency critical | ⚠️ GNN inference can be slow; consider caching or simpler models |
| Large item catalog (>1M) | ⚠️ Use sampled softmax or two-tower retrieval |
Summary: SR-GNN in 5 Key Points
- Session graph construction: Convert click sequence into directed graph, preserving all transitions.
- Gated GNN propagation: Update node embeddings via GRU-like gates over multiple steps.
- Local + global aggregation: Combine last-click (local) and attention-weighted (global) representations.
- Softmax prediction: Score all items via dot product, train with cross-entropy.
- When it works best: Long sessions with complex transition patterns; struggles on cold-start and very short sessions.
SR-GNN demonstrates that explicit graph structure can outperform purely sequential models by preserving relational information. The key insight is that session-based recommendation is fundamentally a graph problem, not just a sequence problem.
- Post title:Session-based Recommendation with Graph Neural Networks (SR-GNN)
- Post author:Chen Kai
- Create time:2024-10-01 00:00:00
- Post link:https://www.chenk.top/en/Session-based%20Recommendation%20with%20Graph%20Neural%20Networks/
- Copyright Notice:All articles in this blog are licensed under BY-NC-SA unless stating additionally.