This repository contains the code used in our SIGIR 2026 paper Pay Attention to Sequence Split: Uncovering the Impacts of Sub-Sequence Splitting on Sequential Recommendation Models.
The codebase is organized into two parallel frameworks corresponding to two dominant sub-sequence splitting (SSS) paradigms in sequential recommendation.
SSS/
├── Multi_Target_Implementation/
│ ├── data/ # Raw and preprocessed datasets
│ └── src/ # Multi-target training framework
│
├── Single_Target_Implementation/
│ ├── data/ # Raw datasets
│ └── src/ # Single-target training framework
│
└── README.md
-
Single-Target Implementation
Sub-sequences are generated on-the-fly during training, and each training instance predicts one next item. -
Multi-Target Implementation
Sub-sequences are generated offline before training and saved as processed dataset files.
This separation is intentional: it reflects two different implementation paradigms adopted by prior sequential recommendation codebases, and allows us to study the effect of data splitting without conflating it with other training-pipeline differences.
Important: The multi-target framework requires offline preprocessing before training. Raw data should not be used directly for training in this framework.
Raw dataset files should be placed in:
Multi_Target_Implementation/data/
The preprocessing script is:
preprocess_data.py
It reads a raw user-item interaction file and generates an augmented dataset according to the specified splitting strategy.
augment_type |
Description |
|---|---|
none |
Use the raw sequence without splitting |
pre |
Prefix-based splitting |
suffix |
Suffix-based splitting |
slide |
Sliding-window splitting |
slide_plus_full |
Sliding-window splitting plus the original full sequence |
| Argument | Description |
|---|---|
slide_window |
Window size of each sub-sequence |
slide_step |
Step size between adjacent windows |
slide_keep_tail |
Whether to keep tail sequences shorter than one window |
python preprocess_data.py \
--data_dir ../data/ \
--data_name Beauty \
--augment_type slide_plus_full \
--slide_window 3 \
--slide_step 1
python preprocess_data.py \
--data_dir ../data/ \
--data_name CDs \
--augment_type slide \
--slide_window 5 \
--slide_step 1
python preprocess_data.py \
--data_dir ../data/ \
--data_name ML-1M \
--augment_type pre
python preprocess_data.py \
--data_dir ../data/ \
--data_name Douyin \
--augment_type suffixThe processed files are saved in the same data directory. Typical filenames include:
Beauty_pre.txt
Beauty_suffix.txt
Beauty_slide_win3_step1.txt
Beauty_slide_plus_full_win3_step1.txt
CDs_slide_win5_step1.txt
ML-1M_pre.txt
Because sequence-length distributions differ substantially across datasets, we do not use a unified sliding-window size. The following settings correspond to the configurations used in our experiments.
| Dataset | Recommended Window Size |
|---|---|
| Beauty | 3 |
| Sports_and_Outdoors | 3 |
| CDs | 5 |
| ML-1M | 10 |
| Douyin | 5 |
| LastFM | 5 |
In general, shorter and denser sequence datasets prefer smaller windows, while datasets with longer user histories can benefit from larger windows.
The single-target framework follows the on-the-fly splitting paradigm:
- each training instance predicts exactly one next item
- sub-sequence splitting, if enabled, is performed during training-time data loading
- no preprocessing is required
Single-target training uses the raw dataset directly. There is no need to run any preprocessing script in advance.
The following models are supported in the single-target framework:
SASRecBSARecFMLPRecGRU4Rec
| Argument | Options / Meaning |
|---|---|
model_type |
SASRec, BSARec, FMLPRec, GRU4Rec |
loss_type |
BCE, CE |
augment_type |
ori, pre, suffix, slide |
data_name |
Dataset name |
model_idx |
Run index / random seed |
train_name |
Log and checkpoint name |
augment_type |
Description |
|---|---|
ori |
Use the original training sequence without sub-sequence splitting |
pre |
Generate prefix-based training samples on-the-fly |
suffix |
Generate suffix-based training samples on-the-fly |
slide |
Generate sliding-window training samples on-the-fly |
For slide augmentation, the recommended window sizes are:
| Dataset | Recommended Window Size |
|---|---|
| Beauty | 3 |
| Sports_and_Outdoors | 3 |
| CDs | 5 |
| ML-1M | 10 |
| Douyin | 5 |
| LastFM | 5 |
python main.py \
--model_type SASRec \
--loss_type BCE \
--augment_type ori \
--model_idx 1 \
--data_name Beauty \
--train_name SASRec_Beauty_ori
python main.py \
--model_type SASRec \
--loss_type BCE \
--augment_type pre \
--model_idx 1 \
--data_name Beauty \
--train_name SASRec_Beauty_pre
python main.py \
--model_type SASRec \
--loss_type BCE \
--augment_type slide \
--model_idx 1 \
--data_name Beauty \
--train_name SASRec_Beauty_slide
python main.py \
--model_type GRU4Rec \
--loss_type BCE \
--augment_type ori \
--model_idx 1 \
--data_name Beauty \
--train_name GRU4Rec_Beauty_ori
python main.py \
--model_type BSARec \
--loss_type CE \
--augment_type ori \
--model_idx 1 \
--data_name ML-1M \
--train_name BSARec_ML1M_ori
python main.py \
--model_type FMLPRec \
--loss_type CE \
--augment_type suffix \
--model_idx 1 \
--data_name CDs \
--train_name FMLPRec_CDs_suffixThe multi-target framework follows the offline preprocessing paradigm:
- sub-sequences are generated before training
- training is performed on the generated files
- different augmentation strategies correspond to different processed datasets
The following models are supported in the multi-target framework:
SASRecBSARecFMLPRecGRU4Rec
python main.py \
--model_name BSARec \
--loss_type CE \
--c 5 \
--alpha 0.7 \
--lr 0.001 \
--num_attention_heads 1 \
--augment_type None \
--model_idx 1 \
--data_name CDs \
--gpu_id 3
python main.py \
--model_name BSARec \
--loss_type CE \
--c 9 \
--alpha 0.3 \
--lr 0.0005 \
--num_attention_heads 4 \
--augment_type None \
--model_idx 1 \
--data_name ML-1M \
--gpu_id 0
python main.py \
--model_name BSARec \
--c 3 \
--alpha 0.9 \
--lr 0.001 \
--num_attention_heads 1 \
--augment_type None \
--model_idx 2 \
--data_name Douyin \
--gpu_id 0
python main.py \
--model_name BSARec \
--c 3 \
--alpha 0.9 \
--lr 0.001 \
--num_attention_heads 1 \
--augment_type None \
--model_idx 2 \
--data_name LastFM \
--gpu_id 3
python main.py \
--model_name BSARec \
--c 5 \
--alpha 0.3 \
--lr 0.001 \
--num_attention_heads 4 \
--augment_type None \
--model_idx 2 \
--data_name Sports_and_Outdoors \
--gpu_id 0
python main.py \
--model_name BSARec \
--c 5 \
--alpha 0.7 \
--lr 0.0005 \
--num_attention_heads 1 \
--augment_type None \
--model_idx 2 \
--data_name Beauty \
--gpu_id 3python main.py \
--model_name FMLPRec \
--num_hidden_layers 4 \
--loss_type CE \
--augment_type slide_plus_full_win3_step1 \
--model_idx 1 \
--data_name Beauty \
--gpu_id 0python main.py \
--model_name SASRec \
--loss_type CE \
--augment_type slide_plus_full_win5_step1 \
--model_idx 1 \
--data_name Douyin \
--gpu_id 0
python main.py \
--model_name SASRec \
--loss_type CE \
--augment_type slide_plus_full_win3_step1 \
--model_idx 1 \
--data_name Beauty \
--gpu_id 0- For BSARec, we follow the original paper setup and tune
c,alpha, learning rate, and the number of attention heads according to each dataset. - For SASRec, GRU4Rec, and FMLPRec, the key hyperparameters also follow the original paper settings as closely as possible.
- Dataset-specific tuning is applied when necessary, and the representative commands above reflect the configurations used in the paper.
To ensure fair and leakage-free evaluation, we use the following protocol in both frameworks.
- Sub-sequence splitting / augmentation is applied only to the training set
- Validation and test sets are not augmented
- Future interactions are never included in training inputs
Although training may use split sub-sequences, validation and test are always performed on the complete original user sequence under the standard evaluation protocol.
Concretely:
- the validation target is taken from the held-out interaction near the end of the original sequence
- the test target is taken from the final held-out interaction
- the model is evaluated using the user's full observable history before the target
Although the single-target and multi-target frameworks differ in how training data is generated, they share:
- the same original dataset split
- the same validation / test protocol
- the same ranking-based evaluation setting
Therefore, differences in results are attributable to the training-time effect of sub-sequence splitting, rather than inconsistencies in evaluation.
This codebase is built upon and inspired by the following excellent works:
We thank the authors for releasing high-quality, reproducible implementations.
@inproceedings{dang2026pay,
title={Pay Attention to Sequence Split: Uncovering the Impacts of Sub-Sequence Splitting on Sequential Recommendation Models},
author={Dang, Yizhou and Wu, Yifan and Huang, Minhan and Zhao, Chuang and Ma, Lianbo and Guo, Guibing and Wang, Xingwei and Sun, Zhu},
journal={Proceedings of the 49th International ACM SIGIR Conference on Research and Development in Information Retrieval},
year={2026}
}