Abstract
High-dimensional ransomware detection datasets are challenging for machine learning due to sparsity, nonlinearity, and heterogeneous feature distributions. Conventional dimensionality reduction often overlooks class-conditional correlations and fails to preserve semantic distinctions between static and dynamic behaviors. To address this problem, we propose a correlation-driven, class-aware hierarchical feature clustering framework that mainly groups features into ransomwarespecific, benign-specific, and shared clusters, with model-optimized thresholds. Static opcode features and dynamic features are kept in separate partitions, ensuring preservation of behavioral semantics. Experimental evaluation on balanced datasets shows that the framework reduces dimensionality while improving interpretability and efficiency. Random Forest training time decreased by 70.7% and misclassification of ransomware samples dropped by 2.07%. The derived feature clusters provided clear semantic separation: encryption and anti-analysis routines were isolated as ransomware-specific, while process and registry management features were grouped as benignware-specific. Models trained on the reduced set achieved higher Cohen's Kappa, lower log loss, and more balanced accuracy across classes, demonstrating the effectiveness of the proposed framework for robust and interpretable ransomware detection.
| Original language | English |
|---|---|
| Title of host publication | Proceedings - 25th IEEE International Conference on Data Mining Workshops, ICDMW 2025 |
| Publisher | IEEE Computer Society |
| Pages | 1366-1372 |
| Number of pages | 7 |
| ISBN (Electronic) | 9798331581329 |
| DOIs | |
| Publication status | Published - 2025 |
| Event | 25th IEEE International Conference on Data Mining Workshops, ICDMW 2025 - Washington, United States Duration: 12 Nov 2025 → 15 Nov 2025 |
Publication series
| Name | IEEE International Conference on Data Mining Workshops, ICDMW |
|---|---|
| ISSN (Print) | 2375-9232 |
| ISSN (Electronic) | 2375-9259 |
Conference
| Conference | 25th IEEE International Conference on Data Mining Workshops, ICDMW 2025 |
|---|---|
| Country/Territory | United States |
| City | Washington |
| Period | 12/11/25 → 15/11/25 |
Bibliographical note
Publisher Copyright:© 2025 IEEE.
Keywords
- correlation analysis
- cybersecurity
- dimensionality reduction
- feature clustering
- hierarchical clustering
- Ransomware detection
Fingerprint
Dive into the research topics of 'Class-Aware Hierarchical Feature Clustering for High-Dimensional Complex Ransomware Detection'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver