Abstract
The growing dependence on externally hosted foun-
dation models and non-European cloud infrastructures creates
both environmental and strategic challenges for AI deployment.
For many European companies, researchers, and students, one
practical barrier to more local and independent use of advanced
language technologies is that hosting and optimizing such models
requires substantial computational and energy resources. Low-
ering that barrier requires methods that can make post-training
optimization more accessible without relying on computationally
expensive trial-and-error benchmarking. This paper addresses
part of that challenge by presenting a surrogate-based modeling
approach for predicting neural network performance under
magnitude-based pruning. We develop an LSTM model that
approximates pruning-induced performance degradation across
candidate pruning configurations, thereby reducing the need
for exhaustive benchmarking during pruning-space exploration.
The model is trained on pruning behavior patterns from 17
transformer-based text classification models, mapping pruning
thresholds from 0 to 10 to corresponding accuracy metrics.
The surrogate-model achieves a Mean Absolute Error of 0.0395
on validation data, representing a 54% improvement over a
linear baseline. Cross-architecture testing further demonstrates
generalization capability, with the model achieving similar pre-
diction accuracy (MAE 0.0422) on a VGG16 convolutional neural
network despite being trained exclusively on transformer archi-
tectures. These results suggest that pruning behavior patterns
can be approximated with sufficient accuracy to support lower-
cost exploration of compression strategies across model types.
As such, this work contributes a practical first step toward
reducing the computational barrier of local AI optimization
and lays methodological groundwork for later surrogate-assisted
optimization workflows aimed at making advanced models more
feasible to evaluate, adapt, and eventually deploy on smaller-scale
or more locally governed infrastructure.
dation models and non-European cloud infrastructures creates
both environmental and strategic challenges for AI deployment.
For many European companies, researchers, and students, one
practical barrier to more local and independent use of advanced
language technologies is that hosting and optimizing such models
requires substantial computational and energy resources. Low-
ering that barrier requires methods that can make post-training
optimization more accessible without relying on computationally
expensive trial-and-error benchmarking. This paper addresses
part of that challenge by presenting a surrogate-based modeling
approach for predicting neural network performance under
magnitude-based pruning. We develop an LSTM model that
approximates pruning-induced performance degradation across
candidate pruning configurations, thereby reducing the need
for exhaustive benchmarking during pruning-space exploration.
The model is trained on pruning behavior patterns from 17
transformer-based text classification models, mapping pruning
thresholds from 0 to 10 to corresponding accuracy metrics.
The surrogate-model achieves a Mean Absolute Error of 0.0395
on validation data, representing a 54% improvement over a
linear baseline. Cross-architecture testing further demonstrates
generalization capability, with the model achieving similar pre-
diction accuracy (MAE 0.0422) on a VGG16 convolutional neural
network despite being trained exclusively on transformer archi-
tectures. These results suggest that pruning behavior patterns
can be approximated with sufficient accuracy to support lower-
cost exploration of compression strategies across model types.
As such, this work contributes a practical first step toward
reducing the computational barrier of local AI optimization
and lays methodological groundwork for later surrogate-assisted
optimization workflows aimed at making advanced models more
feasible to evaluate, adapt, and eventually deploy on smaller-scale
or more locally governed infrastructure.
| Original language | English |
|---|---|
| Number of pages | 11 |
| Publication status | In preparation - 2026 |
Fingerprint
Dive into the research topics of 'Predicting Model Performance Under Pruning: A Surrogate Modeling Approach for Sustainable AI Optimization'. Together they form a unique fingerprint.-
Conferentie Digitale Duurzaamheid
van Kersbergen, R. (Participant) & Verheijke, L. (Participant)
25 Sept 2025Activity: Participating in or organising an event › Organising a conference, workshop, ... › Academic
-
Responsible Applied AI Trade-off Dashboard Kick-off
Wiggers, P. (Organiser), Horsman, S. (Organiser), Fuckner, M. (Organiser) & van Kersbergen, R. (Organiser)
16 Oct 2024Activity: Participating in or organising an event › Organising a conference, workshop, ... › Professional
-
AI4DM: Responsible AI Trade-Offs
Horsman, S. (Speaker) & van Kersbergen, R. (Speaker)
23 Feb 2024Activity: Talk or presentation › Invited talk › Academic
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver