Skip to main navigation Skip to search Skip to main content

Predicting Model Performance Under Pruning: A Surrogate Modeling Approach for Sustainable AI Optimization

Research output: Working paper/preprintWorking paperAcademic

4 Downloads (Pure)

Abstract

The growing dependence on externally hosted foun-
dation models and non-European cloud infrastructures creates
both environmental and strategic challenges for AI deployment.
For many European companies, researchers, and students, one
practical barrier to more local and independent use of advanced
language technologies is that hosting and optimizing such models
requires substantial computational and energy resources. Low-
ering that barrier requires methods that can make post-training
optimization more accessible without relying on computationally
expensive trial-and-error benchmarking. This paper addresses
part of that challenge by presenting a surrogate-based modeling
approach for predicting neural network performance under
magnitude-based pruning. We develop an LSTM model that
approximates pruning-induced performance degradation across
candidate pruning configurations, thereby reducing the need
for exhaustive benchmarking during pruning-space exploration.
The model is trained on pruning behavior patterns from 17
transformer-based text classification models, mapping pruning
thresholds from 0 to 10 to corresponding accuracy metrics.
The surrogate-model achieves a Mean Absolute Error of 0.0395
on validation data, representing a 54% improvement over a
linear baseline. Cross-architecture testing further demonstrates
generalization capability, with the model achieving similar pre-
diction accuracy (MAE 0.0422) on a VGG16 convolutional neural
network despite being trained exclusively on transformer archi-
tectures. These results suggest that pruning behavior patterns
can be approximated with sufficient accuracy to support lower-
cost exploration of compression strategies across model types.
As such, this work contributes a practical first step toward
reducing the computational barrier of local AI optimization
and lays methodological groundwork for later surrogate-assisted
optimization workflows aimed at making advanced models more
feasible to evaluate, adapt, and eventually deploy on smaller-scale
or more locally governed infrastructure.
Original languageEnglish
Number of pages11
Publication statusIn preparation - 2026

Fingerprint

Dive into the research topics of 'Predicting Model Performance Under Pruning: A Surrogate Modeling Approach for Sustainable AI Optimization'. Together they form a unique fingerprint.

Cite this