Identification of a protein expression signature distinguishing early from organising diffuse alveolar damage in COVID-19 patients.

Diffuse alveolar damage (DAD) is a histopathological finding associated with severe viral infections, including SARS-CoV-2. However, the mechanisms mediating progression of DAD are poorly understood. Applying protein digital spatial profiling to lung tissue obtained from a cohort of 27 COVID-19 autopsy cases from the UK, we identified a protein signature (ARG1, CD127, GZMB, IDO1, Ki67, phospho-PRAS40 (T246), and VISTA that distinguishes early / exudative DAD from late / organising DAD with good predictive accuracy. These proteins warrant further investigation as potential immunotherapeutic targets to modulate DAD progression and improve patient outcome.


INTRODUCTION
The COVID-19 pandemic has claimed over 6.6 million lives and despite vaccines that prevent serious illness and use of dexamethasone in severely ill patients, worldwide deaths continue to accrue 1 . There is, therefore, a continued need to identify new treatment options to minimise disease severity. Diffuse alveolar damage (DAD) reflects a continuum of immunopathology associated with multiple causes of lung injury and is a primary histological feature of fatal COVID-19 2, 3 . However, the cellular and molecular pathways associated with the progression of DAD from its early exudative phase (EDAD), characterised by oedema, hyaline membranes, and inflammation, to a late organising and loosely fibrotic phase (ODAD) remain unclear.
To begin to address this question, we examined lung tissue from a cohort of COVID-19 autopsy cases in the UK. We used digital spatial profiling (DSP) to determine differences in . CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 13, 2022. ; https://doi.org/10.1101/2022.12.09.22283280 doi: medRxiv preprint protein expression between regions of interest identified histologically as EDAD or ODAD.
We focused on protein targets with therapeutic potential demonstrated in other diseases and/or pre-clinical models to identify potential candidates for re-purposing in COVID-19.

MATERIALS AND METHODS
Lung tissue from patients that had died with SARS-CoV-2 was selected from a larger cohort assembled by the UK Coronavirus Immunology Consortium (UK-CIC). A full description of the UK-CIC cohorts will be provided elsewhere (Milross et

RESULTS
We examined 194 ROIs (7 ± 2 ROIs per patient; 122 EDAD, 50 ODAD, 22 MDAD; figure   1A and online supplemental table 1). Principal components analysis (PCA) showed separation of each form of DAD with 41.4% of variance accounted for by PC1 and PC2 (figure 1B and C). We next applied partial least squares regression (PLS-R) (figure 2A) and identified variables responsible for group separation using variable importance in projection . CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 13, 2022. ; (VIP) scores. Proteins with VIP scores > 1.3 (ARG1, CD127, CD163, GZMB, IDO1, Ki67, phopsho-PRAS40 (T246) and VISTA; figure 2B) largely mirrored what was observed with PCA. These 8 variables were used to classify ROIs in PLS linear discriminate analysis (PLS-LDA) with leave-one-patient-out (LOPO) cross validation to prevent overfitting. This achieved a predictive accuracy of 93% and 80% for EDAD and ODAD respectively (figure Collectively, our data suggest a core protein signature comprising ARG1, CD127, GZMB, IDO1, Ki67, phospho-PRAS40 (T246), and VISTA distinguishes EDAD from ODAD ROIs in this patient group (figure 2C). Nevertheless, our data also suggest further patient heterogeneity within EDAD ROIs. This was most marked for ARG1, which was absent from all EDAD ROIs in 8/20 patients. Although sample size precluded a formal analysis, this appeared unrelated to sex, place of death, duration of disease or cohort (online supplementary is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 13, 2022. ; https://doi.org/10.1101/2022.12.09.22283280 doi: medRxiv preprint

DISCUSSION
Using DSP to interrogate well-annotated lung tissue, we identified a core protein signature discriminating early from late phases of DAD. Not surprisingly given the targeted nature of our panel, the proteins we identified have well known functions in inflammation and immunity, but they have not previously been evaluated in relation to DAD progression. ARG1 is elevated in the lungs of severe COVID-19 patients, being expressed by CD11b + CD66b + granulocytic myeloid-derived suppressor cells. 4 IDO1 was detected in lung tissue in another autopsy series, though most tryptophan-catabolizing activity was associated with IDO2. 5 CD127 expression on monocytes has been noted at sites of hyperinflammation 6 , whereas VISTA has been proposed as a therapeutic target to minimise inflammation. 7 Finally, phosphorylation of PRAS40 at T246 releases mTORC1 to perform its many downstream functions and elevated phopsho-PRAS40 (T246) has been used as a biomarker of PI3K/Akt/mTORC1 activation 8 , a pathway implicated in idiopathic pulmonary fibrosis and This study has limitations: 1. DSP quantifies protein expression across the entire ROI and cannot distinguish multiple cells with low target expression vs. few cells with high expression; 2. our patient cohort was too small to perform sub-group analysis based on age, gender, disease duration, or place of death; 3. we cannot rule out that patients had other forms of concurrent disease or different forms of DAD in other areas of lung not sampled here and this may account for some of the inter-patient heterogeneity observed; 4. further validation is required in an independent patient cohort, preferably incorporating single cell technologies.
Notwithstanding these limitations, to our knowledge this is the first study to apply highly multiplexed DSP to discriminate between EDAD and ODAD. The extent to which the many . CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 13, 2022. ; https://doi.org/10.1101/2022.12.09.22283280 doi: medRxiv preprint millions of COVID-19 survivors are at risk of developing pulmonary fibrosis is only beginning to be understood 10 . Importantly, many of the protein targets we have identified as being highly expressed at the early stages of DAD are amenable to therapeutic intervention with existing drugs or drugs in development. Hence further exploration of these targets in pre-clinical models of SARS-CoV-2 infection could provide an evidence base on which to base future intervention trials.

Acknowledgements
The authors would like to acknowledge the tissue donors and their families for their contribution to medical science. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint

Ethical Approval
Human samples used in this research project were partly obtained from the Newcastle is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 13, 2022. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 13, 2022.   is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 13, 2022. ; . CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 13, 2022. ; https://doi.org/10.1101/2022.12.09.22283280 doi: medRxiv preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 13, 2022. ; https://doi.org/10.1101/2022.12.09.22283280 doi: medRxiv preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 13, 2022. ; Figure 3 Differential target expression between EDAD and ODAD using using linear mixed modelling A and B, Differentially expressed (FDR 5%; FC = 1.5) protein targets between EDAD and ODAD (A) and MDAD and ODAD (B). Data derives from a Linear mixed modelling with patient repeat measures and cohort as a random effect.. C, Individual ROI counts for EDAD, MDAD and ODAD ROIs for identified target proteins. ns, non-significant; **, p<0.01; ***, p<0.001; **** p<0.0001 between indicated groups. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint

Patient samples
Autopsy lung tissues from 27 patients were obtained from biobanks at the University of Newcastle, University of Edinburgh, and Imperial College London. Patients were from both the first and second wave of the UK pandemic. These patients were selected from a larger histopathological study on the basis that they showed DAD in the absence of additional lung complications associated with pneumonia or heart failure. A description of the full cohort is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 13, 2022. ; lung tissue. ROI capture was performed using a GeoMx Spatial profiler instrument (Nanostring, Seattle, WA, USA).
Digital count data were normalised to positive ERCC controls and to housekeeping controls (GAPDH and Histone H3). Housekeeping targets were selected based on high correlation with isotype controls. ROIs with abnormal levels of hybridisation, HK expression or low isotype control background were removed from the analysis. Targets were removed from analysis if signals were below the geometric mean of the isotype controls. Data was exported for further analysis in R (see below) and analysed using linear mixed modelling using GeoMx software (version 2.0) with patient ID and cohort selected as random variables.
Volcano plots were generated in GeoMx ® software and show significance scores with FDR correction (5%) based on Benjamini, Krieger, and Yekutieli two stage set-up method and Log2 fold change cut-off of 0.589 (1.5-fold change).

Statistical analysis
Statistical analyses were carried out in R version 4.1.1. 1 The base R function prcomp was used for PCA, while the pls package 2 was used for PLSR. Classification was performed using the plsgenomics R package 3 with LOPO cross-validation to avoid overfitting in this supervised approach. Here all ROIs for each patient in turn were left out and the remaining data used to build the model which was then used to predict the class of the left-out ROIs.
Results are shown for the ROIs that were not used in model training. In order to show the predictive accuracy as the discriminatory threshold was varied, a receiver operator curve (ROC) was generated using the R package ROCR 4 .
. CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 13, 2022. ; https://doi.org/10.1101/2022.12.09.22283280 doi: medRxiv preprint