Extreme windstorms and sting-jets in convection-permitting climate simulations over Europe

Extra-tropical windstorms are one of the costliest natural hazards affecting Europe, and windstorms that develop a sting-jet are extremely damaging. A sting-jet is a mesoscale core of very high wind speeds that occurs in Shapiro-Keyser type cyclones, and high-resolution models are required to adequately model sting-jets. Here, we develop a low-cost methodology to automatically detect sting jets, using the characteristic warm seclusion of Shapiro-Keyser cyclones and the slantwise descent of high wind speeds, within pan-European 2.2km convection-permitting climate model (CPM) simulations over Europe. The representation of wind gusts is improved with respect to ERA-Interim reanalysis data compared to observations; this is linked to better representation of cold conveyor belts and sting-jets in the CPM. Our analysis indicates that Shapiro-Keyser cyclones, and those that develop sting-jets, are the most damaging windstorms in present and future climates. The frequency of extreme windstorms is projected to increase by 2100 and a large contribution comes from sting-jet storms. Furthermore, extreme wind speeds and their future changes are underestimated in the GCM compared to the CPM. We conclude that the CPM adds value in the representation of extreme winds and surface wind gusts and can provide improved input for impact models compared to coarser resolution models.


Introduction
Extra-tropical windstorms are one of the costliest natural hazards affecting Europe and cyclones that follow the Shapiro-Keyser (SK) conceptual model (Shapiro and Keyser, 1990) account for a large proportion of the most damaging windstorms (Hewson and Neu, 2015). Within Shapiro-Keyser storms, there are three main sources of extreme surface wind gusts; the warm conveyor belt (WCB) (Harrold, 1973), the cold conveyor belt (CCB) (Carlson, 1980;Browning and Roberts, 1994), and a sting-jet, if present (Browning, 2004;Baker, 2009;Schultz and Browning, 2017;Clark and Gray, 2018). According to Hewson and Neu (2015), the highest and most damaging surface wind gusts are generally due to the sting-jet when it is present, followed in order by the CCB and WCB. Previous storms known to have produced sting-jets include the most damaging windstorm to hit the UK, the Great Storm of '87 (Browning, 2004;Clark et al., 2005), as well as windstorms Ulli and Friedhelm (Fox et al., 2012), Tini (Volonté et al., 2018), and Ellen (Met Éireann, 2020). The latter four storms each warranted a red level warning, the highest level to be issued by forecasting agencies due to the high likelihood of extreme impacts occurring. Reported impacts include large numbers (>150,000) of households without electricity (Perils, 2012) as well as high insured losses such as US$0.2bn for Windstorm Ulli and US$6.7 billion for the Great Storm of '87 (Roberts et al., 2014). In this article, we quantify the contribution of such windstorms to wind risk using current and future climate simulations produced by a convection-permitting climate model (CPM), and compare these with a 25km global climate model (GCM) to assess the added-value offered by a CPM in the representation of extreme wind speeds.
The WCB and CCB are low-level jets that occur within the warm and cold sectors, respectively, of extra-tropical cyclones, and both can lead to extremely damaging surface wind gusts (Hewson and Neu, 2015). A SJ is a meso-scale slantwise airstream that descends from the mid-troposphere, within a SK cyclone, into what is known as the dry slot (Weldon and Holmes, 1991), in front of (or eastward of) the CCB (Clark and Grey, 2018). The sting-jet exhibits a number of features that makes it particularly hazardous. Firstly, the dry slot is generally characterised by weak stability conditions that are favourable to high momentum air from the SJ being transferred down towards the surface (Clark et al., 2005;Hewson and Neu, 2015;Clark and Grey, 2018). And secondly, the sting-jet accelerates during its descent within the free atmosphere unopposed by friction, bringing high momentum air towards the surface (Slater et al., 2015). This descent can then be further accelerated by evaporative cooling (Browning, 2004;Clark et al., 2005;Browning et al., 2015;Rivière et al., 2020) and the release of a slantwise instability, known as conditional symmetric instability (Clark et al., 2005;Gray et al., 2011;Baker et al., 2014;Coronel et al., 2016;Volonté et al., 2018;Eisenstein et al., 2020). As a result, the presence of a stingjet may yield wind speeds that can far exceed those expected when viewing isobars of a synoptic chart. Such characteristics and the short timescale of a sting-jet, compared to a CCB or WCB (1 to 12 hours vs. 12 to 36 hours: Hewson and Neu, 2015), presents challenges for forecasters who can sometimes only identify the presence of a sting-jet in real-time (Fox et al., 2012).
To adequately simulate sting-jets using a numerical model, both high horizontal and vertical resolution are required. The minimum requirements suggested are horizontal grid spacings below 10-15km and vertical grid spacings of around 250m in the mid-troposphere, and smaller spacings below (Clark et al., 2005;Clark and Gray, 2018). Besides the small spatial scale of a sting-jet, high-resolution is required in order to realise the instabilities that can accelerate a sting-jet during its descent. For instance, Volonté et al. (2018) compare a sting-jet's descent in a high-resolution model with that in a global model with reduced horizontal and vertical resolution. Although both simulations produce a sting-jet, the sting-jet's descent in the high-resolution simulation coincided with the release of conditional symmetric instability and increases in wind speed, while no such instability release was found in the lower resolution model. Subsequently, the maximum sting-jet wind speeds at 850hPa were underestimated by 12m/s compared to the high-resolution model (48m/s vs. 60m/s). Similarly, Coronel et al. (2016) found that reducing both the horizontal and the vertical resolution results in reduced wind speeds and no sting-jet forming. However, there was little difference when only reducing the horizontal resolution (4km vs 20km). In other comparisons where only vertical resolution was decreased, studies reported a reduction in stingjet strength (Clark et al., 2005) and in the area of intense wind speeds (Martínez-Alvarado et al., 2010). In contrast, Slater et al. (2015) reported no sensitivity to either horizontal (10km vs. 20km) or vertical resolution in idealised simulations, though possible changes in the sting-jet due to the release of slantwise instabilities were not included in the simulation . Overall, the effect of resolution appears to vary and may depend on characteristics such as the size of the sting-jet and the contribution of conditional symmetric instability release. In cases where the latter is signi cant, it is evident that high vertical and horizontal resolution is required.
Quantifying the contribution of sting-jets to wind extremes, relative to the CCB or WCB, remains an understudied topic. This is partly because observations of wind speeds are sparse and may miss cores of very strong wind speeds produced by sting-jets, while high-resolution simulations are restricted to case studies due to the large costs they incur. Within such case studies, Smart and Browning (2014) nd the sting-jet produced higher wind gusts than the CCB, but over a much smaller footprint and shorter timescale, while Martínez-Alvarado et al. (2014) show that the sting-jet and CCB may occur alongside making it di cult to disentangle their relative contributions. The likelihood of sting-jets occurring in windstorms has so far been indirectly inferred using a large-scale precursor known as downdraught slantwise convective available potential energy (DSCAPE) (Martínez-Alvarado et al., 2012) which indicates the presence of slantwise instabilities or the environmental conditions that are conducive to sting-jets. When applied to ERA-Interim (Martínez-Alvarado et al., 2012;Hart et al., 2017), and a 10-year climate simulation (Martínez-Alvarado et al., 2018), the diagnostic was present in roughly 0.1 storms per month that pass over Northwest Europe. This was found to increase to 0.5 cyclones per month in a future climate simulation under the RCP8.5 scenario. Within these studies, Hart et al. (2017) and Martínez-Alvarado et al. (2018) characterised storm severity using the maximum 850hPa wind speed. They found that DSCAPE storms (that develop explosively) account for more than 40% of storms with maximum wind speeds > 40m/s in a present climate while the frequency of these storms that exceed 35m/s was found to increase by 140% in a future simulation. The presence of DSCAPE can thus indicate potentially damaging windstorms and is currently used in operational forecasting to do so (Gray et al., 2020). DSCAPE points to the high risk of a sting-jet developing in these storms, although the models it is applied to are too coarse resolution to adequately capture sting-jets and resulting wind speeds. Thus, the high-resolution climate simulations analysed in this study should provide a better opportunity to understand the contributions of sting-jets to extreme wind speeds in windstorms.
The aim of this study is twofold: to develop a low-cost method to identify sting-jets in high-resolution climate simulations; and to assess the added-value offered by a high resolution (2.2km) convection permitting climate model (CPM), compared to a 25km GCM, for wind extremes in extra-tropical windstorms with a focus on those that produce a sting-jet. The added-value will be assessed through a comparison with observed wind gusts as well as by identifying systematic differences between the CPM and GCM that would indicate improved process representation in the CPM. In particular, we would like to understand if the likelihood of extreme wind speeds due to sting-jets are underestimated in the GCM compared to the CPM, and what implication this may have for impacts due to wind extremes in a current and future climate.

Model Simulations
Three simulations from a high-resolution convection permitting regional climate model (CPM) are analysed in this study: a hindcast driven by the ERA-Interim reanalysis dataset (ERAI) (Dee et al., 2011) at the lateral boundaries (March 1999-February 2018, as well as a control simulation with current climate forcing and a future climate simulation with RCP8.5 climate forcing, driven by a 25km N512 HadGEM3 global climate model (GCM), each 10 years in length. The CPM simulations are carried out with the UK Met O ce Uni ed Model (UM) at a 2.2km horizontal resolution over a European domain shown in Figure 1a, and have previously been assessed in terms of their improved representation of precipitation , as well as future projections of precipitation . The CPM uses 70 terrain following vertical levels with a 40km top. The lowest level is 2.5m above the ground and vertical level spacings increase quadratically with height.
This gives grid spacings of 40, 140, 300, and 590m at heights of approximately 100m, 1 km, 5 km, and 16 km above sea level. The CPM therefore satis es both horizontal and vertical resolution requirements for the representation of sting-jets as outlined in Clark and Grey (2018). A detailed summary on the model physics can be found in Berthou et al. (2020). The GCM providing the lateral boundaries is run at a 25km resolution with 85 vertical levels with an 85km top, further details on the model physics may be found in Williams et al. (2018) and Stratton et al. (2018).

Wind Data, Analysis and Metrics
We analyse wind speeds in the northern part of the domain only (grid cells north of the grey line in Figure 1a). We also mask the outer 100 grid cells to prevent contamination from the downscaling method. All months of the year are included in this analysis, as sting-jets are not limited to one season.

Surface Wind Gusts Over Land
We assess the CPM's representation of surface wind gusts over land compared to ERA-Interim (ERAI) and observations, using the hindcast simulation. Surface wind gusts are not studied for the control and future climate simulations as wind gust output was not available from the GCM climate simulations nor for the entirety of the CPM climate simulations. The maximum 3-second wind gust is output from the CPM hindcast simulation on a 3-hourly basis, that is, the maximum gust within a 3-hourly interval, as does ERAI. Observed surface wind gusts are obtained from the Met O ce Integrated Data Archive System (MIDAS) weather station observations. These are reported as the maximum gust over a mixture of 1-, 3-, and 6-hourly intervals, depending on the given station. We include observed gusts for all 15 storms from the XWS (eXtreme WindStorms) catalogue (Roberts et al., 2014) during the period of 2001-2018. Storms prior to 2001 are not assessed as wind gust output from CPM hindcast is not available before then. The XWS catalogue comprises the most extreme windstorms to hit Europe in the period 1979-2012, which were selected through a combination of consultation with insurance companies and an analysis of windstorm metrics to nd the most important windstorms over this period. Additionally, Storms Christian and Tini, which occurred after their initial analysis period, were also included. In total, 15 storms are assessed.

850hPa Wind Speeds
As wind gusts are not available for the CPM and GCM control and future climate simulations, all analysis of wind speeds in the climate change simulations is done using 3-hourly instantaneous wind speeds at 850hPa. Winds at 850hPa are not directly comparable to surface wind gusts but they give an indication of the maximum surface wind gusts achievable in a storm, as argued by Hart et al. (2017).

Analysis of Storm Severity Over Land
The severity of storms is characterised as the 95 th percentile of all 850hPa wind speeds that exceed 25m/s within the storm footprint (). This is calculated by incorporating grid cells that occur within 500km of the cyclone centre. Only land grid cells are incorporated so that the metric is impact relevant. A threshold of 25m/s was chosen as this is recognised as the surface wind gust level at which damages due to wind start to occur (Roberts et al., 2014). This metric provides an indication of the extremity of wind speeds over a relevant area of the storm (5% of the area where winds exceed 25m/s). It is chosen instead of the maximum wind speed that has been used in previous studies with coarser resolution models (e.g. Zappa et al., 2013;Hart et al., 2017, Martínez-Alvarado et al., 2019, as the maximum wind speed from a single grid cell at 2.2km resolution may not always be representative of the overall storm severity.
Identi cation of Warm and Cold Sector Winds 850hPa wind speeds are classi ed into cold and warm sector wind speeds to identify winds related to the warm conveyor belt in the warm sector of storms and the cold conveyor belt and sting jet in the cold sector of storms.
We include both land and ocean grid cells in this analysis as sting-jets often occur over the ocean. Following Hart et al. (2017), the cold and warm sectors of a storm are separated at 3-hourly intervals using a temperature threshold at 850hPa () that represents the frontal boundary between the warm and cold sector. Wet bulb potential temperature () is more commonly used but we do not have this output available at 3-hourly intervals. is rstly regridded to the 25km GCM grid and the threshold is identi ed in each storm, at the time of minimum MSLP within the CPM domain, as the mean from grid cells that exceed the 99 th percentile of within a 1000km radius of the cyclone centre. Grid cells at elevations above 500m are masked before calculating the 99 th percentile of . The resulting threshold obtained from the application of this method is demonstrated by the thick blue line in Figure 3a.

StormIdenti cation
Cyclone positions have been identi ed at 3-hourly intervals. They were initially tracked at a 6-hourly resolution (00h, 06h, 12h, 18h) in ERA-Interim and in the 25km GCM using the Hoskins and Hodges (2002) tracking algorithm, which identi es and tracks cyclones using 850hPa relative vorticity, regridded to a T42 grid (~300km) resolution. As sting-jets may occur over timescales shorter than 6 hours, cyclone positions at 3-hourly timesteps between the 6-hourly timesteps (i.e. 03h, 09h, 15h, 21h) were subsequently identi ed. These positions were identi ed as the point with maximum relative vorticity at 850hPa within a search-region whose extent was bounded using storm coordinates at the previous and following 6-hourly timestep (e.g. 00h and 06h are used to de ne search region for storm position at 03h). The longitude and latitude coordinates of the storm at the 6hourly timesteps are used to de ne the northern, southern, eastern and western bounds of the search region.
Each corner of the search-region is then extended outwards by two grid points to account for storms that remain relatively stationary over a 6-hour period, which would result in a very small search region.

Diagnosing Warm Seclusion of Shapiro-Keyser Cyclones
Shapiro-Keyser (SK) cyclones are identi ed at a 6-hourly timescale (00h, 06h, 12h, 18h) using wet-bulb potential temperature () at 850hPa and mean sea level pressure (MSLP). SK Cyclones develop a warm seclusion during Stage IV of the SK conceptual model (Catto, 2016), which occurs as a result of a bent-back front that encloses or wraps around a core of relatively warmer air. An example of such a feature is provided in Figure 1. This exempli es the secluded warm air feature, coincident with the MSLP core, which is surrounded by colder air to the north, south and west. The boundaries between the warm and cold air mark the position of the bent front which appears as a cold front to the south, and warm front to the north of the cyclone's core in the SK conceptual model. Methods are available to identify the position of frontal boundaries (e.g. Hewson, 1998;Berry et al., 2011) based on temperature gradients between adjacent grid points in coarse resolution datasets.
However, at 2.2km resolution, grid spacings are too small to detect a frontal boundary whose temperature gradients occur over larger spatial scales. Furthermore, such approaches will not distinguish between fronts associated with a warm seclusion and those not. We therefore develop an approach that is better suited to this resolution that will identify the warm seclusion feature.
A latitudinal cross-section through the storm along the black line in Figure 1 (b), depicts the warm seclusion as a sharp peak in in Figure 1 (c). Similarly, a longitudinal cross-section would show a sharp increase in in the westeast direction (not shown). The warm seclusion diagnostic is based on the identi cation of such peaks in that coincide with the MSLP core of the cyclone. Speci cally, a warm seclusion is identi ed if the MSLP core of the cyclone coincides with an area where is at least 2K warmer than surrounding areas. The steps of the identi cation procedure are outlined below: 1. Extract a region around cyclone centre which extends 1000km to the north, south, east and west (red box in Figure 1 (a)).
2. Within each latitudinal row of the resulting matrix, identify the presence of a peak in . A peak in refers to a consecutive number of points when is greater than a threshold , which is equal to at the beginning of a peak (see Figure 1c). The beginning of a peak is identi ed at point when and a peak is only identi ed if an end point is found which indicates a reversal in the latitudinal gradient that is indicative of a warm seclusion.
Only points within a peak that exceed + 2K, are considered as potential warm seclusion points. The thresholds and + 2K are indicated in Figure 1 (c). The selection of 2k is a subjective choice based on experience and analysis of the known events in Table 1, though it is similar to the 1.5K threshold used to identify the warm core of Medicane cyclones (Tous et al., 2016).
3. The previous step is repeated in the longitudinal direction (west to east).
4. The MSLP core is identi ed in the same way, although a trough in MSLP is identi ed rather than a peak.
Once the presence of a MSLP trough is detected, the minimum MSLP is found () and the MSLP core is identi ed as all points within the trough less than + 6hPa. This threshold is also a subjective choice guided by experience and analysis of known events.

A warm seclusion is identi ed if the number of potential warm seclusion points, that satisfy both ii and iii,
within the MSLP core exceeds 500 points. The number 500 was selected from experience to remove very small and incoherent cases that can occur due to a noisy eld at 2.2km. To give some perspective, at 2.2km, 500 points would cover an area of around 2,200 km 2 , which as a circle, would have a radius of approximately 26km, which is far below the average radius of a cyclone core.

Sting-Jet Diagnostic
The identi cation of sting-jets in this analysis was limited by the number of pressure levels available in the output. Five pressure levels (300, 500, 700, 850 and 925 hPa) were available, of which only two were useful to identify the slantwise descent of a sting-jet. The only existing automated method to detect sting-jets in highresolution simulations, known as back-trajectory analysis (Martínez-Alvarado et al., 2014) requires output on many pressure levels at time intervals less than one hour (Eisenstein et al., 2020). Although this is a powerful method, its application is not feasible on a climatological timescale due to its large storage and computational costs. This motivates the need for a low-cost method with low data requirements. In this study, sting-jets are identi ed within the subset of diagnosed Shapiro-Keyser cyclones, at a 3-hourly timescale using wind speeds at 700hPa and 850hPa as well as relative humidity (RH) at 500hPa.
A sting-jet occurs in the cold sector of a cyclone where it descends from the mid-troposphere (~600hPa) to the top of the boundary layer (~850hPa) or lower, emerging from the edge of an extrusion of cloud on the northeast side of the cyclone called the cloud head (Bottger et al., 1975) into a region of low relative humidity known as the dry slot. In doing so it produces areas of low RH within the cloud head from where it has descended as well as a distinct region of strong winds at the level it descends to in front of the cloud head (Clark and Grey, 2018).
The diagnostic presented here is based on identifying the three characteristics described above, namely the slanted descent of a sting-jet, the reduction in RH and the distinct feature of high wind speeds in front of the cloud head around the southern ank of the cyclone. The slanted descent is identi ed by a reversal of the vertical wind speed gradient () between winds at 700hPa () and 850hPa (), along streamlines at 850hPa.
See formula 1 in the supplementary les section.
If present, a sting-jet will produce a transition from positive to negative gradients along a streamline, (see Figure   2b and 2c) that originates from an area of high RH ice (cloud head) into an area of low RH ice (dry slot) as can be seen from the comparison of RH ice and the location of the gradient reversal feature in Figure 2a and 2b respectively. The transition is identi ed by extracting streamlines (e.g. gold dashed line in Figure 2b) along wind directions at the 850hPa level, which are used to provide an approximation of the actual sting-jet trajectory, as if it were projected from above on to the 850hPa level. The streamlines are initiated from points (e.g. gold dot in Figure 2c) with negative gradients in the cold sector of the storm whose wind speed at 850hPa also exceeds the local 98 th percentile of the winter season December, January, and February (DJF). This is calculated separately for the hindcast, while the 98 th percentile value for the control is applied to both the control and future simulations.
At each grid cell along the streamline, the vertical wind gradient () is extracted along with RH ice at 500hPa and wind speed at 850hPa. Only streamlines to the south of a cyclone are considered, which are those whose backward direction is towards the west. Each streamline will have a maximum length of ~200km. This is a subjective choice, Clark et al. (2005) show for one storm that the length of the sting-jet feature was around 150km, though this can vary between different storms (Hewson and Neu, 2015). For a sting-jet to be identi ed, several criteria should be met along a given streamline: 1. Moving backwards along a streamline, a continuous sequence of negative gradients, beginning from the initial point of the streamline, should be followed by a continuous sequence of positive gradients. An example of this transition along a streamline is provided in Figure 2c. Each positive and negative sequence must be at least 20 grid cells in length and occur within 10 grid cells of one another to account for possible uctuations in the vertical gradient.
2. The median RH ice at 500hPa within the positive gradient points should be greater than 80%, indicating the descent originates from within the cloud head. The threshold of 80% is applied as this has been used in previous sting-jet analyses to indicate the location of cloud (Eisenstein et al., 2020;Volonté et al., 2018;Coronel et al., 2016). RH ice at 500hPa was chosen instead of at 700hPa, as the algorithm was found to perform better.
3. The median RH ice at 500hPa within the negative sequence should be, in absolute terms, at least = 50% lower than that within the sequence of positive gradients. Together, the RH ice criteria indicates the emergence of the sting-jet from the cloud head region into the dry slot of the cyclone. The value of 50% was chosen to optimise the algorithm's performance, as described further below.
4. The maximum 850hPa wind speed of the negative sequence should be at least = 6m/s higher than the maximum wind speed from the positive sequence. Along with the RH ice criteria, this indicates a distinct wind feature ahead the cloud head. Similarly to RH ice , 6m/s was chosen to optimise the algorithm's performance (see below). 5. The above procedure is applied to each point in the cyclone's cold sector with a negative vertical gradient which exceeds the DJF 98 th percentile described above. The number of these points that satisfy i to iv should exceed 50 points. This is an arbitrary selection to remove randomly identi ed events that occur due to the noisy nature of a 2.2km grid.

Performance and Limitations of Algorithm
The performance of the algorithm has been tested for 14 known or speculated storms (see Table 1) that produced sting-jets within the time period of the hindcast simulation (1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017)(2018). Some of these have been con rmed as having sting-jets through backward trajectory analysis (Clark and Grey, 2018), while others have been subjectively identi ed through visual analysis (Hewson and Neu, 2015). The algorithms to detect a warm seclusion and a sting-jet were applied to each time step of these storms. Each storm produced a warm seclusion which was detected by the algorithm, as indicated in Table 1, and we nd this to be a robust feature in storms that is relatively easy to detect. As such, we have high con dence in the automated detection of warm seclusions. To assess the sting-jet algorithm's performance, we rstly subjectively identify if the simulation has produced a sting-jet in each of the 14 storms.

Subjective Identi cation of Sting-jets
Two subjective assessments were carried out independently by the rst author and a Chief Operational Meteorologist at the UK Met O ce, Dan Suri. The two assessments are used to minimise any unconscious bias and increase our con dence in the robustness of the subjective assessment. The rst author used criteria recommended by Hewson and Neu (2015) in which a sting-jet is visually identi ed if the wind maxima at 850hPa occurs just ahead of the cloud head close to the cyclone's core on the equatorward side, as shown in Figure 2a. The algorithm is also based on these criteria. As detailed in Table 1, nine storms were found to exhibit these characteristics. The second subjective assessment by Dan Suri used the above information but also looked at surface wind gusts (where available) for evidence that a sting-jet had reached the surface. Eleven of the fourteen storms presented evidence of a sting-jet according to these criteria. To highlight the inherent subjectivity of this approach, and the di culty in identifying if a sting-jet has occurred or not, Dan Suri categorised each of these eleven storms according to the degree of certainty (high, medium, or low) he had that the identi ed feature was produced by a sting-jet (see Table 1). Five fall into the high con dence category, three in medium, and three in low.
All nine storms identi ed by the lead author are included in the eleven storms identi ed by Dan Suri, and the timings of the identi ed sting-jets are generally the same, except for Storm Xynthia (Table 1). The remaining two storms not included in this eleven by the rst author (Storms Tini and Gudwin/Erwin), presented no evidence of a distinct wind feature ahead of the cloud head but did show signs of a sting-jet in surface wind gusts. It is possible that sting-jets may be missed in 3-hourly instantaneous output but seen in the 3-hourly maximum wind gusts if they occur over a short period in between. It is also possible that the de nition used for the cloud head may not be optimal for all storms and result in missed cases. However, there is a large overlap between the two assessments and so we are con dent that the criteria used by the rst author, which is also used by the algorithm, provides a good indicator of a sting-jet.

Optimisation and Performance of Sting-Jet Algorithm
As the criteria used by the algorithm is the same as that used in the subjective assessment by the lead author, the performance of the algorithm is optimised against the nine storms identi ed by the lead author. The parameters and were selected to optimise the algorithm's performance by maximising the number of correctly identi ed sting-jets whilst minimising the number of false alarms and missed events. A number of combinations of (-20% to -70%) and (2m/s to 10m/s) were tested, and the optimum parameter values found were and . The performance of the algorithm with the chosen parameters is summarised in Table 1. The algorithm identi ed 6 of the 9 events and had 1 false alarm. The number of missed events increases from 3 to 5 compared to the Forecaster's assessment. However, it only identi ed 7 of the 22 sting-jet time steps from across all storms but had just 3 false alarms. These results indicate that the algorithm with the prescribed parameters has skill in identifying storms where the indicators of sting-jets are present, namely the distinct wind feature at 850hPa ahead of the cloud head. However, it has less skill in identifying all such timesteps in those storms.
The main source of the missed cases lies with the criteria applied to RH ice along the extracted streamlines. The comparison of the median RH ice in the positive and negative sequence of the gradient reversal feature (Figure 2 (b) and (c)) is problematic as it requires the positive and negative sequences to lie mostly within an area of high and low RH respectively, which is not always the case. Comparing the maximum and minimum RH in the positive and negative sequences respectively increases the number of hits to 17 of the 22 timesteps but is also accompanied by a large number of false alarms (23) which is not acceptable. We therefore decided to compare the median RH to remain conservative and are con dent that the storms identi ed with this approach are done so for the right reasons. Other missed cases such as Storms Egon and Jeanette had no gradient reversal present. It is possible that no sting-jet is actually present in these storms, or that the sting-jet descends close to or over the cold conveyor belt which is indicated in a case study analysis of Storm Egon (Eisenstein et al. 2020).
This occurrence may remove the positive gradient between 850hPa and 700hPa, leading to a missed case.
Some further quality control was carried out to investigate the overall performance for all storms. The algorithm was found to produce many false alarms in the Mediterranean and surrounding areas with complex orography as it identi es subsidence in downslope winds on the lee side of mountains as well as events such as the Mistral in the south of France that produces subsiding air and extreme wind speeds downstream of the Rhone valley. We therefore decided to remove the Mediterranean and surrounding areas with complex orography from the analysis as we found little evidence of sting-jets there in the simulations and it would be di cult to distinguish between the abovementioned features and sting-jets without applying arbitrary criteria. As such, the sting-jet algorithm is not applied to cyclones south of the grey line in Figure 1a.
The low data requirements, compared to the 3-dimensional elds needed for back-trajectory analysis, are a large advantage of this approach, which creates the possibility to automate the analysis of sting-jets in highresolution climate simulations and to allow the further exploitation of such simulations that are continually increasing in number . There is also potential for it to be applied operationally to help the forecasting community with high-impact weather warnings. However, the low-cost advantage of the method does come with caveats, in that several parameters are required with arbitrary thresholds due to the low number of pressure levels available. In future applications, improvements might be achieved through increasing the number of pressure levels output from simulations between 900 and 500hPa, or potentially through applying an online diagnostic while the model is running which would remove the need for large volumes of output. It would also be helpful to use case study simulations of known sting-jets to ne tune the method and guide the selection Page 11/28 of required output from high-resolution climate simulations. Despite the caveats though, we are con dent that the algorithm with the speci ed criteria is identifying storms for the right reasons and can be used to extract information related to sting-jets from these simulations.

Comparison with Observed Wind Gusts
We compare cumulative distribution functions (CDFs) of observed maximum 3-second wind gusts for 15 storms from the XWS catalogue to those produced by the hindcast simulation and ERAI (Figure 3). Only wind gusts from stations/grid-points below an elevation of 500m are considered. CDFs are rstly constructed for each storm from the maximum gust at each grid cell or station that occurred within 500km of a cyclone centre ( Figure  3b-p). These CDFs are then pooled for the CPM, ERAI and observations and shown in Figure 3a. The CPM generally produces higher wind gusts than ERAI and compares better with observations both on the native grid and when regridded to the ERA Interim grid. Compared to the observations, the CPM underestimates wind gusts above 25m/s. This feature is also seen in Roberts et al. (2014) and Haas and Pinto (2012), who compared observed wind gusts to those simulated by a limited-area model run at coarser resolutions of 25km and 7km respectively. In Roberts et al. (2014), possible reasons for this underestimation included weaker pressure gradients in the model and the non-representation of convective gusts.
However, when comparing CDFs on an event-by-event basis we see the CPM compares quite well with the observations in some events. In other events, the CPM can both overestimate and underestimate wind gusts, and so no systematic bias is seen in this respect. For those events where the CPM underestimates observed gusts, CPM simulated gusts are either similar to or better than ERAI (e.g. Figure 3b, e, g, h). This suggests that the CPM's performance may be limited by the use of lateral boundary conditions from ERAI which tends to underestimate the minimum MSLP in windstorms due to its resolution (Hewson and Neu, 2015). This bias is not necessarily corrected through dynamical downscaling, potentially contributing to underestimation of wind gusts by the CPM.
It is important to note that we do not expect one-to-one correspondence between simulated and observed gusts on an event-by-event basis, although we would expect the overall climatology of gusts to be captured. This is because the CPM only receives information on the observed state of the atmosphere at the lateral boundaries of the Europe-wide model domain, and within this domain it evolves freely. Furthermore, station observations at a given point in space are not directly comparable to gridded model data, as modelled gusts represent an areal mean for a grid cell that may tend to make the model gusts too low. On the other hand, observations may also miss peak wind gusts as they are sparsely distributed in space. There are no gridded products of observed wind gusts and so this is the best comparison we can make. Overall, the results provide an indication that the CPM can represent maximum wind gusts within extra-tropical cyclones, providing an improvement over ERAI.

Assessment of Storm Frequency and Severity
Storms are categorised into three storm types; non-Shapiro-Keyser storms (), Shapiro-Keyser storms without a sting-jet (), and those that produce a sting-jet (). The total number of cyclones within each category for the three simulations is provided in Table 2 be related to natural variability or differences in boundary conditions and downscaling ratios (75km to 2km for ERAI, and 25 to 2km for the GCM). The number of cyclones in the control and future simulations is the same, although, due to the overall reduction of SK cyclones in the future simulation, the proportion of SK cyclones with sting-jets is slightly higher in the future (4.7% vs. 4.3%). Compared to Hart et al. (2017) and Martínez-Alvarado et al. (2018) who report climatologies of 0.1 and 0.5 DSCAPE cyclones per month over Northwest Europe in present and future climates respectively, our present-day estimates are similar to these at 0.1 and 0.2 cyclones per month in the hindcast and control, though our future estimate is lower at 0.2 cyclones per month. The severity of storms is characterised by (see methods). CDFs of for each storm type are provided in Figure 4.
The conditional CDF for storms is shifted to higher values compared to storms, while the CDFs of storms are shifted to even higher values in each of the three simulations, indicating an increased probability of extreme wind speeds when a sting-jet occurs. The absolute contributions of each storm type to exceedances of above high thresholds is provided in Figure 5. Each bar represents the number of storms exceeding a given threshold, while the colours indicate the frequency contribution of each storm type. The hindcast is split into the same two 10-year periods in Table 1 to give an indication of natural variability in . The number of storms with exceeding 35m/s in each period differs by around 10 storms. This difference varies depending on the threshold. The control simulation has more storms exceeding 35m/s than both hindcast periods, but the numbers exceeding higher thresholds are quite comparable. The proportional contributions from each storm type in the hindcast and control are also similar. cyclones account for the majority of the most extreme windstorms at all thresholds, while storms show a large contribution at lower thresholds, but their contribution diminishes at higher thresholds. storms have a small representation at each bin, although their proportional contribution increases with increasing threshold. In the future simulation, there are increased exceedances for all storm types, although the changes are mostly driven by increased contributions from and storms as shown by numbers above each bar in Figure 5 (c). For instance, above 40m/s, the number of storms increases from 10 to 28, with 44% and 39% of that increase accounted for by and respectively. Overall, even with the di culties found in identifying stingjets themselves, the methodology developed here is identifying storms that have a chance of developing stingjets (i.e. storms) which are generally the most impactful storms and dominate future increases in extreme threshold exceedances.

Sting-Jet Storms: Sources of Wind Extremes
The results above relate to overall storm severity (i.e. all extreme winds within a storm footprint and not just the sting-jet itself). We now examine the contribution of the sting-jet feature to the overall distribution of wind speeds relative to the warm and cold conveyor belts (WCB and CCB) for storms in which a sting-jet was identi ed (Figure 6). Contributions from the sting-jet, CCB and WCB are extracted by rstly separating wind speeds into warm and cold sector winds. All warm sector wind speeds are classi ed as WCB, while cold sector wind speeds are classi ed as a sting-jet at timesteps when a sting-jet was detected and CCB otherwise. It is possible that a CCB may be present at the same time as a sting-jet, and so sting-jet and CCB winds refer to cold sector winds in the presence or absence of a sting-jet. Wind speeds are then pooled across all storms within each wind type (i.e. SJ, CCB or WCB).
It is clear from Figure 6 that the CCB is the main driver of extreme 850hPa wind speeds, a nding that is in line with Hart et al. (2017). It accounts for more than 40% of winds greater than 35m/s and this proportion increases at higher thresholds. The WCB has a similar contribution at 35m/s, but this decreases with increasing threshold. Contributions from sting-jet winds are generally lower than the CCB and WCB, but increase at higher thresholds. The low estimate for sting-jet winds at 850hPa is partly due to its relatively smaller spatial and temporal scales compared to the CCB and WCB, but it is also likely to be underestimated given the number of timesteps on which sting-jets are missed (Table 2). Furthermore, it is possible that its contribution to extreme surface gusts would be greater than at 850hPa as it can more readily transfer momentum from aloft towards the surface than the CCB and WCB.

Comparison with GCM
The comparison between the CPM and GCM aims to highlight systematic differences that would indicate potential added-value of the CPM through better process representation. For this comparison, CPM 850hPa wind speeds are regridded to the 25km grid of the GCM. Hereafter, the CPM on its native grid and that regridded to the 25km GCM grid will be referred to as CPMn and CPMr respectively. The metric is then calculated for the CPMr and GCM. Compared to CPMr, the GCM underestimates the frequency of storms exceeding high values of in both the control and future simulations (Figure 7a and 7b). In terms of future changes, the changes are higher in the GCM than CPMr for storms with < 40m/s but are higher in the CPMr for storms with > 40m/s (Figure 7f). The CPMn has similar or slightly higher changes than the GCM at all thresholds ( Figure 7c). Overall, the future changes in the CPM and GCM are similar which indicates that changes in are dominated by large-scale changes in storms captured in the GCM that are inherited by the CPM. This is a similar nding to Leckebusch et al. (2006), though Donat et al. (2011) report differences between RCMs when driven by the same GCM, and so this result may differ depending on the RCM or CPM used. However, there is a difference in future changes of 3 storms at high thresholds (> 42m/s) which accounts for 33% of changes in the CPM. Over longer simulations, an accumulation of this difference may become signi cant if maintained, and it would be interesting to explore this as longer high-resolution climate simulations become available.
To compare within each storm type (i.e , or ), we apply the classi cation obtained from CPMn that is used for results in Section 3.3. Hence, we do not identify the presence of a warm seclusion or sting-jet on the 25km grid, but classify a GCM storm, for instance, as if a warm seclusion and a sting-jet is identi ed in that given storm within CPMn output. In fact, the given storm may not have a sting-jet in the GCM, due to the coarser resolution, but we have not been able to assess this as the performance of the algorithm when applied to a 25km grid was not acceptable, indicating that the algorithm requires high-resolution input. Thus, potential added-value of the CPM in capturing the actual sting-jet feature is not considered here, and instead we focus purely on the overall extreme wind distribution and any evidence of added-value in this. The distribution of absolute differences in between each storm in the CPMr and GCM is presented as boxplots in Figures 8d and 8e. is generally higher for CPMr storms than GCM storms, and the differences seen are consistent across each storm type. However, the differences are more relevant in and storms as wind speeds are generally higher in these storms ( Figure 5).
Next, we look to understand the source of the differences between CPMr and GCM for and storms. To do so, we pool all 850hPa wind speeds that occur within 500km of a cyclone centre for both storm types in each simulation separately. The percentage difference between CPMr and GCM is then calculated for the difference in absolute number of 850hPa wind speed exceedances of speci ed thresholds (Figure 8) within four categories: 1) all 850hPa wind speeds within 500km of a cyclone centre (ALL); 2) warm conveyor belt winds (WCB); 3) cold conveyor belt winds (CCB); and 4) sting-jet winds (SJ). Winds are classi ed as WCB if they occur within the warm sector of a cyclone, identi ed separately in CPMr and GCM, while cold sector winds are classi ed as stingjet on timesteps when a sting-jet was identi ed in the CPMn, and CCB otherwise. The approach of assigning wind speeds to sting-jet assumes that the stage at which a sting-jet develops (between stage 3 and 4 in the Shapiro-Keyser conceptual model (Clark and Grey (2018)), occurs simultaneously in the CPM and GCM, though it may not occur in the GCM at all. Storms can deviate in the CPM domain from the GCM, but qualitatively we do not observe large differences in the position of storms between GCM and CPM, and so we deem this a reasonable assumption to make.
The GCM underestimates extreme wind speeds at 850hPa compared to the CPMr (black line in Figure 8) in both CPM-classi ed (Figure 8a and 8b) and storms (Figure 8c and 8d) within the control and future simulations. These differences increase with increasing threshold, where the underestimation can be between 20-40% for thresholds between 35 and 45m/s. Many of these differences are also deemed signi cantly different from zero, as indicated by points along each line (see gure caption for details on the construction of the uncertainty interval). The meaning of signi cance in this case is an indication that the given difference is robust across the event set and is insensitive to the occurrence of single events or minor differences in the areal extent of extremes in storms. The differences between CPMr and GCM are generally higher for the control simulation, particularly for storms ( Figure 8c). In contrast, the differences are robust for storms in the control and future simulations (Figure 8a and 8b), which have a much larger sample size.
Differences seen between CPMr and GCM are likely due to an underestimation of wind speeds related to the sting-jet and CCB, as can be deduced through comparison of the relevant lines in Figure 8 with the black line representing ALL wind speeds, particularly for storms (Figure 8a and 8b). The largest differences in storms come from sting-jet winds (Figure 8c and 8d), though signi cant differences are not seen for the future simulation (Figure 8d), and so this nding may not be as robust as differences seen for CCB winds in storms. In contrast, the GCM tends to overestimate WCB winds though differences are smaller (<20%) and only signi cant for lower thresholds below 35m/s. Overall, the results suggest that the CPM and GCM are similar in their representation of WCB winds, while the GCM largely underestimates cold sector wind speeds at 850hPa related to the sting-jet and CCB, particularly for the extremes. It is therefore likely that the CPM will provide more reliable information on potential future changes in extreme 850hPa wind speeds, which may result in the GCM underestimating changes in frequency of the most severe windstorms as indicated by the differences in future changes in > 42m/s (Figure 7f). The underestimation at 850hPa may also lead to an underestimation in extreme surface wind gusts, though we cannot show this as wind gust output was not available from these simulations. Such an underestimation in wind gusts may have large implications if using GCM output for use in impact models, as the expected impact, or insured loss, increases with cube of the wind speed. Consequently, a small underestimation in extreme wind speeds can yield a disproportionately larger underestimation in expected impacts (Klawa and Ulbrich, 2003).

Summary & Conclusions
We have investigated wind extremes within extra-tropical windstorms using high-resolution convectionpermitting model (CPM) simulations over Europe. This paper assesses the added-value of the CPM compared to a 25km GCM and proposes a novel, low-cost, automated methodology that detects cyclones which develop warm seclusions, namely Shapiro-Keyser cyclones, and sting-jets within those cyclones.
The added value of the CPM was rstly assessed by comparing simulated surface wind gusts from the hindcast simulation with those from ERA-Interim and observations. Secondly, we looked for systematic differences between the CPM and 25km GCM that would indicate added-value. The CPM is generally able to replicate the observed CDF of maximum wind gusts during 15 extreme windstorms from the XWS catalogue (Roberts et al., 2014), and offered an improvement compared to ERA Interim wind gusts. As wind gust output was not available for the climate simulations, we compared the CPM and GCM using 850hPa wind speeds. The GCM underestimates the frequency of extreme winds speeds compared to the CPM, and this underestimation is largely attributed to a better representation of cold-conveyor belts (CCBs) and sting-jets in the CPM. For the CCB, the CPM will capture tighter thermal and MSLP gradients close to the cyclone centre that will result in higher wind speeds. While this is also important for sting-jets, the CPM can also realise the release of slantwise instabilities that accelerate a sting-jet's descent, unlike the GCM, and improve the downward transfer of momentum inside the boundary layer (Rivière et al. 2020). Consequently, we expect that a similar underestimation by the GCM would be seen for surface wind gusts.
The automated detection methodology was applied to a hindcast, control and future climate simulation. Shapiro-Keyser cyclones were found to account for the majority of extreme windstorms in each simulation, while those that develop a sting-jet account for a smaller number. The method was found to be effective in detecting Shapiro-Keyser cyclones and sting-jets in historical windstorms, and we have high con dence in the identi cation of warm seclusions. We are also con dent that the algorithm used to detect sting-jets is identifying events for the right reasons, though we have seen that it is prone to missing some sting-jet occurrences (and in particular does not identify all the time-steps within a given storm where a sting-jet is present). The low-cost advantage of the method comes with caveats in that several parameters are required with thresholds to be de ned. Speci c bene ts and shortcomings of the method are discussed in the main text. In the future, it would be helpful to use case study simulations of known sting-jets to ne tune the method and guide the selection of required output from high-resolution climate simulations. Nevertheless, here we have demonstrated that it is possible to diagnose sting-jets with signi cantly reduced information compared to that required for more precise methods such as back-trajectory analysis (Martínez-Alvarado et al., 2014;Volonté et al., 2018;Eisenstein et al., 2020). This advantage allows for the analysis of sting-jets in high-resolution climate simulations and may be valuable to forecasters (depending on performance) to help with the detection of sting-jets in output from deterministic weather prediction models.
A large increase in the frequency of extreme windstorms was found in the future simulation. This is mostly accounted for by Shapiro-Keyser cyclones as well as those in which sting-jets are detected. Although the 10-year simulations assessed here are relatively short, the response of the future climate simulation is similar to previous studies Pinto et al., 2007;Gastineau and Soden, 2009;Donat et al., 2011;Zappa et al., 2013;Pinto et al., 2012;Vautard et al., 2019), and an increase of extreme windstorms seems a robust response to anthropogenic climate change across many studies (Feser et al., 2015). However, assessments of multi-model ensembles reveal a large spread in the simulated response from different models indicating a signi cant role of internal variability (Donat et al., 2011;Zappa et al., 2013). While a moister, warmer environment due to anthropogenic climate change is expected to lead to more favourable conditions for stingjets (Martínez-Alvarado et al., 2018), the overall in uence of anthropogenic climate change on the frequency of extreme windstorms remains uncertain and requires further investigation (Catto et al., 2019;Zappa, 2019).
The CPM and GCM are similar in their future change response, although there is a hint that the GCM underestimates changes in the most extreme windstorms compared to the CPM. Overall, we can see that changes in the frequency of extreme windstorms are dominated by large-scale changes in storms captured in the GCM and inherited by the CPM. However, since the CPM provides local added detail and an improved representation of sting-jets and cold conveyor belts, the projections of changes to extreme wind speeds are expected to be more reliable from the CPM. Further work is needed to con rm whether there are robust differences in future changes at convection permitting scale and this will be investigated using longer highresolution simulations from additional CPMs (e.g. Coppola et al., 2020) when they become available.
This study has implications for stakeholders and downstream impact modellers. In particular, the improved performance of the hindcast in representing surface wind gusts indicates that the CPM will provide better input data for impact models. Indeed, Dunn et al. (2018) demonstrate an improvement in modelling wind impacts to electrical infrastructure when moving to higher resolution simulations. We expect that the GCM will underestimate surface wind gusts driven by CCBs and sting-jets compared to the CPM, and importantly also their future changes. Consequently, impact models will likely require output from high-resolution simulations or bias corrected GCM output (e.g. Haas and Pinto, 2012;Roberts et al., 2014). Although, for these to be reliable, large-scale features such as storm tracks must be adequately represented and the assumptions of the bias correction must be valid for future intense cyclones (Maraun et al., 2017).
Declarations future projections of North Atlantic and European extratropical cyclones in the CMIP5 climate models. Journal of Climate, 26(16), pp.5846-5862.
Zappa, G., 2019. Regional climate impacts of future changes in the mid-latitude atmospheric circulation: a storyline view. Current Climate Change Reports, 5(4), pp.358-371. Comparison of wind gust CDFs between observed (black lines), ERA Interim (blue line), and the CPM on its native grid (solid red line) and regridded to the ERA Interim grid (~75km) (dashed red line). CDFs are constructed for by pooling wind gusts over land from stations/grid-points below a 500m elevation for (a) all 15 windstorms from the XWS Catalogue (Roberts et al., 2014), and for (b) to (p) each individual windstorm. Storms are ordered according to their reported insured losses (if available) according to the XWS catalogue