Bridging the gap between AI and observations for sea ice forecasting: a new working group (ORCAS)

Artificial Intelligence (AI) is transforming environmental forecasting worldwide, but applying it to the complex, rapidly changing polar regions, the Arctic and Antarctica, remains a major scientific challenge. AI is revolutionising numerical weather prediction by providing tools that are often better performing, significantly faster, and more energy-efficient than traditional physics-based models. Major developments, such as ECMWF’s AIFS and Google’s WeatherNext, have demonstrated that AI can match, or even exceed, the predictive skill of traditional numerical weather prediction systems.

However, polar regions are characterised by sparse observational coverage and intricate climate processes, making them difficult to model accurately. Current AI models rely heavily on reanalysis products (such as ERA5), which, despite their strengths, are known to contain significant biases and shortcomings in the polar regions.

The growing demand for AI applications in polar regions, combined with their unique observational and physical challenges, calls for a comprehensive, community-driven effort.

Comparison of ECMWF’s AI-based forecasting system (AIFS) and the traditional physics-based model (IFS) weather forecasting accuracy for different forecast lead times. AIFS outperforms IFS across all forecast lead times, and both systems show improving accuracy over time. Challenges remain in both polar regions, and lower overall performance in the Southern Ocean reflects, among other things, the limited observations. Figure modified from Bauer 2024, https://doi.org/10.1016/j.jemets.2024.100002

Introducing the ORCAS Working Group

ORCAS (Observational Requirements in the Context of AI prediction Systems) is a community effort dedicated to ensuring that next-generation AI models are built on a solid foundation of real-world observations. Our team brings together experts in machine learning and polar observation to evaluate model performance and identify the optimal data needed for future forecasting systems.

ORCAS began as a task team under the WMO WWRP PCAPS (Polar Coupled Analysis and Prediction for Services) project. PCAPS aims to build on the legacy of the Year of Polar Prediction (YOPP) to improve the fidelity, actionability, and impact of forecasting services. Through PCAPS, we are strategically linked to international initiatives like Antarctica InSync and the planning for the 5th International Polar Year (IPY5)

Recognising the broader oceanographic implications of our work, ORCAS has now also been established as a SCOR (Scientific Committee on Oceanic Research) Working Group. This dual affiliation will enable us to deliver peer-reviewed publications on model verification, produce strategic reports on observational priorities, and provide accessible guidance and resources for the broader scientific community.

Sea ice melange in the Amundsen Sea, January 2024, showing a heterogeneous mixture of sea ice floes and iceberg fragments. The complex, fine-scale processes occurring here are not necessarily captured in the grid-scale analysis products typically used to train AI models. Photo credit: Clare Eayrs

Key Goals: Observations, Future Planning, and Trust

Maximising the value of observations: A primary goal is to ensure that specialised field campaign data, such as from the MOSAiC expedition or the SvalMIZ-24 campaign, are no longer underused in AI development. While remote sensing fits well into current AI pipelines, in situ campaign data are vital for capturing fine-scale processes and extreme conditions that reanalysis datasets often miss.

Informing future campaigns: ORCAS identifies the specific data types, formats, and spatio-temporal resolutions that upcoming campaigns like Antarctica InSync and IPY5 must prioritise to optimally support AI training, initialisation, and validation. We seek to answer critical questions about observational density—determining, for example, whether a wider spread of observations is more beneficial than localised, high-quality deep-profile data.

Probing physical consistency for building trust: For AI predictions to be useful, researchers and stakeholders must be confident that the outputs are physically meaningful and consistent. ORCAS assesses "physical consistency" by checking whether AI systems adhere to known physical constraints and conservation laws, such as the relationship among sea-ice drift, thickness, and temperature. We are leveraging emerging interpretability methods to understand how AI models arrive at their predictions. These techniques can also help identify which types of observations are most critical for model performance, directly informing our observational strategy recommendations.

A major challenge is that current AI models primarily simulate state variables (like sea ice concentration) but often lack explicit calculations for radiative and turbulent fluxes. ORCAS is investigating methods to recompute or predict these fluxes to ensure process-level understanding is maintained.

Current Activities: Testing the Models

To achieve these goals, ORCAS is developing systematic validation scenarios based on historical campaign events. Our first major test case uses the SvalMIZ-24 dataset, which documented a period of strong winds and wave activity in the Marginal Ice Zone north of Svalbard in April 2024.

We are evaluating three distinct architectures: ECMWF-AIFS, Ice-kNN (subseasonal analogues), and MET-AICE (high-resolution regional). Early results from the AIFS comparison show that models with explicit sea ice representation perform better at capturing local temperature evolution, particularly in the marginal ice zone, though all systems currently exhibit a persistent bias in the region. These early findings are helping us define standardised metrics and recommendations for handling the resolution differences between high-frequency buoy data and grid-scale AI models.

Equally important is understanding the development ecosystem itself: What training data are developers using beyond the standard ERA5 reanalysis? How are models being validated? Which observations are being incorporated, and which remain underutilised? Our February workshop is designed to address these questions directly, bringing together model developers and observational scientists to share practices, identify data gaps, and establish community standards.

Upcoming Online Community Workshop: February 2026

We are pleased to announce our inaugural online community workshop scheduled for 10 and 11 February 2026. This event will offer an opportunity to share our working group activities with the community and to gather broad feedback on our early findings and methods. We invite AI model developers, observational scientists, and stakeholders to join us in shaping the future of sea ice prediction.

The free workshop will be conducted over two sessions to accommodate global time zones on Tuesday, 10 February (13:00–15:00 UTC) and Wednesday, 11 February (03:00–05:00 UTC) via Zoom. Participants will examine how AI sea ice models currently use datasets for training, initialisation, and validation. The workshop will feature presentations from modelling and observational communities, quick introductions from participants, and discussions on data needs, collaboration opportunities, and future observation system design. We welcome anyone interested in contributing to this expanding community.

All participants are requested to prepare a single-slide introduction using a provided template. Registration closes on Monday, 9 February 2026. Please use the following form to register your attendance. Sessions will be recorded for those unable to attend live.

— ORCAS Task Team Members: Clare Eayrs, Lorenzo Zampieri, Malte Müller, Luisa von Albedyll, Sandra Barreira, David Bromwich, Petra Heil, Zachary Labe, Yafei Nie, Luciano Ponzi Pezzi, David Clemens-Sewell, Yonghan Choi, Wayne De Jager, Simon Driscoll, Lauren Hoffman, François Massonnet

Next
Next

Season's Greetings from PCAPS