The first ORCAS Community Workshop “Connecting Observations and AI Sea Ice Prediction Systems”, held online on 10–11 February 2026 across two sessions to accommodate global time zones, brought together participants from across five continents (operational centres, universities, and non-profits) to begin building a community around a question that matters for the whole polar prediction enterprise: what kinds of observations do AI sea-ice prediction systems actually need, and how do we know if those systems are getting the physics right?

Researchers from KOPRI and BAS measuring sea ice thickness and snow depth on an Antarctic sea ice floe, Amundsen Sea, January 2024. Thickness estimates from satellites remain uncertain, largely due to complex snow layering, making accurate field-derived measurements indispensable. (Photo credit: Clare Eayrs)

Why ORCAS, and why now?

ORCAS (Observational Requirements in the Context of AI Prediction Systems for Sea Ice) operates as both a PCAPS Task Team and a SCOR Working Group (WG 173). It was established to address the rapid rise of AI in polar forecasting. AI models are being developed faster than common standards and frameworks for testing them, and many of the observations most valuable for understanding why sea ice behaves as it does, e.g., ice thickness, deformation, snow depth, and turbulent fluxes, are not yet systematically integrated into AI workflows.

The timing is also urgent because two major polar field campaigns are in active planning. Antarctica InSync (2027–2030) and the fifth International Polar Year (IPY5, 2032–2033) will generate a wealth of new observational data. ORCAS has an opportunity right now to help shape what gets measured, and how, so that those datasets are as useful as possible for training and evaluating the next generation of AI prediction systems.

Polar prediction needs observations spanning the full Earth System. The MOSAiC expedition showed what coordinated, multi-disciplinary data collection can look like. ORCAS is working to ensure the next generation of polar campaigns delivers datasets that are ready for data-driven prediction systems (Figure credit: AWI/MOSAiC Consortium)

A "noisy revolution"

One of the workshop's opening presentations described the current state of AI weather forecasting as a "noisy revolution." New AI systems are emerging quickly, often delivering forecasts faster and more cheaply than physics-based models, and sometimes with better accuracy. Specific sea ice tools, such as MET-AICE and ConvLSTM, are demonstrating real skill in predicting sea ice concentration and extent. Generative AI models are even beginning to show promise in maintaining physical consistency across related variables and ensuring, for example, that predictions of ice thickness are consistent with predictions of ice motion.

The community's focus is shifting beyond the simple measure of forecast skill to a more fundamental question: does the model behave physically? If you stress-test it by introducing perturbations, e.g., modifying the sea ice initial state, feeding it an extreme year, or masking out key inputs, does it still respond in physically plausible ways? Rather than duplicating the forecast skill intercomparisons already conducted by initiatives like the Sea Ice Outlook or SIPN South, ORCAS aims to complement them by probing physical realism through perturbation and consistency experiments. The ultimate goal, as one participant put it, is to build trust, not just measure skill.

The ORCAS Community

A majority of the participants identified as early-career, reflecting how strongly the shift towards AI in polar science is being driven by the next generation. Registrations were received from all six inhabited continents, and the growing network represents a broad community, with expertise spanning AI model developers, field observationalists, researchers in satellite remote sensing, and those working at the interface between these communities.

The discussions revealed a two-way need that ORCAS is well-placed to address. Field scientists want to understand what AI systems can and cannot do, so they can make a stronger case for the value of their measurements. AI developers, meanwhile, often don't know what observational data exists, and they lack standardised frameworks to validate their models against physical reality rather than just skill scores. ORCAS can provide a community space for exchanging ideas and designing frameworks to shape both modelling and observation strategies.

The challenge of connecting observations to AI

A core scientific thread running through the workshop was the difficulty of using field campaign data to train or validate AI prediction systems. Rich, process-level measurements, such as those collected during the extraordinary MOSAiC expedition, which saw a research vessel drifting in Arctic sea ice for an entire year, are not routinely incorporated into these architectures. The challenge is that AI systems typically need long, continuous records to learn from, and most field campaigns produce short, episodic datasets that do not meet that bar on their own.

Participants argued that campaign data should be understood as a source of process benchmarks, i.e., high-value datasets curated around specific physical themes (thermodynamics, ice dynamics, snow-ice interactions) that can be used to stress-test AI models and identify where they go wrong. For example, the MOSAiC expedition generated AI-derived products such as automated ice deformation fields from ship radar and snow classification from micro-penetrometer profiles. These datasets can be used for validation even if the originals are too sparse or short for training.

The guiding principle that emerged was "collection, not perfection": a well-described, unified dataset format that supports physical consistency testing.

Sea ice thickness emerged as a major data gap. Existing products from satellites, models, and field campaigns are fragmented and inconsistently formatted, and real-time thickness data is particularly limited. In this context, the Antarctic presents special challenges as retrieval algorithms tuned for the Arctic don't transfer well to the Southern Ocean, where complex snow layers make thickness estimates much more uncertain.

A diverse future for AI sea ice prediction

Participants were clear that the future of AI sea ice prediction is not a single large model that does everything. User needs are simply too varied: a search-and-rescue team needs high-resolution, local forecasts hours ahead, while a national ice service needs regional predictions days to weeks out, and climate services need seasonal outlooks at the basin scale. The consensus leaned toward a diverse ecosystem of tailored AI systems, with ORCAS providing the common data and evaluation frameworks within which they can all be compared and tested.

*User needs for sea ice forecasts are diverse, spanning different variables, spatial scales and time frames. This heterogeneity calls for a diverse ecosystem of tailored prediction systems.*

There was excitement about emerging approaches that could help bridge the gap between sparse polar observations and the gridded fields that AI systems typically need. Graph neural networks, for example, can learn to map irregular, heterogeneous observations, e.g., buoys, campaign transects, satellite swaths, onto gridded sea ice fields, without requiring everything to be blended into a continuous product first. Direct Observation Prediction, which trains AI to predict future observations (like brightness temperature) rather than processed sea ice retrievals, was discussed as a promising complement to reanalysis-based approaches.

Next steps and community engagement

The workshop ended with a clear sense of direction and a list of next steps. A community mailing list is being established to keep the wider registered community connected. Regular online check-ins, roughly every six months, will share progress and new developments. A PCAPS webinar series will showcase new models, datasets, and field campaign results. Over the longer term, ORCAS will build a web-based guidance platform, a community hub for tools, datasets, tutorials, and a contributor registry that will outlast the formal working group.

Above all, the workshop demonstrated that there is a globally distributed, early-career-led, interdisciplinary community ready to work together on these questions.

ORCAS (SCOR WG 173 / PCAPS Task Team) is co-chaired by Clare Eayrs (NYU), Lorenzo Zampieri (ECMWF), and Malte Müller (Norwegian Meteorological Institute). The workshop report and session recordings are available via the ORCAS mailing list. To get involved, please sign up for the mailing list.

ORCAS community workshop highlights

Why ORCAS, and why now?

A "noisy revolution"

The ORCAS Community

The challenge of connecting observations to AI

A diverse future for AI sea ice prediction

Next steps and community engagement

Polar Coupled Analysis and Prediction for Services

ORCAS community workshop highlights

Why ORCAS, and why now?

A "noisy revolution"

The ORCAS Community

The challenge of connecting observations to AI

A diverse future for AI sea ice prediction

Next steps and community engagement

PCAPS Task Teams report considerable progress to meeting their goals

Meet the PCAPS SG: Gunilla Svensson on understanding processes in polar models

Polar Coupled Analysis and Prediction for Services