2026.05.15 / School Project / CS675 / Group project / Computer vision

Adapting Vision GNN SAR Iceberg Imagery Classification

A group project for CS675 — we explored Vision Graph Neural Networks (ViG) for patch-level classification on Antarctic SAR imagery.

Abstract

Increasing our understanding of iceberg and glacier processes — calving, drifting, fragmentation, and melting — is essential to climate science, improved climate modeling, and iceberg dynamics. These processes provide indicators of global temperature shifts, sea level rise, and ecosystem changes.

Synthetic Aperture Radar (SAR) has become a widely used tool for monitoring these phenomena due to its ability to capture high-resolution images regardless of weather or lighting conditions. Recent advances in deep learning have enabled automated analysis of SAR imagery, with Convolutional Neural Networks (CNNs) commonly used for detection and classification. However, CNNs are inherently limited by their grid-based structure, which may not fully capture the irregular spatial relationships present in SAR imagery.

Vision Graph Neural Networks (ViG) provide an alternative by representing images as graphs of interconnected patches, allowing more flexible modeling of spatial dependencies. In this work, we investigate whether ViG can effectively perform patch-level classification on Antarctic SAR imagery and create a foundation for future work in understanding iceberg dynamics.

SARViG repository ↗

Training strategies

We compared two ways to adapt the same ImageNet-pretrained ViG model for dense patch-level classification — same architecture, same grid of per-cell class predictions, only the training pipeline differed. The Baseline fine-tunes directly on labeled SAR patches with cross-entropy loss, learning SAR-specific features from supervision alone. The Deluxe strategy adds a self-supervised pretraining stage on unlabeled SAR data first, using a pixel reconstruction objective (MSE) to nudge the model toward SAR texture and intensity patterns, then runs the same supervised fine-tuning step. The idea is to separate representation learning from class discrimination — though reconstruction can favor pixel fidelity over semantic separation, which may hurt when classes like ice and ocean look visually similar.

Report

Slides

Open report Open slides GitHub