POLAR: A Portrait OLAT Dataset and Generative Framework for Illumination-Aware Face Modeling

Abstract

Face relighting aims to synthesize realistic portraits under novel illumination while preserving identity and geometry. However, progress remains constrained by the limited availability of large-scale, physically consistent illumination data.

To address this, we introduce POLAR, a large-scale and physically calibrated One-Light-at-a-Time (OLAT) dataset containing over 200 subjects captured under 156 lighting directions, multiple views, and diverse expressions. Building upon POLAR, we develop a flow-based generative model POLARNet that predicts per-light OLAT responses from a single portrait, capturing fine-grained and direction-aware illumination effects while preserving facial identity.

Unlike diffusion or background-conditioned methods that rely on statistical or contextual cues, our formulation models illumination as a continuous, physically interpretable transformation between lighting states, enabling scalable and controllable relighting. Together, POLAR and POLARNet form a unified illumination learning framework that links real data, generative synthesis, and physically grounded relighting, establishing a self-sustaining “chicken-and-egg’’ cycle for scalable and reproducible portrait illumination.

Method

Given a uniformly lit portrait, the encoder–decoder pair \( (\mathbf{E},\mathbf{D}) \) maps both the input and its target OLAT image into latent space. Latent Bridge Matching learns a continuous, direction-conditioned transport between these endpoints, supervised by the velocity field loss \( \mathcal{L}_{\mathrm{LBM}} \). A conditional U-Net predicts the latent drift \( v{\theta}(z_t, t, c_{\text{dir}}) \) using the encoded light direction as input. During inference, a single forward step transports the latent \( z_u \) toward the illumination-specific latent \( z_l \), enabling efficient generation of per-light OLAT responses for all calibrated directions. These synthesized OLATs can be linearly composed to render realistic relighting under arbitrary HDR environments.