Monocular surface normal estimators trained on large-scale RGB-normal data often perform poorly in the edge cases of reflective, textureless, and dark surfaces. Polarization encodes surface orientation independently of texture and albedo, offering a physics-based complement for these cases. Existing polarization methods, however, require multi-view capture or specialized training data, limiting generalization. We introduce Poppy, a training-free framework that refines normals from any frozen RGB backbone using single-shot polarization measurements at test time. Keeping backbone weights frozen, Poppy optimizes per-pixel offsets to the input RGB and output normal along with a learned reflectance decomposition. A differentiable rendering layer converts the refined normals into polarization predictions and penalizes mismatches with the observed signal. Across seven benchmarks and three backbone architectures (diffusion, flow, and feed-forward), Poppy reduces mean angular error by 23–26% on synthetic data and 6–16% on real data. These results show that guiding learned RGB-based normal estimators with polarization cues at test time refines normals on challenging surfaces without retraining.
Overview of the Poppy framework. Given a single RGB-polarization image pair, Poppy refines the normals from a frozen RGB backbone by optimizing input/output offsets and a reflectance decomposition, supervised by a differentiable polarization rendering loss.
Each row shows a qualitative comparison on a real-world scene from NeRSP, PISR, and SfPUEL. Input: compares the RGB image with its paired polarization image. Normal: compares MoGe-2 surface normals with Poppy-refined normals. Error Map: angular error relative to ground truth (range 0–50°); lower is better. SfPUEL image brightness is enhanced by ×4 for visualization.
Radiance decomposition applications on real-world scenes from PANDORA and NeISF. Radiance Decomposition: compares the specular (Ls) and diffuse (Ld) components. Polarization Decomposition: compares the polarization specular and diffuse components. Metallic Appearance: isolated specular component with its hue shifted to achieve a metallic appearance. Recoloring: object recolored via the diffuse component. Ls brightness is enhanced by ×2 and polarization diffuse brightness is enhanced by ×4 for visualization.
3D mesh reconstruction by VCR-GauS on the FROG scene from the NeRSP dataset. Normal: compares MoGe-2 surface normals with Poppy-refined normals. 3D Mesh Error Map: reconstructed 3D mesh colored by signed distance.
BibTeX will be posted upon publication.