Poppy: Polarization-Based Plug-and-Play Guidance for Enhancing Surface Normal Estimation

Stony Brook University  ·  Photon Intelligence Lab

Abstract

Monocular surface normal estimators trained on large-scale RGB-normal data often perform poorly in the edge cases of reflective, textureless, and dark surfaces. Polarization encodes surface orientation independently of texture and albedo, offering a physics-based complement for these cases. Existing polarization methods, however, require multi-view capture or specialized training data, limiting generalization. We introduce Poppy, a training-free framework that refines normals from any frozen RGB backbone using single-shot polarization measurements at test time. Keeping backbone weights frozen, Poppy optimizes per-pixel offsets to the input RGB and output normal along with a learned reflectance decomposition. A differentiable rendering layer converts the refined normals into polarization predictions and penalizes mismatches with the observed signal. Across seven benchmarks and three backbone architectures (diffusion, flow, and feed-forward), Poppy reduces mean angular error by 23–26% on synthetic data and 6–16% on real data. These results show that guiding learned RGB-based normal estimators with polarization cues at test time refines normals on challenging surfaces without retraining.

Method

Poppy pipeline

Overview of the Poppy framework. Given a single RGB-polarization image pair, Poppy refines the normals from a frozen RGB backbone by optimizing input/output offsets and a reflectance decomposition, supervised by a differentiable polarization rendering loss.

Results

Input
Normal
Error Map
NeRSP
Image Polar
ImagePolarization
MoGe MoGe+Poppy
MoGe-2+Poppy
MoGe MoGe+Poppy
MoGe-2+Poppy
PISR
Image Polar
ImagePolarization
MoGe-2 MoGe-2+Poppy
MoGe-2+Poppy
MoGe-2 MoGe-2+Poppy
MoGe-2+Poppy
SfPUEL
Image Polar
ImagePolarization
MoGe-2 MoGe-2+Poppy
MoGe-2+Poppy
MoGe-2 MoGe-2+Poppy
MoGe-2+Poppy

Each row shows a qualitative comparison on a real-world scene from NeRSP, PISR, and SfPUEL. Input: compares the RGB image with its paired polarization image. Normal: compares MoGe-2 surface normals with Poppy-refined normals. Error Map: angular error relative to ground truth (range 0–50°); lower is better. SfPUEL image brightness is enhanced by ×4 for visualization.

Radiance Decomposition Application

Input
Radiance Decomposition
Polarization Decomposition
Metallic Appearance
Recoloring
NeISF
Image Polarization
ImagePolarization
Ls Ld
LsLd
Polar Specular Polar Diffuse
SpecularDiffuse
Metallic
Recoloring
PANDORA
Image Polarization
ImagePolarization
Ls Ld
LsLd
Polar Specular Polar Diffuse
SpecularDiffuse
Metallic
Recoloring

Radiance decomposition applications on real-world scenes from PANDORA and NeISF. Radiance Decomposition: compares the specular (Ls) and diffuse (Ld) components. Polarization Decomposition: compares the polarization specular and diffuse components. Metallic Appearance: isolated specular component with its hue shifted to achieve a metallic appearance. Recoloring: object recolored via the diffuse component. Ls brightness is enhanced by ×2 and polarization diffuse brightness is enhanced by ×4 for visualization.

3D Mesh Reconstruction

Input
Normal
3D Mesh Error Map
NeRSP
Image Polarization
ImagePolarization
MoGe-2 +Poppy
MoGe-2+Poppy
MoGe-2 +Poppy
MoGe-2+Poppy
NeRSP
Image Polarization
ImagePolarization
MoGe-2 +Poppy
MoGe-2+Poppy
MoGe-2 +Poppy
MoGe-2+Poppy
colorbar

3D mesh reconstruction by VCR-GauS on the FROG scene from the NeRSP dataset. Normal: compares MoGe-2 surface normals with Poppy-refined normals. 3D Mesh Error Map: reconstructed 3D mesh colored by signed distance.

BibTeX

BibTeX will be posted upon publication.