SIR: Multi-view Inverse Rendering with Decomposable Shadow for Indoor Scenes

SIR: Multi-view Inverse Rendering with Decomposable Shadow for Indoor Scenes

1 The Hong Kong Polytechnic University
2 Laboratory for Artificial Intelligence in Design, HKSAR
Arxiv

Abstract


We propose SIR, an efficient method to decompose differentiable shadows for inverse rendering on indoor scenes using multi-view data, addressing the challenges in accurately decomposing the materials and lighting conditions. Unlike previous methods that struggle with shadow fidelity in complex lighting environments, our approach explicitly learns shadows for enhanced realism in material estimation under unknown light positions. Utilizing posed HDR images as input, SIR employs an SDF-based neural radiance field for comprehensive scene representation. Then, SIR integrates a shadow term with a three-stage material estimation approach to improve SVBRDF quality. Specifically, SIR is designed to learn a differentiable shadow, complemented by BRDF regularization, to optimize inverse rendering accuracy. Extensive experiments on both synthetic and real-world indoor scenes demonstrate the superior performance of SIR over existing methods in both quantitative metrics and qualitative analysis. The significant decomposing ability of SIR enables sophisticated editing capabilities like free-view relighting, object insertion, and material replacement.

Figure 1: Given a set of posed multi-view HDR images of an indoor scene, SIR successfully disentangles the scene appearance into 3D neural fields of shape, global and spatially-varying illumination, soft shadows, and SVBRDFs, which can produce convincing results for several applications such as novel view synthesis, free-viewpoint relighting, object insertion, and material replacement.

Figure 2: The pipeline consists of three phases: 1) In phase 1, we sample a ray with direction v and spatial point x from the given posed HDR images. The geometry network fd learns the signed distance d , and the HDR-radiance network fc learns radiance C . Ray marching is then employed to obtain the surface point x^ . 2) In phase 2, we sample diffuse incoming light Li,d from environment maps E for learning irradiance Iir . We also calculate the specular incoming light Li,s and the pseudo hard shadow ξ . 3) In phase 3, hard shadow Shard is learned using Θh with pseudo ground truth. We then initialize the parameters of Θs using the optimized parameters of Θh . Instance-level BRDF regularizers are applied, and the whole rendering equation is optimized to update the soft shadow Ssoft , albedo A^ , and roughness R^ .

Results

Comparison with Baselines




Results on predicted albedo, roughness and synthetic image. Despite tackling a more challenging task, our SIR outperforms existing methods in decomposing material attributes like albedo and roughness, and more notably, in shadow extraction. our method exhibits a remarkable capability to handle complex lighting conditions in real-world indoor scenes. This ability is crucial for achieving precise and reliable material estimations for inverse rendering.


Ablation Study

Results of ablation study on evaluating the impact of shadow terms, differentiable soft shadow, and albedo regularizer.

Editing Applications

Results of virtual object insertion on synthetic and real-world scenes. Our method generalizes well to synthetic and real-world scenes and consistently produces realistic appearance and shadows.

Citation


@misc{wei2024sir,
      title={SIR: Multi-view Inverse Rendering with Decomposable Shadow for Indoor Scenes}, 
      author={Xiaokang Wei and Zhuoman Liu and Yan Luximon},
      year={2024},
      eprint={2402.06136},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Paper


SIR: Multi-view Inverse Rendering with Decomposable Shadow for Indoor Scenes

Xiaokang Wei, Zhuoman Liu, Yan Luximon

description arXiv
insert_comment BibTeX