Reliable copyright protection for 3D assets requires watermark verification across arbitrary viewpoints. However, existing evaluations rely on dataset splits or ad-hoc camera samplings that overlook failure cases.
We introduce Multi-Shell Viewpoint Sampling (MSVS), which ensures uniform, distance-aware coverage via concentric, visibility-bounded shells and spherical sampling. MSVS exposes significantly low bit accuracy even on high-quality renderings that an adversary would prefer.
Based on this evaluation standard, we further propose a greedy subsampling strategy that selects training views guided by a locality-aware kernel. For 3D-GSW, greedy subsampling improves MSVS bit accuracy by $+0.020$ on the Blender dataset and $+0.056$ on the Stanford-ORB dataset over random selection, and the gains persist under common image attacks.
MSVS establishes a comprehensive benchmark for 3D watermark evaluation, while greedy subsampling provides an efficient strategy to enhance watermark protection.
GuardSplat cameras are generated using the official implementation’s default parameters : $r = 4.031128874, \theta \in [-180^\circ, 180^\circ], \phi = -30^\circ$ (spherical coordinates; $r$ radius, $\theta$ azimuth, $\phi$ elevation).
Neither strategy covers the viewpoint space comprehensively,
In all cases, bit accuracy on the MSVS set is much lower than that on the dataset-provided test set, demonstrating the effectiveness of MSVS in exposing failure cases that an adversary would prefer.
| Method-Metric | PSNR↑ | SSIM↑ | LPIPS↓ | Bit Accuracy (Test)↓ | Bit Accuracy (MSVS)↓ |
|---|---|---|---|---|---|
| 3D-GSW | 34.77 | 0.981 | 0.023 | 0.951 | 0.764 |
| + Random | 32.59 | 0.971 | 0.032 | 0.975 | 0.832 |
| + Greedy (Ours) | 32.37 | 0.970 | 0.033 | 0.975 | 0.852 |
| Method-Metric | PSNR↑ | SSIM↑ | LPIPS↓ | Bit Accuracy (Test)↓ | Bit Accuracy (MSVS)↓ |
|---|---|---|---|---|---|
| 3D-GSW | 35.75 | 0.976 | 0.027 | 0.996 | 0.828 |
| + Random | 35.51 | 0.974 | 0.026 | 1.000 | 0.851 |
| + Greedy (Ours) | 34.99 | 0.972 | 0.029 | 1.000 | 0.907 |
The proposed greedy subsampling strategy significantly improves bit accuracy on the MSVS set compared to random selection, demonstrating its effectiveness in enhancing protection where it matters.
| Method-Metric | PSNR↑ | SSIM↑ | LPIPS↓ | Bit Accuracy (Test)↓ | Bit Accuracy (MSVS)↓ |
|---|---|---|---|---|---|
| GuardSplat | 27.15 | 0.931 | 0.054 | 0.653 | 0.557 |
| + Random | 26.91 | 0.927 | 0.058 | 0.706 | 0.599 |
| + Greedy (Ours) | 26.92 | 0.927 | 0.058 | 0.705 | 0.601 |
| Method-Metric | PSNR↑ | SSIM↑ | LPIPS↓ | Bit Accuracy (Test)↓ | Bit Accuracy (MSVS)↓ |
|---|---|---|---|---|---|
| GuardSplat | 36.19 | 0.978 | 0.024 | 0.665 | 0.578 |
| + Random | 35.31 | 0.976 | 0.027 | 0.711 | 0.647 |
| + Greedy (Ours) | 35.34 | 0.976 | 0.027 | 0.711 | 0.650 |
| Method-Metric | PSNR↑ | SSIM↑ | LPIPS↓ | Bit Accuracy (Test)↓ | Bit Accuracy (MSVS)↓ |
|---|---|---|---|---|---|
| GuardSplat | 27.43 | 0.922 | 0.070 | 0.845 | 0.652 |
| + Random | 27.19 | 0.917 | 0.077 | 0.884 | 0.686 |
| + Greedy (Ours) | 27.20 | 0.918 | 0.077 | 0.882 | 0.691 |
| Method-Metric | PSNR↑ | SSIM↑ | LPIPS↓ | Bit Accuracy (Test)↓ | Bit Accuracy (MSVS)↓ |
|---|---|---|---|---|---|
| GaussianMarker | 35.68 | 0.970 | 0.038 | 0.803 | 0.623 |
| + Random | 35.02 | 0.965 | 0.043 | 0.848 | 0.668 |
| + Greedy (Ours) | 35.11 | 0.966 | 0.042 | 0.844 | 0.669 |
The gains in bit accuracy from greedy subsampling persist under common image attacks.
| Method-Attack | None | Noise ($\sigma = 0.1$) | JPEG ($50 \%$) | Scaling ($75 \%$) | Blur ($\sigma = 0.1$) |
|---|---|---|---|---|---|
| 3D-GSW | 0.764 | 0.631 | 0.726 | 0.727 | 0.764 |
| + Random | 0.832 | 0.678 | 0.789 | 0.794 | 0.832 |
| + Greedy (Ours) | 0.852 | 0.685 | 0.806 | 0.816 | 0.852 |
| Method-Attack | None | Noise ($\sigma = 0.1$) | JPEG ($50 \%$) | Scaling ($75 \%$) | Blur ($\sigma = 0.1$) |
|---|---|---|---|---|---|
| 3D-GSW | 0.828 | 0.634 | 0.810 | 0.842 | 0.828 |
| + Random | 0.851 | 0.665 | 0.831 | 0.861 | 0.851 |
| + Greedy (Ours) | 0.907 | 0.688 | 0.888 | 0.914 | 0.907 |
| Method-Attack | None | Noise ($\sigma = 0.1$) | JPEG ($50 \%$) | Scaling ($75 \%$) | Blur ($\sigma = 0.1$) |
|---|---|---|---|---|---|
| GuardSplat | 0.557 | 0.574 | 0.562 | 0.558 | 0.557 |
| + Random | 0.599 | 0.598 | 0.599 | 0.601 | 0.599 |
| + Greedy (Ours) | 0.601 | 0.597 | 0.600 | 0.602 | 0.601 |
| Method-Attack | None | Noise ($\sigma = 0.1$) | JPEG ($50 \%$) | Scaling ($75 \%$) | Blur ($\sigma = 0.1$) |
|---|---|---|---|---|---|
| GuardSplat | 0.578 | 0.585 | 0.583 | 0.578 | 0.578 |
| + Random | 0.647 | 0.616 | 0.636 | 0.648 | 0.647 |
| + Greedy (Ours) | 0.650 | 0.616 | 0.639 | 0.651 | 0.650 |
| Method-Attack | None | Noise ($\sigma = 0.1$) | JPEG ($50 \%$) | Scaling ($75 \%$) | Blur ($\sigma = 0.1$) |
|---|---|---|---|---|---|
| GaussianMarker | 0.652 | 0.509 | 0.555 | 0.616 | 0.652 |
| + Random | 0.686 | 0.511 | 0.573 | 0.644 | 0.686 |
| + Greedy (Ours) | 0.691 | 0.511 | 0.575 | 0.649 | 0.691 |
| Method-Attack | None | Noise ($\sigma = 0.1$) | JPEG ($50 \%$) | Scaling ($75 \%$) | Blur ($\sigma = 0.1$) |
|---|---|---|---|---|---|
| GaussianMarker | 0.623 | 0.499 | 0.541 | 0.625 | 0.623 |
| + Random | 0.668 | 0.501 | 0.568 | 0.663 | 0.668 |
| + Greedy (Ours) | 0.669 | 0.501 | 0.567 | 0.666 | 0.669 |
GuardSplat attains very high training set bit accuracy but much lower test set accuracy, indicating its limited generalization in spatial watermark coverage. Therefore, adding a new training viewpoint mainly boosts accuracy at that viewpoint, with little benefit transferring to nearby views. As a result, most MSVS viewpoints yield similar marginal gains, so the choice of subsampling strategy matters little.
GaussianMarker exhibits substantially lower training-set bit accuracy than the other methods, indicating that it sometimes fails to embed a watermark even at training viewpoints. In such cases, adding a viewpoint does not reliably improve its bit accuracy, which violates the monotonic-improvement assumption of our kernel model that multiplies $b_S (x)$ by $k(d) \ge 1$. This mismatch reduces the advantage of the greedy algorithm.
By contrast, 3D-GSW trains reliably and generalizes better across neighboring views, so greedy subsampling yields large improvements on MSVS.
These are some example renderings with low bit accuracy from MSVS viewpoints. They are rendered by each model trained with the original training set (no extra MSVS views were used for training).
The rendering quality of low-texture objects is only moderate, which suggests optimization difficulty and may partly explain the low bit accuracy. However, even well-textured objects with high rendering quality can exhibit low bit accuracy. This suggests that existing methods, when trained on splits curated for novel view synthesis, generalize well in terms of image quality for unseen views but do not achieve equally broad watermark coverage.