# Depth Estimation → Geometry **Track:** Creative ML & AI-in-the-Loop — Advanced Creative Coding — proposed (50) **Framework / surface:** three.js **Level:** Hard **Prerequisites:** Run a Model Client-Side, Procedural Vertex Deformation **In one line:** Monocular depth turning a webcam image into 3D displacement. ## Theory, aesthetics & inspiration Monocular depth estimation asks a network to infer what one eye cannot measure: per-pixel distance from a single flat image. Depth Anything V2, trained on massive synthetic and pseudo-labeled data, returns a dense, stable depth map; read as a heightfield, it displaces a mesh so a photograph becomes relief. The aesthetic territory is 2.5D—parallax, extrusion, a portrait pushed toward sculpture—where the camera's loss of depth is hallucinated back as form. The approach descends from MiDaS (Ranftl and colleagues at Intel), which established robust cross-dataset depth. Coupled with WebGL displacement, an ordinary webcam frame becomes navigable terrain, lit and rotated like geometry rather than viewed like an image.