Image 2 (IMAGE)
Caption
The structure of the three-stage encoder for 3D body key-points detection. The encoder consists of a human bounding box detection model, a top-down human pose estimation model, and a 2D-3D key-points lifting model. The encoder takes a video as input and outputs 3D pose key-points.
Credit
SUTD
Usage Restrictions
Image should be used with appropriate caption and credit.
License
Original content