NHacker Next
login
▲AllTracker: Efficient Dense Point Tracking at High Resolutionalltracker.github.io
94 points by lnyan 21 hours ago | 10 comments
Loading comments...
upghost 16 hours ago [-]
> The utility of optical flow (i.e., the instantaneous velocity of pixels [16]) toward this goal has long been obvious, yet it has remained challenging to upgrade flows into long-range tracks.

This sentence from the paper makes me feel a little bad that I don't understand why this goal is obvious. I am not tracking why we are tracking pixels.

Is this basically a competing technology with YOLO[1] or SAM[2]?

[1]: https://en.m.wikipedia.org/wiki/You_Only_Look_Once

[2]: https://ai.meta.com/sam2/

Edit: added annotations, should've done that initially

markisus 16 hours ago [-]
Back in my earlier days working on autonomous vehicles, I dreamed of something like this.

The issue with bounding boxes is missed detections, occlusions, and impoverished geometrical information. But if you have a hundred points being stably tracked on an object, it's now much easier to keep tracking it through partial occlusions, figure out its 3D geometry and kinematics, and even re-identify it coming in and out of occlusion.

daemonologist 16 hours ago [-]
No, this performs the same task as CoTracker or TAPIR, but intended for running at a higher resolution. Point tracking is useful both for keeping track of the position of a target and for "inside-out" positioning of the camera.

YOLO is mostly concerned with detecting objects of certain classes in a single image, and SAM is concerned with essentially classifying pixels as belonging to an object or not.

ipsum2 10 hours ago [-]
Using optical flow for point tracking is obvious, not that the goal of tracking pixels is obvious.

Regarding your actual question, there's many use cases.

- Tracking players or balls in sports

- Surveillance

upghost 1 hours ago [-]
Thanks, ok. I think that was this missing piece for me -- we are not just tracking points/pixels but related points/pixels that taken together constitute an entity or identity. Yes I can see how that would be quite useful.
sheepscreek 16 hours ago [-]
I’m not remotely familiar with either YOLO or SAM, but want to add my own question here. Does the utility of this invention have something to do with the tracking of subjects, like auto-focus for cameras and robotics (to keep the subject in view)?
upghost 16 hours ago [-]
Apologies, jargon meanings updated.
jcims 14 hours ago [-]
Object segmentation and tracking is such a natural and 'automatic' part of our visual perception that it's difficult to intuit how challenging it is to do with software.
thom 4 hours ago [-]
It takes freshly deployed humans a while to master, and only then with fairly high bandwidth training data, so I wouldn’t feel too bad about the complexity of implementing it in software.
jauntywundrkind 19 hours ago [-]
Crazy slick results. Nicely done team!