▲Reviving Papers with Codepaperswithcode.co

172 points by nielz_r 2 days ago | 41 comments

jeffreysmith 6 hours ago [-]

I played a somewhat unusual role in this whole story. I was the guy who acquired in the original Papers with Code and managed them after they joined Facebook/Meta.

It was super sad to see FB/M abandon the original mission of what PwC was building towards and let the original community resource rot. During the good times, we always talked about how PwC related to HF. So, I think there is a sort poetry to PwC winding up as part of HF, where they probably always belonged. No company is perfect, but HF has been a better than average steward of open source and community resources.

For the younger folks on this thread, you probably have no real feel for just how frustratingly inefficient AI/ML research used to be before people like Robert and Ross of PwC came along to start to bring structure, sanity, and reproducibility to the information needed to work of this kind. And of course, Clem, Julien, and Thomas of HF kicked off an even bigger effort to tame the previously scattered workflow of open AI research into some sort of sane stack.

It's clear that, in 2026, what PwC could be is something much more evolved than what we were able to do back in the day. LLMs + PwC is a huge design space. I hope nielz_r and friends at HF are able to make something truly useful for the community. AI research has both gotten way easier and much harder. e.g. We have a Fable, but Anthro won't let us use it forward our science. Community resources for research are still very much needed.

Best of luck Son of PwC. May you thrive.

peterfirefly 3 hours ago [-]

PricewaterhouseCoopers? Hydrofluoric acid?

knicholes 3 hours ago [-]

Are you a bot? Look in the title. Papers with code.

peterfirefly 1 hours ago [-]

No, but I hate TLA overuse. Don't you?

"Are you a bot" is an obvious slur when it's easy to see for yourself that I'm not.

wahnfrieden 50 minutes ago [-]

It’s not overuse when first use in context is spelled out.

camdenreslink 3 hours ago [-]

To be fair PwC is a very well established initialism. It would be like saying HTTP but actually referring to something other than the well understood meaning of that thing.

stronglikedan 1 minutes ago [-]

> To be fair PwC is a very well established initialism.

If that were the case, I'd expect to be able to learn what it stands for in the first page of google results, but alas...

And you know what is on the front page? PricewaterhouseCoopers

minimaxir 2 hours ago [-]

The context of the PwC acronym is extremely unambigious in the comment.

throawayonthe 2 hours ago [-]

what's the well established initialism?

https://en.wikipedia.org/wiki/PWC is it one of these?

nielz_r 2 days ago [-]

Hi,

Niels here from the open-source team at Hugging Face. Like many others, I was a huge fan of paperswithcode, a website which allowed to easily find the state-of-the-art (SOTA) across any domain of AI, from computer vision to language models to time-series forecasting. Sadly, that website is no longer maintained after its acquisition by Meta.

Hence, I've been working on reviving it. I obviously use AI agents to parse papers at scale and automatically generate leaderboards (for now I'm the one verifying results). So far, I've only parsed high-impact papers for which I know they're SOTA, like Qwen 3.5 and 3.6, RF-DETR for object detection, DINOv3, SOTA embedding models from the MTEB leaderboard, the Open ASR Leaderboard for automatic speech recognition models, etc.

For now, it includes the following:

> trending papers by default based on Github star velocity

> categorization by domain, e.g., [OCR](https://paperswithcode.co/tasks/ocr)

> methods, popular techniques used across AI papers, which PwC used to have as well, like [RLVR](https://paperswithcode.co/methods/rlvr) and

> eval results for high-impact papers, see e.g., Qwen 3.5 at the bottom

> leaderboards for each domain, e.g., MMTEB or COCO val 2017

> conferences, like [CVPR 2026](https://paperswithcode.co/conferences/cvpr-2026)

> support for citation counts (you can also see the most cited papers by domain!)

> automated linked Github, project page URLs, and artifacts (+ multiple repos are supported on a paper page)

> support for external papers beyond Arxiv, see e.g., [DeepSeek v4](https://paperswithcode.co/paper/82956)

> Harness reports for coding agent benchmarks, e.g., Terminal Bench

> "Sign in with HF" and Storage Buckets are used to store humbnails, paper PDFs, and overall data backups.

I'm curious about your feedback + feature requests!

Try it at https://paperswithcode.co

cyril_st_john 7 hours ago [-]

Any interest in expanding it beyond just AI papers? "Papers with Code" sounds like it could be much more broad than it currently is. I was excited to browse the "All Domains" section until I realized only AI topics are covered - just because so many of the papers that are relevant to my work would not fall under any of these categories.

vjsrinivas 8 hours ago [-]

What will happen to Huggingface's Trending Paper page? Its been my alternative since PWC closed, but they seem to have a level of overlap.

addandsubtract 5 hours ago [-]

> Sadly, that website is no longer maintained after its acquisition by Meta.

Wait, I thought it was aquired by Huggingface, because that's where the domain points to: https://huggingface.co/papers/trending

Anyway, as a huge fan of PWC, I'm glad to see it revived! One of the main annoyances of the old PWC was the searchability / discoverability of papers. I hope that now, you can create embeddings of the paper (summaries) to improve the search and make finding related papers easier.

somethingsome 7 hours ago [-]

Hi! Thanks for the effort!

It would be lovely to parse which datasets/benchmarks were used in the comparisons and select papers by dataset!

In many fields the datasets vary greatly depending on the subfield and its very difficult to find what other benchmarks could be used.

2ap 9 hours ago [-]

This is great. To test it out I just submitted one of my papers on medRXiv and it was super straightforward to do.

Ajoha 10 hours ago [-]

Hi, this is really interesting and I’ll pin that URL. :) Is there something similar for papers regarding psychology, neuroscience and tech?

barrenko 8 hours ago [-]

Tho I am not sure, try checking out the Huggingface's dedicated science discord.

caldarons 10 hours ago [-]

This is great work, thank you!

One feature I would love is to get notified via email when new papers are added (or periodically, once a week/daily).

abidlabs 3 hours ago [-]

You could subscribe to daily papers at the top here: https://huggingface.co/papers/trending

adithyaharish 9 hours ago [-]

This is great work, keep it going

adithyaharish 9 hours ago [-]

I have pinned the url and added it to be tab group bookmark

wanderlust123 10 hours ago [-]

Thank you, I think this is a noble effort. Knowledge is being created at a rapid rate and being able to surface interesting stuff is valuable.

sairali123 9 hours ago [-]

fr fr

Sharlin 7 hours ago [-]

Shame about the name, it feels better suited to a more general curated repo/hall-of-fame of papers in any field that come with easily rerunnable code to reproduce the paper’s results, or try out different datasets, or similar.

jekude 7 hours ago [-]

started doing this but life got in the way, would love to pick it back up at some point

https://github.com/planetlambert/turing

imadr 5 hours ago [-]

Is no one tired else of these repetitive, obviously Claude-made webdesigns?

Zopieux 3 hours ago [-]

Everyone is, see https://vorpus.github.io/performativeUI/#/components/status-... (satire of obvious slop-UI building blocks) and https://news.ycombinator.com/item?id=48445554

quibono 8 hours ago [-]

This is a bit off-topic (though tangentially related) - does anyone remember a similar blog where the author would do something like a "5 minute paper" review, i.e. they'd discuss findings and try to communicate the main point? It was usually a paper per week, mostly CompSci / maths papers IIRC

henrythewasp 8 hours ago [-]

Was it "The morning paper" by Adrian Colyer? - https://blog.acolyer.org/

ndr 7 hours ago [-]

Thank you for resurfacing this, it has been my daily commute read for years, it was great!

quibono 6 hours ago [-]

Yes, exactly this! Thank you!

addandsubtract 5 hours ago [-]

There is/was also 2-minute papers[0], but the videos lost their plot a couple of years ago.

[0] https://www.youtube.com/@TwoMinutePapers

lalaland1125 6 hours ago [-]

The fact that this even needs to exist means that our academic conferences aren't prioritizing the right things. Papers without code should be desk rejected.

nicce 6 hours ago [-]

And many of the papers in medical area are published with closed data because collecting that data is so expensive and everyone wants to hold onto it. Nobody can verify the results. Yet they are marked as "peer reviewed".

steinvakt2 9 hours ago [-]

Yes please! I have been frustrated with the state of object detection models especially. Everyone claims SOTA. So you end up having to test manually to find out which one actually is. And unlike LLM's, it should be pretty easily quantifiable.

jamoio 9 hours ago [-]

Is there an RSS feed?

marcindulak 3 hours ago [-]

It does not seem they offer a web feed.

I asked for Atom/RSS about a year ago https://github.com/paperswithcode/paperswithcode-data/issues..., when the original paperswithcode feed disappeared.

It's a kind of coincidence that my own Atom feed of arXiv papers with code went live a few days ago https://code-available-feed.github.io/code-available-feed/.

It's independenent from Hugginface, and uses arXiv API directly. The code is open source, Apache 2. No AI is involved in parsing the arXiv pdfs, only a pattern match for known code-hosting domains like github.com or gitlab.com. This means there false positives, but that's probably fine, and their patterns could be tuned later.

You can host your own feed, by forking the repo and setting GitHub Actions environment variables. For now I'm hosting cs.AI and cs.SD categories.

abidlabs 2 hours ago [-]

Would you be interested in an RSS feed for: https://huggingface.co/papers/trending?

marcindulak 2 hours ago [-]

https://huggingface.co/papers/ and https://paperswithcode.co/ are two different projects, right?

I don't get the idea of "trending" for scientific papers. I would like to see the full list of papers that have associated source code from the given research field, like the arXiv cs.SD category. The reason to see all papers is to get an overview of the field.

kozzion 9 hours ago [-]

Bring it back! Sing it back!

Loading comments...

jeffreysmith 6 hours ago [-]

I played a somewhat unusual role in this whole story. I was the guy who acquired in the original Papers with Code and managed them after they joined Facebook/Meta.

Best of luck Son of PwC. May you thrive.

peterfirefly 3 hours ago [-]

PricewaterhouseCoopers? Hydrofluoric acid?

knicholes 3 hours ago [-]

Are you a bot? Look in the title. Papers with code.

peterfirefly 1 hours ago [-]

No, but I hate TLA overuse. Don't you?

"Are you a bot" is an obvious slur when it's easy to see for yourself that I'm not.

wahnfrieden 50 minutes ago [-]

It’s not overuse when first use in context is spelled out.

camdenreslink 3 hours ago [-]

To be fair PwC is a very well established initialism. It would be like saying HTTP but actually referring to something other than the well understood meaning of that thing.

stronglikedan 1 minutes ago [-]

> To be fair PwC is a very well established initialism.

If that were the case, I'd expect to be able to learn what it stands for in the first page of google results, but alas...

And you know what is on the front page? PricewaterhouseCoopers

minimaxir 2 hours ago [-]

The context of the PwC acronym is extremely unambigious in the comment.

throawayonthe 2 hours ago [-]

what's the well established initialism?

https://en.wikipedia.org/wiki/PWC is it one of these?

nielz_r 2 days ago [-]

Hi,

For now, it includes the following:

> trending papers by default based on Github star velocity

> categorization by domain, e.g., [OCR](https://paperswithcode.co/tasks/ocr)

> methods, popular techniques used across AI papers, which PwC used to have as well, like [RLVR](https://paperswithcode.co/methods/rlvr) and

> eval results for high-impact papers, see e.g., Qwen 3.5 at the bottom

> leaderboards for each domain, e.g., MMTEB or COCO val 2017

> conferences, like [CVPR 2026](https://paperswithcode.co/conferences/cvpr-2026)

> support for citation counts (you can also see the most cited papers by domain!)

> automated linked Github, project page URLs, and artifacts (+ multiple repos are supported on a paper page)

> support for external papers beyond Arxiv, see e.g., [DeepSeek v4](https://paperswithcode.co/paper/82956)

> Harness reports for coding agent benchmarks, e.g., Terminal Bench

> "Sign in with HF" and Storage Buckets are used to store humbnails, paper PDFs, and overall data backups.

I'm curious about your feedback + feature requests!

Try it at https://paperswithcode.co

cyril_st_john 7 hours ago [-]

vjsrinivas 8 hours ago [-]

What will happen to Huggingface's Trending Paper page? Its been my alternative since PWC closed, but they seem to have a level of overlap.

addandsubtract 5 hours ago [-]

> Sadly, that website is no longer maintained after its acquisition by Meta.

Wait, I thought it was aquired by Huggingface, because that's where the domain points to: https://huggingface.co/papers/trending

somethingsome 7 hours ago [-]

Hi! Thanks for the effort!

It would be lovely to parse which datasets/benchmarks were used in the comparisons and select papers by dataset!

In many fields the datasets vary greatly depending on the subfield and its very difficult to find what other benchmarks could be used.

2ap 9 hours ago [-]

This is great. To test it out I just submitted one of my papers on medRXiv and it was super straightforward to do.

Ajoha 10 hours ago [-]

Hi, this is really interesting and I’ll pin that URL. :) Is there something similar for papers regarding psychology, neuroscience and tech?

barrenko 8 hours ago [-]

Tho I am not sure, try checking out the Huggingface's dedicated science discord.

caldarons 10 hours ago [-]

This is great work, thank you!

One feature I would love is to get notified via email when new papers are added (or periodically, once a week/daily).

abidlabs 3 hours ago [-]

You could subscribe to daily papers at the top here: https://huggingface.co/papers/trending

adithyaharish 9 hours ago [-]

This is great work, keep it going

adithyaharish 9 hours ago [-]

I have pinned the url and added it to be tab group bookmark

wanderlust123 10 hours ago [-]

Thank you, I think this is a noble effort. Knowledge is being created at a rapid rate and being able to surface interesting stuff is valuable.

sairali123 9 hours ago [-]

fr fr

Sharlin 7 hours ago [-]

jekude 7 hours ago [-]

started doing this but life got in the way, would love to pick it back up at some point

https://github.com/planetlambert/turing

imadr 5 hours ago [-]

Is no one tired else of these repetitive, obviously Claude-made webdesigns?

Zopieux 3 hours ago [-]

Everyone is, see https://vorpus.github.io/performativeUI/#/components/status-... (satire of obvious slop-UI building blocks) and https://news.ycombinator.com/item?id=48445554

quibono 8 hours ago [-]

henrythewasp 8 hours ago [-]

Was it "The morning paper" by Adrian Colyer? - https://blog.acolyer.org/

ndr 7 hours ago [-]

Thank you for resurfacing this, it has been my daily commute read for years, it was great!

quibono 6 hours ago [-]

Yes, exactly this! Thank you!

addandsubtract 5 hours ago [-]

There is/was also 2-minute papers[0], but the videos lost their plot a couple of years ago.

[0] https://www.youtube.com/@TwoMinutePapers

lalaland1125 6 hours ago [-]

The fact that this even needs to exist means that our academic conferences aren't prioritizing the right things. Papers without code should be desk rejected.

nicce 6 hours ago [-]

steinvakt2 9 hours ago [-]

jamoio 9 hours ago [-]

Is there an RSS feed?

marcindulak 3 hours ago [-]

It does not seem they offer a web feed.

I asked for Atom/RSS about a year ago https://github.com/paperswithcode/paperswithcode-data/issues..., when the original paperswithcode feed disappeared.

It's a kind of coincidence that my own Atom feed of arXiv papers with code went live a few days ago https://code-available-feed.github.io/code-available-feed/.

You can host your own feed, by forking the repo and setting GitHub Actions environment variables. For now I'm hosting cs.AI and cs.SD categories.

abidlabs 2 hours ago [-]

Would you be interested in an RSS feed for: https://huggingface.co/papers/trending?

marcindulak 2 hours ago [-]

https://huggingface.co/papers/ and https://paperswithcode.co/ are two different projects, right?

kozzion 9 hours ago [-]

Bring it back! Sing it back!