NeurIPS 2025 Area Chair Experience

09-21-202509-21-2025 blog 8 minutes read (About 1136 words) visits

Introduction

This was my second time serving as an Area Chair for NeurIPS since 2024. In this blog post, I would like to share my experience during the paper submission review process.

The Merge of Main Track and Dataset and Benchmark Track Review Processes

Different from previous years, NeurIPS 2024 unified the Main track and the Dataset and Benchmark track for recruiting Area Chairs and reviewers together, as an initiative to align the review process across different tracks and ensure the papers from different tracks are both of high quality.

When I got invitation serving as an Area Chair on OpenReview, however, there was no such information sent to me. A few days after I accepted the invitation, I realized from some notifications that I was assigned to the Dataset and Benchmark track this year, which was surprising to me because the last time I worked on dataset and benchmark was many years ago. I reached out to the Program Chairs asking if I could switch back to the Main track so that I could be more focused on accelerated computing. Unfortunately, the Program Chairs informed me that the track assignments were final and could not be changed. Some of the Program Chairs suggested that it was a result of considering individual backgrounds, self-indicated preferences, and the overall requirements for reviewer distribution across both tracks. Maybe it was some kind of matching or classification algorithm that makes the decision and certainly I did not feel it has done the right job for me. At least I was never asked for my preferences.

Nevertheless, since I was already on the boat, I had to adapt to the new role and serve the community to the best of my abilities.

NeurIPS 2025 Area Chair Experience

From the paper submissions assigned to me, more than 50% of the authors are Chinese and 95% of the reviewers under my supervision are Chinese, although I did not check how many of them were not located in China. Assuming the NeurIPS paper assignment system was not biased by the nationality or primary language, this indicates Jensen Huang’s statement that “50% of AI researchers are Chinese” was not exaggerated.

This year I only got 7 paper submissions assigned to me and the papers on the Dataset and Benchmark track are relatively easier to review and evaluate, compared to the papers on the main track in my opinion. Because of that, I felt less stressed and could better balance my daily job and my volunteer service to the community.

All the reviewers under my supervision were able to submit their reviews almost on time so I did not have to invite any additional emergency reviewers. Because I read the papers in person, I was able to catch some oversights and issues in some of the reviews. Most of the reviewers were responsive to my feedback and made the necessary revisions to their reviews. But there were still a few reviewers who did not follow my guideline throughout the review process, leaving some of the critical topics that I thought of not being discussed.

The NeurIPS OpenReview interface became somewhat similar to the ICML OpenReview interface, which is more complicated and less convenient in my own opinion compared to the one we used last year. There was no “Comment” button in the beginning and I could only point out some minor issues in the review via the “Flag Insufficient Review” button, which was an overkill. I reached out to the Program Chairs for this issue and they were able to fix it quickly. In addition, I found I could not send messages to the authors anytime I wanted during the entire review process, which was inconvenient.

Among the 7 papers assigned to me, 4 of them got really high ratings. Among those 4 papers, each paper has a minimum score of 4/5 from all the reviewers. NeurIPS typically has an acceptance rate of around 20-25%. In my case, even though I think some highly rated papers still have some unresolved issues, the issues were relatively minor and it would be very difficult to justify rejecting them. So I recommended accepting all the 4 papers.

During the paper decision phase, I got one notification that one of the papers I recommended for acceptance, which has a score of 4/5 from all the reviewers, was under the average of all the papers recommended for acceptance this year, and I would have to negotiate with the Senior Area Chairs and provide more compelling reasons to support my recommendation. I reached out to the Senior Area Chairs and briefly described the situation and the pros and cons of the paper. The Senior Area Chairs agreed with my assessment.

However, when it comes to the final decision, I realized that only 2 out of the 4 papers I recommended for acceptance were accepted by the Senior Area Chairs and Program Chairs. The two papers that were accepted both have an average rating of 4.5/5, while the two papers that were rejected have an average rating of 4.25/5 and 4/5 respectively. I also notice that at the end of my meta reviews for those papers which were recommended by me but rejected, there is a note added by the Program Chairs saying “The final decision for this paper has been taken by the program chairs after consultation with the SACs. All Senior Area Chairs have ranked papers according to the feedback from the AC during the review process. We decided to leave the original meta-review to reflect the opinion of the AC in light of the initial discussions with reviewers and SAC.” So it seems that this year there are too many paper submissions, the committee has to do some sort of ranking at the Senior Area Chair level and reject some of the recommended papers from the Area Chairs which did not seem to be bad at all. I sometimes felt bad about this, but I could not do anything because it is always a zero-sum game in the end.

Finally, this year the Area Chairs were asked to provide ratings for the reviewers, just like in ICML 2025, especially for the reviewers who did an outstanding job or a poor job. However, I think the process could be improved by allowing adjustable ratings that are invisible to the reviewers during the entire review process so that the performances of the reviewers can be better recorded. I had suggested this to the ICML 2025 Program Chairs a few months ago but I never got any feedback.

References

NeurIPS Area Chair Guidelines

NeurIPS 2025 Area Chair Experience

https://leimao.github.io/blog/NeurIPS-2025-Area-Chair-Experience/

Author

Lei Mao

Posted on

09-21-2025

Updated on

09-21-2025

Licensed under

Deep Learning,

NeurIPS,

Conference

NeurIPS 2025 Area Chair Experience

Introduction

The Merge of Main Track and Dataset and Benchmark Track Review Processes

NeurIPS 2025 Area Chair Experience

References

Author

Posted on

Updated on

Licensed under

Like this article? Support the author with

Comments

Advertisement

Catalogue