Overview
In the words of Charles Dickens, "It was the best of times, it was the worst of times..." The ever-evolving field of computer vision embodies this sentiment, generating both excitement and looming concern--the worry that the richness of the field's history, predating the "deep learning revolution," might be overlooked by newcomers. While the deep learning revolution has brought unprecedented progress, it's also important not to overlook the rich history that preceded it.
This graduate seminar course aims to bridge the gap and connect the dots between the surge in deep learning and the foundational work that laid the groundwork. While not all historical research directly contributed to the latest methods, overlooking the field's evolution risks missing out on essential insights. We will revisit seminal works still relevant today alongside modern approaches, identifying the most impactful ideas over time and exploring past lessons that continue to influence the future.
The course is organized around student presentations, with the first few lectures by the instructor setting the tone. The class will cover a span of topics, from low-level vision to generative models. Each session will discuss two papers -- one historical (pre-2009) and one modern (post-2010) -- to highlight the field's evolution. Beyond presentations, students will engage in a project that revisits old ideas with modern tools, offering a unique perspective in today's fast-paced research environment. This course is not just a retrospective; it's an opportunity to gain a fresh outlook on computer vision. Requirements include two presentations, a final project, peer grading, and active participation.
Prerequisites: A basic understanding of machine learning and some experience with programming.
Recommended prerequisite course: Introduction to Computer Vision: TTIC 31040
Requirements
Presentation (60%: 2 x 30%)
Students will present on two papers: a foundational paper and a modern paper on different topics. They will develop and deliver a lecture on an assigned paper on a predetermined date. For a detailed list of topics, see here
- Signup: Shortly after the first lecture on Tuesday, March 19th, an email will be sent to all registered students with a link to the signup sheet. This link will also be posted on Slack. Topics are assigned on a first-come, first-served basis. However, if no one signs up for some topics, I may ask some students to switch to ensure coverage. Signup deadline for registered students is 2PM on March 21st. If you fail to sign up promptly, you will lose 10% of your presentation grade.
After the deadline for registered students, any unregistered students interested in taking the course or auditing the course are free to sign up for the remaining spots. The presentation schedule will be finalized by the end of Friday, March 22nd.
Keep in mind that the act of signing up for a topic is a commitment to your instructor, and all the other students in the class. Therefore, if you are unsure whether you will stay in the course, we urge you to make this decision now if at all possible.
- Practice presentation (30% of presentation grade): Each student must do a practice presentation with Anand about three days before the actual presentation date. This session aims to provide feedback to ensure your in-class presentation meets the highest standards. The practice presentation doesn't need to be fully polished or complete, but grading will assess the seriousness of your preparation. There are two available slots on Monday (for Thursday presentations) and two on Friday (for Tuesday presentations), each lasting 45 minutes. Please contact Anand to schedule your practice presentation at least one week in advance for one of these slots. You are also encouraged to consult with Anand during office hours about your paper or draft slides, even before the formal practice session.
- Slides (30% of presentation grade): By the end of the day before your scheduled presentation, you must send Anand your slides in either PowerPoint or PDF format via Slack.These slides will then be made available to all students on the course webpage. Remember to include your names on the title slide. Failing to submit your slides on time will result in a loss of this portion of your grade. Additionally, please pay close attention to the guidelines on credit attribution below, as not adhering to them could negatively impact your grade for this section and might even be regarded as a violation of academic integrity.
- In-class presentation (30% of presentation grade): The presentations will be evaluated on several key criteria: clarity of delivery, technical depth of the content, successful synthesis of the discussed material, the presenter's ability to engage and involve the audience, and how well the feedback from the practice presentation was incorporated and responded to.
- Peer grading (10% of presentation grade): Each presentation will be evaluated by three to five of your classmates. Their individual scores will then be averaged to determine this portion of your overall grade. The peer grading form is here. For details, see below.
Guidelines for creating a successful presentation:
- There will be two papers to cover in each class. The first introduces the topic and covers the basics before getting into specifics, with a bonus for summarizing key developments from the historic to the modern paper in the last few slides. The second paper should relate back to the first, exploring how the two approaches might connect.
- Each presentation will last for 25 minutes, followed by 5 minutes for audience questions. The final 15 minutes of the class will be dedicated to a discussion led by me, based on questions about the papers submitted by students the day before.
- Think of yourself as a professor for the day. Aim to deliver a comprehensive and understandable lecture on a specific topic, ensuring the technical depth is appropriate for our audience.
- Be sure to place your topic in the context of the entire course. Whenever appropriate, take care to point out any specific connections to other presentations that came before.
- Where appropriate, feel free to bring a critical perspective to your topic. Go beyond simply describing the techniques. Question assumptions, expose possible flaws and limitations, suggest alternatives and/or directions for future research. Keep in mind that some of the papers you are covering are very recent, making skepticism about any extraordinary/unsubstantiated claims warranted.
- Be sure to involve the class. When you are developing your presentation, identify places where you can ask other students for input, or topics that you want to open up for discussion.
- Because timing is hard to predict, you need to maintain some flexibility in terms of the paper you will cover. It is a good idea to have one or two sections in the latter half of your slides that you can skip depending on the time. When you are presenting, keep an eye on the time and adjust the pacing towards the end accordingly.
- Use of external sources and credit attribution: Be sure to explicitly give credit whenever you use material from other sources. If you "borrow" any slides or graphics, be sure to give the original source in small font on the bottom of each slide. If you show a demo based on somebody's code, be sure to clearly announce this. Failure to follow these guidelines will hurt your score for the slides, and may even be considered an academic integrity violation. It is not acceptable to use an entire slide deck from another source "as is" as the basis for your presentation.
Group project (20%)
You're encouraged to collaborate on the project in groups, but working solo is also an option if you prefer. The project requires you to revisit historical ideas with modern tools. If possible, prepare a demo or present some results during your lecture or on the final day of class.
Project deliverables (submissions over Slack):
- Proposal (10% of project grade, due Monday, April 8th): The proposal should be sent in PDF format by one group member and should include: (1) names of group members; (2) a brief description of the proposed project, approximately half a page; (3) key references, including links to any resources you plan to use, especially code and data. Late submissions will receive no credit but must still be submitted to avoid further penalties on subsequent components of the project grade.
- Progress Update (10% of project grade, due Monday, April 29th): Provide a summary of your current efforts, with notes on any modifications to your original project goals. At a minimum, you should show evidence of successfully running baseline code (e.g., training an off-the-shelf model) on your target data.
- Project Presentation (30% of project grade, on May 16th): Spotlight-like presentation. Further details will depend on the number of projects.
- Final Deliverable (50% of project grade, due Thursday, May 16th): A report presenting results, formatted according to the CVPR style template, with a maximum of 4 pages.
Format for implmentation report:
The final report should be submitted in PDF format by one designated group member on Slack.
It should be (the equivalent of) at least four pages following CVPR style, including figures but excluding references. The report should be written in the style of a research paper. It is not necessary to submit code. Here is the outline to follow for the report:
- Cover page: executive summary: List title and authors. Briefly summarize your problem, line of attack, and most interesting/surprising findings. Be sure to include at least one diagram or example result figure.
- Introduction: Define and motivate the problem, discuss background material or related work, and
briefly summarize your approach.
- Details of the approach: Include any formulas, pseudocode, diagrams -- anything
that is necessary to clearly explain your system and what you have done. If possible, illustrate
the intermediate stages of your approach with results images.
- Results: Clearly describe your experimental protocols and identify any external code and datasets used.
Present your quantitative evalution (if any) and show some example outputs.
If you are working with videos, put example output on YouTube or some other external repository and include links in your
report.
- Discussion and conclusions: Summarize the main insights drawn from your analysis and
experiments. You can get a good project grade with mostly negative results, as long as you show evidence of extensive
exploration, thoughtfully analyze the causes of your negative results, and discuss potential
solutions.
- Statement of individual contribution: Required if there is more than one group member. This is also excluded from the four-page limit.
- References: including URLs for any external code or data used.
Peer grading reports (10%)
Each student will be assigned to grade two classes and will have to turn in four peer grading reports (DOC, PDF) in the course of the semester. Peer grading report is worth 10% of your total course grade, so please take it seriously. These reports serve two purposes: to provide constructive feedback to your fellow students, and to encourage you to engage in depth in topics other than your own. Reports will be anonymous to the other students, but not to the instructor. The scores in the reports will be used to calculate the peer portion of the presentation grade for the respective team, and with rare exceptions, they will be shared with the team (but not with the class more broadly).
Reports should be submitted by slack to Anand. For Tuesday presentations, the reports are due by the end of Friday of the same week, and for Thursday presentations, they are due by the end of the following Sunday. Late reports will be penalized 20% (or 1% of your total course grade) for each day they are late.
Participation (10%)
You are expected to come to class most days and participate in discussions both during class and on the Slack (a thread for questions and comments will be created for each topic).
Schedule (in progress)
- Presentation Signup
- Presenters: Practice presentation is required three days before your actual presentation. Finalized slides are due the night before your presentation. Come to class at least five minutes early to make sure that your laptop works with the projector.
- Peer graders: Reports for Tuesday presentations are due via Slack to Anand by the end of Friday, and reports for Thursday presentations are due by the end of the following Sunday.
Date
| Topic
| Historical Foundation
| Modern Approach
|
March 19
| Class intro [Anand] PPT, PDF
| N/A
| N/A
|
March 21
| Intrinsic Images
| Recovering Intrinsic Scene Characteristics From Images
PPT, PDF [Anand]
| StyleGAN Knows Normal, Depth, Albedo, and More
PPT, PDF [Anand]
|
March 26
| Image-based Lighting
| Rendering Synthetic Objects into Real Scenes
PPT, PDF [Anand]
| DiffusionLight: Light Probes for Free by Painting a Chrome Ball
PPT, PDF [Anand]
|
March 28
| Light Field Modeling
| The Plenoptic Function and the Elements of Early Vision
PPT, PDF [Hanchen Li]
| NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
key, PDF [Joshua Ahn]
|
April 02
| Image Filters
| Design and Use of Steerable Filters
key, PDF [Xiao Zhang]
| Visualizing and Understanding Convolutional Networks
PPT, PDF [Tracy Zhu]
|
April 04
| Image Descriptors
| Distinctive Image Features from Scale-Invariant Keypoints
PPT, PDF[Richard Liu]
| SuperPoint: Self-Supervised Interest Point Detection and Description
PPT, PDF [Anand]
|
April 09
| Optical Flow
| An Iterative Image Registration Technique with an Application to Stereo Vision
PPT, PDF [Haochen Wang]
| RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
PPT, PDF [Hanchen Li]
|
April 11
| Stereo Vision
| Fast Approximate Energy Minimization via Graph Cuts
PPT, PDF [Jiahao Li]
| DUSt3R: Geometric 3D Vision Made Easy
PPT, PDF [Haochen Wang]
|
April 16
| SFM
| Photo tourism: exploring photo collections in 3D
PPT, PDF [Xiaodan Du]
| Pixel-Perfect Structure-from-Motion with Featuremetric Refinement
PPT, PDF [Jiahao Li]
|
April 18
| Single Image to Depth
| Texture gradient as a depth cue
PPT, PDF [Gabriel Aguilar Perez]
| Vision Transformers for Dense Prediction
PPT, PDF [Richard Liu -> Anand]
|
April 23
| Selective Processing: good-to-know techniques
| Bilateral Filters, Non Local Means, Total Variation Denoising
PPT, PDF
| Attention Is All You Need
PPT, PDF
|
April 25
| Object Detection
| Object Detection with Discriminatively Trained Part-Based Model
PPT, PDF
| Mask R-CNN
PPT, PDF
|
April 30
| Grouping and Segmentation
| SLIC Superpixels compared to state-of-the-art superpixel methods
PPT, PDF
| Segment Anything
PPT, PDF
|
May 05
| Connecting Text and Images
| Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary
PPT, PDF
| Learning Transferable Visual Models From Natural Language Supervision
PPT, PDF
|
May 07
| Image Generation/Synthesis
| Texture Synthesis by Non-parametric Sampling
PPT, PDF
| Image Style Transfer Using Convolutional Neural Networks
PPT, PDF
|
May 09
| Image Generation/Synthesis
| Image Analogies
PPT, PDF
| Generative Adversarial Nets
PPT, PDF
|
May 09
| Image Generation/Synthesis
| Scene Completion Using Millions of Photographs
PPT, PDF
| High-Resolution Image Synthesis with Latent Diffusion Models
PPT, PDF
|
May 14
| Project presentations & Course Wrap Up
|
|
|
Acknowledgement
This class is inspired by Lana Lazebnik's Short Course: "Looking Back to Look Forward" at Georgia Tech in 2020. Her videos are available and viewing them is highly encouraged. The course follows the design, structure, policies and also the webpage template of another course by Lana Lazebnik: "Cutting-Edge Trends in Deep Learning and Recognition".
Special thanks to David Forsyth, David McAllister, Derek Hoiem, Greg Shakhnarovich, Lana Lazebnik and Shubham Tulsiani for their useful suggestions in designing this course.
Useful Resources
Similar Courses
Reading List
|