TTIC 41000: Past Meets Present: A Tale of Two Visions

Instructor: Anand Bhattad (bhattad -at- ttic.edu)
Lectures: T TH 2:00-3:20, 530 TTIC

Practice Presentation with Instructor:
M 1-2:30PM (for TH Presentations), 509 TTIC
F 1-2:30PM (for T Presentations), 509 TTIC

Office hours: F 3:30-4:15 PM or by appointment, 509 TTIC

Always check announcements on Slack for short-notice changes to instructor office hours!

Contents: requirements, topics & schedule, acknowledgement, resources

Overview

In the words of Charles Dickens, "It was the best of times, it was the worst of times..." The ever-evolving field of computer vision embodies this sentiment, generating both excitement and looming concern--the worry that the richness of the field's history, predating the "deep learning revolution," might be overlooked by newcomers. While the deep learning revolution has brought unprecedented progress, it's also important not to overlook the rich history that preceded it.

This graduate seminar course aims to bridge the gap and connect the dots between the surge in deep learning and the foundational work that laid the groundwork. While not all historical research directly contributed to the latest methods, overlooking the field's evolution risks missing out on essential insights. We will revisit seminal works still relevant today alongside modern approaches, identifying the most impactful ideas over time and exploring past lessons that continue to influence the future.

The course is organized around student presentations, with the first few lectures by the instructor setting the tone. The class will cover a span of topics, from low-level vision to generative models. Each session will discuss two papers -- one historical (pre-2009) and one modern (post-2010) -- to highlight the field's evolution. Beyond presentations, students will engage in a project that revisits old ideas with modern tools, offering a unique perspective in today's fast-paced research environment. This course is not just a retrospective; it's an opportunity to gain a fresh outlook on computer vision. Requirements include two presentations, a final project, peer grading, and active participation.

Prerequisites: A basic understanding of machine learning and some experience with programming.
Recommended prerequisite course: Introduction to Computer Vision: TTIC 31040

Topic List and Papers

Presentation (60%: 2 x 30%)

Students will present on two papers: a foundational paper and a modern paper on different topics. They will develop and deliver a lecture on an assigned paper on a predetermined date. For a detailed list of topics, see here

Signup: Shortly after the first lecture on Tuesday, March 19th, an email will be sent to all registered students with a link to the signup sheet. This link will also be posted on Slack. Topics are assigned on a first-come, first-served basis. However, if no one signs up for some topics, I may ask some students to switch to ensure coverage. Signup deadline for registered students is 2PM on March 21st. If you fail to sign up promptly, you will lose 10% of your presentation grade.

After the deadline for registered students, any unregistered students interested in taking the course or auditing the course are free to sign up for the remaining spots. The presentation schedule will be finalized by the end of Friday, March 22nd.

Keep in mind that the act of signing up for a topic is a commitment to your instructor, and all the other students in the class. Therefore, if you are unsure whether you will stay in the course, we urge you to make this decision now if at all possible.

Practice presentation (30% of presentation grade): Each student must do a practice presentation with Anand about three days before the actual presentation date. This session aims to provide feedback to ensure your in-class presentation meets the highest standards. The practice presentation doesn't need to be fully polished or complete, but grading will assess the seriousness of your preparation. There are two available slots on Monday (for Thursday presentations) and two on Friday (for Tuesday presentations), each lasting 45 minutes. Please contact Anand to schedule your practice presentation at least one week in advance for one of these slots. You are also encouraged to consult with Anand during office hours about your paper or draft slides, even before the formal practice session.

Slides (30% of presentation grade): By the end of the day before your scheduled presentation, you must send Anand your slides in either PowerPoint or PDF format via Slack.These slides will then be made available to all students on the course webpage. Remember to include your names on the title slide. Failing to submit your slides on time will result in a loss of this portion of your grade. Additionally, please pay close attention to the guidelines on credit attribution below, as not adhering to them could negatively impact your grade for this section and might even be regarded as a violation of academic integrity.

In-class presentation (30% of presentation grade): The presentations will be evaluated on several key criteria: clarity of delivery, technical depth of the content, successful synthesis of the discussed material, the presenter's ability to engage and involve the audience, and how well the feedback from the practice presentation was incorporated and responded to.

Peer grading (10% of presentation grade): Each presentation will be evaluated by three to five of your classmates. Their individual scores will then be averaged to determine this portion of your overall grade. The peer grading form is here. For details, see below.

Guidelines for creating a successful presentation:

There will be two papers to cover in each class. The first introduces the topic and covers the basics before getting into specifics, with a bonus for summarizing key developments from the historic to the modern paper in the last few slides. The second paper should relate back to the first, exploring how the two approaches might connect.
Each presentation will last for 25 minutes, followed by 5 minutes for audience questions. The final 15 minutes of the class will be dedicated to a discussion led by me, based on questions about the papers submitted by students the day before.
Think of yourself as a professor for the day. Aim to deliver a comprehensive and understandable lecture on a specific topic, ensuring the technical depth is appropriate for our audience.
Be sure to place your topic in the context of the entire course. Whenever appropriate, take care to point out any specific connections to other presentations that came before.
Where appropriate, feel free to bring a critical perspective to your topic. Go beyond simply describing the techniques. Question assumptions, expose possible flaws and limitations, suggest alternatives and/or directions for future research. Keep in mind that some of the papers you are covering are very recent, making skepticism about any extraordinary/unsubstantiated claims warranted.
Be sure to involve the class. When you are developing your presentation, identify places where you can ask other students for input, or topics that you want to open up for discussion.
Because timing is hard to predict, you need to maintain some flexibility in terms of the paper you will cover. It is a good idea to have one or two sections in the latter half of your slides that you can skip depending on the time. When you are presenting, keep an eye on the time and adjust the pacing towards the end accordingly.
Use of external sources and credit attribution: Be sure to explicitly give credit whenever you use material from other sources. If you "borrow" any slides or graphics, be sure to give the original source in small font on the bottom of each slide. If you show a demo based on somebody's code, be sure to clearly announce this. Failure to follow these guidelines will hurt your score for the slides, and may even be considered an academic integrity violation. It is not acceptable to use an entire slide deck from another source "as is" as the basis for your presentation.

Group project (20%)

You're encouraged to collaborate on the project in groups, but working solo is also an option if you prefer. The project requires you to revisit historical ideas with modern tools. If possible, prepare a demo or present some results during your lecture or on the final day of class.

Project deliverables (submissions over Slack):

Proposal (10% of project grade, due Monday, April 8th): The proposal should be sent in PDF format by one group member and should include: (1) names of group members; (2) a brief description of the proposed project, approximately half a page; (3) key references, including links to any resources you plan to use, especially code and data. Late submissions will receive no credit but must still be submitted to avoid further penalties on subsequent components of the project grade.

Progress Update (10% of project grade, due Monday, April 29th): Provide a summary of your current efforts, with notes on any modifications to your original project goals. At a minimum, you should show evidence of successfully running baseline code (e.g., training an off-the-shelf model) on your target data.

Project Presentation (30% of project grade, on May 16th): Spotlight-like presentation. Further details will depend on the number of projects.

Final Deliverable (50% of project grade, due Thursday, May 16th): A report presenting results, formatted according to the CVPR style template, with a maximum of 4 pages.

Format for implmentation report: The final report should be submitted in PDF format by one designated group member on Slack. It should be (the equivalent of) at least four pages following CVPR style, including figures but excluding references. The report should be written in the style of a research paper. It is not necessary to submit code. Here is the outline to follow for the report:

Cover page: executive summary: List title and authors. Briefly summarize your problem, line of attack, and most interesting/surprising findings. Be sure to include at least one diagram or example result figure.
Introduction: Define and motivate the problem, discuss background material or related work, and briefly summarize your approach.
Details of the approach: Include any formulas, pseudocode, diagrams -- anything that is necessary to clearly explain your system and what you have done. If possible, illustrate the intermediate stages of your approach with results images.
Results: Clearly describe your experimental protocols and identify any external code and datasets used. Present your quantitative evalution (if any) and show some example outputs. If you are working with videos, put example output on YouTube or some other external repository and include links in your report.
Discussion and conclusions: Summarize the main insights drawn from your analysis and experiments. You can get a good project grade with mostly negative results, as long as you show evidence of extensive exploration, thoughtfully analyze the causes of your negative results, and discuss potential solutions.
Statement of individual contribution: Required if there is more than one group member. This is also excluded from the four-page limit.
References: including URLs for any external code or data used.

Peer grading reports (10%)

Each student will be assigned to grade two classes and will have to turn in four peer grading reports (DOC, PDF) in the course of the semester. Peer grading report is worth 10% of your total course grade, so please take it seriously. These reports serve two purposes: to provide constructive feedback to your fellow students, and to encourage you to engage in depth in topics other than your own. Reports will be anonymous to the other students, but not to the instructor. The scores in the reports will be used to calculate the peer portion of the presentation grade for the respective team, and with rare exceptions, they will be shared with the team (but not with the class more broadly).

Reports should be submitted by slack to Anand. For Tuesday presentations, the reports are due by the end of Friday of the same week, and for Thursday presentations, they are due by the end of the following Sunday. Late reports will be penalized 20% (or 1% of your total course grade) for each day they are late.

Participation (10%)

You are expected to come to class most days and participate in discussions both during class and on the Slack (a thread for questions and comments will be created for each topic).

Schedule (in progress)

Presentation Signup
Presenters: Practice presentation is required three days before your actual presentation. Finalized slides are due the night before your presentation. Come to class at least five minutes early to make sure that your laptop works with the projector.
Peer graders: Reports for Tuesday presentations are due via Slack to Anand by the end of Friday, and reports for Thursday presentations are due by the end of the following Sunday.

Date	Topic	Historical Foundation	Modern Approach
March 19	Class intro [Anand] PPT, PDF	N/A	N/A
March 21	Intrinsic Images	Recovering Intrinsic Scene Characteristics From Images PPT, PDF [Anand]	StyleGAN Knows Normal, Depth, Albedo, and More PPT, PDF [Anand]
March 26	Image-based Lighting	Rendering Synthetic Objects into Real Scenes PPT, PDF [Anand]	DiffusionLight: Light Probes for Free by Painting a Chrome Ball PPT, PDF [Anand]
March 28	Light Field Modeling	The Plenoptic Function and the Elements of Early Vision PPT, PDF [Hanchen Li]	NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis key, PDF [Joshua Ahn]
April 02	Image Filters	Design and Use of Steerable Filters key, PDF [Xiao Zhang]	Visualizing and Understanding Convolutional Networks PPT, PDF [Tracy Zhu]
April 04	Image Descriptors	Distinctive Image Features from Scale-Invariant Keypoints PPT, PDF[Richard Liu]	SuperPoint: Self-Supervised Interest Point Detection and Description PPT, PDF [Anand]
April 09	Optical Flow	An Iterative Image Registration Technique with an Application to Stereo Vision PPT, PDF [Haochen Wang]	RAFT: Recurrent All-Pairs Field Transforms for Optical Flow PPT, PDF [Hanchen Li]
April 11	Stereo Vision	Fast Approximate Energy Minimization via Graph Cuts PPT, PDF [Jiahao Li]	DUSt3R: Geometric 3D Vision Made Easy PPT, PDF [Haochen Wang]
April 16	SFM	Photo tourism: exploring photo collections in 3D PPT, PDF [Xiaodan Du]	Pixel-Perfect Structure-from-Motion with Featuremetric Refinement PPT, PDF [Jiahao Li]
April 18	Single Image to Depth	Texture gradient as a depth cue PPT, PDF [Gabriel Aguilar Perez]	Vision Transformers for Dense Prediction PPT, PDF [Richard Liu -> Anand]
April 23	Selective Processing: good-to-know techniques	Bilateral Filters, Non Local Means, Total Variation Denoising PPT, PDF	Attention Is All You Need PPT, PDF
April 25	Object Detection	Object Detection with Discriminatively Trained Part-Based Model PPT, PDF	Mask R-CNN PPT, PDF
April 30	Grouping and Segmentation	SLIC Superpixels compared to state-of-the-art superpixel methods PPT, PDF	Segment Anything PPT, PDF
May 05	Connecting Text and Images	Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary PPT, PDF	Learning Transferable Visual Models From Natural Language Supervision PPT, PDF
May 07	Image Generation/Synthesis	Texture Synthesis by Non-parametric Sampling PPT, PDF	Image Style Transfer Using Convolutional Neural Networks PPT, PDF
May 09	Image Generation/Synthesis	Image Analogies PPT, PDF	Generative Adversarial Nets PPT, PDF
May 09	Image Generation/Synthesis	Scene Completion Using Millions of Photographs PPT, PDF	High-Resolution Image Synthesis with Latent Diffusion Models PPT, PDF
May 14	Project presentations & Course Wrap Up

Acknowledgement

This class is inspired by Lana Lazebnik's Short Course: "Looking Back to Look Forward" at Georgia Tech in 2020. Her videos are available and viewing them is highly encouraged. The course follows the design, structure, policies and also the webpage template of another course by Lana Lazebnik: "Cutting-Edge Trends in Deep Learning and Recognition".

Special thanks to David Forsyth, David McAllister, Derek Hoiem, Greg Shakhnarovich, Lana Lazebnik and Shubham Tulsiani for their useful suggestions in designing this course.

Useful Resources

Similar Courses

Georgia Tech Looking Back to Look Forward course
UIUC Computer Vision: What Will Stand Test of Time? course
Berkeley Visual Scene Understanding course
CMU Visual Learning and Recognition course
UIUC 3D Vision course

Reading List

Lana's Reading List from her Computer Vision: What will stand test of time class?