TTIC 41000: Past Meets Present: A Tale of Two Visions

Instructor: Anand Bhattad  (bhattad -at- ttic.edu)
Lectures: T TH 2:00-3:20, 530 TTIC

Practice Presentation with Instructor:
M 1-2:30PM (for TH Presentations), 509 TTIC
F 1-2:30PM (for T Presentations), 509 TTIC

Office hours: F 3:30-4:15 PM or by appointment, 509 TTIC

Always check announcements on Slack for short-notice changes to instructor office hours!

Contents: requirements, topics & schedule, acknowledgement, resources

Overview

In the words of Charles Dickens, "It was the best of times, it was the worst of times..." The ever-evolving field of computer vision embodies this sentiment, generating both excitement and looming concern--the worry that the richness of the field's history, predating the "deep learning revolution," might be overlooked by newcomers. While the deep learning revolution has brought unprecedented progress, it's also important not to overlook the rich history that preceded it.

This graduate seminar course aims to bridge the gap and connect the dots between the surge in deep learning and the foundational work that laid the groundwork. While not all historical research directly contributed to the latest methods, overlooking the field's evolution risks missing out on essential insights. We will revisit seminal works still relevant today alongside modern approaches, identifying the most impactful ideas over time and exploring past lessons that continue to influence the future.

The course is organized around student presentations, with the first few lectures by the instructor setting the tone. The class will cover a span of topics, from low-level vision to generative models. Each session will discuss two papers -- one historical (pre-2009) and one modern (post-2010) -- to highlight the field's evolution. Beyond presentations, students will engage in a project that revisits old ideas with modern tools, offering a unique perspective in today's fast-paced research environment. This course is not just a retrospective; it's an opportunity to gain a fresh outlook on computer vision. Requirements include two presentations, a final project, peer grading, and active participation.

Prerequisites: A basic understanding of machine learning and some experience with programming.
Recommended prerequisite course: Introduction to Computer Vision: TTIC 31040

Topic List and Papers

Requirements

Presentation (60%: 2 x 30%)

Students will present on two papers: a foundational paper and a modern paper on different topics. They will develop and deliver a lecture on an assigned paper on a predetermined date. For a detailed list of topics, see here

Guidelines for creating a successful presentation:

  • There will be two papers to cover in each class. The first introduces the topic and covers the basics before getting into specifics, with a bonus for summarizing key developments from the historic to the modern paper in the last few slides. The second paper should relate back to the first, exploring how the two approaches might connect.
  • Each presentation will last for 25 minutes, followed by 5 minutes for audience questions. The final 15 minutes of the class will be dedicated to a discussion led by me, based on questions about the papers submitted by students the day before.
  • Think of yourself as a professor for the day. Aim to deliver a comprehensive and understandable lecture on a specific topic, ensuring the technical depth is appropriate for our audience.
  • Be sure to place your topic in the context of the entire course. Whenever appropriate, take care to point out any specific connections to other presentations that came before.
  • Where appropriate, feel free to bring a critical perspective to your topic. Go beyond simply describing the techniques. Question assumptions, expose possible flaws and limitations, suggest alternatives and/or directions for future research. Keep in mind that some of the papers you are covering are very recent, making skepticism about any extraordinary/unsubstantiated claims warranted.
  • Be sure to involve the class. When you are developing your presentation, identify places where you can ask other students for input, or topics that you want to open up for discussion.
  • Because timing is hard to predict, you need to maintain some flexibility in terms of the paper you will cover. It is a good idea to have one or two sections in the latter half of your slides that you can skip depending on the time. When you are presenting, keep an eye on the time and adjust the pacing towards the end accordingly.
  • Use of external sources and credit attribution: Be sure to explicitly give credit whenever you use material from other sources. If you "borrow" any slides or graphics, be sure to give the original source in small font on the bottom of each slide. If you show a demo based on somebody's code, be sure to clearly announce this. Failure to follow these guidelines will hurt your score for the slides, and may even be considered an academic integrity violation. It is not acceptable to use an entire slide deck from another source "as is" as the basis for your presentation.

Group project (20%)

You're encouraged to collaborate on the project in groups, but working solo is also an option if you prefer. The project requires you to revisit historical ideas with modern tools. If possible, prepare a demo or present some results during your lecture or on the final day of class.

Project deliverables (submissions over Slack):
  • Proposal (10% of project grade, due Monday, April 8th): The proposal should be sent in PDF format by one group member and should include: (1) names of group members; (2) a brief description of the proposed project, approximately half a page; (3) key references, including links to any resources you plan to use, especially code and data. Late submissions will receive no credit but must still be submitted to avoid further penalties on subsequent components of the project grade.

  • Progress Update (10% of project grade, due Monday, April 29th): Provide a summary of your current efforts, with notes on any modifications to your original project goals. At a minimum, you should show evidence of successfully running baseline code (e.g., training an off-the-shelf model) on your target data.

  • Project Presentation (30% of project grade, on May 16th): Spotlight-like presentation. Further details will depend on the number of projects.

  • Final Deliverable (50% of project grade, due Thursday, May 16th): A report presenting results, formatted according to the CVPR style template, with a maximum of 4 pages.
Format for implmentation report: The final report should be submitted in PDF format by one designated group member on Slack. It should be (the equivalent of) at least four pages following CVPR style, including figures but excluding references. The report should be written in the style of a research paper. It is not necessary to submit code. Here is the outline to follow for the report:
  1. Cover page: executive summary: List title and authors. Briefly summarize your problem, line of attack, and most interesting/surprising findings. Be sure to include at least one diagram or example result figure.
  2. Introduction: Define and motivate the problem, discuss background material or related work, and briefly summarize your approach.
  3. Details of the approach: Include any formulas, pseudocode, diagrams -- anything that is necessary to clearly explain your system and what you have done. If possible, illustrate the intermediate stages of your approach with results images.
  4. Results: Clearly describe your experimental protocols and identify any external code and datasets used. Present your quantitative evalution (if any) and show some example outputs. If you are working with videos, put example output on YouTube or some other external repository and include links in your report.
  5. Discussion and conclusions: Summarize the main insights drawn from your analysis and experiments. You can get a good project grade with mostly negative results, as long as you show evidence of extensive exploration, thoughtfully analyze the causes of your negative results, and discuss potential solutions.
  6. Statement of individual contribution: Required if there is more than one group member. This is also excluded from the four-page limit.
  7. References: including URLs for any external code or data used.

Peer grading reports (10%)

Each student will be assigned to grade two classes and will have to turn in four peer grading reports (
DOC, PDF) in the course of the semester. Peer grading report is worth 10% of your total course grade, so please take it seriously. These reports serve two purposes: to provide constructive feedback to your fellow students, and to encourage you to engage in depth in topics other than your own. Reports will be anonymous to the other students, but not to the instructor. The scores in the reports will be used to calculate the peer portion of the presentation grade for the respective team, and with rare exceptions, they will be shared with the team (but not with the class more broadly).

Reports should be submitted by slack to Anand. For Tuesday presentations, the reports are due by the end of Friday of the same week, and for Thursday presentations, they are due by the end of the following Sunday. Late reports will be penalized 20% (or 1% of your total course grade) for each day they are late.

Participation (10%)

You are expected to come to class most days and participate in discussions both during class and on the Slack (a thread for questions and comments will be created for each topic).

Schedule (in progress)

  • Presentation Signup
  • Presenters: Practice presentation is required three days before your actual presentation. Finalized slides are due the night before your presentation. Come to class at least five minutes early to make sure that your laptop works with the projector.
  • Peer graders: Reports for Tuesday presentations are due via Slack to Anand by the end of Friday, and reports for Thursday presentations are due by the end of the following Sunday.
Date Topic Historical Foundation Modern Approach
March 19 Class intro [Anand]
PPT, PDF
N/A N/A
March 21 Intrinsic Images Recovering Intrinsic Scene Characteristics From Images
PPT, PDF [Anand]
StyleGAN Knows Normal, Depth, Albedo, and More
PPT, PDF [Anand]
March 26 Image-based Lighting Rendering Synthetic Objects into Real Scenes
PPT, PDF [Anand]
DiffusionLight: Light Probes for Free by Painting a Chrome Ball
PPT, PDF [Anand]
March 28 Light Field Modeling The Plenoptic Function and the Elements of Early Vision
PPT, PDF [Hanchen Li]
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
key, PDF [Joshua Ahn]
April 02 Image Filters Design and Use of Steerable Filters
key, PDF [Xiao Zhang]
Visualizing and Understanding Convolutional Networks
PPT, PDF [Tracy Zhu]
April 04 Image Descriptors Distinctive Image Features from Scale-Invariant Keypoints
PPT, PDF[Richard Liu]
SuperPoint: Self-Supervised Interest Point Detection and Description
PPT, PDF [Anand]
April 09 Optical Flow An Iterative Image Registration Technique with an Application to Stereo Vision
PPT, PDF [Haochen Wang]
RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
PPT, PDF [Hanchen Li]
April 11 Stereo Vision Fast Approximate Energy Minimization via Graph Cuts
PPT, PDF [Jiahao Li]
DUSt3R: Geometric 3D Vision Made Easy
PPT, PDF [Haochen Wang]
April 16 SFM Photo tourism: exploring photo collections in 3D
PPT, PDF [Xiaodan Du]
Pixel-Perfect Structure-from-Motion with Featuremetric Refinement
PPT, PDF [Jiahao Li]
April 18 Single Image to Depth Texture gradient as a depth cue
PPT, PDF [Gabriel Aguilar Perez]
Vision Transformers for Dense Prediction
PPT, PDF [Richard Liu -> Anand]
April 23 Selective Processing: good-to-know techniques Bilateral Filters, Non Local Means, Total Variation Denoising
PPT, PDF
Attention Is All You Need
PPT, PDF
April 25 Object Detection Object Detection with Discriminatively Trained Part-Based Model
PPT, PDF
Mask R-CNN
PPT, PDF
April 30 Grouping and Segmentation SLIC Superpixels compared to state-of-the-art superpixel methods
PPT, PDF
Segment Anything
PPT, PDF
May 05 Connecting Text and Images Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary
PPT, PDF
Learning Transferable Visual Models From Natural Language Supervision
PPT, PDF
May 07 Image Generation/Synthesis Texture Synthesis by Non-parametric Sampling
PPT, PDF
Image Style Transfer Using Convolutional Neural Networks
PPT, PDF
May 09 Image Generation/Synthesis Image Analogies
PPT, PDF
Generative Adversarial Nets
PPT, PDF
May 09 Image Generation/Synthesis Scene Completion Using Millions of Photographs
PPT, PDF
High-Resolution Image Synthesis with Latent Diffusion Models
PPT, PDF
May 14 Project presentations & Course Wrap Up

Acknowledgement

This class is inspired by Lana Lazebnik's Short Course: "Looking Back to Look Forward" at Georgia Tech in 2020. Her videos are available and viewing them is highly encouraged. The course follows the design, structure, policies and also the webpage template of another course by Lana Lazebnik: "Cutting-Edge Trends in Deep Learning and Recognition".

Special thanks to David Forsyth, David McAllister, Derek Hoiem, Greg Shakhnarovich, Lana Lazebnik and Shubham Tulsiani for their useful suggestions in designing this course.

Useful Resources

Similar Courses

Reading List