Course
Outline
Fundamentals
of Computer Vision
COMP 558
Fall 2009
Instructor:
Professor Michael Langer
Office:
McConnell Engineering, rm. 329
Tel:
514-398-3740
Email:
langer@cim.mcgill.ca
Office Hours:
by appointment (send me email) |
Teaching
Assistant (T.A.) Fahim Mannan
Office:
ENGMC 337
Telephone:
Email:
fmannan [at] cim.mcgill.ca
Office Hours:
by appointment
(send him email) |
Overview
Computer vision involves the development of algorithms and software
that have the potential to mimic a biological organism's ability to
``see''. Though the sense of vision is immediate for most people, the
complexity of the task that the human visual system accomplishes is in
fact enormous. The field of computer vision has grown steadily over
the past few decades, and has advanced to the point where a set of
core algorithms and techniques now exist for solving specific vision
problems in constrained settings. Several international research
laboratories now exist and applications of computer vision techniques
in industry, robotics, and bio-medicine abound. This course seeks to
present the fundamentals of computer vision at an advanced
undergraduate/beginning graduate level. Students will become
familiar with the basic theoretical and practical tools of computer
vision, which would be needed to carry out research or to
find employment in this field.
Course Contents
The course in Fall 2009 consists
of three parts. Lectures will cover most of the specific
topics below.
Part 1: Image Formation
- image projection with a
pinhole camera
- motion field seen by a
moving observer (translation and rotation)
- camera model
(homogeneous coordinates, extrinsic and intrinsic parameters)
- thin lens model
(aperture, f-stop, blur, depth of field)
- lighting and
reflectance (light and shade, specularities, radiance, BRDF)
- image
capture (color, RGB, image irradiance, exposure, dynamic
range, noise, sampling)
Part 2: Image Analysis
- basic linear
systems (convolution, filtering, noise, derivatives, smoothing)
- edge detection
(Canny), corner detection (2nd moment matrix, Harris vs.
Laplacian)
- features and affine invariance (SIFT)
- motion estimation (aperture problem, Kanade-Lucas-Tomasi, Horn &
Schunk)
- linear algebra tools (matrix decompositions, QR, SVD)
- least squares minimization (pseudoinverse, non-linear methods)
- line fitting (least squares, Hough transform, RANSAC)
- finding a vanishing point
Part 3: 3D Vision
- shape from shading (sunny vs. cloudy days)
- shape from texture, defocus
- geometric camera calibration
- homographies and planar rectification
- egomotion (Rieger Lawton)
- affine structure (factorization, Basri-Ullman, bas relief ambiguity)
- epipolar geometry, estimating the fundamental matrix
- dense stereo correspondence (dynamic programming, graph cuts)
Official Course Description from McGill Calendar
Biological vision, edge
detection, projective geometry and camera
modeling, shape from shading and texture, stereo vision, optical flow,
motion analysis, object representation, object recognition, graph
theoretic methods, high level vision, applications.
Note: this description applies to previous version of the
course taught by Prof. Kaleem Siddiqi. While there is a large
overlap between these topics and those covered this year, there are
significant differences as well. This year, in particular, we will
not be covering object recognition, differential geometry of
surfaces. These topics will be covered in Prof. Siddiqi's 7xx course
offered in the Winter 2010.
Prerequisites
Students should have a solid background in basic Calculus,
Linear Algebra, Algorithms and Data structures, and
Programming. The official prerequisites for the
course are COMP
206 , COMP
360, MATH
222,
MATH
223 (or equivalent).
It is strongly recommended that students only take this
course if they received A grades in MATH 222 and 223 (or equivalent).
Students who do
not have these prerequisites may still be allowed to take the course,
but only with the
permission of the instructor.
Lecture
Notes, Readings, Textbooks
The material covered in the lectures will be available in the form of
Lecture Notes written by the
instructor and posted as PDFs
on the course web page, or as handouts. Readings also will be made available
typically in electronic form.
There is no required textbook for the course. There are, however,
several good texts out there and students are encouraged to consult
with them. These textbooks are available on reserve in the Schulich
Science and Engineering library.
Evaluation
B.Sc. students must achieve a grade of 55 (C) to pass this course.
M.Sc. and Ph.D. students must achieve a grade of 65 (B-).
There are three components to the grade (total 100%):
- Three assignments (40 %)
- Midterm exam (15 %)
- Final Exam (30 %)
- Term Paper (15 %)
A description of each of the three components is as follows:
- Assignments
(40 %)
- There will be three
assignments, of roughly equal weight.
- The assignments will
involve taking digital photographs, writing MATLAB programs to analyze
them, and answering questions related
to the lecture material. Students are not
required to know about digital photography or MATLAB prior
to the course. The principles of digital photography will be
covered in class. Examples of starter MATLAB code will be
given, and
students will asked
to extend these examples.
- Assignments must be
submitted in electronic form via WebCT.
- Policy on lateness and
other specific instructions will be specified on each assignment.
- Exams (midterm 15% + final 30% = 45 %)
The
midterm will take place during a lecture slot and will be 80
minutes long. It will occur approximately midway through the
semester and will cover approximately the first third of the lecture
material. The Final Exam will take place during
the Final Exam period and will cover the remainder of the lecture
material.
- Term
Paper (15 %)
- The student must
summarize three original
research
articles on
a
specific topic. These articles must have appeared in one of
the three main computer vision conferences, namely CVPR,
ECCV,
ICCV.
(links are to most recent conference only). At least
one of these papers must be recent (since 2005).
The usual way to choose the three papers is to find
one recent paper which uses or replaces techniques developed
in two earlier papers.
- Three potential articles must be chosen by the end of
October. Students should submit URL to these articles (no attachments
please) along with a brief (say 100 word) description that justifies
the choice of each article. Feedback on the choice of articles will
be given within one week of submission. The choice of articles must
be finalized by Friday Nov. 13.
- A hard copy of the
Term Paper is due on the last official lecture day.
Late submissions will be given a
maximum grade of 10/15. NOTE: if the
Final Exam is scheduled very early in the Final Exam period,
then this Term Paper deadline will be extended beyond the
Final Exam period in order to give you time to study for the
Final Exam.
[ADDED Oct. 23, 2009] Because the Final Exam is being held
early in the Final Exam period, I am extending the
dates associated with the Term Paper as follows: Initial submission of
URLs and justification of articles (FRI. NOV. 13), Finalization of 3
articles (FRIDAY NOV. 27), Term Paper due (TUESDAY, DEC. 22 5
pm).
- Further specifications:
- The submitted Term
Paper must be
between 1300 and 1500 words
according to the UNIX/LINUX command wc.
The instructor will verify the word count. If the word count
is not between 1300-1500
words,
the student will be asked to resubmit and if the resubmission is beyond
the due date then the late penalty will be applied (see
below). This
word count does not include the Bibliography.
- The Term Paper must be
written in the student's own
words. It
is forbidden to copy phrases or sentences directly from the
articles unless the exact wording is crucial (for example, in a
definition). In this case, quotation marks and a
proper
citation must be used. See Student's
Guide to
Avoiding
Plagarism
mentioned below. It is also considered bad style to
paraphrase, and marks will be deducted if students merely re-ordered or
slightly changed the author's wording.
- Neither equations nor
figures may be included in the
Term
Paper. Instead the equations and figures must be
cited from the
article being summarized. e.g. see Eq.
(5) in
reference [3].
- The paper must be
typeset -- either Latex or Word --
and
run through a spell checker. Marks will be taken off for
spelling mistakes.
- The student must
submit:
- one hard copy of the
Term Paper, including a
Bibliography
- one hard copy of
each of the three articles being
summarized
- one electronic ASCII
version of the Term Paper,
which the instructor will use to determine the word count.
The electronic version
must not
include the Bibliography.
- The Grading Scheme for
the term paper is as follows (15
points total)
- Organization (5
points)
- The Term Paper must
have a one paragraph
Introduction
outlining what the topic is. The article summaries should
then
follow. The key idea(s) of each article should be
explained in simple terms, and references to
particular equations and figures should be given to help the
reader follow the summary. Roughly equal weight should be
given to the three articles. Given the tight word
constraints, a
Conclusion section is optional and should only be included if it
includes ideas that were not covered in the Introduction.
- Possible ways of
organizing the Term Paper include:
- the development of
a particular computational
method or
theory
- the development of
one particular author's work
- a comparison of
different experimental or
computational methods for solving a particular problem
- Clarity/Readability (5
points)
- The Instructor
should be
able
to understand the main
ideas of the summarized articles by reading only the Term
Paper. For details that are not fully explained in
the summaries, the Instructor should be able
to easily index from the statements in the Term Paper to
relevant
locations in the articles. For example, give pointers to
page/paragraphs where ideas are elaborated.
- Choice of
Detail (5 points)
- Because the word
count is severely limited, it is
not possible to cover all the details in each
article. Students
must therefore decide which details to include and which to leave
out. Students will be penalized
for including details that are not central to the main ideas of the
article(s).
- The paper should be
written at a level appropriate
for the instructor, as only he will read the paper. The
student
should not re-explain concepts covered in class, nor explain standard
methods
that are used (but not introduced) by the authors.
In accord with McGill University's Charter of Students' Rights,
students in this course have the right to submit in English or in
French any written work that is to be graded.
Time
Management (revised post hoc: Dec. 14, 2009)
How much time should you plan on spending on this course?
If you are an undergraduate and taking a 5 course load, then this
course should take at least 20% of your time (probably around 25%).
I say "at least",
because it is a 500 level course which is the highest level for ugrad
courses. I am expecting you are in U3 and you
will only take this course if you have done well in U1 and U2,
in particular, in the prereq courses. (If you have a weak background,
then for sure you will need to do extra work to make up for it, and
so it will take more than 20% of your time.) I am also expecting
that you are taking a mix of courses at different levels, including
3xx and 4xx and maybe even 1xx and 2xx in other departments/faculties,
and so this course will be a bit more challenging for you and will
demand slightly more time.
If you are an MSc graduate student, you likely will be taking
three courses and two of these courses will be 4 credits (not 3,
like this one). In terms of workload, your 11 credits of courses together
will be the equivalent of just under four 5xx level courses (12
credits). I consider four 5xx courses to be equivalent in workload to a
five course load that is a mix of 2xx, 3xx, 4xx, and 5xx level
courses.
So let's assume you spend at least 20% of your time on this
course. The semester is 13 weeks long, plus 2 weeks
for the Final Exam period (total 15 weeks).
I believe that to get B grades, a typical McGill student
will need to work at least 40 hours per week over all
courses. (To get A grades, you will need to work more.)
Hence the total number of hours you
should spend on this course is at least 120 hours (= 40 hours per week
* 15 weeks / 5 courses). For this course, this 120 hours should break down roughly to:
- 40 hours for attending lectures
- 40 hours for reviewing lecture material to study for exams and
to review material needed to do the assignments.
- 10 hours for the Term Paper
- 10 hours for each of the three Assignments (30 total), note:
this does not include the time needed to review lecture material
that the assigment is based on
Related Courses
MSc or PhD students who are interested in computer vision, graphics,
robotics, or related areas are encouraged to concentrate their course
choices in these areas. Particularly relevant COMP courses are:
- COMP 557 Fundamentals of Computer Graphics (Fall 2009)
- COMP 646 Machine Learning (Fall 2009)
- COMP 599 Computer Animation (Winter 2010)
- COMP 755 Mobile Robotics (Winter 2010)
- COMP 766 Shape Analysis in Computer Vision (Winter 2010)
Other
useful Links
Academic Integrity
McGill University values
academic integrity. Therefore, all students
must understand the meaning and consequences of cheating,
plagiarism and other academic offenses
under the Code of Student Conduct and Disciplinary Procedures.
See
here for more information