Fahim Mannan's Webpage

About
Course Work
Teaching
Projects
Research

I'm currently a Computer Vision and Imaging Researcher at Algolux working on optimizing Image and Signal Processors (ISP) for computer vision applications. Previously, I was a Postdoc with Derek Nowrouzezahrai and worked in collaboration with Jean-François Lalonde and Algolux. I obtained my PhD (2011 - 2016) and MSc from the School of Computer Science under the supervision of Michael Langer.

PhD Thesis: On Optimal Depth from Defocus

PhD Supervisor: Michael Langer

Department: School of Computer Science

Research Interests: Computer Vision (3D reconstruction, defocus, stereo, segmentation, scene understanding),
Robotics (vision based navigation, motion planning and multirobot system),
and Computer Graphics (physically-based rendering, fluid simulation).

Related Publications:

3DV 2016: Discriminative Filters for Depth from Defocus

CRV 2016: Blur Calibration for Depth from Defocus

CRV 2016: What is a Good Model for Depth from Defocus?

3DV 2015: Optimal Camera Parameters for Depth from Defocus

Other Activities:

Jan 2013 - April 2013: MITACS Accelerate Research Intern @ Hololabs Inc.

Summer 2012: Visiting student at the Computer Vision lab @ University of Western Ontario working with Prof. Olga Veksler

Summer 2011: Attended ICVSS 2011 and CVML 2011

Jun - Dec 2010: Intern @ INRIA-Saclay(Galen)/École Centrale Paris (MAS), supervised by Prof. Nikos Paragios.

Winter 2014

COMP-766 Shape Analysis [Project: Inferring Differential Structure from Defocused Images]

Winter 2012

COMP-567 Integer Programming

Winter 2011

COMP-642 Numerical Estimation

MATH-560 Optimization

Winter 2010

MATH-552 Combinatorial Optimization

Winter 2008

COMP-646 Computational Perception

ECSE-626 Statistical Computer Vision [Interactive Image Segmentation, project page]

COMP-764B Advanced Computer Graphics and Animation

Fall 2007

COMP-765 Planning Algorithms [Project: Comparative Study of PCD and PRM, report]

COMP-652 Machine Learning [Project: Gender Classification from face image, report]

COMP-558 Fundamentals of Computer Vision

COMP-557 Computer Graphics

Winter 2015

Teaching Assitant for COMP-557 Computer Graphics

Fall 2009

Teaching Assitant for COMP-558 Fundamentals of Computer Vision

Fall 2008

Teaching Assitant for COMP-557 Computer Graphics

Winter 2008

Teaching Assistant for COMP-322 Programming in C++

Teaching Assistant for COMP-310/ECSE-427 Operating Systems

Winter 2013

Hololabs and Mitacs funded project on segmentation.

Summer 2012 and 2013

Graph Cut based segmentation using soft shape priors.

Winter 2012

Flight Scheduling Problem: This was a course project for Discrete Optimization 2. We developed a linear programming model for scheduling flights between different cities with different constraints on the maximum distance travelled, fuel consumption, plane capacity, etc. A Matlab script took the problem description and after pre-processing generated zimpl description of the linear programming problem. The linear program was then solved using CPLEX.

June - Dec 2010

Optical flow using Dual Decomposition @ ECP/INRIA

Summer 2008

Interactive Mask Editor for Hugin/Panotools (Google Summer of Code 2008)

As part of Google Summer of Code 2008, I developed an interactive mask editor for Hugin/Panotools.

Winter 2008

Interactive Image Segmentation

Statistical Computer Vision

This project is based on Boykov & Jolly's "Interactive Graph Cuts for Optimal Boundary & Region Segmentation of Objects in N-D Images", ICCV, July 2001. The objective of the project is to implement the MRF model outlined in the paper and solve the optimization problem using an efficient implementation of max-flow min-cut algorithm by Yuri Boykov and Vladimir Kolmogorov. more

Wave Particles

Advanced Computer Graphics and Animation

This project is based on the paper "Wave Particles" by Cem Yuksel et al. published in SIGGRAPH 2007.

Fall 2007

Comparative Study of Probabilistic Cell Decomposition and Probabilistic Roadmap

Planning Algorithms

Implemented Probabilistic Cell Decomposition and Probabilistic Roadmap methods and compared them using different obstacles and workspace. Implementation was done using C++, and OpenGL. Proximity Query Package (PQP) was used for performing collision detection. [report]

Classification of Face Images Based on Gender using Dimensionality Reduction Techniques and SVM

Machine Learning

Used dimensionality reduction techniques - PCA and ICA to project images onto low dimensional subspace and then classify using SVM. Experiments were carried out by varying training set size, normalization, histogram equalization and input data scaling. Matlab and LibSVM was used for this project.[report]

Estimating Distance and Spatial Separation between Obstacles using Stereo Vision

Fundamentals of Computer Vision

Implemented an algorithm for estimating depth of obstacles by first determining the contour of the obstacles and then, determining the distance of those points from the stereo system.

Past Projects

Vision-based Line Following Robot

For this project, we developed differential drive mobile robot that can follow line. We used a webcam as the primary sensor. We also developed a line extraction algorithm using OpenCV and a control software that uses the data provided by the vision system and generate appropriate motor signals for controlling the robots motion. My contribution was in developing the control software. As the team lead, I also oversaw the mechanical and electrical aspects of the robot.

Robot Arm Controller

In this project, a 3DoF robot arm controller was developed. The user provides the source and destination location and the algorithm determines the shortest collision-free path from the source to the goal location. Finally, it sends stepper motor control signals to the actual robot arm via LPT port.

Online Motion Planning for Cooperative Multirobot System

I developed this as part of my undergraduate project work. The project is basically concerned with developing a distributed decentralized motion planner for cooperative multirobot system.

Network Monitoring Tool

This project was developed as part of Advanced Networking project. The tool captures packets from the network using WinPcap and analyzes the content of those data packet. From this perspective, it works like network packet capturing tools such as ethereal. However, the tool is also capable of detecting other packet capturing programs on the network. There are several ways of doing this. One approach is to send fake ARP requests which should be ignored by all the nodes except the one running a sniffer. Additionally, I have also implemented ARP cache poisoning feature to experiment how it works.

Port Scanner

This is a basic port scanner that was implemented using WinPcap. It works by sending SYN request packets and listening for ACK responses. Additionally, I’ve also implemented FIN based scanning. To avoid being blocked by firewalls, it sends packets in small chunks.

Publications:

Journal

J1. M. Langer and F. Mannan, Visibility in three-dimensional cluttered scenes, J. Opt. Soc. Am. A 29, 1794-1807 (2012).

Conference

C6. F. Mannan and M. S. Langer, Discriminative Filters for Depth from Defocus, 3DV 2016

C5. F. Mannan and M. S. Langer, Blur Calibration for Depth from Defocus, CRV 2016

C4. F. Mannan and M. S. Langer, What is a Good Model for Depth from Defocus?, CRV 2016

C3. F. Mannan and M. S. Langer, Optimal Camera Parameters for Depth from Defocus, 3DV 2015

C2. F. Mannan and M. S. Langer, Performance of Stereo Methods in Cluttered Scenes, 8th Canadian Conference on Computer and Robot Vision, 2011

C1. F. Mannan and M.S. Langer, Performance of MRF-based Stereo Algorithms for 3D Cluttered Scenes, in Adv. in Intelligent and Soft Computing Springer, 2010, Vol. 83/2010, 125-136.

PhD Work:

Depth-from-Defocus:

My PhD work is related to finding novel ways of improving the performance of Depth-from-Defocus algorithms. Depth-from-Defocus works by estimating the depth dependent defocus blur at every pixel. Finite aperture lenses can focus on a single plane at a time. Everything outside that focal plane is defocused and the amount of defocus varies with the depth of the object. The problem of estimating depth from defocus has been around since the late 80's. My goal is to find ways to improve the performance of DFD based on the scene, camera parameters and setup.

Segmentation

Besides DFD, I also worked on scene segmentation. Here the goal was to develop graph cut based algorithms with soft shape constraints.

Master's Thesis :

Markov Random Field based Methods for Cluttered Scene Stereo

Abstract:

This thesis studies the performance of different Markov Random Field (MRF) based stereo formulations for cluttered scenes. Cluttered scenes have objects of a specific size distribution placed randomly in 3D space. Real-world examples of such scenes include forest canopy, bushes or foliages in general. One characteristic of such scenes is that they contain a lot of depth discontinuities and partially visible pixels. A natural question which is addressed in this thesis is how well the existing stereo algorithms perform for such scenes. The scenes used in some of the widely used benchmark dataset do not contain stereo pairs with dense clutter. Therefore, we use a cluttered scene model [1] to generate synthetic scenes with different scene parameters such as size and density of objects, and range of depth. In our experiments we apply algorithms with basic and visibility constraints. In the basic category we use: Expansion, Swap, Max Product Belief Propagation (BP-M), Sequential Tree Reweighted Message Passing (TRW-S) and Sequential Belief Propagation (BP-S) with different forms of data and smoothness terms. In the visibility constraint category we use: KZ1 and KZ2 proposed in [2, 3, 4]. The algorithms are applied to the input dataset with different parameter settings. To compare the performance, we consider the percentage of mislabeled pixels, errors in certain regions and the contribution of the errors in those regions to the total error. We also analyze the cause of those errors using the underlying scene statistics. For the basic formulation, Potts model performs surprisingly well in all the experiments, in the sense that binocularly visible surface points are correctly labeled. In particular, Expansion, TRW-S, and BP-M perform equally well. Algorithms with visibility constraints also perform equally well for binocular pixels and in some cases slightly better than basic formulation. We did not observe any clear improvement in labeling binocular pixels. However, visibility constraints perform largely better than basic formulation when all the pixels are considered. This is also reflected in the energy measure. Algorithms based on basic formulation shows large gap between the ground truth and output energy. However, formulations with visibility constraints have energy values closer to the ground truth. This is because the visibility constraint restricts the search space to disparity labels that are consistent. We conclude that methods like KZ1 can primarily improve labeling of monocular pixels. For binocular pixels, there is still room for improvement in both formulations, especially in the case of off-by-one errors (i.e. cases where the assigned labels differ from the ground truth by a single disparity).

Download from eScholarship@McGill

Fahim Mannan

Ph.D. in Computer Science