The Visual Learning Group researches methods to learn models of the real-world from images and video. Our most recent work leverages the framework of deep learning to address challenging problems at the boundary between computer vision and machine learning. Projects include image categorization, action recognition, depth estimation from single photo, as well as 3D reconstruction of human movement from monocular video.

 

news


  1. New paper to appear at NIPS 2018:

  2. Cooperative Learning of Audio and Video Models from Self-Supervised Synchronization,
    with Bruno Korbar, and Du Tran.


  1. Three papers presented at ECCV 2018:

  2. Object Detection in Video with Spatiotemporal Sampling Networks,
    with Gedas Bertasius, and Jianbo Shi.

  3. MaskConnect: Connectivity Learning by Gradient Descent,
    with Karim Ahmed.

  4. Scenes-Objects-Actions: A Multi-Task, Multi-Label Video Dataset,
    with Jamie Ray, Heng Wang, Du Tran, Yufei Wang, Matt Feiszli, and Manohar Paluri.


  1. Three papers on video models presented at CVPR 2018:

  2. A Closer Look at Spatiotemporal Convolutions for Action Recognition,
    with Du Tran, Heng Wang, Jamie Ray, Yann LeCun, and Manohar Paluri.

  3. Detect-and-Track: Efficient Pose Estimation in Videos,
    with Rohit Girdhar, Georgia Gkioxari, Manohar Paluri, and Du Tran.

  4. What Makes a Video a Video: Analyzing Temporal Information in Video Understanding Models and Datasets,
    with De-An Huang, Vignesh Ramanathan, Dhruv Mahajan, Juan Carlos Niebles,
    Fei-Fei  Li, and Manohar Paluri.


  1. Together with collaborators, we organized two workshops at CVPR 2018:

  2. Brave New Ideas for Video Understanding

  3. DeepGlobe: A Challenge for Parsing the Earth through Satellite Images


  1. New paper presented at WACV 2018:

    BranchConnect: Large-Scale Visual Recognition with Learned Branch Connections,
    with Karim Ahmed.

  1. New paper to appear in Computer Vision and Image Understanding:

    Multiple Hypothesis Colorization and Its Application to Image Compression,
    with Haris Baig.


  1. We released our new dataset for video comprehension, named VideoMCC. It includes over 600 hours of video. Give it a try!

sponsors

We thank the following sources for supporting our research.

Logo design by Christine Claudino