C3D: Generic Features for Video Analysis

Du Tran, Lubomir Bourdev, Rob Fergus, Lorenzo Torresani, Manohar Paluri

C3D is a modified version of BVLC caffe [2] to support 3-Dimensional Convolutional Networks. C3D can be used to train, test, or fine-tune 3D ConvNets efficiently. We also provide our C3D pre-trained model which were trained on Sports-1M dataset [3] with necessary tools for extract video features.


Sport classification using C3D on Sports-1M dataset. Video frames are visualized with top 2 predictions.




This project is supported by:

Facebook AI Research Visual Learning Group


If you find C3D helpful for your research, please cite the following papers:

  1. D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri, Learning Spatiotemporal Features with 3D Convolutional Networks, ICCV 2015, PDF.
  2. Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell, Caffe: Convolutional Architecture for Fast Feature Embedding, arXiv 2014.
  3. A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, and L. Fei-Fei, Large-scale Video Classification with Convolutional Neural Networks, CVPR 2014.

Need further helps? Email me: trandu -at- fb.com or post here.