Selected Publications *indicate corresponding authors
Jinxuan Li, Zihang Lin, Jian-Fang Hu*, Chaolei Tan, Tianming Liang, Zhi Jin, and Wei-Shi Zheng,Collaborative Static and
Dynamic Vision-Language Learning for Spatio-Temporal Video Grounding,Arxiv, 2025.
Jianwei Tang, Jian-Fang Hu*, Tianming Liang, Xiaotong Lin, Jiangxin Sun, Wei-Shi Zheng, Jianhuang Lai, Human Motion
Prediction via Continual Prior Compensation, Arxiv, 2025.
Jinxuan Li, Yi Zhang, Jian-Fang Hu*, Chaolei Tan, Tianming Liang, Beihao Xia, “TubeRMC: Tube-conditioned Reconstruction
with Mutual Constraints for Weakly-supervised Spatio-Temporal Video Grounding”, Arxiv, 2025.
Heng Li, Xiaotong Lin, Ling-An Zeng, Yulei Kang, Shuai Li, Jian-Fang Hu*, "MotionHiFlow: Text-to-Motion via Hierarchical
Flow Matching", Arxiv, 2025.
Xiaotong Lin, Tianming Liang,Jian-Fang Hu*, Kun-Yu Lin, Yulei Kang, Chunwei Tian, Jianhuang Lai, Wei-Shi Zheng,
"CoopDiff: Anticipating 3D Human-object Interactions via Contact-consistent Decoupled Diffusion", Arxiv, 2025.
Tianming Liang, Haichao Jiang, Yuting Yang, Chaolei Tan, Shuai Li, Wei-Shi Zheng,
Jian-Fang Hu*, "Long-RVOS: A
Comprehensive Benchmark for Long-term Referring Video Object Segmentation", Arxiv, 2025.
[paper] [project page]
Shenghao Fu, Qize Yang, Yuan-Ming Li, Yi-Xing Peng, Kun-Yu Lin, Xihan Wei, Jian-Fang Hu*, Xiaohua Xie, Wei-Shi Zheng,
"ViSpeak: Visual Instruction Feedback in Streaming Videos", International Conference on Computer Vision (ICCV), 2025.
Tianming Liang, Kun-Yu Lin, Chaolei Tan, Jianguo Zhang, Wei-Shi Zheng,
Jian-Fang Hu*, "ReferDINO: Referring Video Object
Segmentation with Visual Grounding Foundations", International Conference on Computer Vision (ICCV), 2025.
[code&model] [project page][ReferDINO-Plus]
Jianwei Tang, Hong Yang, Tengyue Chen, Jian-Fang Hu*, "Stochastic Human Motion Prediction with Memory of Action
Transition and Action Characteristic", IEEE Computer Vision and Pattern Recognition (CVPR), 2025.
Wei-Jin Huang, Yuan-Ming Li, Zhi-Wei Xia, Yu-Ming Tang, Kun-Yu Lin, Jian-Fang Hu*, Wei-Shi Zheng, "Modeling Multiple
Normal Action Representations for Error Detection in Procedural Tasks", IEEE Computer Vision and Pattern Recognition
(CVPR), 2025.
Dian Zheng, Cheng Zhang, Xiao-Ming Wu, Cao Li, Chengfei Lv, Jian-Fang Hu*, Wei-Shi Zheng, "Panorama Generation From NFoV
Image Done Right", IEEE Computer Vision and Pattern Recognition (CVPR), 2025.
Yulei Kang, Teng-Yue Chen, Xiaotong Lin, Siyu Jiang, Jian-Fang Hu*, Recovering Human Mesh from Videos by 2D and 3D
Deformable Attentions, IEEE International Conference on Multimedia and Expo (ICME), 2025.
Heng Li, Xing Liufu, Xiaotong Lin, Jian Zhu, Jian-Fang Hu*, Efficient Text-to-Motion via Multi-Head Generative Masked
Modeling, IEEE International Conference on Multimedia and Expo (ICME), 2025.
Yaokun Zhong, Siyu Jiang, Jian Zhu, Jian-Fang Hu*, Context Consistency Learning via SentenceRemoval for Semi-Supervised
Video ParagraphGrounding, IEEE International Conference on Multimedia and Expo (ICME), 2025.
Fuxing Liu, Chaolei Tan, Xiaotong Lin, Yonggang Qi, Jinxuan Li, Jian-Fang Hu*, "SAUGE: Taming SAM for
Uncertainty-Aligned Multi-Granularity Edge Detection", Association for the Advancement of Artificial Intelligence
(AAAI), 2025.
Tianming Liang, Linhui Li, Jian-Fang Hu*, Xiangyang Yu, Wei-Shi Zheng, and Jianhuang Lai. "Rethinking Temporal Context
in Video-QA: A Comprehensive Study of Single-frame Static Bias", IEEE Transactions on Multimedia (TMM), 2024.
Jin Li, Ziqiang He, Anwei Luo, Jian-Fang Hu, Z. Jane Wang, and Xiangui Kang,"AdvAD: Exploring Non-Parametric Diffusion
for Imperceptible Adversarial Attacks", Conference on Neural Information Processing Systems (NeurIPS), 2024.
Jia-Run Du, Jia-Chang Feng, Kun-Yu Lin, Fa-Ting Hong, Zhongang Qi, Ying Shan, Jian-Fang Hu*,and Wei-Shi Zheng*,
"Weakly-Supervised Temporal Action Localization by Progressive Complementary Learning", IEEE Transactions on Circuits
and Systems for Video Technology (IEEE TCSVT), 2024.
Linhui Li, Xiaotong Lin, Yejia Huang, Zizhen Zhang*, and Jian-Fang Hu*, "Beyond Minimum-of-N: Rethinking the Evaluation
and Methods of Pedestrian Trajectory Prediction", IEEE Transactions on Circuits and Systems for Video Technology (IEEE
TCSVT), 2024.
Chaolei Tan, Zihang Lin, Junfu Pu, Zhongang Qi, Wei-Yi Pei, Zhi Qu, Yexin Wang, Ying Shan, Wei-Shi Zheng, and
Jian-Fang Hu*, "SynopGround: A Large-Scale Dataset for Multi-Paragraph Video Grounding from TV Dramas and Synopses", ACM
Multimedia (ACM MM), 2024.
Link for Dataset & Codes & Pretrained Model
Xiaotong Lin, Tianming Liang, Jianhuang Lai, and
Jian-Fang Hu*,"Progressive Pretext Task Learning for Human Trajectory
Prediction",European Conference on Computer Vision (ECCV), 2024.
Codes & Model
Tianming Liang, Chaolei Tan, Beihao Xia, Wei-Shi Zheng, and Jian-Fang Hu*, "Ranking Distillation for Open-Ended Video
Question Answering with Insufficient Labels",IEEE Computer Vision and Pattern Recognition (CVPR), 2024.
Chaolei Tan, Jianhuang Lai, Wei-Shi Zheng, and Jian-Fang Hu*, "Siamese Learning with Joint Alignment and Regression for
Weakly-Supervised Video Paragraph Grounding",IEEE Computer Vision and Pattern Recognition (CVPR), 2024.
Dian Zheng, Xiao-Ming Wu, Shu-Zhou Yang, Jian Zhang, Jian-Fang Hu, and Wei-Shi Zheng. Selective Hourglass Mapping for
Universal Image Restoration Based on Diffusion Model. IEEE Conference on Computer Vision and Pattern Recognition (CVPR),
2024.
Jianwei Tang, Jiangxin Sun, Xiaotong Lin, Lifang Zhang, Wei-Shi Zheng, and
Jian-Fang Hu*, "Temporal Continual Learning
with Prior Compensation for Human Motion Prediction", Conference on Neural Information Processing Systems (NeurIPS),
2023.
Codes
Zihang Lin, Chaolei Tan, Jian-Fang Hu*, Zhi Jin, Tiancai Ye, and Wei-Shi Zheng, "Collaborative Static and Dynamic
Vision-Language Streams for Spatio-Temporal Video Grounding", IEEE Conference on Computer Vision and Pattern Recognition
(CVPR), 2023. (Winner of ACM MM 2022 Workshop Person In Context Challenge)
Chaolei Tan, Zihang Lin, Jian-Fang Hu*, Wei-Shi Zheng, and Jianhuang Lai, "Hierarchical Semantic Correspondence Networks
for Video Paragraph Grounding", IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
Jiangxin Sun, Chunyu Wang, Huang Hu, Hanjiang Lai, Zhi Jin, and Jian-Fang Hu*,"You Never Stop Dancing: Non-freezing
Dance Generation via Bank-constrained Manifold Projection", Conference and Workshop on Neural Information Processing
Systems (NeurIPS), 2022.
Jiangxin Sun, Zihang Lin, Xintong Han, Jian-Fang Hu*, Jia Xu, and Wei-Shi Zheng, "Action-guided 3D Human Motion
Prediction", Conference and Workshop on Neural Information Processing Systems (NeurIPS), 2021.
Jian-Fang Hu#, Jiangxin Sun#, Zihang Lin, Jianhuang Lai, Wenjun Zeng, and Wei-Shi Zheng, "APANet: Auto-Path Aggregation
for Future Instance Segmentation Prediction", IEEE Transaction on Pattern Analysis and Machine Intelligence (TPAMI), 44
(7), 3386-3403, 2022.
PDF Supp Link
Jian-Fang Hu*, Wei-Shi Zheng, Lianyang Ma, Gang Wang, Jianhuang Lai, and Jianguo Zhang, "Early Action Prediction by Soft
Regression", IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 41 (11), 2568-2583, 2019.
PDF SUPP-PDF
Jian-Fang Hu, Wei-Shi Zheng, Jianhuang Lai, and Jianguo Zhang, "Jointly Learning Heterogeneous Features for RGB-D
Activity Recognition", IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 39 (11), 2186-2200, 2017.
PDF SUPP-PDF Codes & SYSU 3D HOI Set & Demo
Zihang Lin#, Jiangxin Sun#,
Jian-Fang Hu(*), Qizhi Yu, Jianhuang Lai, Wei-Shi Zheng, "Predictive Feature Learning for
Future Segmentation Prediction", International Conference on Computer Vision (ICCV) 2021.
PDF SUPP-PDF
Xionghui Wang,
Jian-Fang Hu(*) , Jianhuang Lai, Jianguo Zhang, and Wei-Shi Zheng, "Progressive Teacher-student Learning
for Early Action Prediction", IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
PDF
Guoliang Pang , Xionghui Wang,
Jian-Fang Hu(*), Qing Zhang, and Wei-Shi Zheng, DBDNet: Learning Bi-directional Dynamics
for Early Action Prediction, International Joint Conference on Artificial Intelligence (IJCAI), 2019.
PDF
Jian-Fang Hu, Wei-Shi Zheng, Jiahui Pan, Jianhuang Lai, and Jianguo Zhang, "Deep Bilinear Learning for RGB-D Action
Recognition", European Conference on Computer Vision (ECCV), 2018.
PDF
Jiafeng Xie, Bing Shuai,
Jian-Fang Hu(*), Jingyang Lin, and Wei-Shi Zheng, "Improving Fast Segmentation with
Teacher-student Learning", British Machine Vision Conference (BMVC), 2018.
PDF
Shaofan Lai, Wei-Shi Zheng, Jian-Fang Hu, and Jianguo Zhang, "Global-Local Temporal Saliency Action Prediction",
IEEE
Transaction on Image Processing (TIP), 27(5), 2272-2285, 2018.
Jiachi He, Jian-Fang Hu, Xi Lu, Wei-Shi Zheng, "Multi-Task Mid-Level Feature Learning for Micro-Expression
Recognition",
Pattern Recognition (PR), 66 (6), 44-52, 2017.
Jian-Fang Hu, Wei-Shi Zheng, Xiaohua Xie, and Jianhuang Lai, "Sparse Transfer for Facial Shape-from-Shading",
Pattern
Recognition (PR), 68(8), 272-285, 2017.
pdf
Jian-Fang Hu, Wei-Shi Zheng, Lianyang Ma, Gang Wang, and Jianhuang Lai, "Real-time RGB-D Activity Prediction by
Soft
Regression", European Conference on Computer Vision (ECCV), 280-296, 2016.
pdf Prediction Results
Zhaoze Zhou, Wei-Shi Zheng,
Jian-Fang Hu, Yong Xu, Jane You, " One-pass Online Learning: A Local Approach",
Pattern
Recognition (PR), 51, 346-357, 2016.
pdf code
Jian-Fang Hu, Wei-Shi Zheng, Jianhuang Lai, and Jianguo Zhang, "Jointly Learning Heterogeneous Features for
RGB-D
Activity Recognition", IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 5344-5352, 2015.
pdf
Jian-Fang Hu, Wei-Shi Zheng, Jianhuang Lai, Shaogang Gong, and Tao Xiang, "Recognising Human-Object Interaction
via
Exemplar based Modelling", International Conference on Computer Vision (ICCV) , 3144-3151, 2013.
pdf
BibTex
Jian-Fang Hu, Guocan Feng, Jianhuang Lai, and Wei-Shi Zheng, "Asymmetric Facial Shape based on Some Symmetry
Assumptions", Chines Conference on Biometric Recognition (CCBR), 42-49, 2011.
VIDEO