Jie(Jay) Mei I am a fifth-year Ph.D. from the Information Processing Lab at the University of Washington, Seattle where I am fortunate to be advised by Prof. Jenq-Neng Hwang. My research involves deep learning, lifelong learning, multimodal learning (vision+language), and 3D vision. I just finished a real-time NeRF rendering project as a deep learrning research intern at Apple in 2023 summer. Before that, I was engaged in a vision language pre-training project as a research intern at Google Brain. In 2022 summer, I was a research scientist intern in the MapsCV team, Reality Labs, at Meta Platforms, Inc., working on panoptic segmentation of Lidar Point Clouds. I was also a software engineer intern in Megvii, China in 2019 summer, working on few-shot object detection. Prior to my Ph.D. study, I was fortunate to be advised by Distinguished Prof. Demetri Terzopoulos during the UCLA CSST program. During my undergraduate, I am the recipient of the highest honor, the Principal 'Teli Xu' Scholarship, at Beijing Institute of Technology. I was also fortunate to be advised by Prof. Shengjin Wang from Tsinghua University on my graduation project. |
|
3D Vision, Apple Maps | Deep Learning Research Intern | (Jun, 2023 - Sep, 2023) | |
Vision and Language Team, Google Brain | Research Intern + Part-time Student Researcher |
(Sep, 2022 - Apr, 2023) | |
Maps CV Team, Reality Lab | Research Scientist Intern | (Jun, 2022 - Sep, 2022) | |
Image and Video Group | Software Engineer Intern | (Jun, 2019 - Sep, 2019) |
Scale-up NeRF Pipeline and Real-time Rendering "In this project, we present a scale-up NeRF pipeline enabling real-time rendering on device." @misc{mei2022unsupervised, title={Unsupervised Severely Deformed Mesh Reconstruction (DMR) from a Single-View Image}, author={Jie Mei and Jingxi Yu and Suzanne Romain and Craig Rose and Kelsey Magrane and Graeme LeeSon and Jenq-Neng Hwang}, year={2022}, eprint={2201.09373}, archivePrefix={arXiv}, primaryClass={cs.CV} } |
SLVP: Self-supervised Language-Video Pre-training for Referring Video Object Segmentation "In this paper, we present a general self-supervised language-video pre-training (SLVP) strategy which brought non-negligible improvement to the downstream pixel-level Referring-VOS task."
arxiv/
bibtex
@inproceedings{mei2024slvp, title={SLVP: Self-Supervised Language-Video Pre-Training for Referring Video Object Segmentation}, author={Mei, Jie and Piergiovanni, AJ and Hwang, Jenq-Neng and Li, Wei}, booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision}, pages={507--517}, year={2024} } |
HCIL: Hierarchical Class Incremental Learning for Longline Fishing Visual Monitoring "This work introduces a Hierarchical Class Incremental Learning (HCIL) model, which significantly improves the state-of-the-art hierarchical classification methods under the CIL scenario."
arxiv/
video /
bibtex
@misc{mei2022hcil, title={HCIL: Hierarchical Class Incremental Learning for Longline Fishing Visual Monitoring}, author={Jie Mei and Suzanne Romain and Craig Rose and Kelsey Magrane and Jenq-Neng Hwang}, year={2022}, eprint={2202.13018}, archivePrefix={arXiv}, primaryClass={cs.CV} } |
Unsupervised Severely Deformed Mesh Reconstruction (DMR) from a Single-View Image "This paper proposes an unsupervised mesh reconstruction method for severely deformed objects from a single-view image."
arxiv/
bibtex
@misc{mei2022unsupervised, title={Unsupervised Severely Deformed Mesh Reconstruction (DMR) from a Single-View Image}, author={Jie Mei and Jingxi Yu and Suzanne Romain and Craig Rose and Kelsey Magrane and Graeme LeeSon and Jenq-Neng Hwang}, year={2022}, eprint={2201.09373}, archivePrefix={arXiv}, primaryClass={cs.CV} } |
Instance Tracking and Semantic Segmentation "This work achieved No.1 place in ICCV 2021 BMTT Challenge."
arxiv (KITTI) /
arxiv (MOT) /
bibtex
@article{wanghvps, title={HVPS: A Human Video Panoptic Segmentation Framework}, author={Wang, Yizhou and Zhang, Haotian and Jiang, Zhongyu and Mei, Jie and Yang, Cheng-Yen and Cai, Jiarui and Hwang, Jenq-Neng and Kim, Kwang-Ju and Kim, Pyong-Kun} } @article{zhangu3d, title={U3D-MOLTS: Unified 3D Monocular Object Localization, Tracking and Segmentation}, author={Zhang, Haotian and Wang, Yizhou and Jiang, Zhongyu and Yang, Cheng-Yen and Mei, Jie and Cai, Jiarui and Hwang, Jenq-Neng and Kim, Kwang-Ju and Kim, Pyong-Kun} } |
Absolute 3D Pose Estimation and Length Measurement of Severely Deformed Fish from Monocular Videos in Longline Fishing "This video-based method estimates the absolute 3D fish pose and fish length only from single-view 2D segmentation masks."
arxiv /
video /
bibtex
@inproceedings{mei2021absolute, title={Absolute 3d Pose Estimation and Length Measurement of Severely Deformed Fish from Monocular Videos in Longline Fishing}, author={Mei, Jie and Hwang, Jenq-Neng and Romain, Suzanne and Rose, Craig and Moore, Braden and Magrane, Kelsey}, booktitle={ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, pages={2175--2179}, year={2021}, organization={IEEE} } |
Video-based Hierarchical Species Classification for Longline Fishing Monitoring "This paper proposes a hierarchical classification dataset and a method enforcing the hierarchical data structure. It also introduces an efficient training and inference strategy for video-based fisheries data classification." |
Website Credits to Georgia Gkioxari