Kenneth Li

I am a Ph.D. student at Harvard SEAS, advised by Hanspeter Pfister.

Prior to pursuing my Ph.D., I received a B.E. from Chinese Academy of Sciences, where I primarily worked on applications of computer vision.

I have a broad interest in machine learning, particularly in unsupervised learning, with application to vision as well as its intersection with language and audio.

Email  |  Google Scholar  |  LinkedIn  |  Twitter  |  Github

profile photo
clean-usnob Pose Recognition with Cascade Transformers
Ke Li*, Shijie Wang*, Xiang Zhang*, Yifan Xu, Weijian Xu, Zhuowen Tu
CVPR, 2021
[Arxiv] | [Code] | [Video]
By using Transformer architecture, we can clean up heuristic designs that has long bedeviled pose estimation models in an end-to-end fashion.

clean-usnob Unsupervised Discriminative Learning of Sounds for Audio Event Classification
Sascha Hornauer, Ke Li, Stella X. Yu, Shabnam Ghaffarzadegan, Liu Ren
ICASSP, 2021
[Arxiv] | [Slides]
A self-supervised learning model that can transfer knowledge across audio datasets and deliver on-par performance with ImageNet pre-training.

clean-usnob Multi-modal graph neural network for joint reasoning on vision and scene text
Difei Gao*, Ke Li*, Ruiping Wang, Shiguang Shan, Xilin Chen
CVPR, 2020
[Arxiv] | [Code] | [Supp] | [Video]
Texts spotted in daily images could be rare, polysemous, and ambiguous, but we can pin down their semantics by across-modality message passing.

Design and course code from Jon Barron's website
Last update: 09/2021