About
I am a Director of Applied Science at Amazon Alexa AI. My research focuses on computer…
Activity
6K followers
Experience
Education
Volunteer Experience
-
-
-
Adjunct Professor and Dean of the School of Artificial Intelligence
Xi'an Jiaotong University
- Present 3 years 6 months
-
Chair, Technical Community of Pattern Analysis and Machine Intelligence (TC-PAMI)
IEEE Computer Society
- Present 5 months
Science and Technology
I am the current Chair of TC-PAMI, the technical community oversees a premiere journal, i.e., the IEEE Trans. on Pattern Analysis and Machine Intelligence (T-PAMI), and premiere AI conferences including IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), and IEEE/CVF International Conf. on Computer Vision (ICCV). T-PAMI/CVPR/ICCV are all ranked on the top in terms of journal/conference impact factor.
Publications
-
Video Event Detection Using Temporal Pyramids of Visual Semantics with Kernel Optimization and Model Subspace Boosting
ICME 2012
(To be presented) In this study, we present a system for video event classification that generates a temporal pyramid of static visual semantics using minimum-value, maximum-value, and average value aggregation techniques. Kernel optimization and model subspace boosting are then applied to customize the pyramid for each event. SVM models are independently trained for each level in the pyramid using kernel selection according to 3-fold cross-validation. Kernels that both enforce static temporal…
(To be presented) In this study, we present a system for video event classification that generates a temporal pyramid of static visual semantics using minimum-value, maximum-value, and average value aggregation techniques. Kernel optimization and model subspace boosting are then applied to customize the pyramid for each event. SVM models are independently trained for each level in the pyramid using kernel selection according to 3-fold cross-validation. Kernels that both enforce static temporal order and permit temporal alignment are evaluated.
Model subspace boosting is used to select the best combination of pyramid levels and aggregation techniques for each event. The NIST TRECVID Multimedia Event Detection (MED) 2011 dataset was used for evaluation. Results demonstrate that kernel optimizations using both temporally static and dynamic kernels together achieves better performance than any one particular method alone. In addition, model subspace boosting reduces the size of the model by 80%, while maintaining 96% of the performance gain.Other authors -
Towards Large Scale Land-cover Recognition of Satellite Images", Invited submission to The 8th International Conference on Information, Communications and Signal Processing
(ICICS'2011), Singapore, December, 2011
The entire Earth surface has been documented with
satellite imagery. The amount of data continues to grow as higher
resolutions and temporal information become available. With this
increasing amount of surface and temporal data, recognition,
segmentation, and event detection in satellite images with a
highly scalable system becomes more and more desirable. In
this paper, a semantic taxonomy is constructed for the landcover
classification of satellite images. Both the…The entire Earth surface has been documented with
satellite imagery. The amount of data continues to grow as higher
resolutions and temporal information become available. With this
increasing amount of surface and temporal data, recognition,
segmentation, and event detection in satellite images with a
highly scalable system becomes more and more desirable. In
this paper, a semantic taxonomy is constructed for the landcover
classification of satellite images. Both the training and
running of the classifiers are implemented in a distributed
Hadoop computing platform. Publicly available high resolution
datasets were collected and divided into tiles of fixed dimensions
as training data. The training data was manually indexed into the
semantic taxonomy categories, such as ”Vegetation”, ”Building”,
and ”Pavement”. A scalable modeling system implemented in
the Hadoop MapReduce framework is used for training the
classifiers and performing subsequent image classification. A
separate larger test dataset of the San Diego region, acquired
from Microsoft BING Maps, was used to demonstrate the efficacy
of our system at large scale. The presented methodology of
land-cover recognition provides a scalable solution for automatic
satellite imagery analysis, especially when GIS data is not readily
available, or surface change may occur due to catastrophic events
such as flooding, hurricane, and snow storm, etc.Other authorsSee publication
Honors & Awards
-
IEEE Fellow
IEEE
for Contributions to Facial Recognition in Images and Videos.
-
IAPR Fellow
International Association of Pattern Recognition
for Contributions to Visual Computing and Learning from Unconstrained Images and Videos
-
ACM Distinguished Scientist
Association for Computing Machineary
for Contributions to Multimedia and Computer Vision
-
IAPR Young Biometrics Investigator Award
International Association of Pattern Recognition
for Contributions to Unconstrained Face Recognition in Images and Videos
-
NIH R01 Award "NRI: An Egocentric Computer Vision based Active Learning Co-Robot Wheelchair"
National Institute of Health (NIH)
R01 Grant from the National Robotics Initiative Program (Sole PI when awarded)
-
Google Faculty Award
Association of Computing Machineary
for Project on Interactive Video Recognition
-
Microsoft Ship-It Award
Microsoft Bing Multimedia Search
for Shipping Image Content Filters for Faceted Image Search to Bing Multimedia Search
Other similar profiles
Explore top content on LinkedIn
Find curated posts and insights for relevant topics all in one place.
View top content