Guang-Tong Zhou

Guang-Tong Zhou

Bellevue, Washington, United States
3K followers 500+ connections

About

I am currently a Research Scientist at Meta. Before joining Meta, I worked as a Machine…

Activity

Join now to see all activity

Experience

  • Meta Graphic

    Meta

    Greater Seattle Area

  • -

    Vancouver, Canada Area

  • -

    Vancouver, BC, Canada

  • -

    Burnaby, BC, Canada

  • -

    Vancouver, BC, Canada

  • -

    Pittsburgh, PA, USA

  • -

    Burnaby, BC, Canada

  • -

    Hangzhou, China

  • -

    Jinan, China

  • -

    Churchill, VIC, Australia

  • -

    Nanjing, China

Education

Publications

  • Learning Structured Inference Neural Networks with Label Relations

    IEEE Computer Vision and Pattern Recognition (CVPR)

    Images of scenes have various objects as well as abundant attributes, and diverse levels of visual categorization are possible. A natural image could be assigned with fine-grained labels that describe major components, coarse-grained labels that depict high level abstraction, or a set of labels that reveal attributes. Such categorization at different concept layers can be modeled with label graphs encoding label information. In this paper, we exploit this rich information with a state-of-art…

    Images of scenes have various objects as well as abundant attributes, and diverse levels of visual categorization are possible. A natural image could be assigned with fine-grained labels that describe major components, coarse-grained labels that depict high level abstraction, or a set of labels that reveal attributes. Such categorization at different concept layers can be modeled with label graphs encoding label information. In this paper, we exploit this rich information with a state-of-art deep learning framework, and propose a generic structured model that leverages diverse label relations to improve image classification performance. Our approach employs a novel stacked label prediction neural network, capturing both inter-level and intra-level label semantics. We evaluate our method on benchmark image datasets, and empirical results illustrate the efficacy of our model.

    Other authors
    See publication
  • Discovering Human Interactions in Videos with Limited Data Labeling

    Workshop on Group and Crowd Behavior Analysis and Understanding (at CVPR)

    We present a novel approach for discovering human interactions in videos. Activity understanding techniques usually require a large number of labeled examples, which are not available in many practical cases. Here, we focus on recovering semantically meaningful clusters of human-human and human-object interaction in an unsupervised fashion. A new iterative solution is introduced based on Maximum Margin Clustering (MMC), which also accepts user feedback to refine clusters. This is achieved by…

    We present a novel approach for discovering human interactions in videos. Activity understanding techniques usually require a large number of labeled examples, which are not available in many practical cases. Here, we focus on recovering semantically meaningful clusters of human-human and human-object interaction in an unsupervised fashion. A new iterative solution is introduced based on Maximum Margin Clustering (MMC), which also accepts user feedback to refine clusters. This is achieved by formulating the whole process as a unified constrained latent max-margin clustering problem. Extensive experiments have been carried out over three challenging datasets, Collective Activity, VIRAT, and UT-interaction. Empirical results demonstrate that the proposed algorithm can efficiently discover perfect semantic clusters of human interactions with only a small amount of labeling effort.

    Other authors
    See publication
  • Discovering Video Clusters from Visual Features and Noisy Tags

    European Conference onComputer Vision (ECCV)

    We present an algorithm for automatically clustering tagged videos. Collections of tagged videos are commonplace, however, it is not trivial to discover video clusters therein. Direct methods that operate on visual features ignore the regularly available, valuable source of tag information. Solely clustering videos on these tags is error-prone since the tags are typically noisy. To address these problems, we develop a structured model that considers the interaction between visual features…

    We present an algorithm for automatically clustering tagged videos. Collections of tagged videos are commonplace, however, it is not trivial to discover video clusters therein. Direct methods that operate on visual features ignore the regularly available, valuable source of tag information. Solely clustering videos on these tags is error-prone since the tags are typically noisy. To address these problems, we develop a structured model that considers the interaction between visual features, video tags and video clusters. We model tags from visual features, and correct noisy tags by checking visual appearance consistency. In the end, videos are clustered from the refined tags as well as the visual features. We learn the clustering through a max-margin framework, and demonstrate empirically that this algorithm can produce more accurate clustering results than baseline methods based on tags or visual features, or both. Further, qualitative results verify that the clustering results can discover sub-categories and more specific instances of a given video category.

    Other authors
    • Arash Vahdat
    • Greg Mori
    See publication
  • Latent Maximum Margin Clustering

    Neural Information Processing Systems (NIPS)

    We present a maximum margin framework that clusters data using latent variables. Using latent representations enables our framework to model unobserved information embedded in the data. We implement our idea by large margin learning, and develop an alternating descent algorithm to effectively solve the resultant non-convex optimization problem. We instantiate our latent maximum margin clustering framework with tag-based video clustering tasks, where each video is represented by a latent tag…

    We present a maximum margin framework that clusters data using latent variables. Using latent representations enables our framework to model unobserved information embedded in the data. We implement our idea by large margin learning, and develop an alternating descent algorithm to effectively solve the resultant non-convex optimization problem. We instantiate our latent maximum margin clustering framework with tag-based video clustering tasks, where each video is represented by a latent tag model describing the presence or absence of video tags. Experimental results obtained on three standard datasets show that the proposed method outperforms non-latent maximum margin clustering as well as conventional clustering approaches.

    Other authors
    See publication
  • Learning Class-to-Image Distance with Object Matchings

    IEEE Computer Vision and Pattern Recognition (CVPR)

    We conduct image classification by learning a class-to-image distance function that matches objects. The set of objects in training images for an image class are treated as a collage. When presented with a test image, the best matching between this collage of training image objects and those in the test image is found. We validate the efficacy of the proposed model on the PASCAL 07 and SUN 09 datasets, showing that our model is effective for object classification and scene classification tasks.…

    We conduct image classification by learning a class-to-image distance function that matches objects. The set of objects in training images for an image class are treated as a collage. When presented with a test image, the best matching between this collage of training image objects and those in the test image is found. We validate the efficacy of the proposed model on the PASCAL 07 and SUN 09 datasets, showing that our model is effective for object classification and scene classification tasks. State-of-the-art image classification results are obtained, and qualitative results demonstrate that objects can be accurately matched.

    Other authors
    See publication
  • Mass Estimation

    Machine Learning Journal

    This paper introduces mass estimation -- a base modelling mechanism that can be employed to solve various tasks in machine learning. We present the theoretical basis of mass and efficient methods to estimate mass. We show that mass estimation solves problems effectively in tasks such as information retrieval, regression and anomaly detection. The models, which use mass in these three tasks, perform at least as well as and often better than eight state-of-the-art methods in terms of…

    This paper introduces mass estimation -- a base modelling mechanism that can be employed to solve various tasks in machine learning. We present the theoretical basis of mass and efficient methods to estimate mass. We show that mass estimation solves problems effectively in tasks such as information retrieval, regression and anomaly detection. The models, which use mass in these three tasks, perform at least as well as and often better than eight state-of-the-art methods in terms of task-specific performance measures. In addition, mass estimation has constant time and space complexities.

    Other authors
    See publication
  • Relevance Feature Mapping for Content-Based Multimedia Information Retrieval

    Pattern Recognition

    This paper presents a novel ranking framework for content-based multimedia information retrieval (CBMIR). The framework introduces relevance features and a new ranking scheme. Each relevance feature measures the relevance of an instance with respect to a profile of the targeted multimedia database. We show that the task of CBMIR can be done more effectively using the relevance features than the original features. Furthermore, additional performance gain is achieved by incorporating our new…

    This paper presents a novel ranking framework for content-based multimedia information retrieval (CBMIR). The framework introduces relevance features and a new ranking scheme. Each relevance feature measures the relevance of an instance with respect to a profile of the targeted multimedia database. We show that the task of CBMIR can be done more effectively using the relevance features than the original features. Furthermore, additional performance gain is achieved by incorporating our new ranking scheme which modifies instance rankings based on the weighted average of relevance feature values. Experiments on image and music databases validate the efficacy and efficiency of the proposed framework.

    Other authors
    • Kai Ming Ting
    • Fei Tony Liu
    • Yilong Yin
    See publication
  • K-means Based Fingerprint Segmentation with Sensor Interoperability

    EURASIP Journal on Advances in Signal Processing

    A critical step in an automatic fingerprint recognition system is the segmentation of fingerprint images. Existing methods are usually designed to segment fingerprint images originated from a certain sensor. Thus their performances are significantly affected when dealing with fingerprints collected by different sensors. This work studies the sensor interoperability of fingerprint segmentation algorithms, which refers to the algorithm’s ability to adapt to the raw fingerprints obtained from…

    A critical step in an automatic fingerprint recognition system is the segmentation of fingerprint images. Existing methods are usually designed to segment fingerprint images originated from a certain sensor. Thus their performances are significantly affected when dealing with fingerprints collected by different sensors. This work studies the sensor interoperability of fingerprint segmentation algorithms, which refers to the algorithm’s ability to adapt to the raw fingerprints obtained from different sensors. We empirically analyze the sensor interoperability problem, and effectively address the issue by proposing a k-means based segmentation method called SKI. SKI clusters foreground and background blocks of a fingerprint image based on the k-means algorithm, where a fingerprint block is represented by a 3-dimensional feature vector consisting of block-wise coherence, mean, and variance (abbreviated as CMV). SKI also employs morphological postprocessing to achieve favorable segmentation results. We perform SKI on each fingerprint to ensure sensor interoperability. The interoperability and robustness of our method are validated by experiments performed on a number of fingerprint databases which are obtained from various sensors.

    Other authors
    See publication
  • Mass Estimation and Its Applications

    ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD)

    This paper introduces mass estimation -- a base modelling mechanism in data mining. It provides the theoretical basis of mass and an efficient method to estimate mass. We show that it solves problems very effectively in tasks such as information retrieval, regression and anomaly detection. The models, which use mass in these three tasks, perform at least as good as and often better than a total of eight state-of-the-art methods in terms of task-specific performance measures. In addition, mass…

    This paper introduces mass estimation -- a base modelling mechanism in data mining. It provides the theoretical basis of mass and an efficient method to estimate mass. We show that it solves problems very effectively in tasks such as information retrieval, regression and anomaly detection. The models, which use mass in these three tasks, perform at least as good as and often better than a total of eight state-of-the-art methods in terms of task-specific performance measures. In addition, mass estimation has constant time and space complexities.

    Other authors
    See publication
  • Relevance Feature Mapping for Content-Based Image Retrieval

    International Workshop on Multimedia Data Mining (at KDD)

    This paper presents a ranking framework for content-based image retrieval using relevance feature mapping. Each relevance feature measures the relevance of an image to some profile underlying the image database. The framework is a two-stage process. In the on-line modeling stage, it constructs a collection of models which maps all images in the database to the relevance feature space. In the on-line retrieval stage, it assigns a weight to every relevance feature based on the query image, and…

    This paper presents a ranking framework for content-based image retrieval using relevance feature mapping. Each relevance feature measures the relevance of an image to some profile underlying the image database. The framework is a two-stage process. In the on-line modeling stage, it constructs a collection of models which maps all images in the database to the relevance feature space. In the on-line retrieval stage, it assigns a weight to every relevance feature based on the query image, and then ranks images in the database according to their weighted average feature values. The framework also incorporates relevance feedback which modifies the ranking based on the feedbacks through reweighted features. We show that the power of the proposed framework is coming from the relevance features. Experiments on a large image database validate the efficacy and efficiency of the proposed framework.

    Other authors
    • Kai Ming Ting
    • Fei Tony Liu
    • Yilong Yin
    See publication
Join now to see all publications

Patents

  • Technique For Validating A Prognostic-Surveillance Mechanism In An Enterprise Computer System

    Filed US ORA160807-US-NP

    The disclosed embodiments generally relate to prognostic-surveillance techniques for enterprise computer systems. More specifically, the disclosed embodiments relate to a technique for validating the performance of a prognostic-surveillance mechanism, which is used to detect operational anomalies that arise during operation of an enterprise computer system.

    Other inventors
  • A Fingerprint Segmentation Method with Sensor Interoperability

    Issued CN ZL200910019788.X

    Other inventors
  • A Fast Fingerprint Segmentation Method by Co-Training

    Issued CN ZL200810138118.5

    Other inventors

Courses

  • Design and Analysis of Algorithms

    CMPT 705

  • Illumination in Images

    CMPT 828

  • Internet Archetecture and Protocols

    CMPT 771

  • Machine Learning

    CMPT 726

Projects

  • KeyBridge @ Oracle Labs: Deep Learning for Cloud Security

    -

    Enhance cloud security by leveraging advanced deep learning algorithms for log analysis, telemetry monitoring as well as traffic data analysis.

  • Learning Parts for Scenes

    -

    - Automatic discovery of scenery parts for a given scene.
    - Learn discriminative and representative parts in a unified part clustering and scene classification framework.

    Other creators
    • Sungju Hwang
    • Leonid Sigal
    • Greg Mori
  • Web Master of KDD 2012

    -

    Build website for ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2012.

    Other creators
    • Weike Pan
    See project
  • OLAP for Large-Scale Keyword Graphs

    -

    Design online analytical processing (OLAP) for large-scale keyword graphs, e.g. social networks with each person carries keywords, academic citation graphs where each paper has its own keywords.

    Other creators
    • Jian Pei
    • Zhi Yu
  • Mass Estimation

    -

    - Mass estimation is a new data modeling mechanism that can be employed to solve various data mining and machine learning tasks, e.g. anomaly detection, information retrieval, regression, classification, and clustering.
    - Mass-based methods often perform better than or at least as well as the state-of-the-art methods, but they run orders of magnitude faster than their distance/density-based counterparts.
    - Publications in KDD 2010, KDD Workshop 2010, PR 2012 & MLJ 2013.
    - Wikipedia…

    - Mass estimation is a new data modeling mechanism that can be employed to solve various data mining and machine learning tasks, e.g. anomaly detection, information retrieval, regression, classification, and clustering.
    - Mass-based methods often perform better than or at least as well as the state-of-the-art methods, but they run orders of magnitude faster than their distance/density-based counterparts.
    - Publications in KDD 2010, KDD Workshop 2010, PR 2012 & MLJ 2013.
    - Wikipedia page available for more details <https://en.wikipedia.org/wiki/Mass_estimation>.

    Other creators
    See project
  • Online Metric Learning with Qsim Idea for CBIR

    -

    Design an online distance metric learning method for content-based image retrieval. Relevant feedback is incorporated to refine the learned metric.

    Other creators
  • Fingerprint Segmentation Using Co-Training

    -

    Propose a semi-supervised fingerprint image segmentation method based on co-training.

    Other creators
  • Predicting Cross-Selling with Ensemble SVMs

    -

    Develop an ensemble learning method for the cross-selling problem in PAKDD 2007 Data Mining Competition.

    Other creators

Languages

  • Chinese

    -

  • English

    -

Recommendations received

More activity by Guang-Tong

View Guang-Tong’s full profile

  • See who you know in common
  • Get introduced
  • Contact Guang-Tong directly
Join to view full profile

Other similar profiles

Explore top content on LinkedIn

Find curated posts and insights for relevant topics all in one place.

View top content

Add new skills with these courses