“I worked closely with Guang-Tong in Oracle labs (Vancouver). He was also my colleague at SFU Vision and Media Lab supervised by Prof Greg Mori. So, in overall I know him for more than 8 years. He is a very dedicated, organized, and detail-oriented researcher. He is passionate about machine learning and generally research. I was very grateful that I could rely on him at work that he would do his task timely and perfectly. Having someone like him on my side could give me peace of mind that I am not alone. He is also a very nice and generous friend of mine.”
Guang-Tong Zhou
Bellevue, Washington, United States
3K followers
500+ connections
About
I am currently a Research Scientist at Meta. Before joining Meta, I worked as a Machine…
Activity
-
🎉 Excited to share that our paper, “LassoFlexNet: A Flexible Neural Architecture for Tabular Data,” is accepted at ICML 2026! 🎉 Paper:…
🎉 Excited to share that our paper, “LassoFlexNet: A Flexible Neural Architecture for Tabular Data,” is accepted at ICML 2026! 🎉 Paper:…
Liked by Guang-Tong Zhou
-
An incredible week at MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) with the RBC Borealis. Spending time with the CSAIL…
An incredible week at MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) with the RBC Borealis. Spending time with the CSAIL…
Liked by Guang-Tong Zhou
-
How do you navigate a major career pivot — especially into a field you've never worked in before? Ming Hua, vice president of wearables devices at…
How do you navigate a major career pivot — especially into a field you've never worked in before? Ming Hua, vice president of wearables devices at…
Liked by Guang-Tong Zhou
Experience
Education
Publications
-
Learning Structured Inference Neural Networks with Label Relations
IEEE Computer Vision and Pattern Recognition (CVPR)
Images of scenes have various objects as well as abundant attributes, and diverse levels of visual categorization are possible. A natural image could be assigned with fine-grained labels that describe major components, coarse-grained labels that depict high level abstraction, or a set of labels that reveal attributes. Such categorization at different concept layers can be modeled with label graphs encoding label information. In this paper, we exploit this rich information with a state-of-art…
Images of scenes have various objects as well as abundant attributes, and diverse levels of visual categorization are possible. A natural image could be assigned with fine-grained labels that describe major components, coarse-grained labels that depict high level abstraction, or a set of labels that reveal attributes. Such categorization at different concept layers can be modeled with label graphs encoding label information. In this paper, we exploit this rich information with a state-of-art deep learning framework, and propose a generic structured model that leverages diverse label relations to improve image classification performance. Our approach employs a novel stacked label prediction neural network, capturing both inter-level and intra-level label semantics. We evaluate our method on benchmark image datasets, and empirical results illustrate the efficacy of our model.
Other authorsSee publication -
Discovering Human Interactions in Videos with Limited Data Labeling
Workshop on Group and Crowd Behavior Analysis and Understanding (at CVPR)
We present a novel approach for discovering human interactions in videos. Activity understanding techniques usually require a large number of labeled examples, which are not available in many practical cases. Here, we focus on recovering semantically meaningful clusters of human-human and human-object interaction in an unsupervised fashion. A new iterative solution is introduced based on Maximum Margin Clustering (MMC), which also accepts user feedback to refine clusters. This is achieved by…
We present a novel approach for discovering human interactions in videos. Activity understanding techniques usually require a large number of labeled examples, which are not available in many practical cases. Here, we focus on recovering semantically meaningful clusters of human-human and human-object interaction in an unsupervised fashion. A new iterative solution is introduced based on Maximum Margin Clustering (MMC), which also accepts user feedback to refine clusters. This is achieved by formulating the whole process as a unified constrained latent max-margin clustering problem. Extensive experiments have been carried out over three challenging datasets, Collective Activity, VIRAT, and UT-interaction. Empirical results demonstrate that the proposed algorithm can efficiently discover perfect semantic clusters of human interactions with only a small amount of labeling effort.
Other authorsSee publication -
Discovering Video Clusters from Visual Features and Noisy Tags
European Conference onComputer Vision (ECCV)
We present an algorithm for automatically clustering tagged videos. Collections of tagged videos are commonplace, however, it is not trivial to discover video clusters therein. Direct methods that operate on visual features ignore the regularly available, valuable source of tag information. Solely clustering videos on these tags is error-prone since the tags are typically noisy. To address these problems, we develop a structured model that considers the interaction between visual features…
We present an algorithm for automatically clustering tagged videos. Collections of tagged videos are commonplace, however, it is not trivial to discover video clusters therein. Direct methods that operate on visual features ignore the regularly available, valuable source of tag information. Solely clustering videos on these tags is error-prone since the tags are typically noisy. To address these problems, we develop a structured model that considers the interaction between visual features, video tags and video clusters. We model tags from visual features, and correct noisy tags by checking visual appearance consistency. In the end, videos are clustered from the refined tags as well as the visual features. We learn the clustering through a max-margin framework, and demonstrate empirically that this algorithm can produce more accurate clustering results than baseline methods based on tags or visual features, or both. Further, qualitative results verify that the clustering results can discover sub-categories and more specific instances of a given video category.
Other authors -
-
Latent Maximum Margin Clustering
Neural Information Processing Systems (NIPS)
We present a maximum margin framework that clusters data using latent variables. Using latent representations enables our framework to model unobserved information embedded in the data. We implement our idea by large margin learning, and develop an alternating descent algorithm to effectively solve the resultant non-convex optimization problem. We instantiate our latent maximum margin clustering framework with tag-based video clustering tasks, where each video is represented by a latent tag…
We present a maximum margin framework that clusters data using latent variables. Using latent representations enables our framework to model unobserved information embedded in the data. We implement our idea by large margin learning, and develop an alternating descent algorithm to effectively solve the resultant non-convex optimization problem. We instantiate our latent maximum margin clustering framework with tag-based video clustering tasks, where each video is represented by a latent tag model describing the presence or absence of video tags. Experimental results obtained on three standard datasets show that the proposed method outperforms non-latent maximum margin clustering as well as conventional clustering approaches.
Other authorsSee publication -
Learning Class-to-Image Distance with Object Matchings
IEEE Computer Vision and Pattern Recognition (CVPR)
We conduct image classification by learning a class-to-image distance function that matches objects. The set of objects in training images for an image class are treated as a collage. When presented with a test image, the best matching between this collage of training image objects and those in the test image is found. We validate the efficacy of the proposed model on the PASCAL 07 and SUN 09 datasets, showing that our model is effective for object classification and scene classification tasks.…
We conduct image classification by learning a class-to-image distance function that matches objects. The set of objects in training images for an image class are treated as a collage. When presented with a test image, the best matching between this collage of training image objects and those in the test image is found. We validate the efficacy of the proposed model on the PASCAL 07 and SUN 09 datasets, showing that our model is effective for object classification and scene classification tasks. State-of-the-art image classification results are obtained, and qualitative results demonstrate that objects can be accurately matched.
Other authorsSee publication -
Mass Estimation
Machine Learning Journal
This paper introduces mass estimation -- a base modelling mechanism that can be employed to solve various tasks in machine learning. We present the theoretical basis of mass and efficient methods to estimate mass. We show that mass estimation solves problems effectively in tasks such as information retrieval, regression and anomaly detection. The models, which use mass in these three tasks, perform at least as well as and often better than eight state-of-the-art methods in terms of…
This paper introduces mass estimation -- a base modelling mechanism that can be employed to solve various tasks in machine learning. We present the theoretical basis of mass and efficient methods to estimate mass. We show that mass estimation solves problems effectively in tasks such as information retrieval, regression and anomaly detection. The models, which use mass in these three tasks, perform at least as well as and often better than eight state-of-the-art methods in terms of task-specific performance measures. In addition, mass estimation has constant time and space complexities.
Other authorsSee publication -
Relevance Feature Mapping for Content-Based Multimedia Information Retrieval
Pattern Recognition
This paper presents a novel ranking framework for content-based multimedia information retrieval (CBMIR). The framework introduces relevance features and a new ranking scheme. Each relevance feature measures the relevance of an instance with respect to a profile of the targeted multimedia database. We show that the task of CBMIR can be done more effectively using the relevance features than the original features. Furthermore, additional performance gain is achieved by incorporating our new…
This paper presents a novel ranking framework for content-based multimedia information retrieval (CBMIR). The framework introduces relevance features and a new ranking scheme. Each relevance feature measures the relevance of an instance with respect to a profile of the targeted multimedia database. We show that the task of CBMIR can be done more effectively using the relevance features than the original features. Furthermore, additional performance gain is achieved by incorporating our new ranking scheme which modifies instance rankings based on the weighted average of relevance feature values. Experiments on image and music databases validate the efficacy and efficiency of the proposed framework.
Other authors -
-
K-means Based Fingerprint Segmentation with Sensor Interoperability
EURASIP Journal on Advances in Signal Processing
A critical step in an automatic fingerprint recognition system is the segmentation of fingerprint images. Existing methods are usually designed to segment fingerprint images originated from a certain sensor. Thus their performances are significantly affected when dealing with fingerprints collected by different sensors. This work studies the sensor interoperability of fingerprint segmentation algorithms, which refers to the algorithm’s ability to adapt to the raw fingerprints obtained from…
A critical step in an automatic fingerprint recognition system is the segmentation of fingerprint images. Existing methods are usually designed to segment fingerprint images originated from a certain sensor. Thus their performances are significantly affected when dealing with fingerprints collected by different sensors. This work studies the sensor interoperability of fingerprint segmentation algorithms, which refers to the algorithm’s ability to adapt to the raw fingerprints obtained from different sensors. We empirically analyze the sensor interoperability problem, and effectively address the issue by proposing a k-means based segmentation method called SKI. SKI clusters foreground and background blocks of a fingerprint image based on the k-means algorithm, where a fingerprint block is represented by a 3-dimensional feature vector consisting of block-wise coherence, mean, and variance (abbreviated as CMV). SKI also employs morphological postprocessing to achieve favorable segmentation results. We perform SKI on each fingerprint to ensure sensor interoperability. The interoperability and robustness of our method are validated by experiments performed on a number of fingerprint databases which are obtained from various sensors.
Other authorsSee publication -
Mass Estimation and Its Applications
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD)
This paper introduces mass estimation -- a base modelling mechanism in data mining. It provides the theoretical basis of mass and an efficient method to estimate mass. We show that it solves problems very effectively in tasks such as information retrieval, regression and anomaly detection. The models, which use mass in these three tasks, perform at least as good as and often better than a total of eight state-of-the-art methods in terms of task-specific performance measures. In addition, mass…
This paper introduces mass estimation -- a base modelling mechanism in data mining. It provides the theoretical basis of mass and an efficient method to estimate mass. We show that it solves problems very effectively in tasks such as information retrieval, regression and anomaly detection. The models, which use mass in these three tasks, perform at least as good as and often better than a total of eight state-of-the-art methods in terms of task-specific performance measures. In addition, mass estimation has constant time and space complexities.
Other authorsSee publication -
Relevance Feature Mapping for Content-Based Image Retrieval
International Workshop on Multimedia Data Mining (at KDD)
This paper presents a ranking framework for content-based image retrieval using relevance feature mapping. Each relevance feature measures the relevance of an image to some profile underlying the image database. The framework is a two-stage process. In the on-line modeling stage, it constructs a collection of models which maps all images in the database to the relevance feature space. In the on-line retrieval stage, it assigns a weight to every relevance feature based on the query image, and…
This paper presents a ranking framework for content-based image retrieval using relevance feature mapping. Each relevance feature measures the relevance of an image to some profile underlying the image database. The framework is a two-stage process. In the on-line modeling stage, it constructs a collection of models which maps all images in the database to the relevance feature space. In the on-line retrieval stage, it assigns a weight to every relevance feature based on the query image, and then ranks images in the database according to their weighted average feature values. The framework also incorporates relevance feedback which modifies the ranking based on the feedbacks through reweighted features. We show that the power of the proposed framework is coming from the relevance features. Experiments on a large image database validate the efficacy and efficiency of the proposed framework.
Other authors -
Patents
-
Technique For Validating A Prognostic-Surveillance Mechanism In An Enterprise Computer System
Filed US ORA160807-US-NP
The disclosed embodiments generally relate to prognostic-surveillance techniques for enterprise computer systems. More specifically, the disclosed embodiments relate to a technique for validating the performance of a prognostic-surveillance mechanism, which is used to detect operational anomalies that arise during operation of an enterprise computer system.
Other inventors
Courses
-
Design and Analysis of Algorithms
CMPT 705
-
Illumination in Images
CMPT 828
-
Internet Archetecture and Protocols
CMPT 771
-
Machine Learning
CMPT 726
Projects
-
KeyBridge @ Oracle Labs: Deep Learning for Cloud Security
-
Enhance cloud security by leveraging advanced deep learning algorithms for log analysis, telemetry monitoring as well as traffic data analysis.
-
Learning Parts for Scenes
-
- Automatic discovery of scenery parts for a given scene.
- Learn discriminative and representative parts in a unified part clustering and scene classification framework.Other creators -
-
Web Master of KDD 2012
-
Build website for ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2012.
Other creators -
-
OLAP for Large-Scale Keyword Graphs
-
Design online analytical processing (OLAP) for large-scale keyword graphs, e.g. social networks with each person carries keywords, academic citation graphs where each paper has its own keywords.
Other creators -
-
Mass Estimation
-
- Mass estimation is a new data modeling mechanism that can be employed to solve various data mining and machine learning tasks, e.g. anomaly detection, information retrieval, regression, classification, and clustering.
- Mass-based methods often perform better than or at least as well as the state-of-the-art methods, but they run orders of magnitude faster than their distance/density-based counterparts.
- Publications in KDD 2010, KDD Workshop 2010, PR 2012 & MLJ 2013.
- Wikipedia…- Mass estimation is a new data modeling mechanism that can be employed to solve various data mining and machine learning tasks, e.g. anomaly detection, information retrieval, regression, classification, and clustering.
- Mass-based methods often perform better than or at least as well as the state-of-the-art methods, but they run orders of magnitude faster than their distance/density-based counterparts.
- Publications in KDD 2010, KDD Workshop 2010, PR 2012 & MLJ 2013.
- Wikipedia page available for more details <https://en.wikipedia.org/wiki/Mass_estimation>.Other creatorsSee project
Languages
-
Chinese
-
-
English
-
Recommendations received
1 person has recommended Guang-Tong
Join now to viewMore activity by Guang-Tong
-
After an incredible and productive 4.5-year journey, I've moved on from Ecopia AI. I am grateful for the time spent with talented, motivated, and…
After an incredible and productive 4.5-year journey, I've moved on from Ecopia AI. I am grateful for the time spent with talented, motivated, and…
Liked by Guang-Tong Zhou
-
I’m sure you’ve heard by now that we’re building a monetization team at OpenAI. We’re totally reimagining how ads should be delivered from a…
I’m sure you’ve heard by now that we’re building a monetization team at OpenAI. We’re totally reimagining how ads should be delivered from a…
Liked by Guang-Tong Zhou
-
We’re growing our team and looking for Senior/Principal Scientists to lead the development of frontier Large Video Models (LVMs) for visual…
We’re growing our team and looking for Senior/Principal Scientists to lead the development of frontier Large Video Models (LVMs) for visual…
Liked by Guang-Tong Zhou
-
We're Hiring: Fundamental ML Innovation for Conversion Modeling @ Meta (with Jiayin Ge and Aaron Tao) We're part of Meta's Ads Ranking System…
We're Hiring: Fundamental ML Innovation for Conversion Modeling @ Meta (with Jiayin Ge and Aaron Tao) We're part of Meta's Ads Ranking System…
Liked by Guang-Tong Zhou
-
In computer vision, when we worked with a handful of images, researchers (literally) knew the pixels in images they worked with. As we transitioned…
In computer vision, when we worked with a handful of images, researchers (literally) knew the pixels in images they worked with. As we transitioned…
Liked by Guang-Tong Zhou
-
Today marks a new chapter for me — I’m excited to share that I’ve joined Apple AIML as a Staff Machine Learning Engineer. Last week, I wrapped up my…
Today marks a new chapter for me — I’m excited to share that I’ve joined Apple AIML as a Staff Machine Learning Engineer. Last week, I wrapped up my…
Liked by Guang-Tong Zhou
-
Sometimes you have to draw it out to see the big picture. The AI landscape can get overwhelming with new terminology dropping every day. I find it…
Sometimes you have to draw it out to see the big picture. The AI landscape can get overwhelming with new terminology dropping every day. I find it…
Liked by Guang-Tong Zhou
-
Vision is ascending as the new primary medium for presenting information —vastly more intuitive, information-dense, and precise in capturing…
Vision is ascending as the new primary medium for presenting information —vastly more intuitive, information-dense, and precise in capturing…
Liked by Guang-Tong Zhou
-
Heading to NeurIPS 2025 in San Diego! Meet up with me and others on the RBC Borealis team to learn about our work in cutting edge machine learning…
Heading to NeurIPS 2025 in San Diego! Meet up with me and others on the RBC Borealis team to learn about our work in cutting edge machine learning…
Liked by Guang-Tong Zhou
-
I'll be at #NeurIPS in SD next week (Dec. 3-7). If you're around and interested to learn about opportunities at #xAI, let's grab a coffee together…
I'll be at #NeurIPS in SD next week (Dec. 3-7). If you're around and interested to learn about opportunities at #xAI, let's grab a coffee together…
Liked by Guang-Tong Zhou
-
Thrilled to share Meta’s latest milestone in ads AI: our 𝗚𝗲𝗻𝗲𝗿𝗮𝘁𝗶𝘃𝗲 𝗔𝗱𝘀 𝗠𝗼𝗱𝗲𝗹 (𝗚𝗘𝗠) — the “central brain” accelerating…
Thrilled to share Meta’s latest milestone in ads AI: our 𝗚𝗲𝗻𝗲𝗿𝗮𝘁𝗶𝘃𝗲 𝗔𝗱𝘀 𝗠𝗼𝗱𝗲𝗹 (𝗚𝗘𝗠) — the “central brain” accelerating…
Liked by Guang-Tong Zhou
Other similar profiles
Explore top content on LinkedIn
Find curated posts and insights for relevant topics all in one place.
View top content