“Anwar built the LLM evaluation pipeline our team relies on for content classification, and he did it with the rigor of someone who actually cares about data quality and model drift, not just shipping a demo. His work on risk-based country categorization improved our detection coverage across global markets in a measurable way. Honestly though, the best way.to understand Anwar is to watch what happens when something breaks at scale. He stays calm and traces the issue methodically. He communicates clearly the whole time and leaves the system better than he found it.”
Anwar Shaikh
San Francisco, California, United States
2K followers
500+ connections
About
Experienced Machine Learning Engineer with a demonstrated history of working in the…
Experience
Education
-
Indiana University Bloomington
3.97
-
Associate Instructor for INFO-I 201 Mathematical Foundation of Informatics.
-
-
-
Licenses & Certifications
Volunteer Experience
-
Volunteer at Inclusive Innovations Expo 2013 - A National Event
Persistent Systems
- Present 12 years 7 months
Science and Technology
Inclusive innovation was an event where innovators all across the country were present to showcase their ideas at booths. Event was tremendously successful marking more than 100,000 visitors.
Inclusive Innovation event 2013 was an acknowledgement and celebration of Indian Design. It helped discover the new paradigms of design in India, which could answer the call of making Indian industry and manufacturing more competitive and innovative. -
Volunteer at iShare- a company wide technology event
Persistent Systems
- Present 11 years 8 months
Science and Technology
iShare is an company wide event which provides the platform for employees to exchange ideas, showcase extra ordinary work.
Publications
-
Parameter fitting in a multiscale model: Parameter Scanning vs. Particle Swarm Optimization
Interagency Modeling and Analysis Group (IMAG) 2018
Patents
Courses
-
Advance Natural Language Processing
LING-L 645
-
Algorithm Analysis and Design
CSCI-B 503
-
Applied Machine Learning
CSCI-B 659
-
Artificial Intelligence
-
-
Big Data Analytics and Text Mining
ILS-Z 604
-
Data Mining
CSCI-B 565
-
Data Science for Drug Discovery
INFO-I 590
-
Data Structures
-
-
Design Patterns - Training
-
-
Design and Analysis of Algorithm
-
-
Discrete Structures
-
-
Elements of Artificial Intelligence
CSCI-B 551
-
HTML 5 - Training
-
-
Introduction to Statistics
STAT-S 520
-
Python - Training
-
-
Search (Information Retrieval)
ILS-Z 534
-
Software Engineering
-
-
Theory of Computation
-
Projects
-
Heterogenous Graph Mining
- Present
Conducting an independent research study under Prof. Xiazhong Liu for improving personalized recommendation algorithms on heterogeneous graph. Identifying reasoning behind recommendation by analyzing key pathways. Also, generating personalized texual summary for recommended article.
-
Wikipedia Text Mining
- Present
In this project we are analyzing huge Wikipedia data >100G to find major events and their relationships. For example, Financial Crisis like ‘Subprime Crisis‘ what are the other events or sub-events that preceded or succeeded it and causal relationship among them.
The objective here is to understand the phenomena behind the events and their implications.Other creators -
Predictive Modeling of IU Bus Transit System
-
Yelp Data-set Challenge
Provided huge yelp data with1.6M reviews by 366K users for 61K businesses for 10 diverse cities. We analyzed in a specific city, the features that substantially contribute to the user's choice of restaurant. To get this we designed a complex mathematical formula which calculates the score of the feature based on the user weight and review weight that mentions the feature.
Other creatorsSee project -
Semantic Web - Case Study (Calais, LinkedIn)
As a part of Artificial Intelligence coursework prepared case study of application of Artificial Intelligence in semantic web. This includes collaborative usage of Artificial Intelligence technologies like Natural Language Processing (NLP), Machine Learning with Semantic Web methodologies like Resource Description Framework (RDF), Schema Query Language.
Presented well received case-study of LinkedIn and Calais using above technologies to prepare and process metadata. -
Expedia Hotel Recommendation- Kaggle Competition
-
With hundreds, even thousands, of hotels to choose from at every destination, it's difficult to know which will suit your personal preferences. Expedia wants to take the proverbial rabbit hole out of hotel search by providing personalized hotel recommendations to their users. This is no small task for a site with hundreds of millions of visitors every month! Currently, Expedia uses search parameters to adjust their hotel recommendations, but there aren't enough customer specific data to…
With hundreds, even thousands, of hotels to choose from at every destination, it's difficult to know which will suit your personal preferences. Expedia wants to take the proverbial rabbit hole out of hotel search by providing personalized hotel recommendations to their users. This is no small task for a site with hundreds of millions of visitors every month! Currently, Expedia uses search parameters to adjust their hotel recommendations, but there aren't enough customer specific data to personalize them for each user. In this project, we have taken up the challenge to contextualize customer data and predict the likelihood a user will stay at 100 different hotel groups.
Other creators -
Littler Case Smart
-
See projectAt Littler I was fortunate learn/work with Director for Data Analytics - Zev Eigen. We had data for EEOC charges for past 6 years. We applied boosted decision tree and neural network algorithm to predict the probability for EEOC charge turning into litigation, and probable settlement value and duration for the charge.
-
Genomic Data Analysis
-
SanGeniX is Persistent Systems' solution for analyzing and visualizing NGS data. SanGeniX is a web based NGS data analysis suite with a highly intuitive user interface where data processing can be achieved through ‘SanGeniX Titanium and Platinum’ – a lite desktop and cluster edition respectively. SanGeniX integrates multiple robust and validated algorithms in the form of predefined workflows and offers flexibility to construct custom workflows.
Other creatorsSee project -
Systems Biology of Skin Cells
-
Systems Biology of Skin Cells tool, Cell-in-Silico, which aims to provide an alternative to clinical trials of cosmetic products on animals. The tool assists biologist to create and analyze the models of skin cells and study the effects of different chemical compounds on skin.
We used Qt (C++) for development and SBML (Systems Biology Markup Language) library developed by CalTech. This project was developed in collaboration with National Chemical Laboratory (NCL) and National Institute…Systems Biology of Skin Cells tool, Cell-in-Silico, which aims to provide an alternative to clinical trials of cosmetic products on animals. The tool assists biologist to create and analyze the models of skin cells and study the effects of different chemical compounds on skin.
We used Qt (C++) for development and SBML (Systems Biology Markup Language) library developed by CalTech. This project was developed in collaboration with National Chemical Laboratory (NCL) and National Institute of Immunology (NII).
Working in a smaller team offered opportunities to play different roles apart from a developer, including requirement analysis, module design and quality control, which outstretched my competency as a full stack developer.Other creatorsSee project -
ShareSpot - A Media Sharing Android Application
-
ShareSpot an android application where a user can upload, download and browse his photos or videos at different media sites like Picasa, Flickr and YouTube via single application.
It enables the end-user to browse through the local file system and select media files to upload. The files can be uploaded to the supported web sites using a single user-friendly interface without the need for separate logins to each website and then selecting and uploading media files.Other creatorsSee project
Honors & Awards
-
You Made A Difference
Persistent Systems
Awarded with "You Made A Difference" for substantial contribution to the research work at R&D unit of company.
Languages
-
English
Full professional proficiency
-
Hindi
Native or bilingual proficiency
-
Marathi
Native or bilingual proficiency
Recommendations received
5 people have recommended Anwar
Join now to viewOther similar profiles
Explore top content on LinkedIn
Find curated posts and insights for relevant topics all in one place.
View top content