Zach Wentz
New York, New York, United States
4K followers
500+ connections
View mutual connections with Zach
Zach can introduce you to 10+ people at Reflection
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
View mutual connections with Zach
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
Activity
4K followers
-
Zach Wentz reposted thisZach Wentz reposted thisWeek 3 nearly in the books at Reflection. A few quick thoughts — and some roles we are hiring for. On the company: The mission — It's big, bold, and resonates with everyone. There's a clarity to it that has people genuinely aligned and moving in the same direction. That feels great. The vibes — Incredibly collaborative and open. There's a ton of work to do, and people are eager to pitch in across teams. Infra, product, research, policy, legal — it all just works seamlessly, with effectively no boundaries. The talent — Has eclipsed my wildest expectations. We're hiring some of the best people in the space from OpenAI, Anthropic, Google, and Meta, and that's attracting more top talent. It feels like things are snowballing in a way that's building a really strong team and culture. I can't talk yet about what we're working on, but I'll be doing talks and interviews soon — stay tuned. What we're hiring for: - Safety — a safety lead and a safety PM. Core to what we do, and to doing it openly. - Policy — so much happening at the global stage; hiring the best in regulatory, safety and preparedness, and more. - Legal — great work to be done on open source, commercial deals, and beyond. - Comms and community — events, hacks, technical blogs, publications. Looking for someone to help drive it. - SWEs/MLEs — we're building a ton in open source, and hiring engineers who are builders first: people who engage communities around their projects and evangelize in the ecosystem. Truly unique work, and unapologetically open source. - GPU infrastructure & compute — We're developing foundational models at very large scale, and hiring engineers to build GPU and compute infra across multiple clouds. - Post-training — Actively recruiting candidates to lead post-training. Given our roots, we're heavily invested in RL — so if large-scale RL, environments, and shaping foundational model behavior excites you, this could be a great fit. If any of this resonates, please submit at: https://lnkd.in/gjkZgE8f Cheers! Bonus: https://lnkd.in/gXgEyQp7
-
Zach Wentz shared thisThis is my last week at Meta/MSL. I'm leaving for an opportunity I could have only dreamed of, which is wild given 2 years ago that opportunity was the one I now leave. The reason I have such an opportunity is due to the stunning colleagues I met and learned from at Meta, in Core Systems, and in MSL Infra. I'm not going to tag everyone because it's impossible to list everyone without seeming disingenuous, and on the other end to know where to draw the line. But they know who they are. A few things I've learned in my tenure: 1. Relationships > Scope – I got this wrong a few times, and right a few times. Meta, Metamates, Me continues to be the right guide. (Corollary: work with people that get this). 2. Do what lights your fire – Related to the previous: scope and org decisions don't matter much if you're working on what makes you excited to wake up in the morning. You're going to perform better. You're not going to burn out, even in the face of a faster pace and higher volume of work than you've had in your entire career (which I got to experience as part of MSL 😅 ). These lessons would come as no surprise to my colleagues at Meta; "build what you want with the people you want" was among the most common advice I received in those initial 1:1s. So I want to close this "badge post" with gratitude to my Metamates: thanks for the advice; I take my next role in heed of it. Au plaisir. -- As to specifics, stay tuned, but I'm thrilled to be joining one of the most talent dense teams I've ever come across.
-
Zach Wentz reposted thisZach Wentz reposted thisExcited to be speaking tomorrow at AI Native Conf in San Francisco, hosted by Together AI! Sanyam Bhutani and I will be diving into PyTorch Native RL at Scale — how we're pushing reinforcement learning forward using PyTorch-native tooling at Meta. If you're at The Midway tomorrow (March 5), come say hi. Huge lineup of speakers from Cursor, NVIDIA, Cartesia, Collinear, Gamma, and more. Big thanks to the Together AI team for putting this together. https://lnkd.in/gAwZPVwE #AINativeConf #ReinforcementLearning #PyTorch #Meta #TogetherAI
-
Zach Wentz reposted thisZach Wentz reposted thisAgentic RL hackathon this weekend! Mentors from PyTorch, Hugging Face, and Unsloth AI will guide you to build agentic environments to win from a pool of $100K prizes. + free compute and token credits just for attending! Lock in mar 7-8 in SF : https://lnkd.in/eHvT7VZx
-
Zach Wentz reposted thisZach Wentz reposted thisMeta-PyTorch team and Cerebral Valley are hosting the OpenEnv hackathon, where developers will create RL training environments and drive core model capability frontiers including: long-horizon planning, grounded interaction, recursive self-improvement, and multi-agent strategic reasoning. To provide the best mentorship to developers, we are partnering with the people that are at the forefront of post-training -Hugging Face, University of California, Berkeley, Unsloth AI, CoreWeave, Northflank, OpenPipe, Fleet AI, Inc., Mercor, Snorkel AI, Scaler AI Labs, Patronus AI, Halluminate (YC S25), Scale AI, and Cursor. Registrations below: San Francisco, Mar 7-8 https://lnkd.in/erN6C7DZ
-
Zach Wentz reposted thisAI models will get better only if we can accurately measure their performance in real life scenarios. As Meta PyTorch team, we are partnering with Snorkel AI to launch Open Benchmarks Grants program, aiming to proliferate the open-source AI benchmarking ecosystem.Zach Wentz reposted thisOur ability to measure AI has been outpaced by our ability to develop it, and this evaluation gap is one of the most important problems in AI. Open benchmarks are one of the most important levers for advancing AI safely and responsibly—but the academic and open-source teams driving them often hit resource constraints, especially in the face of the exponentially expanding complexity of what tomorrow’s benchmarks need to cover. That's why we're launching Open Benchmarks Grants: a $3M commitment from Snorkel AI, with support from Hugging Face, Prime Intellect, Together AI, Factory HQ, Harbor, and PyTorch to back the teams building benchmarks that define new vectors of progress for agentic AI. The next wave of benchmarks must close the gap across three core dimensions: environment complexity, autonomy horizon, and output complexity. This grant provides funding, expert data development support, and research collaboration to support those innovating in these areas (and more we haven’t thought of!). We're excited to support the pioneers and builders defining AI benchmarks in the open. 👉 Read more here: https://lnkd.in/geBsSFcx 👉 Submit your applications here: https://lnkd.in/gehJcCjN
-
Zach Wentz shared thisWe're live! Be on for about 2.5 hours with a bunch of good peeps talking RL, agents, OpenEnv, and sandboxing. https://lnkd.in/e5gmVv29
-
Zach Wentz reposted thisZach Wentz reposted thisI'm excited to kick off 2026 with a bang! My team at Meta-PyTorch is partnering with Hugging Face and Unsloth AI as key sponsors of the new OpenEnv Custom Track in the AgentX-AgentBeats challenge at Berkeley RDI — fueling open, community-driven progress on RL environments and general intelligence benchmarks. This collaboration underscores our commitment to open standards, accessible tooling, and community innovation in agentic systems — and we can’t wait to see what builders around the world create with these shared resources. A personal shoutout to Daniel Han, Michael Han (Unsloth), Sanyam Bhutani, Hamid Shojanazeri, Dawn Song, Lysandre Debut, Ben Burtenshaw, Lewis Tunstall, Davide Testuggine, Zach Wentz, Emre Guven and many others for making this happen! And a big congrats to the teams participating and thanks to Berkeley RDI for driving this incredible ecosystem forward - I look forward to the collaboration! Details: https://lnkd.in/gvVHsYDqAgentic AI Weekly | Berkeley RDI | January 7, 2026Agentic AI Weekly | Berkeley RDI | January 7, 2026
-
Zach Wentz reposted thisZach Wentz reposted thisWe're partnering with PyTorch, Hugging Face on the OpenEnv Challenge! Use Unsloth AI for RL & OpenEnv to win $10K in HF credits! As part of the University of California, Berkeley AgentBeats Competition, there is a special track just for reinforcement learning! If you can: 1. Create an RL environment, and publish to the HF Hub 2. Publish training notebooks with Unsloth, HF 3. Write a blog on HuggingFace Then submit an entry! You also get to publish a PyTorch blog! The AgentBeats competition details are at https://lnkd.in/gNDkHMnb For the OpenEnv track details, see https://lnkd.in/gZ93sqdd
-
Zach Wentz reacted on thisZach Wentz reacted on thisI'm excited to share that I joined Reflection AI last month on the Strategy and Operations team, where I'm working on Special Projects. In my first few weeks here, I've been energized by the mission and the people, and I'm already learning so much. We're hiring, so if you're interested in learning more, reach out!
-
Zach Wentz reacted on this1. Our new team at Meta (John Yang Kilian Lieret Ph.D. Jeffrey Ma & Me) just posted a super-tough new coding benchmark which challenges agents to code entire apps end-to-end. 2. The top score is 0%. 3. We will be making the benchmark harder.Zach Wentz reacted on thisIntroducing ProgramBench: 200 whole-repo generation tasks rigorously evaluated in cleanroom settings (no internet, no decompilation, no leaked source, no systracing, ...). Best score is **0**. Our agent ONLY gets a target executable and some readme/usage files. The agent must choose a language, design abstraction layers, architect the entire program. No internet access or any other way of cheating. The executable cannot be decompiled or inspected. This makes our tasks much more realistic (and harder). Our 200 instances range from simple (jq, ripgrep, …) to brutal (ffmpeg, sqlite, compilers). Giving credit for partially resolved tasks is very misleading, so we don’t. But Opus 4.7 gets quite close on some 6 instances and resolved more than 95% of their unit tests. You can pip install programbench && programbench eval <your solution> with all required images on dockerhub. Opening leaderboard for submissions soon. Work co-led with John Yang and with Jeffrey Ma Parth Thakkar Dmitrii Pedchenko Sten Sootla Emily McMilin Pengcheng Yin Rui Hou Gabriel Synnaeve Diyi Yang and Ofir Press
-
Zach Wentz liked thisGrateful to Sam for helping us with marketing for our CritPt benchmark (super tough physics questions). critpt.com
-
Zach Wentz reacted on thisZach Wentz reacted on thisit's fun having a seat on a rocket ship.. Check out Misha's CNBC interview this week talking about our latest raise at a $25B valuation. Key takeaways from the interview: • China is winning open-source AI right now — DeepSeek, Qwen, and Kimi are the most capable open models available, and the U.S. doesn't yet have a competitive answer • The stakes are geopolitical, not just technical — enterprises and sovereign governments won't adopt Chinese models due to legal and security concerns, creating a critical gap • Reflection's approach: open weights — released freely for developers to build on, while keeping training data and pipelines in-house (think Meta's Llama model) with a big focus on open safety (think Purple Llama!) and an ecosystem around it all.. Full interview 👉 https://lnkd.in/gKJqsAd9 Cheers!Reflection CEO confirms latest funding round closed at $25 billion pre-money valuationReflection CEO confirms latest funding round closed at $25 billion pre-money valuation
-
Zach Wentz liked thisZach Wentz liked thisMuse Spark debuts at #7 in the Code Arena - making AI at Meta the #3 lab right behind Anthropic's Claude Sonnet 4.6 and Z.ai’s GLM-5.1, surpassing Gemini-3.1-Pro and GPT-5.4. Code Arena evaluates agentic coding on real-world tasks - building live websites and apps, ranked by users on real workflows. Huge congrats to AI at Meta on this impressive milestone!
-
Zach Wentz liked thisZach Wentz liked thisShinsegae taps Reflection AI for retail overhaul, drops OpenAI talksShinsegae taps Reflection AI for retail overhaul, drops OpenAI talks
-
Zach Wentz reacted on thisZach Wentz reacted on thisReflection AI made the 2026 Forbes AI 50. Still a bit surreal to be part of this team
-
Zach Wentz liked thisZach Wentz liked thisToday was a big milestone. After addressing feedback from friends & family, I presented & opened the v0 of my product to the first cohort of "stranger" customers. 5 out of 20 said they are willing to pay for it (from the presentation) and everyone said they want to try it. This burst of energy and the adrenaline rush cant be compared to any executive presentation I did at big tech companies. Cant take my 👀s off the mixpanel dashboard...Lets go!
-
Zach Wentz reacted on thisZach Wentz reacted on thisWeek 3 nearly in the books at Reflection. A few quick thoughts — and some roles we are hiring for. On the company: The mission — It's big, bold, and resonates with everyone. There's a clarity to it that has people genuinely aligned and moving in the same direction. That feels great. The vibes — Incredibly collaborative and open. There's a ton of work to do, and people are eager to pitch in across teams. Infra, product, research, policy, legal — it all just works seamlessly, with effectively no boundaries. The talent — Has eclipsed my wildest expectations. We're hiring some of the best people in the space from OpenAI, Anthropic, Google, and Meta, and that's attracting more top talent. It feels like things are snowballing in a way that's building a really strong team and culture. I can't talk yet about what we're working on, but I'll be doing talks and interviews soon — stay tuned. What we're hiring for: - Safety — a safety lead and a safety PM. Core to what we do, and to doing it openly. - Policy — so much happening at the global stage; hiring the best in regulatory, safety and preparedness, and more. - Legal — great work to be done on open source, commercial deals, and beyond. - Comms and community — events, hacks, technical blogs, publications. Looking for someone to help drive it. - SWEs/MLEs — we're building a ton in open source, and hiring engineers who are builders first: people who engage communities around their projects and evangelize in the ecosystem. Truly unique work, and unapologetically open source. - GPU infrastructure & compute — We're developing foundational models at very large scale, and hiring engineers to build GPU and compute infra across multiple clouds. - Post-training — Actively recruiting candidates to lead post-training. Given our roots, we're heavily invested in RL — so if large-scale RL, environments, and shaping foundational model behavior excites you, this could be a great fit. If any of this resonates, please submit at: https://lnkd.in/gjkZgE8f Cheers! Bonus: https://lnkd.in/gXgEyQp7
Experience
Projects
-
OpenEnv: Building the Open Environment Ecosystem
Contributed to the launch of OpenEnv, a collaborative initiative between Meta and Hugging Face that establishes an open-source standard for agentic environments. OpenEnv defines secure, sandboxed environments that specify exactly what tools, APIs, and execution context AI agents need to perform tasks safely and effectively.
Key Contributions & Features:
- Developed standardized environment interfaces using step(), reset(), and close() APIs that enable both training and deployment…Contributed to the launch of OpenEnv, a collaborative initiative between Meta and Hugging Face that establishes an open-source standard for agentic environments. OpenEnv defines secure, sandboxed environments that specify exactly what tools, APIs, and execution context AI agents need to perform tasks safely and effectively.
Key Contributions & Features:
- Developed standardized environment interfaces using step(), reset(), and close() APIs that enable both training and deployment of AI agents
- Created reusable, shareable environments through the OpenEnv Hub on Hugging Face, allowing developers to build, validate, and iterate on agentic tasks
- Established environment specifications (RFC 0.1) covering core architecture, packaging, isolation, MCP tool integration, and unified action schemas
- Integrated with reinforcement learning frameworks including TRL, TorchForge, VeRL, and SkyRL for scalable RL post-training
Impact:
OpenEnv addresses the critical challenge of safely exposing tools to autonomous AI agents by providing clear semantics, sandboxed execution, and authenticated API access. The platform enables reproducibility of state-of-the-art methods, end-to-end deployment pipelines, and collaborative environment creation across the open-source AI community.
Technologies: Python, PyTorch, Reinforcement Learning, Docker, API Integration, Agent-based Systems -
Polly.JS
-
Open-sourced Polly.JS, a JavaScript library enabling recording, replaying, and stubbing of HTTP interactions to simplify mocking andstubbing tests. Polly taps into multiple request APIs across Node and browsers to intercept requests and responses with minimal configuration via a simple, powerful API. Polly enabled scalable testing of a complex data model with many relationships that would otherwise require difficult, not reusable mocking.
Honors & Awards
-
Amazon Gen AI Hackathon - 1st Place
Amazon Music Executive Leadership
Led our hackathon team of 14 engineers, designers, data scientists, and product manager in a week long AI hackathon to develop a Gen AI feature for Amazon Music that won Best Overall, Best Demo, and Most Compelling project.
Languages
-
French
Limited working proficiency
Recommendations received
2 people have recommended Zach
Join now to viewView Zach’s full profile
-
See who you know in common
-
Get introduced
-
Contact Zach directly
Other similar profiles
Explore more posts
-
Viktor Bezdek
Groupon • 3K followers
Thrilled to share my latest article: "The Mixed-Initiative Interface: Designing Control Handoffs Between Humans and AI." In a world where AI handles more decisions, the real UX frontier is the seamless transfer of control—ensuring humans can step in without losing context, state, or accountability. We're moving beyond rigid autonomy to dynamic systems where handoffs are graceful and empowering. Key insights include: - The Handoff Problem: Mixed-initiative systems allow bidirectional control shifts; the challenge is preserving state during transitions to avoid confusion. - Decision Ledger: An append-only log of actions, confidence scores, and reasoning to brief humans on AI's prior steps. - Non-Destructive Overrides: Use proposal queues for AI suggestions that can be reviewed or modified without halting workflows. - Calibrating Confidence: Dynamic thresholds based on consequence, reversibility, and confidence—err toward escalation to build trust. - Reversibility and Intervention Windows: Leverage event sourcing for undos and strategic delays to enable human oversight in high-stakes scenarios. This is essential for anyone building AI-powered products: think pricing optimization, fraud detection, or any domain blending human judgment with machine efficiency. Does this align with challenges you're facing in your work? Read the full piece here: https://lnkd.in/dX_y7YMa
5
-
Jay Garmon
Finvi • 1K followers
"The issue is basically that 'negative reward for lying and stealing' looks the same as 'negative reward for getting caught lying and stealing'. ... [so] AI will wind up with the latter motivation. The reward function will miss sufficiently sneaky misaligned behavior, and so the AI will come to feel like that kind of behavior is good, and this tendency will generalize in a very bad way." https://lnkd.in/eKSXkt7y
2
-
Future AGI
27K followers
The voice AI industry has fundamentally misunderstood its own product-market fit. We're obsessing over conversational sophistication while failing at basic speech synthesis. These "performance barriers" Rishav calls out aren't technical limitations; they're the predictable result of prioritizing flashy demos over robust engineering. Companies won't adopt voice agents not because the technology isn't ready, but because vendors have consistently oversold capabilities while underdelivering on reliability. It’s about time someone said it out loud: the barriers to building RELIABLE voice AI (with stats). Here it out 👇
14
1 Comment -
Matt Weagle
Machinify • 1K followers
Some themes I'm hearing from people having success AI coding agents are explicit context, documentation through time, and automated feedback mechanisms. Explicit context includes the practices and assumptions that had previously been shared via IC senior/junior mentorships. Explicitly articulating architectural patterns, constraints, domain boundaries, language-specific design patterns, the definition of done, refactoring signals, error handling policies, optimization opportunities, etc. What would have been incrementally, lazily, and only verbally surfaced during an apprenticeship relationship is promoted to one or more documents. Documentation over time is the history behind why code works a certain way: special cases, environment-specific behaviors, why some IPC was chosen over another. While this information is often available in commit history or cryptic comments, it's less often made first class. Artifacts like ADRs capture the evolution of code. This historical documentation provides counterbalancing evidence against what is intentionally absent from the current codebase version. Automated feedback mechanisms include style, code organization, linters, compiler warnings as errors, semgrep policies, OWASP Web Testing Guidance, fitness functions for correctness, unit tests, fuzzers, and similar types of functional and non-functional feedback mechanisms. These are guardrails that operate at agent speed and provide actionable corrective signals to ensure alignment These are also hallmarks of pre-AI high-performing engineering teams. They highlight the importance of shared understanding, history, and actionable feedback. In my experience these practices help most teams to be more effective, independent of how they are constituted.
17
5 Comments -
Ganesh Ramanarayanan
Ganramstyle Labs • 3K followers
If you're keen on real AI use cases, this is actually quite powerful and I encourage you to read further (and help us build: https://lnkd.in/gsYnAfKK) At Hex, we've found our Threads product (https://lnkd.in/gJKCurJB) is the best way to summarize customer support load, because we're not limited to what's in the customer support portal - we can ask in terms of active users, customer size, feature adoption, and still get great answers. This is based on a strategy of keeping your data warehouse as a source of truth, and all related data work in one place - Hex - so that AI understands concepts in your business and makes smart connections between things. Arguably, ALL data analysis for a particular vertical is eventually best achieved in a tool like Hex, connected to curated data in your data warehouse. Wait - can't I do this directly in the customer support tool? Can't I connect an MCP server and get a functionality like this in one of my other AI enabled products? Maybe, but you have 100+ SaaS apps, so play this scenario out: - How many places are you going to sync your core data sets? - How many cross product integrations are going to exist, and how many of them are you going to enable? (100 choose 2 is ~5000) - In how many of these products will all users have seats? - In how many of these products will these seats have permissions to use AI features? (Our support tool just told me I don't have permission to ask the agent questions, probably because I don't have the right seat type, which means more $$) My bet is a on a world with ONE place for all analysis.
33
2 Comments -
Srinivasan Venkatachary
6K followers
Gemini 3 is here. It’s our most intelligent model – and it’s available now in Google Search, starting with AI Mode. Gemini 3 brings incredible reasoning power to Search, making it easier to ask anything on your mind. It also unlocks incredibly rich new generative UI capabilities, so AI Mode can now dynamically create interactive tools and simulations on the fly to help you understand complex topics. Take for example asking AI Mode to help solve a Rubik's Cube. With Gemini 3's new gen UI capabilities, it created an interactive visualization offering strategies to master the puzzle. Proud of the phenomenal collaboration across Google Research, Google DeepMind and Google Search on this launch Dive deeper into the core principles that enabled our implementation of generative UI in our recently published Google Research paper: “Generative UI: LLMs are Effective UI Generators.” https://lnkd.in/gJmJR-EH It’s also an exciting week because today we launched Nano Banana Pro in Search via AI Mode. Built on Gemini 3, Nano Banana Pro can generate precise, detailed and contextually rich visuals with factual accuracy. Access Gemini 3 and Nano Banana Pro in Search by selecting the “Thinking” model in the AI Mode drop-down menu. It’s available now to all Google AI Pro and Ultra subscribers in the U.S. and is coming to everyone in the U.S. soon.
358
15 Comments
Explore top content on LinkedIn
Find curated posts and insights for relevant topics all in one place.
View top content