Evolving SkyRL into a Highly-Modular RL Framework https://lnkd.in/eiCEAdS6 SkyRL-v0.1: Making SkyRL Modular In the [original release of SkyRL](https://lnkd.in/eDQFxZef), we introduced an agentic layer in the RL stack for multi-turn tool use LLMs, optimized for long-horizon, real-environment tasks like SWE-Bench. Today, we are upgrading SkyRL to a highly-modular RL framework to train LLMs with the introduction of two key additions: 1) A **modular**, **performant** **RL framework** for training LLMs. SkyRL makes it easy to prototype new training algorithms, environments, and training execution plans — without compromising usability or speed. 2) A **gymnasium of tool-use tasks** with a simple environment interface and an ****out-of-the box library of popular tasks ****such as math, code, search, and SQL. SkyRL’s modularity enables easy implementation of real-world improvements—like async training, heterogeneous hardware, and new environments — with **under 100 LoC** and up to **1.8× faster training**.
Frédéric Barbaresco’s Post
More Relevant Posts
-
Cursor 2.0 now has a built-in browser that their agents can use. Now, put the following in perspective, beyond the fact that it is a code editor: – Cursor can run multiple agents, with multiple LLMs in parallel. – Cursor has full access to the file system (can read/write files). – Cursor can write code and run it alongside the other tasks. – Cursor can browse the internet, navigating websites and reading webpages. This is the recipe for an advanced, powerful platform to do pretty much everything you can think of. Explorations will follow. https://lnkd.in/duDQUFA4 ––––––––––– Shameless plug: ✓ Run a boring tool (MCP ready) to automate GSlides: https://deckserve.com ✓ Build AI-assisted and agentic internal tools: https://eloquentops.com
To view or add a comment, sign in
-
-
PDF: HTML_Service_Integration_Analysis.md.pdf Here is why Claude is so good at code, based on this the prompt (together with 80k tokens of code from the main repo + 4 key tests files + the mitmproxy interceptor code): "Hi we have been adding support for the Html Service to the attached Mitmproxy service" "can you first confirm if that service is now available via the main mitmproxy workflow (I also included the code for the fastapi_interceptor which is the code that runs on the vm with the mitmproxy (which then talks to this service)" "note that the reason we are adding this Html Service is that the original design (still in the code) used URLs to get the data and then transform the Html, which was a bad workflow since we already get the HTML from the Mitmproxy" ... Claude created the analysis you can see in the PDF below. This was after a longish internal 'reasoning' thread, which is also quite interesting to read (interestingly, I didn't have the Claude's 'Extended Thinking' mode activated) And btw, this analysis is spot on, I just wanted to make sure I had not missed anything before I continue with the next part of this integration :) 〰️〰️〰️〰️〰️〰️〰️〰️〰️〰️〰️〰️〰️〰️〰️〰️〰️〰️〰️〰️〰️〰️〰️〰️〰️〰️〰️〰️〰️〰️〰️〰️〰️ ⤵️ Listed in my collection of: 📍 Examples from day-to-day GenAI powered development workflows (Part 3) 👉 https://lnkd.in/eNuAaP-h
To view or add a comment, sign in
-
Cursor 2.0 just launched — and it's transforming software. While building my latest project, the new features are game-changers. Here's what stands out: ⚡ Composer — 4x faster responses Most tasks complete in under 30 seconds. Whether implementing complex features, integrations, or refactoring, speed matters. 🤖 Multi-Agent Interface — 8 agents in parallel On a single prompt, you can run agents working on: Backend API routes Frontend components Database migrations Authentication flows All simultaneously, each in its own isolated codebase copy. This dramatically cuts development time. 🌐 Embedded Browser (GA) — test without leaving the editor Test and refine code directly in Cursor. No more switching between IDE and browser. 📋 Improved Code Review — see all changes at once Reviewing agent-generated changes across multiple files in one view streamlines the process. The multi-agent interface is the standout. Running multiple features in parallel without conflicts saves hours on complex projects. If you're building software with AI assistance, this release is worth exploring.
To view or add a comment, sign in
-
-
💡 𝐃𝐚𝐲 𝟏𝟎 𝐨𝐟 #𝟏𝟎𝟎𝐃𝐚𝐲𝐬𝟏𝟎𝟎𝐓𝐨𝐨𝐥𝐬 𝐓𝐨𝐝𝐚𝐲’𝐬 𝐭𝐨𝐨𝐥: ToolFK.com → https://toolfk.com/ 𝐖𝐡𝐚𝐭 𝐢𝐬 𝐢𝐭? ToolFK.com is an extensive online toolbox offering hundreds of free utilities spanning images, video, text, code, conversions and more—all under one roof. You’ll find online compilers, AI-image generators, PDF converters, regex testers, QR generators and much more. Why I like it? It’s the Swiss-Army knife for digital creators, developers and anyone needing quick online utilities. No signup required for many tools — ready when you are. Covers a huge range: from “image to video” AI workflows to “base64-to-file” conversions and code compilers. 𝐐𝐮𝐢𝐜𝐤 𝐭𝐢𝐩: Next time you need a fast tool (e.g., convert a PDF to PPT, run a quick code snippet, or generate an AI image from a prompt), bookmark ToolFK.com. Search by the type of task you have (e.g., “image to video”, “online compiler”, “QR code generator”) and jump straight in—no installations, no fuss. Have you used ToolFK.com (or similar all-in-one tool hubs)? What was the utility you found most helpful? Drop your experience below 👇 Stay tuned for Tool #11 tomorrow! #100Days100Tools #ToolOfTheDay #DevTools #Productivity #OnlineUtilities
To view or add a comment, sign in
-
-
🚀 Form Filling MCP Agent 😎 Filling out forms online always felt like a chore 🥱. switching tabs, figuring out which site is official, hunting for what documents to upload, and not knowing what’s next. That frustration while applying for PAN card recently led me to build something better. 🚀 Form Filling MCP Agent, a complete multi-agent system powered by Model Context Protocol (MCP) and integrated with Claude Desktop. Just say “Help me Apply for a PAN card” and the agents take it from there 🙌 😄 . This project runs through three dedicated MCP server agents working together seamlessly: 🔍 Research Agent — finds the right websites and requirements 🤖 Automation Agent — fills and navigates across form UIs 🎯 Coordinator — manages and synchronizes the entire workflow through MCP ✅ Simple forms — working beautifully ⚡ Complex forms — 50–60% automated (improving daily) 📄 Document uploads — in progress Built with Python, Playwright, Pydantic, and a full MCP integration with Claude Desktop, this setup already handles end-to-end form workflows intelligently from discovery to submission. The vision is simple: To make online processes completely effortless. No endless research, no messy UI hopping, no second-guessing steps — just a seamless, guided, AI-driven experience. Watching Claude and the agents collaborate in real time researching, reasoning, and filling forms across different websites, feels like a glimpse of what human-AI collaboration will look like next. This is still an evolving project, document uploads and complex multi-step workflows are the next milestones. But the potential here is huge: a world where applying for anything online becomes as natural as having a conversation. Attached is a short demo 🎥 showing it in action. Would love to hear your thoughts, feedback, or ideas for where this can go next. Claude Deepak Kamboj Aishwarya Naresh Reganti Himanshu Kriplani #AI #MCP #ClaudeAI #Automation #Innovation #Productivity #Agents #AgenticAI #Python #ArtificialIntelligence #AITools #TechInnovation #AIRevolution #FutureOfWork #GenAI #WebAutomation #AIProductivity #ClaudeDesktop #DigitalTransformation #MachineLearning #A2A #Claude #AIinAction
To view or add a comment, sign in
-
Never ship AI code you can not explain, line by line. That single rule is how OptiPhoenix uses Cursor, Claude, and Convert to speed up A/B tests without losing code ownership. In this guide, Shashi Ranjan and Kamal Sahni break down how they use these tools to accelerate A/B test development while maintaining full code ownership and visibility. Guardrails for AI-assisted experiments - Scope: boilerplate, utilities, HTML and CSS, refactors, debugging, docs - Block: secrets, client data, internal URLs, proprietary logic - Process: strict .cursorrules, human review, security scan, QA checklist - Outcome: faster builds, cleaner diffs, zero mystery code Quick setup you can copy - Node.js 20+ - npm i @convertcom/mcp-server - Cursor Settings → Tools and MCP, add Convert server with API key and secret - Create .cursorrules with inputs, directory layout, file templates, coding standards - Prompt template to start a new test, then let Cursor scaffold and compile QA checklist - No console errors, selectors resolve - Desktop and mobile visuals verified - Events fire, sticky CTAs proxy original behavior - CSS specificity checked, minimal conflicts What this delivers - 60 to 70 percent less manual coding on typical UI and layout changes - Reproducible structure across projects - Clear handoff notes for QA and analytics How are you setting guardrails for AI-generated code in experimentation work? #CRO #ABTesting #DataAnalytics #MachineLearning #FutureTech
To view or add a comment, sign in
-
-
<< FileSearch - How File Search works File Search accelerates your development workflow by handling the complexities of RAG for you. It provides a user-friendly alternative to a self-managed setup. Simple, integrated developer experience: We've streamlined the entire RAG process. File Search automatically manages file storage, optimal chunking strategies, embeddings and the dynamic injection of retrieved context into your prompts. It works within the existing `generateContent` API, making it easy to adopt. Powerful vector search: Powered by our latest state-of-the-art Gemini Embedding model, File Search uses vector search to understand the meaning and context of a user's query. It can find relevant information from your documents, even if the exact words aren't used. Built-in citations: The model’s responses automatically include citations that specify which parts of your documents were used to generate the answer, simplifying verification. Support for a wide range of formats: You can build a comprehensive knowledge base using a vast array of file formats, including PDF, DOCX, TXT, JSON and many common programming language file types (see the full list of supported formats in the docs) >>
To view or add a comment, sign in
-
update after another ~3 months. still seeing steady growth. some commentary: i have been bearish on mcp because it doesn't have a solution for tool pollution. i think the pipe-dream for everyone is to have this chatbot that can access all their data from any system of record and synthesize it for good use. the fundamental limitation is the expressiveness of the tools available and the LLMs capability to compose those tools. an analogy from my prof that i really liked is this: "when an engineer sits down to write code that solves a problem, they don't write out the individual functions then compose those together. instead they do the opposite, they write the complete script to perform the solution then after the fact, they may refactor to composable units." so all that to say, this code execution direction from Anthropic is compelling: https://lnkd.in/gxBV5-Zq what's more expressive than a programming language? data source: https://lnkd.in/gFpKBUA8 original post: https://lnkd.in/gF3ACufb update 1 post: https://lnkd.in/gkE9G6cA
To view or add a comment, sign in
-
-
The OpenAI node in n8n now allows using built-in tools like web search and code interpreter. Code execution is pretty handy if you want to run any dynamic calculation or data processing as part of your request. LLMs are notoriously bad at math, so just prompting it to do any sort of calculation is a bad idea. But they accel at writing code. So instead, you can now ask it to generate code and also run it for you. Combine that with a web search and you get a nice data fetching and processing flow within a single node.
To view or add a comment, sign in
-
More from this author
Explore content categories
- Career
- Productivity
- Finance
- Soft Skills & Emotional Intelligence
- Project Management
- Education
- Technology
- Leadership
- Ecommerce
- User Experience
- Recruitment & HR
- Customer Experience
- Real Estate
- Marketing
- Sales
- Retail & Merchandising
- Science
- Supply Chain Management
- Future Of Work
- Consulting
- Writing
- Economics
- Artificial Intelligence
- Employee Experience
- Workplace Trends
- Fundraising
- Networking
- Corporate Social Responsibility
- Negotiation
- Communication
- Engineering
- Hospitality & Tourism
- Business Strategy
- Change Management
- Organizational Culture
- Design
- Innovation
- Event Planning
- Training & Development