Great strides in artificial intelligence development during the last five years produced agents that are now commonplace at work and home. It is humbling to note that virtually all frontier large language models today trace back to a preprint introducing the transformer neural network architecture – a fifteen-page paper that profoundly ...
1 Introduction
Differencing is one of the most common transformations in time series analysis.
It is also one of the easiest transformations to misunderstand.
In many ARIMA-style workflows, differencing is introduced almost mechanically: i...
Read it in: Español. We are excited to introduce the new team of mentors for the rOpenSci 2026 Champions Program! This year we have eleven individuals committed to open science, bringing together a rich diversity of backgrounds and perspectives. The t... [Read more...]
Introduction Differential Machine Learning (DML), as introduced in the recent arXiv paper (Differential Machine Learning for 0DTE Options with Stochastic Volatility and Jumps), extends supervised learning by incorporating not only function values but also their derivatives. In financial contexts, this often means sensitivities such as Greeks. However, when direct derivatives ...
I tend to write a lot of functions that create specific graphics implemented with ggplot2. Although I try to pick graphic parameters (e.g. colors, text size, etc.) that are reasonable, I will typically define all relevant aesthetics as param... [Read more...]
JAGS 5.0.0-beta is now available from SourceForge. The beta release is for two groups of people: Please send feedback via the JAGS forums or file a bug report The JAGS library The following packages are available: The rjags package In … Continue reading →
I’m getting more and more into data engineering these days and having used R for
a long time, I’m seeing a lot of problems that look nail-shaped to my R-shaped
hammer. The available tools to solve those problems exist for (presumably) very
good reasons, so I wanted to ...
Have you ever looked at a freshly plotted scatter plot and immediately thought, “Ah, this is clearly a logarithmic curve with some heteroskedastic noise,” without running a single line of modeling code? How do you do that? You don’t perform gradient descent in your head. You use your intuition! ...
Snow in Inwood, New York. Photograph by the author.
Recently I’ve been looking at hourly ridership data from the New York City Subway. Last time we learned that people go to work in the morning and come home in the eve... [Read more...]
A note to myself on survival analysis — KM curves, log-rank tests & Cox models 🧮 If I wrote it the way I understood it, maybe I’ll actually remember it 🤞
Motivations
We see survival analysis or more generally call...
rvflnet is an R package that implements a Random Vector Functional Link (RVFL) network. It is a nonlinear expressive version of glmnet that can be used for regression, classification and survival analysis.
Frank Harrell’s Regression Modeling Strategies online seminar will take place May 14, 15, 18, and 19. This workshop covers principled strategies for building, validating, and interpreting multivariable regression models for a wide range of outcomes, with emphasis on predictive accuracy, avoiding overfitting, and interpreting estimated effects. It explores spline methods, data reduction, benefits ... [Read more...]
Join our workshop on Reactive Shiny Apps and Deployment with Google Cloud Run: Intermediate R Shiny Workshop, which is a part of our workshops for Ukraine series! Here’s some more info: Title: Reactive Shiny Apps and Deployment with Google Cloud Run: Intermediate R Shiny Workshop Date: Thursday, May 21st, 18:00 – 20:00 ... [Read more...]
The Missing Standard
CDISC released the Population Pharmacokinetic (PopPK) Implementation Guide in 2023, giving the clinical programming community a clear structural blueprint for PK analysis datasets. But Exposure-Response (ER) modeling — wh... [Read more...]
Dear rOpenSci friends, it’s time for our monthly news roundup! You can read this post on our blog. Now let’s dive into the activity at and around rOpenSci!
Tomáš Kalibera (1978–2026)
The rOpenSci team is deeply saddened at the los... [Read more...]
dplyr verbs are descriptive: let’s make them more verbose!
Yet another pipe for R.
Repost for better image handling on r-bloggers.
Motivation
In SAS, every DATA step prints a log:
NOTE: There were 120000 observations read f...
Hello, I’d like to share CougarStats, a free and open-source R Shiny web app I developed to support the teaching and learning of Statistics. CougarStats runs entirely in a browser and is designed for accessibility and ease of use. You can explore the app here: https://www.cougarstats.ca/ ... [Read more...]
Introduction Universities are increasingly using collaborative learning pedagogies, which can benefit learners through deeper understanding of course content and teamwork skills. However, the realisation of these sought-after benefits depend on how educators assign learners to groups. Educators have formulated various mathematical models to perform this assignment. Some have developed developed ... [Read more...]
Understanding R’s describe() Function: A Complete Guide to Summary Statistics Table of Contents Introduction to describe() Breaking Down the Output Columns Key Statistics and Their Interpretation Practical Examples When to Use Which Statistic Extending the Functionality Conclusion Introduction to describe() The describe() function from R’s psych package (Revelle, 2023) ... [Read more...]