R Validation Hub

Status Report & Workshop

Doug Kelkhoff

on behalf of the R Validation Hub team

2023-09-18

Slides Available!

pharmar.github.io/events-positconf2023

👋 Who We Are

The R Validation Hub is a collaboration to support the adoption of R within a biopharmaceutical regulatory setting (pharmaR.org)

Grew out of R/Pharma 2018
Led by participants from ~10 organizations
With frequent involvement from health authorities (primarily the FDA)
And subscribers from ~60 organizations spanning multiple industries

🤝 Affiliates: PSI/AIMS (CAMIS)

Comparing Analysis Method Implementations in Software
A cross-industry group formed of members from PHUSE, PSI, and ASA.

Released a white paper providing guidance on appropriate use of stats methods, for example:
- Don’t default to the defaults
- Be specific when drafting analysis plans, including precise methods & options
A resource for knowing the details of methods across languages

🤝 Affiliates: PSI/AIMS (CAMIS)

Currently building a repository of methods comparisons at psiaims.github.io/CAMIS.
Check out their contributors page to learn how you can get involved!

CAMIS Comparisons Resources
Methods		R	SAS	Comparison
Summary Statistics	Rounding	`R`	`SAS`	`R` vs `SAS`
	Summary Statistics	`R`	`SAS`	`R` vs `SAS`
…	…	…	…	…

🤝 Affiliates: R Consortium

Works with and provides support to the R Foundation and to the key organizations developing, maintaining, distributing and using R software

Key Activities

The R Validation Hub
R Submission Working Group
R Repositories Working Group (ie CRAN enhancements, future development)

👷‍♂️ What We Do (pharmaR.org)

Products

White Paper

Guidance on compliant use of R and management of packages

New! Repositories

Building a public, validation-ready resource for R packages

Coline Zeballos

New! Communications

Connecting validation experts across the industry

Juliane Manitz

`{riskmetric}`

Gather and report on risk heuristics to support validation decision-making

Eric Milliman

`{riskassessment}`

A web interface to {riskmetric}, supporting review, annotation and cataloging of decisions

Aaron Clark

New! `{riskscore}`

An R data package capturing risk metrics across all of CRAN

Aaron Clark

📊 A Quick Survey

Keep your hand raised if…

It’s early morning and you need an excuse to stretch
This is your first time hearing about the R Validation Hub
You’re missing Andy’s posh accent
Your org contributes to the R Validation Hub
Your org leverages the R Validation Hub guidelines
Your org uses R Validation Hub tools ({riskmetric}, {riskassessment})

🗓️ Agenda

Updates 20min
Established Workstream Recap 10min
past, present & future
Repositories Workstream Introduction 15min
Table Discussions: Leaps of Faith 20min
Room Discussion: Design Lab 20min
Closing

📣 Updates

🗝 Key Policy Updates!

If nothing else, take this home!

The FDA appears to accept .R files through their eSUB portal¹.
The FDA has released a draft of a new Computer Software Assurance² guideline that seems to be increasingly the basis for their evaluation of R.

🗝 Key Policy Updates!

If nothing else, take this home!

Identifying Intended Use ¹

Software is used directly for the production and quality systems’ automation inspection, testing, or the collection and processing of production data. Software supports development, monitoring and automated testing. A manufacturer should use a risk-based analysis to determine appropriate assurance activities.

🗝 Key Policy Updates!

If nothing else, take this home!

Determining the Appropriate Assurance Activities¹

Assurance can include Ad-hoc testing, Exploratory testing (active package use), Error-guessing (regression testing), Robust scripted testing and Limited scripted testing (traceable, reproducible testing suites).

“This approach may apply scripted testing for high-risk features”

Change of Leadership

You may have noticed that I am not Andy Nicholls.
Last year, Andy decided to step down to focus on his growing responsibilities as Head of Data Science at GSK

Pulse Check

We looked back on how we had been working
Identified new opportunities
1. Refining holistic strategic direction
2. Be mindful about communication and organization
We have a new Communication workstream! (and awesome new slides!)
More ways to get involved

📜 Workstream Report

Case Studies

7 Companies Shared their Approach to Package Validation

Commonalities

Categorized risk (hi/med/lo)
Heavily weight unit testing
Base & Suggested packages “trusted”

Differences

Risk stratification (eg coverage cutoff)
Managing risk
- human-in-the-middle review
- restricted package subset
- adding bespoke testing

Themes

Time & resource intensive
Requires unique intersection of expertise
Challenges of lifecycle management of ecosystem

`{riskmetric}` Roadmap

Ease of use:
Wrapper functions for a a complete workflow, prettier outputs
Metric completeness:
Implement metrics for as many pacakge sources as possible. Chain sources together to create more complete assessments
Modular addititions:
Allow users to easy add custom assessments, create optional assessments based on community packages (e.g. oyster, srr, pkgstats, etc)
Focusing on metrics and scoring:
Making custom weighting more robust and convenient. Guidance materials on weighting specific assessments based on community feedback and our own views on best practices.

`{riskassessment}` App

Feature Recap

Face lifts for Report Builder & Database View
Better dependency inspection
Org-level customizations using config file
Allow admin user role management
Package file explorer

`{riskassessment}` App

New Test explorer! (Code provided by GSK)

📦 Repositories

Repositories Workstream

Supporting a transparent, open, dynamic, cross-industry approach of establishing and maintaining a repository of R packages.

Taking ample time to engage stakeholders
- Validation leads across the industry
- Active health authority involvement
- Analytic environment admins and developers
Considering the possibilities
- Mapping needs to solutions that meet the industry where it is
- …while building the path for it to move forward

How did we get here?

Our whitepaper is widely adopted
But implementing it is inconsistent & laborious
- Variations throughout industry pose uncertainty
- Sharing software with health authorities is a challenge
- Health authorities, overwhelmed by technical inconsistencies, are more likely to question software use
We feel the most productive path forward is a shared ecosystem

Work to-date

Building consensus in package evaluation and distribution…

Who needs a repository anyways?
Stakeholder engagement 3mo
Product refinement and proof-of-concept planning 1mo
POC development 2mo

✋ Hold up! Why a repository?

“Every successful team starts with a small existential crisis”
– unknown

Tools for building evaluation in-house?
Sharing of extra testing resources?
Curation of packages?
A stricter CRAN?

Embracing change

Old dog, new trick

Modern package ecosystems are the stats world’s new trick
Methods are provided directly by statisticians and academics, rarely by vendors.
Risk is managed not by itemized requirements, but by good development practices.¹
We need to learn how to manage risk in a constantly evolving ecosystem

Comparing Approaches

Vendored Stats Products

Data Science Ecosystem

Of-the-shelf cohort.

A “snapshot” of living repository.

Internal tools developed against cohort packages.

Internal tools developed against latest packages.

New package versions risk incompatibility.

New packages can be reviewed and upgraded at-will.

Steep upgrade cost (time, developement).

Living ecosystem, constantly vetted against new releases

System-specific mix of packages.

More likely what is used by HAs

Tied to current validation expectations.

Adaptable as R best practices evolve

Challenges shipping in-house code.

Interesting Stakeholder Findings

Health authority primary concerns
- Avoiding security vulnerabilities while using R
- Visible discussions vetting methodology and relevance
Industry validation leads
- Relieved that open-source tools are public, less need to audit vendored tools
System administrators, users and developers
- Want clarity and consistency internally and externally

Prototyping

Running three prototypes to explore specific needs

Test case exchange format (repo)
Communication channels for methods discussion & considerations (google doc)
Risk filters and transparency of known vulnerabilities (repo)

Prototyping

Test Case Exchange Format

{
  "rPackage": {
    "name": "stats",
    "link": "https://cran.r-project.org/package=stats"
  },
  "CSADocPkg": {
    "function": {
      "name": "t.test",
      "assuranceActivity": {
        "activityType": "Scripted Testing: Robust",
        "definition": "Scripted testing efforts in which the risk of the computer system or automation includes evidence of repeatability, traceability to requirements, and auditability.",
        "parameters": {
          "testObjectives": [
            {
              "uuid": "6cde1b0f-3e41-4878-8cd5-79c87be88a7d",
              "objective": "Verify that p values produced by stats::t.test are uniformly distributed",
              "keywords": ["t.test", "p values", "uniform distribution"],
              "testCases": [
                {
                  "uuid": "4fa03a8d-2e39-4866-9cd3-69b77bd78a6b",
                  "testName": "t test produces calibrated p values",
                  "description": "This test checks that the p values produced by stats::t.test do not deviate substantially from the expected uniform distribution.",
                  "code": "set.seed(42)\\nm <- 100\\nfor (n in c(5, 50, 500)) {\\n# repeatedly sample data under null and record p value\\nres <- numeric(m)\\nfor (i in 1:m) {\\nres[i] <- t.test(rnorm(n))$p.value\\n}\\n# expect non significant result\\nexpect_true(\\nks.test(res, 'punif')$p.value > 0.05\\n)\\n}",
                  "result": "pass",
                  "environment": {
                    "container": "rocker/tidyverse:4.3.1",
                    "runtime": "singularity",
                    "runtimeVersion": "3.8",
                    "renvLockfile": ""
                  }
                }
  ...

Prototyping

Communication Channels

Prototyping

Package Security & Risk Filters

install.packages("options")
#> Security vulnerabilities found in packages to be installed.
#> To proceed with installation, re-run with `accept_vulnerabilities = TRUE`
#> 
#> ── Vulnerability overview ──
#> 
#> ℹ 1 package was scanned
#> ℹ 1 package was found in the Sonatype database
#> ℹ 1 package had known vulnerability
#> ℹ A total of 1 known vulnerability was identified
#> ℹ See https://github.com/sonatype-nexus-community/oysteR/ for details.

nrow(available.packages())
#> 5 "low risk" packages

options(available_packages_filters = NULL)
nrow(available.packages())
#> 17 available packages

A fork in the road

Given the key capabilities and tools to address them. How do we bundle these solution to address industry needs?

Support our industry today

Delivering in-house solutions for you to pick-and-choose

Consistent processes to apply
Local tools to deploy in-house
Community forum for knowledge sharing

Build what we want the industry to be

Drive change through transparency and consistency

Lead by example with a public solution
Make it easier to adopt than re-build
Transparency-first solutions

What does a solution look like?

Closing the CRAN gap for the Pharma Use Case

Reproducibility guidelines
Standard, public assessment of packages
Avenues for communicating about implementations, bugs, security

The Proposal so Far

🧗 “Leaps of Faith”

A “Golden” Base Image
to establish ground truth for testing.
Rethinking requirements
testing, external vetting (CRAN) and adoption are sufficient for Scripted Testing needs - are new requirements necessary?
Expectations of Public Communication
industry-standard communication channels.
Nearly all meaningful assessment can be automated
edge cases (malicious code, methods debate) are better handled by transparent community engagement.

The Proposal so Far

🧗 “Leaps of Faith” Discussion

How would your operations change if the industry adopted…

A “Golden” Base Image
to establish ground truth for testing.
Rethinking requirements
testing, external vetting (CRAN) and adoption are sufficient for Scripted Testing needs - are new requirements necessary?
Expectations of Public Communication
industry-standard communication channels.
Nearly all meaningful assessment can be automated
edge cases (malicious code, methods debate) are better handled by transparent community engagement.

Let’s Discuss!

Choose one of the “Leaps of Faith” to discuss at your table
Stick Notes!
List effects on your organization (good or bad)
Leap Landed or Falls Short
We’ll collect stickies into categories

We’ll discuss as a room in ~10 minutes

Regroup

Sharing any key discussion points, let’s try to find

2 newly enabled activities
2 pitfalls to avoid

Workflow Dreams

Open Discussion

Let’s paint a “perfect” regulatory R organization. How would you like to see it work?

How is analysis shared? Files, images, packages?
How do you ensure quality, do new workflow opportunities open up?
What about integrated solutions (Shiny, cross-language solutions?)
How are internal tools layered on open-source tools? How are they reproducibly shared with health authorities?
How does hiring/recruitment change when workflow is more public?

Closing

Thank you for your engagement!

We’re excited to champion new ways of working that bring Pharma’s together.

R Validation Hub

Slides Available!

👋 Who We Are

🤝 Affiliates: PSI/AIMS (CAMIS)

🤝 Affiliates: PSI/AIMS (CAMIS)

🤝 Affiliates: R Consortium

Key Activities

👷‍♂️ What We Do (pharmaR.org)

Products

White Paper

New! Repositories

New! Communications

{riskmetric}

{riskassessment}

New! {riskscore}

📊 A Quick Survey

🗓️ Agenda

📣 Updates

🗝 Key Policy Updates!

🗝 Key Policy Updates!

🗝 Key Policy Updates!

Change of Leadership

Pulse Check

📜 Workstream Report

Case Studies

Commonalities

Differences

Themes

{riskmetric} Roadmap

{riskassessment} App

Feature Recap

{riskassessment} App

New Test explorer! (Code provided by GSK)

📦 Repositories

Repositories Workstream

How did we get here?

Work to-date

✋ Hold up! Why a repository?

Embracing change

Old dog, new trick

Comparing Approaches

Vendored Stats Products

Data Science Ecosystem

Interesting Stakeholder Findings

Prototyping

Prototyping

Test Case Exchange Format

Prototyping

Communication Channels

Prototyping

Package Security & Risk Filters

A fork in the road

What does a solution look like?

The Proposal so Far

🧗 “Leaps of Faith”

The Proposal so Far

🧗 “Leaps of Faith” Discussion

Let’s Discuss!

Regroup

Workflow Dreams

Open Discussion

Closing

`{riskmetric}`

`{riskassessment}`

New! `{riskscore}`

`{riskmetric}` Roadmap

`{riskassessment}` App

`{riskassessment}` App