R Validation Hub

Status Report & Workshop

Doug Kelkhoff
on behalf of the R Validation Hub team

2023-09-18

Slides Available!

pharmar.github.io/events-positconf2023

👋 Who We Are

The R Validation Hub is a collaboration to support the adoption of R within a biopharmaceutical regulatory setting (pharmaR.org)

  • Grew out of R/Pharma 2018
  • Led by participants from ~10 organizations
  • With frequent involvement from health authorities (primarily the FDA)
  • And subscribers from ~60 organizations spanning multiple industries

🤝 Affiliates: PSI/AIMS (CAMIS)

Comparing Analysis Method Implementations in Software
A cross-industry group formed of members from PHUSE, PSI, and ASA.

  • Released a white paper providing guidance on appropriate use of stats methods, for example:
    • Don’t default to the defaults
    • Be specific when drafting analysis plans, including precise methods & options
  • A resource for knowing the details of methods across languages

🤝 Affiliates: PSI/AIMS (CAMIS)

CAMIS Comparisons Resources
Methods R SAS Comparison
Summary Statistics Rounding R SAS R vs SAS
Summary Statistics R SAS R vs SAS

🤝 Affiliates:

Works with and provides support to the R Foundation and to the key organizations developing, maintaining, distributing and using R software

Key Activities

  • The R Validation Hub
  • R Submission Working Group
  • R Repositories Working Group (ie CRAN enhancements, future development)

👷‍♂️ What We Do (pharmaR.org)

Products

White Paper

Guidance on compliant use of R and management of packages

New! Repositories

Building a public, validation-ready resource for R packages

Coline Zeballos

New! Communications

Connecting validation experts across the industry

Juliane Manitz

{riskmetric}

Gather and report on risk heuristics to support validation decision-making

Eric Milliman

{riskassessment}

A web interface to {riskmetric}, supporting review, annotation and cataloging of decisions

Aaron Clark

New! {riskscore}

An R data package capturing risk metrics across all of CRAN

Aaron Clark

📊 A Quick Survey

Keep your hand raised if…

  • It’s early morning and you need an excuse to stretch
  • This is your first time hearing about the R Validation Hub
  • You’re missing Andy’s posh accent
  • Your org contributes to the R Validation Hub
  • Your org leverages the R Validation Hub guidelines
  • Your org uses R Validation Hub tools ({riskmetric}, {riskassessment})

🗓️ Agenda

  • Updates 20min
  • Established Workstream Recap 10min
    past, present & future
  • Repositories Workstream Introduction 15min
  • Table Discussions: Leaps of Faith 20min
  • Room Discussion: Design Lab 20min
  • Closing

📣 Updates

🗝 Key Policy Updates!

If nothing else, take this home!

  • The FDA appears to accept .R files through their eSUB portal1.
  • The FDA has released a draft of a new Computer Software Assurance2 guideline that seems to be increasingly the basis for their evaluation of R.

🗝 Key Policy Updates!

If nothing else, take this home!

Identifying Intended Use 1

Software is used directly for the production and quality systems’ automation inspection, testing, or the collection and processing of production data. Software supports development, monitoring and automated testing. A manufacturer should use a risk-based analysis to determine appropriate assurance activities.

🗝 Key Policy Updates!

If nothing else, take this home!

Determining the Appropriate Assurance Activities1

Assurance can include Ad-hoc testing, Exploratory testing (active package use), Error-guessing (regression testing), Robust scripted testing and Limited scripted testing (traceable, reproducible testing suites).

“This approach may apply scripted testing for high-risk features”

Change of Leadership

  • You may have noticed that I am not Andy Nicholls.
  • Last year, Andy decided to step down to focus on his growing responsibilities as Head of Data Science at GSK

Pulse Check

  • We looked back on how we had been working
  • Identified new opportunities
    1. Refining holistic strategic direction
    2. Be mindful about communication and organization
  • We have a new Communication workstream! (and awesome new slides!)
  • More ways to get involved

📜 Workstream Report

Case Studies

7 Companies Shared their Approach to Package Validation

Commonalities

  • Categorized risk (hi/med/lo)
  • Heavily weight unit testing
  • Base & Suggested packages “trusted”

Differences

  • Risk stratification (eg coverage cutoff)
  • Managing risk
    • human-in-the-middle review
    • restricted package subset
    • adding bespoke testing

Themes

  • Time & resource intensive
  • Requires unique intersection of expertise
  • Challenges of lifecycle management of ecosystem

{riskmetric} Roadmap

  • Ease of use:
    Wrapper functions for a a complete workflow, prettier outputs
  • Metric completeness:
    Implement metrics for as many pacakge sources as possible. Chain sources together to create more complete assessments
  • Modular addititions:
    Allow users to easy add custom assessments, create optional assessments based on community packages (e.g. oyster, srr, pkgstats, etc)
  • Focusing on metrics and scoring:
    Making custom weighting more robust and convenient. Guidance materials on weighting specific assessments based on community feedback and our own views on best practices.

{riskassessment} App

Feature Recap

  • Face lifts for Report Builder & Database View
  • Better dependency inspection
  • Org-level customizations using config file
  • Allow admin user role management
  • Package file explorer

{riskassessment} App

New Test explorer! (Code provided by GSK)

📦 Repositories

Repositories Workstream

Supporting a transparent, open, dynamic, cross-industry approach of establishing and maintaining a repository of R packages.

  • Taking ample time to engage stakeholders
    • Validation leads across the industry
    • Active health authority involvement
    • Analytic environment admins and developers
  • Considering the possibilities
    • Mapping needs to solutions that meet the industry where it is
    • …while building the path for it to move forward

How did we get here?

  • Our whitepaper is widely adopted
  • But implementing it is inconsistent & laborious
    • Variations throughout industry pose uncertainty
    • Sharing software with health authorities is a challenge
    • Health authorities, overwhelmed by technical inconsistencies, are more likely to question software use
  • We feel the most productive path forward is a shared ecosystem

Work to-date

Building consensus in package evaluation and distribution…

  1. Who needs a repository anyways?
  2. Stakeholder engagement 3mo
  3. Product refinement and proof-of-concept planning 1mo
  4. POC development 2mo

✋ Hold up! Why a repository?

“Every successful team starts with a small existential crisis”
unknown

  • Tools for building evaluation in-house?
  • Sharing of extra testing resources?
  • Curation of packages?
  • A stricter CRAN?

Embracing change

Old dog, new trick

  • Modern package ecosystems are the stats world’s new trick
  • Methods are provided directly by statisticians and academics, rarely by vendors.
  • Risk is managed not by itemized requirements, but by good development practices.1
  • We need to learn how to manage risk in a constantly evolving ecosystem

Comparing Approaches

Vendored Stats Products
Data Science Ecosystem
  • Of-the-shelf cohort.
  • A “snapshot” of living repository.
  • Internal tools developed against cohort packages.
  • Internal tools developed against latest packages.
  • New package versions risk incompatibility.
  • New packages can be reviewed and upgraded at-will.
  • Steep upgrade cost (time, developement).
  • Living ecosystem, constantly vetted against new releases
  • System-specific mix of packages.
  • More likely what is used by HAs
  • Tied to current validation expectations.
  • Adaptable as R best practices evolve

Challenges shipping in-house code.

Interesting Stakeholder Findings

  • Health authority primary concerns
    • Avoiding security vulnerabilities while using R
    • Visible discussions vetting methodology and relevance
  • Industry validation leads
    • Relieved that open-source tools are public, less need to audit vendored tools
  • System administrators, users and developers
    • Want clarity and consistency internally and externally

Prototyping

Running three prototypes to explore specific needs

  • Test case exchange format (repo)
  • Communication channels for methods discussion & considerations (google doc)
  • Risk filters and transparency of known vulnerabilities (repo)

Prototyping

Test Case Exchange Format

{
  "rPackage": {
    "name": "stats",
    "link": "https://cran.r-project.org/package=stats"
  },
  "CSADocPkg": {
    "function": {
      "name": "t.test",
      "assuranceActivity": {
        "activityType": "Scripted Testing: Robust",
        "definition": "Scripted testing efforts in which the risk of the computer system or automation includes evidence of repeatability, traceability to requirements, and auditability.",
        "parameters": {
          "testObjectives": [
            {
              "uuid": "6cde1b0f-3e41-4878-8cd5-79c87be88a7d",
              "objective": "Verify that p values produced by stats::t.test are uniformly distributed",
              "keywords": ["t.test", "p values", "uniform distribution"],
              "testCases": [
                {
                  "uuid": "4fa03a8d-2e39-4866-9cd3-69b77bd78a6b",
                  "testName": "t test produces calibrated p values",
                  "description": "This test checks that the p values produced by stats::t.test do not deviate substantially from the expected uniform distribution.",
                  "code": "set.seed(42)\\nm <- 100\\nfor (n in c(5, 50, 500)) {\\n# repeatedly sample data under null and record p value\\nres <- numeric(m)\\nfor (i in 1:m) {\\nres[i] <- t.test(rnorm(n))$p.value\\n}\\n# expect non significant result\\nexpect_true(\\nks.test(res, 'punif')$p.value > 0.05\\n)\\n}",
                  "result": "pass",
                  "environment": {
                    "container": "rocker/tidyverse:4.3.1",
                    "runtime": "singularity",
                    "runtimeVersion": "3.8",
                    "renvLockfile": ""
                  }
                }
  ...

Prototyping

Communication Channels

Prototyping

Package Security & Risk Filters

install.packages("options")
#> Security vulnerabilities found in packages to be installed.
#> To proceed with installation, re-run with `accept_vulnerabilities = TRUE`
#> 
#> ── Vulnerability overview ──
#> 
#> ℹ 1 package was scanned
#> ℹ 1 package was found in the Sonatype database
#> ℹ 1 package had known vulnerability
#> ℹ A total of 1 known vulnerability was identified
#> ℹ See https://github.com/sonatype-nexus-community/oysteR/ for details.
nrow(available.packages())
#> 5 "low risk" packages

options(available_packages_filters = NULL)
nrow(available.packages())
#> 17 available packages

A fork in the road

Given the key capabilities and tools to address them. How do we bundle these solution to address industry needs?

Support our industry today

Delivering in-house solutions for you to pick-and-choose

  • Consistent processes to apply
  • Local tools to deploy in-house
  • Community forum for knowledge sharing

Build what we want the industry to be

Drive change through transparency and consistency

  • Lead by example with a public solution
  • Make it easier to adopt than re-build
  • Transparency-first solutions

What does a solution look like?

Closing the CRAN gap for the Pharma Use Case

  • Reproducibility guidelines
  • Standard, public assessment of packages
  • Avenues for communicating about implementations, bugs, security

The Proposal so Far

🧗 “Leaps of Faith”

  • A “Golden” Base Image
    to establish ground truth for testing.
  • Rethinking requirements
    testing, external vetting (CRAN) and adoption are sufficient for Scripted Testing needs - are new requirements necessary?
  • Expectations of Public Communication
    industry-standard communication channels.
  • Nearly all meaningful assessment can be automated
    edge cases (malicious code, methods debate) are better handled by transparent community engagement.

The Proposal so Far

🧗 “Leaps of Faith” Discussion

How would your operations change if the industry adopted…

  • A “Golden” Base Image
    to establish ground truth for testing.
  • Rethinking requirements
    testing, external vetting (CRAN) and adoption are sufficient for Scripted Testing needs - are new requirements necessary?
  • Expectations of Public Communication
    industry-standard communication channels.
  • Nearly all meaningful assessment can be automated
    edge cases (malicious code, methods debate) are better handled by transparent community engagement.

Let’s Discuss!

  • Choose one of the “Leaps of Faith” to discuss at your table
  • Stick Notes!
    List effects on your organization (good or bad)
  • Leap Landed or Falls Short
    We’ll collect stickies into categories

We’ll discuss as a room in ~10 minutes

Regroup

Sharing any key discussion points, let’s try to find

  • 2 newly enabled activities
  • 2 pitfalls to avoid

Workflow Dreams

Open Discussion

Let’s paint a “perfect” regulatory R organization. How would you like to see it work?

  • How is analysis shared? Files, images, packages?
  • How do you ensure quality, do new workflow opportunities open up?
  • What about integrated solutions (Shiny, cross-language solutions?)
  • How are internal tools layered on open-source tools? How are they reproducibly shared with health authorities?
  • How does hiring/recruitment change when workflow is more public?

Closing

Thank you for your engagement!

We’re excited to champion new ways of working that bring Pharma’s together.