riskmetric is designed to be readily extensible. This is done through use of the S3 method dispatch system and a conscious acknowledgement of the varying needs that someone may have when assessing package risk. With this in mind, every user facing function is designed first and foremost to be flexible.

Here we’ll walk through a trivial example where we’ll extend riskmetric to add a new assessment, scoring and risk summary function to determine the risk associated with a package given its name starts with the letter “r”.

Adding an Assessment

Assessments are the atomic unit of the riskmetric package, and are used to kick off an individual metric evaluation. Each assessment is a generic function starting with an assess_ prefix, which can dispatch based on the subclass of the pkg_ref object.

Assessment Example

As an example, take a look at how assess_has_news has been implemented. We’ll focus on just the generic and the pkg_install functions:

#> assess_has_news <- function (x, ...) 
#> {
#>     UseMethod("assess_has_news")
#> }
#> attr(,"column_name")
#> [1] "has_news"
#> attr(,"label")
#> [1] "number of discovered NEWS files" 
#> 
#> assess_has_news.pkg_install <- NULL

There are a couple things to note. First, the S3 system is used to dispatch functionality for the appropriate package reference class. Since the way we’d assess the inclusion of a NEWS file might be different for an installed package or remotely sourced metadata, we may have distinct functions to process these datatypes in distinct ways.

Second, a cosmetic "column_name" attribute is used, which replaces the function name as the name of the new column after using the assess() verb.

Finally, a pkg_metric object is returned, which stores arbitrary data pertaining to the metric and importantly adopts a unique subclass for the assessment function.

Writing a New Assessment

Now we’ll write our assessment. Eventually we want to consider a package high risk if the name does not start with “r”. We’ll need to make a pkg_metric object containing the first letter of the name.

assess_name_first_letter <- function(x, ...) {
  UseMethod("assess_name_first_letter")
}
attr(assess_name_first_letter, "column_name") <- "name_first_letter"

assess_name_first_letter.pkg_ref <- function(x, ...) {
  pkg_metric(substr(x$name, 0, 1), class = "pkg_metric_name_first_letter")
} 

Adding pkg_ref Metadata

Perhaps we want to reuse metadata used when assessing the first letter so that it can be reused by other assessments. For particularly taxing metadata, such as metadat that requires a query against a public API, scraping a web page or a large data download, it’s important to store it for other assessment functions to reuse.

To handle this, we define a function for pkg_ref_cache to dispatch to.

Example Metadata Caching

This is how the riskmetric package handles parsing the DESCRIPTION file so that it can feed into all downstream assessments without having to re-parse the file each time or copy the code to do so.

#> pkg_ref_cache.description <- function (x, name, ...) 
#> {
#>     UseMethod("pkg_ref_cache.description")
#> } 
#> 
#> pkg_ref_cache.description.pkg_install <- function (x, name, ...) 
#> {
#>     read.dcf(file.path(x$path, "description"))
#> }

Once these are defined, they’ll be automatically called when the field is first accessed by the pkg_ref object, and then stored for any downstream uses.

library(riskmetric)
package <- pkg_ref("riskmetric")
#> <pkg_install, pkg_ref> riskmetric v0.1.0.9000
#> $path
#>   [1] "/home/user/username/R/3.6/Resources/library/riskmetric"
#> $source
#>   [1] "pkg_install"
#> $version
#>   [1] '0.1.0.9000'
#> $name
#>   [1] "riskmetric"
#> $description...
#> $help...
#> $help_aliases...
#> $news...

Notice that upon initialization, the description field indicates that it hasn’t yet been evaluated with a trailing ... in the name. When accessed, the object will call a caching function to go out and grab the package metadata and return the newly derived value.

Because the pkg_ref object stores an environment, caching this values once makes them available for any future attempts to access the field. This is helpful because we, as developers of the package, don’t need to think critically about the order that assessments are performed, and allows users to redefine the order of assessments without worry about how metadata is acquired.

Writing a Metadata Cache

Now, for our new metric, we want to cache the package name’s first letter. We need to add a new pkg_ref_cache function for our field. Thankfully, any subclass of pkg_ref can access the first letter the same way, so we just need the one function.

After adding this caching function, we need to make a small modification to assess_name_first_letter.pkg_ref in order use our newly cached value.

assess_name_first_letter.pkg_ref <- function(x, ...) {
  pkg_metric(x$name_first_letter, class = "pkg_metric_name_first_letter")
} 

Let’s try it out!

Defining an Assessment Scoring Function

Next, we need a function for scoring our assessment output. In this case, our output is a pkg_metric object whose data is the first letter of the package name.

We’ll add a dispatched function for the score function. As a convention, these functions return a numeric value representing how well the package conforms to best practices with values between 0 (poor practice) and 1 (best practice).

Adding our Assessment to the assess() Verb

The assess function accepts a list of functions to apply. riskmetric provides a shorthande, all_assessments() to collect all the included assessment functions, and you’re free to add to that list to customize your own assessment toolkit.

Our scoring function will automatically get picked up and used by the score method.

and we can define our own summarizing function to aggregate our scores into a single numeric risk.

How you can help…

The riskmetric package was designed to be easily extensible. You can develop dispatched functions in your development environment, hone them into well formed assessments and contribute them back to the core riskmetrics package once you’re done.

If you’d like feedback before embarking on developing a new metric, please feel free to file an issue on the riskmetric GitHub.