In this vignette you can find details on
Modules are first class citizens in the sense that they can be treated like any other data structure in R:
Modules are represented as list type in R. Such that
library("modules")
module({
m <- function() "foo"
foo <-
})is.list(m)
#> [1] TRUE
class(m)
#> [1] "module" "list"
S3 methods may be defined for the class module. The package itself only implements a method for the generic function print
.
Nested modules are modules defined inside other modules. In this case dependencies of the top level module are accessible to its children:
module({
m <-
import("stats", "median")
module({
anotherModule <- function() "foo"
foo <-
})
function() "bar"
bar <-
})
getSearchPathContent(m)
#> List of 5
#> $ modules:root : chr [1:2] "anotherModule" "bar"
#> $ modules:stats : chr "median"
#> $ modules:internals: chr [1:10] "attach" "depend" "export" "expose" ...
#> $ base : chr [1:1268] "!" "!.hexmode" "!.octmode" "!=" ...
#> $ R_EmptyEnv : chr(0)
#> - attr(*, "class")= chr [1:2] "SearchPathContent" "list"
getSearchPathContent(m$anotherModule)
#> List of 7
#> $ modules:root : chr "foo"
#> $ modules:internals: chr [1:10] "attach" "depend" "export" "expose" ...
#> $ modules:root : chr [1:2] "anotherModule" "bar"
#> $ modules:stats : chr "median"
#> $ modules:internals: chr [1:10] "attach" "depend" "export" "expose" ...
#> $ base : chr [1:1268] "!" "!.hexmode" "!.octmode" "!=" ...
#> $ R_EmptyEnv : chr(0)
#> - attr(*, "class")= chr [1:2] "SearchPathContent" "list"
Sometimes it can be useful to pass arguments to a module. If you have a background in object oriented programming you may find this natural. From a functional perspective we define parameters shared by a list of closures. This is achieved by making the enclosing environment of the module available to the module itself.
function(param) {
m <-amodule({
function() param
fun <-
})
}m(1)$fun()
#> [1] 1
amodule
is a wrapper around module
to abstract the following pattern:
function(param) {
m <-module(topEncl = environment(), {
function() param
fun <-
})
}m(1)$fun()
#> [1] 1
Using one of these approaches you construct a local namespace definition with the option to pass down some arguments.
This can be very useful to handle dependencies between two modules. Instead of:
module({
a <- function() "foo"
foo <-
})
module({
b <- use(a)
a <- function() a$foo()
foo <- })
which would hard code the dependency, we can write:
function(a) {
B <-amodule({
function() a$foo()
foo <-
})
} B(a) b <-
There are many good reasons to follow such a strategy. As an example: consider the case in which module a
introduces side effects. By leaving it open as argument we can later decide what exactly we pass down to the constructor of b
. This may be important to us when we want to mock a database, disable logging or otherwise handle access to external ressources.
You can not only put functions into your bag (module) but any R-object. This includes data: modules can be state-full. To illustrate this we define a module to encapsulate some value and have a get and set method for it:
module({
mutableModule <- NULL
.num <- function() .num
get <- function(val) .num <<- val
set <-
})$get()
mutableModule#> NULL
$set(2) mutableModule
In the next module we can use mutableModule
and rebuild the interface to .num
.
module({
complectModule <-suppressMessages(use(mutableModule, attach = TRUE))
function() get()
getNum <-set(3)
})$get()
mutableModule#> [1] 2
$getNum()
complectModule#> [1] 3
Depending on your expectations with respect to the above code it comes at a surprise that we can get and set that value from an attached module; Furthermore it is not changed in mutableModule
. This is because use
will trigger a re-initialization of any module you plug in. You can override this behaviour:
module({
complectModule <-suppressMessages(use(mutableModule, attach = TRUE, reInit = FALSE))
function() get()
getNum <-set(3)
})$get()
mutableModule#> [1] 3
$getNum()
complectModule#> [1] 3
In contrast to systems of object orientation, modules do not provide a formal mechanism of inheritance. Instead we can use various modes of composition. Inheritance often is used to reuse code; or to add functionality to an existing module.
In this context we may use parameterized modules, use
, expose
and extend
. The first two have already been discussed, as has been dependency injection as a strategy to encode relationships between modules.
expose
is most useful when we want to re-export functions from another module:
function() {
A <-amodule({
function() "foo"
foo <-
})
}
function(a) {
B <-amodule({
expose(a)
function() "bar"
bar <-
})
}
B(A())$foo()
#> [1] "foo"
B(A())$bar()
#> [1] "bar"
Here we can easily add functionality to a module, or only reuse parts of it. Another way to achieve this is to use extend
. The difference is, that with expose
we re-export existing functionality unchanged. With extend
we add lines of code to an existing module definition. This means we can (a) override private members of that module and (b) generally gain access to all implementation details. Hence the following two definitions are equivalent:
Variant A
module({
a <- function() "foo"
foo <- function() "bar"
bar <-
})
a#> bar:
#> function()
#>
#>
#> foo:
#> function()
Variant B
module({
a <- function() "foo"
foo <-
})
extend(a, {
a <- function() "bar"
bar <-
})
a#> bar:
#> function()
#>
#>
#> foo:
#> function()
extend
should be used with great care. It is possible and easy to breake functionality of the module you extend. This is not possible or at least more challenging using expose
.
The real use case for extend
is to add unit tests to a module. You can think of using one of two patterns:
Variant A
module({
a <- function() "foo"
foo <- function() {
test <-stopifnot(foo() == "foo")
} })
Variant B
module({
a <- function() "foo"
foo <-
})extend(a, {
stopifnot(foo() == "foo")
})#> foo:
#> function()
The latter alternative will keep the interface clean and gives access to private member functions. Sometimes this can be very useful for testing.
Of course a good way to write R code is to write packages. Modules inside of packages make a lot of sense, because also in a package we only have one scope to work with. Modules provide more options.
modules::module
: will connect to the packages namespace by default. Functions defined inside modules have access to the internal scope of the package.modules::amodule
: provides a slightly saver way and requires explicit registration of objects from the packages namespace. This can happen via dependency injection or modules::use
.If you write constructor functions for your modules (see example below) you automatically take advantage of R CMD check
. R CMD check
will provide some static code analysis tools which are generally helpful.
As you would avoid using library
inside of packages, you should also avoid using modules::import
. The R package namespace mechanism is more than capable of handling all dependencies.