An Introduction to excerptr

Andreas Dominik Cullmann

2021-08-03, 16:26:49

excerptr is an R interface to the python package excerpts. See there for more on the Why.

Suppose you have a script

path <- system.file("tests", "files", "some_file.R", package = "excerptr")
cat(readLines(path), sep = "\n")
#######% % All About Me
#######% % Me
####### The above defines a pandoc markdown header.
####### This is more text that will not be extracted.
#######% **This** is an example of a markdown paragraph: markdown 
#######% recognizes only six levels of heading, so we use seven or
#######% more levels to mark "normal" text.
#######% Here you can use the full markdown 
#######% [syntax](http://daringfireball.net/projects/markdown/syntax).
#######% *Note* the trailing line: markdown needs an empty line to end
#######% a paragraph.
#######%

#% A section
##% A subsection
### Not a subsubsection but a plain comment.
############% Another markdown paragraph.
############%
####### More text that will not be extracted.

and you would want to excerpt the comments marked by ‘%’ into a file giving you the table of contents of your script. Then

excerptr::excerptr(file_name = path, run_pandoc = FALSE, output_path = tempdir())
## [1] 0

gives you

cat(readLines(file.path(tempdir(), sub("\\.R$", ".md", basename(path)))), 
    sep = "\n")
% All About Me
% Me
**This** is an example of a markdown paragraph: markdown 
recognizes only six levels of heading, so we use seven or
more levels to mark "normal" text.
Here you can use the full markdown 
[syntax](http://daringfireball.net/projects/markdown/syntax).
*Note* the trailing line: markdown needs an empty line to end
a paragraph.

# A section
## A subsection
Another markdown paragraph.

If you have pandoc installed, you can convert the markdown output into html:

is_pandoc_installed <- nchar(Sys.which("pandoc")) > 0 &&
                              nchar(Sys.which("pandoc-citeproc")) > 0
is_pandoc_version_sufficient <- FALSE
if (is_pandoc_installed) {
    reference <- "1.12.3"
    version <- strsplit(system2(Sys.which("pandoc"), "--version", stdout = TRUE), 
                        split = " ")[[1]][2]
    if (utils::compareVersion(version, reference) >= 0)
        is_pandoc_version_sufficient <- TRUE
}
if (is_pandoc_version_sufficient) 
    excerptr::excerptr(file_name = path, pandoc_formats = "html", 
                       output_path = tempdir())

This runs pandoc on your excerpted comments and generates an html file you can view via:

if (is_pandoc_version_sufficient) 
    cat(readLines(file.path(tempdir(), sub("\\.R$", ".html", basename(path)))), 
        sep = "\n")

You browse it via

browseURL(file.path(tempdir(), sub("\\.R$", ".html", basename(path))))