Text formatters

Max Gordon

2024-07-20

Text formatters

Bundled with this package are some text formatting functions. The purpose of these is to convert numeric values into character/text that is more pleasent in publication tables.

txtRound

While base::round() is an excellent function in most cases we often want a table to retain trailing 0:s. E.g.

library(htmlTable)
library(dplyr)
library(magrittr)
data("mtcars")

mtcars %<>%
  mutate(am = factor(am, levels = 0:1, labels = c("Automatic", "Manual")),
         vs = factor(vs, levels = 0:1, labels = c("V-shaped", "straight")))

mtcars %>% 
  head(3) %>% 
  select(Transmission = am, Gas = mpg, Weight = wt) %>% 
  htmlTable()
Transmission Gas Weight
Mazda RX4 Manual 21 2.62
Mazda RX4 Wag Manual 21 2.875
Datsun 710 Manual 22.8 2.32

doesn’t look visually that great, instead we would prefer to have something like this:

mtcars %>% 
  head(3) %>% 
  select(Transmission = am, Gas = mpg, Weight = wt) %>% 
  txtRound(digits = 1) %>% 
  htmlTable()
Transmission Gas Weight
Mazda RX4 Manual 21.0 2.6
Mazda RX4 Wag Manual 21.0 2.9
Datsun 710 Manual 22.8 2.3

Single/vector values

At the core of the txtRound is the single/vector value conversion:

txtRound(c(1, 1.1034), digits = 2)
## [1] "1.00" "1.10"
# Use a character to convert
txtRound("1.2333", digits = 2)
## [1] "1.23"

If you have some values that need thousand separation you can also add txtInt_args.

# Large numbers can be combined with the txtInt option
txtRound(12345.12, digits = 1, txtInt_args = TRUE)
## [1] "12,345.1"
txtRound(12345.12, digits = 1, txtInt_args = list(language = "se", html = FALSE))
## [1] "12 345.1"

Data frames

As seen in the introduction we can use data frames for input. We can here rename the converted columns:

mtcars %>% 
  head(3) %>% 
  select(mpg, wt) %>% 
  txtRound(mpg, wt_txt = wt, digits = 1)
##                mpg    wt wt_txt
## Mazda RX4     21.0 2.620    2.6
## Mazda RX4 Wag 21.0 2.875    2.9
## Datsun 710    22.8 2.320    2.3

And we can specify the number of decimals that we’re interested in per column:

mtcars %>% 
  head(3) %>% 
  select(mpg, qsec, wt) %>% 
  txtRound(digits = list(wt = 2, .default = 1))
##                mpg qsec   wt
## Mazda RX4     21.0 16.5 2.62
## Mazda RX4 Wag 21.0 17.0 2.88
## Datsun 710    22.8 18.6 2.32

Matrix

We can also feed a matrix into the txtRound:

mtcars_matrix <- mtcars %>% 
  select(mpg, qsec, wt) %>% 
  head(3) %>% 
  as.matrix()

mtcars_matrix %>% 
  txtRound(digits = 1)
##               mpg    qsec   wt   
## Mazda RX4     "21.0" "16.5" "2.6"
## Mazda RX4 Wag "21.0" "17.0" "2.9"
## Datsun 710    "22.8" "18.6" "2.3"

Here we have some options of excluding columns/rows using regular expressions:

mtcars_matrix %>% 
  txtRound(excl.cols = "^wt$",
           excl.rows = "^Mazda RX4$",
           digits = 1)
##               mpg    qsec    wt     
## Mazda RX4     "21"   "16.46" "2.62" 
## Mazda RX4 Wag "21.0" "17.0"  "2.875"
## Datsun 710    "22.8" "18.6"  "2.32"

Similarly to the data.frame we can use the same syntax to pick column specific digits:

mtcars_matrix %>% 
  txtRound(digits = list(mpg = 0, wt = 2, .default = 1))
##               mpg  qsec   wt    
## Mazda RX4     "21" "16.5" "2.62"
## Mazda RX4 Wag "21" "17.0" "2.88"
## Datsun 710    "23" "18.6" "2.32"

txtInt

While scientific format is useful if familiar with the syntax it can be difficult to grasp for scholars with a less mathematical background. Therefore the thousand separator style can be quite useful, also known as digital grouping:

txtInt(1e7)
## [1] "10,000,000"

As Swedish and many other languages rely on space (SI-standard) we can specify language as a parameter. Note that as we don’t want to have line breaks within a digit we can use non-breaking space for keeping the number intact (the html-code is &nbsp;):

txtInt(1e7, language = "SI", html = FALSE)
## [1] "10 000 000"
txtInt(1e7, language = "SI", html = TRUE)
## [1] "10&nbsp;000&nbsp;000"

Note that there are the option htmlTable.language and htmlTable.html that you can use for the input of these parameters.

txtPval

The p-value is perhaps the most controversial of statistical output, nevertheless it is still needed and used correctly it has it’s use. P-values are frequently rounded as the decimals are not as important. The txtPval is a convenient function with some defaults that correspond to typical uses in medical publications.

txtPval(c(0.1233213, 0.035, 0.001, 0.000001), html = FALSE)
## [1] "0.12"     "0.035"    "0.001"    "< 0.0001"
# The < sign is less-than in html code '&lt;'
txtPval(c(0.05, 0.001, 0.000001), html = TRUE)
## [1] "0.050"       "0.001"       "&lt; 0.0001"

txtMergeLines

In html we indicate new line using <br /> while the latex style uses hbox. To help with these two there is the txtMergeLines that merges lines into one properly formatted unit:

txtMergeLines("Line 1",
              "Line 2",
              "Line 3")

Line 1
Line 2
Line 3

Note that you can also use a single multi-line string:

txtMergeLines("Line 1
               Line 2
               Line 3")

Line 1
Line 2
Line 3

txtMergeLines("Line 1
               Line 2
               Line 3",
              html = FALSE)
## [1] "\\vbox{\\hbox{\\strut Line 1}\\hbox{\\strut Line 2}\\hbox{\\strut Line 3}}"