Reading and Writing Data Files

While eg_read() and eg_write() handle raw file transfers, you probably spend most of your time working with data files that you want to load directly into R as data frames. egnyte provides a set of functions that handle the download-read or write-upload workflow in a single step.

Prerequisites

Before you can read or write files, you need to authenticate. See vignette("configuration") or vignette("authorization") if you haven’t set that up yet.

library(egnyte)

eg_auth()

Reading Data Files

egnyte provides read functions for common data formats. Each function:

Downloads the file from Egnyte to a temporary location
Reads it using the appropriate R package
Returns the data as a data frame (or tibble)
Cleans up the temporary file

CSV Files

# Read a CSV file
dat <- eg_read_csv("/Shared/Data/analysis.csv")

This uses readr::read_csv() under the hood. You can pass any additional arguments that read_csv() accepts:

# Specify column types
dat <- eg_read_csv(
  "/Shared/Data/analysis.csv",
  col_types = cols(
    id = col_integer(),
    name = col_character(),
    value = col_double()
  )
)

# Skip rows, select columns, etc.
dat <- eg_read_csv(
  "/Shared/Data/analysis.csv",
  skip = 2,
  col_select = c(id, name, value)
)

Delimited Files

For files with delimiters other than commas:

# Tab-delimited file (default)
dat <- eg_read_delim("/Shared/Data/data.tsv")

# Pipe-delimited file
dat <- eg_read_delim("/Shared/Data/data.txt", delim = "|")

# Semicolon-delimited (common in European CSVs)
dat <- eg_read_delim("/Shared/Data/european.csv", delim = ";")

Excel Files

# Read an Excel file (first sheet by default)
dat <- eg_read_excel("/Shared/Data/workbook.xlsx")

# Read a specific sheet by name
dat <- eg_read_excel("/Shared/Data/workbook.xlsx", sheet = "Summary")

# Read a specific sheet by position
dat <- eg_read_excel("/Shared/Data/workbook.xlsx", sheet = 3)

Additional readxl::read_excel() arguments work here too:

# Specify a range
dat <- eg_read_excel(
  "/Shared/Data/workbook.xlsx",
  range = "B2:F100"
)

# Skip rows
dat <- eg_read_excel(
  "/Shared/Data/workbook.xlsx",
  skip = 5
)

SAS Files

# Read a SAS7BDAT file
dat <- eg_read_sas("/Shared/Data/dataset.sas7bdat")

# Read a SAS transport file (.xpt)
dat <- eg_read_xpt("/Shared/Data/dataset.xpt")

These use haven::read_sas() and haven::read_xpt(). Haven preserves SAS attributes like variable labels and formats.

Stata Files

# Read a Stata file
dat <- eg_read_stata("/Shared/Data/dataset.dta")

SPSS Files

# Read an SPSS file
dat <- eg_read_spss("/Shared/Data/dataset.sav")

R Objects (RDS)

# Read an RDS file
obj <- eg_read_rds("/Shared/Data/model.rds")

RDS files can contain any R object, not just data frames. This is useful for saving fitted models, lists, or other complex objects.

Writing Data Files

The write functions work in reverse - they take an R object, write it to a temporary file, and upload it to Egnyte.

CSV Files

# Write a data frame to CSV
eg_write_csv(dat, "/Shared/Data/results.csv")

By default, this will fail if the file already exists:

# Overwrite an existing file
eg_write_csv(dat, "/Shared/Data/results.csv", overwrite = TRUE)

You can pass additional arguments to readr::write_csv():

# Don't include column names
eg_write_csv(dat, "/Shared/Data/results.csv", col_names = FALSE)

# Use a different NA representation
eg_write_csv(dat, "/Shared/Data/results.csv", na = ".")

Delimited Files

# Write a tab-delimited file
eg_write_delim(dat, "/Shared/Data/results.tsv")

# Write with a different delimiter
eg_write_delim(dat, "/Shared/Data/results.txt", delim = "|")

Excel Files

# Write a single sheet
eg_write_excel(dat, "/Shared/Data/results.xlsx")

# Write multiple sheets by passing a named list
eg_write_excel(
  list(
    "Summary" = summary_df,
    "Details" = details_df,
    "Raw" = raw_df
  ),
  "/Shared/Data/workbook.xlsx"
)

SAS Transport Files

# Write a SAS transport file
eg_write_xpt(dat, "/Shared/Data/results.xpt")

Note: egnyte can write XPT files but not native SAS7BDAT files. If you need SAS7BDAT, you’ll need to use SAS itself or another tool.

Stata Files

# Write a Stata file
eg_write_stata(dat, "/Shared/Data/results.dta")

SPSS Files

# Write an SPSS file
eg_write_spss(dat, "/Shared/Data/results.sav")

R Objects (RDS)

# Save any R object
eg_write_rds(fitted_model, "/Shared/Data/model.rds")

# Control compression
eg_write_rds(large_data, "/Shared/Data/data.rds", compress = "xz")

Optional Dependencies

The format-specific functions require additional packages that aren’t installed by default with egnyte:

Function	Required Package
`eg_read_csv()`, `eg_write_csv()`	readr
`eg_read_delim()`, `eg_write_delim()`	readr
`eg_read_excel()`	readxl
`eg_write_excel()`	writexl
`eg_read_sas()`, `eg_read_xpt()`, `eg_write_xpt()`	haven
`eg_read_stata()`, `eg_write_stata()`	haven
`eg_read_spss()`, `eg_write_spss()`	haven
`eg_read_rds()`, `eg_write_rds()`	(base R - no extra package)

If you try to use a function without the required package installed, egnyte will prompt you to install it:

The readr package is required for this function.
Would you like to install it? (yes/no)

A Few Tips

Working with Large Files

The read functions download the entire file before reading it. For large files, this can take a while and use significant memory. A few suggestions:

If you only need specific columns, use the col_select argument (for CSV/delim files) to avoid loading unnecessary data
Consider whether you really need the whole file, or if you can work with a sample
For very large datasets, you might want to use eg_read() to download the file once, then work with it locally

Passing Arguments Through

All the format-specific functions accept ... arguments that get passed to the underlying read/write function. If you’re familiar with readr::read_csv() or haven::read_sas(), you can use all the same options:

# readr options
dat <- eg_read_csv(
  "/Shared/Data/data.csv",
  col_types = "ccdd",
  locale = locale(decimal_mark = ","),
  na = c("", "NA", "N/A", ".")
)

# haven options
dat <- eg_read_sas(
  "/Shared/Data/data.sas7bdat",
  encoding = "latin1"
)

File Extensions Matter

egnyte uses the file extension to determine the format when downloading. Make sure your files have the correct extension:

.csv for CSV files
.xlsx or .xls for Excel files
.sas7bdat for SAS data files
.xpt for SAS transport files
.dta for Stata files
.sav for SPSS files
.rds for R objects

Comparison with Base File Transfer

Here’s the difference between using the format-specific functions vs. the base eg_read()/eg_write():

Using eg_read_csv() (recommended for data files):

# One step - download and read
dat <- eg_read_csv("/Shared/Data/data.csv")

Using eg_read() (manual approach):

# Two steps - download, then read
temp_file <- eg_read("/Shared/Data/data.csv")
dat <- readr::read_csv(temp_file)
unlink(temp_file)  # Clean up

The format-specific functions handle all this for you and clean up temporary files automatically.

Reading and Writing Data Files

Prerequisites

Reading Data Files

CSV Files

Delimited Files

Excel Files

SAS Files

Stata Files

SPSS Files

R Objects (RDS)

Writing Data Files

CSV Files

Delimited Files

Excel Files

SAS Transport Files

Stata Files

SPSS Files

R Objects (RDS)

Optional Dependencies

A Few Tips

Working with Large Files

Passing Arguments Through

File Extensions Matter

Comparison with Base File Transfer

Next Steps