% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/ReadNetCDF.R
\name{ReadNetCDF}
\alias{ReadNetCDF}
\alias{GlanceNetCDF}
\title{Read NetCDF files.}
\usage{
ReadNetCDF(
  file,
  vars = NULL,
  out = c("data.frame", "vector", "array"),
  subset = NULL,
  key = FALSE
)

GlanceNetCDF(file, ...)
}
\arguments{
\item{file}{source to read from. Must be one of:
\itemize{
\item A string representing a local file with read access.
\item A string representing a URL readable by \code{\link[ncdf4:nc_open]{ncdf4::nc_open()}}.
(this includes DAP urls).
\item A netcdf object returned by \code{\link[ncdf4:nc_open]{ncdf4::nc_open()}}.
}}

\item{vars}{one of:
\itemize{
\item \code{NULL}: reads all variables.
\item a character vector with the name of the variables to read.
\item a function that takes a vector with all the variables and returns either
a character vector with the name of variables to read or a numeric/logical
vector that indicates a subset of variables.
}}

\item{out}{character indicating the type of output desired}

\item{subset}{a list of subsetting objects. See below.}

\item{key}{if \code{TRUE}, returns a data.table keyed by the dimensions of the data.}

\item{...}{in \code{\link[=GlanceNetCDF]{GlanceNetCDF()}}, ignored. Is there for convenience so that a call to \code{\link[=ReadNetCDF]{ReadNetCDF()}} can
be also valid for \code{\link[=GlanceNetCDF]{GlanceNetCDF()}}.}
}
\value{
The return format is specified by \code{out}. It can be a data table in which each
column is a variable and each row, an observation; an array with named
dimensions; or a vector. Since it's possible to return multiple arrays or
vectors (one for each variable), for consistency the return type is always a
list. Either of these two options are much faster than the
first since the most time consuming part is the melting of the array
returned by \link[ncdf4:ncvar_get]{ncdf4::ncvar_get}. \code{out = "vector"} is particularly useful for
adding new variables to an existing data frame with the same dimensions.

When not all variables specified in \code{vars} have the same number of dimensions,
the shorter variables will be recycled. E.g. if reading a 3D pressure field
and a 2D surface temperature field, the latter will be turned into a 3D field
with the same values in each missing dimension.

\code{GlanceNetCDF()} returns a list of variables and dimensions included in the
file with a nice printing method.
}
\description{
Using the \code{\link[ncdf4]{ncdf4-package}} package, it reads a NetCDF file. The advantage
over using \code{\link[ncdf4]{ncvar_get}} is that the output is a tidy data.table
with proper dimensions.
}
\section{Subsetting}{

In the most basic form, \code{subset} will be a named list whose names must match
the dimensions specified in the NetCDF file and each element must be a vector
whose range defines
a contiguous subset of data. You don't need to provide and exact range that
matches the actual gridpoints of the file; the closest gridpoint will be selected.
Furthermore, you can use \code{NA} to refer to the existing minimum or maximum.

So, if you want to get Southern Hemisphere data from the from a file that defines
latitude as \code{lat}, then you can use:
\preformatted{
subset = list(lat = -90:0)
}

To use dimension indices instead of values, wrap the expression in \code{\link[base:AsIs]{base::I()}}.
For example to read the first 10 timesteps of a file:

\preformatted{
subset = list(time = I(1, 10))
}

Negative indices are interpreted as starting from the end.
So to read the last 10 timesteps of a file:

\preformatted{
subset = list(time = I(-10, 0))
}

More complex subsetting operations are supported. If you want to read non-contiguous
chunks of data, you can specify each chunk into a list inside \code{subset}. For example
this subset
\preformatted{
subset = list(list(lat = -90:-70, lon = 0:60),
              list(lat = 70:90, lon = 300:360))
}
will return two contiguous chunks: one on the South-West corner and one on the
North-East corner. Alternatively, if you want to get the four corners that
are combination of those two conditions,

\preformatted{
subset = list(lat = list(-90:-70, 70:90),
              lon = list(0:60, 300:360))
}
Both operations can be mixed together. So for example this

\preformatted{
subset = list(list(lat = -90:-70,
                   lon = 0:60),
              time = list(c("2000-01-01", "2000-12-31"),
                          c("2010-01-01", "2010-12-31")))
}

returns one spatial chunk for each of two temporal chunks.

The general idea is that named elements define 'global' subsets ranges that will be
applied to every other subset, while each unnamed element define one contiguous chunk.
In the above example, \code{time} defines two temporal ranges that every subset of data will
have.

The above example, then, is equivalent to

\preformatted{
subset = list(list(lat = -90:-70,
                   lon = 0:60,
                   time = c("2000-01-01", "2000-12-31")),
              list(lat = -90:-70,
                   lon = 0:60,
                   time = c("2010-01-01", "2010-12-31")))
}

but demands much less typing.
}

\examples{
file <- system.file("extdata", "temperature.nc", package = "metR")
# Get a list of variables.
variables <- GlanceNetCDF(file)
print(variables)

# The object returned by GlanceNetCDF is a list with lots
# of information
str(variables)

# Read only the first one, with name "var".
field <- ReadNetCDF(file, vars = c(var = names(variables$vars[1])))
# Add a new variable.
# ¡Make sure it's on the same exact grid!
field[, var2 := ReadNetCDF(file, out = "vector")]

\dontrun{
# Using a DAP url
url <- "http://iridl.ldeo.columbia.edu/SOURCES/.Models/.SubX/.GMAO/.GEOS_V2p1/.hindcast/.ua/dods"
field <- ReadNetCDF(url, subset = list(M = 1,
                                       P = 10,
                                       S = "1999-01-01"))

# In this case, opening the netcdf file takes a non-neglible
# amount of time. So if you want to iterate over many dimensions,
# then it's more efficient to open the file first and then read it.

ncfile <- ncdf4::nc_open(url)
field <- ReadNetCDF(ncfile, subset = list(M = 1,
                                       P = 10,
                                       S = "1999-01-01"))


# Using a function in `vars` to read all variables that
# start with "radar_".
ReadNetCDF(radar_file, vars = function(x) startsWith(x, "radar_"))

}
}
