read_dataset()
is the function that downloads our data in our R package. You can find the following documentation by running ?read_dataset
in R. Please see our blog post on the functionality.
read_dataset {econdatar} | R Documentation |
read_dataset
Description
Returns the data for the given data set – ECONDATA:id(version), as a list, or as tidy data.table‘s. Available data sets can be looked up using read_database()
or from the web platform. Tidying can be done directly within read_dataset()
, or ex-post using tidy_data()
.
Usage
read_dataset(id, tidy = TRUE, ...)
## S3 method for class 'eds_dataset'
tidy_data(x, wide = TRUE, ...)
Arguments
id |
Data set identifier. | ||||||||||||||||||||||
x |
A raw API return object to be tidied. Can also be done directly in | ||||||||||||||||||||||
wide |
Specifies whether the tidied data be returned in wide or long format. | ||||||||||||||||||||||
... |
Further optional arguments:
| ||||||||||||||||||||||
tidy |
logical. Return data and metadata in tidy data.table‘s (see Value), by passing the result through |
wide | logical, default: TRUE . Returns data in a column-based format, with "label" and "source_identifier" attributes to columns (when available) and an overall “metadata” attribute to the table, otherwise a long-format is returned. See Value. |
|
prettify | logical, default: TRUE . Attempts to make the returned metadata more human readable replacing each code category and enumeration with its name. It is advisable to leave this set to TRUE , in some cases, where speed is paramount you may want to set this flag to FALSE . If multiple datasets are being queried this option is automatically set to FALSE . |
|
combine | logical, default: FALSE . If wide = FALSE , setting combine = TRUE will combine all data and metadata into a single long table, whereas the default FALSE will return data and metadata in separate tables, for more efficient storage. |
|
Details
An EconData account is required to use this function. The user must provide an API token, which can be found on the Account page of the online portal; a GUI dialog will prompt the user for their API token. Credentials can also be supplied by setting the ECONDATA_CREDENTIALS environment variable using the syntax: "client_id;client_secret", e.g. Sys.setenv(ECONDATA_CREDENTIALS="client_id;client_secret")
, when available.
Value
If tidy = FALSE
, a list of data frames is returned, where the names of the list are the EconData series codes, and each data frame has a single column named ‘OBS_VALUE’ containing the data, with corresponding dates attached as rownames. Each data frame further has a "metadata"
attribute providing information about the series. The entire list of data frames also has a "metadata"
attribute, providing information about the dataset. If multiple datasets (or versions of a dataset if version
is specified as ‘all’) are being queried, a list of such lists is returned.
If tidy = TRUE
and wide = TRUE
(the default), a single data.table is returned where the first column is the date, and the remaining columns are series named by their EconData codes. Each series has two attributes: "label"
provides a variable label combining important metadata from the "metadata"
attribute in the non-tidy format, and "source_identifier"
gives the series code assigned by the original data provider where available. The table has the same dataset-level "metadata"
attribute as the list of data frames if tidy = FALSE
. If multiple datasets are being queried, a list of such data.table‘s is returned.
If tidy = TRUE
and wide = FALSE
and combine = FALSE
(the default), a named list of two data.table‘s is returned. The first, "data"
, has columns ‘series_key’, ‘time_period’ and ‘obs_value’ providing the data in a long format. The second, "metadata"
, provides dataset and series-level matadata, with one row for each series. If combine = TRUE
, these two datasets are combined, where all repetitive content is converted to factors for more efficient storage. If multiple datasets are being queried, combine = FALSE
gives a nested list, whereas combine = TRUE
binds everything together to a single long frame.
Examples
# library(econdatar)
# Mining production and sales
read_dataset(id = "MINING")
# Tidy options
(MINING <- read_dataset(id = "MINING", tidy = FALSE))
# Same as: read_dataset(id = "MINING", tidy = TRUE)
tidy_data(MINING, wide = TRUE)
tidy_data(MINING, wide = FALSE)
tidy_data(MINING, wide = FALSE, combine = TRUE, prettify = FALSE)
# Can query a specific version by adding e.g. version = "1.0.0" to the call
read_dataset(id = "MINING", version = "all")
read_dataset(id = "MINING", version = "1.0.0")
# Using the series key
read_dataset(id = "MINING", series_key = "MIN001+MIN002..S")
read_dataset(id = "MINING", series_key = c("MIN001+MIN002..S", "MIN009.I.N"))
# Using start and end dates
read_dataset(id = "MINING",
series_key = "MIN001.I.S",
start_date = "2010-01-01",
end_date = Sys.Date()-365)
# Returns 5-10 years (daily average bond yields) not yet contained in the latest release
# (particularly useful for daily data that is released monthly)
read_dataset(id = "MARKET_RATES",
series_key = "CMJD003.B.A",
release = "unreleased")
# library(tibble)
POP <- read_dataset(id = "POPULATION_DATA_REG",
series_key = "POP...80",
tidy = TRUE,
wide = FALSE)
str(POP)
print(names(POP$data))
print(names(POP$metadata))
as_tibble(POP$data)
as_tibble(POP$metadata) |> view()