How to append all the National Treasury dataflows together

In EconData, we now tend to separate complicated data sources into multiple dataflows, in order to keep things more manageable for you. This is usually done in order to reduce the amount of data downloaded, when you are focusing on a specific section of the data source. This blog post shows you how to append together all our National Treasury dataflows, in the event that you want to have all the data together.

We have designed our National Treasury series keys so that these dataflows can be bound together into one overall dataset. (The “series keys” are the codes we use as IDs for each series, to assist with normalization of the data into a data table, related to a metadata table.) Specifically, the first dimension of our National Treasury data distinguishes the section, be it from the national budget N, national budget annexures A, or provincial budgets P. Secondly, the table (usually the second dimension in the series key) distinguishes from which budget table the data originates. Please see our user guide for all the one-character table codes.

Please use the following code as a template for downloading and binding all the dataflows together.

packages <- c("econdatar",
              "tidyverse")
invisible(lapply(packages, library, c=TRUE))

treasury_full <- tibble()

for (NTflow in c("GOVTBUDGET",
                 "GOVTREVENUE",
                 "GOVTTRANSFERS",
                 "GOVT_CON_ECON",
                 "GOVT_CON_BUDGET",
                 "GOVTDEBT",
                 "GOVTGUARANTEES",
                 "GOVTANNEX_CONS",
                 "GOVTANNEX_PROV")) {
    if (exists("treasury_table")) rm(treasury_table)
    treasury_table <- read_dataset( id    = NTflow,
                                    release = "latest",
                                    tidy    = TRUE,
                                    wide    = FALSE,
                                    combine = TRUE  ) |>
                    select(-data_provider_ref)

    treasury_full <- treasury_full |> bind_rows(treasury_table)
}

You may edit the release = "latest" parameter if you want a previous vintage. If you want to do that, please use the respective description from read_release(NTflow, tidy=TRUE) %>% bind_rows()

Table names

print(unique(treasury_full$budget_table) |> sort())

The above command will then print out the following vector of table names, with the unnumbered tables coming from the appendicies.

  • 01. Government Main Budget
  • 02. Main Budget Estimate of National Revenue (Summary)
  • 03. Main Budget Estimate of National Revenue (Detailed)
  • 05. Consolidated Transfers (Economic Classification)
  • 07. Consolidated Government (Economic Classification)
  • 09. Consolidated Government Budget Balance
  • 10. Government Debt
  • 11. Government Contingent Liabilities
  • Changes Over Baseline
  • Direct and Indirect Transfers to Local Government
  • Division of Nationally Raised Revenue
  • Grants to Provinces
  • Medium-Term Macroeconomic Assumptions
  • Risk Adjusted Subcomponent Shares
  • Schedule 1 of the Division of Revenue Bill
  • Transfers to Local Government
  • Basic Component Shares
  • Equitable Shares
  • Health Shares
  • Implemented Equitable Shares
  • Output Subcomponent Shares
  • School Enrolment
  • Share of GDP
  • Total Transfers

Dian

Dian

Thanks a lot! I appreciate this.

Comments are closed.