Working with EconData's Public Finance Data

Introduction

In this tutorial, we apply the Econdatar R package to demonstrate how to use the Treasury Budget data on EconData. We will create three visualisations: an area chart, a hairy line plot, and a stacked bar chart. The Econdatar package helps you streamline and automate your workflows by directly importing the latest data available on the platform.

Beneath each visualization, you’ll find an “expand to see code” option. Clicking on this will reveal the R scripts that were used to create each plot, offering a hands-on learning experience on how to reproduce these visuals.

Treasury Main Budget Table 10 (Government Debt): Area chart of debt composition

expand to see code

Creating an Area Chart of Composition of Debt

In this example, we will be working with EconData’s “Treasury Main Budget: Government Debt” tables and how to use them for analysis.

Load necessary packages:

packages <- c( "tidyverse",
               "econdatar",
               "scales"
             )

invisible(lapply(packages, library, c=TRUE))

Learn how to install the econdatar package here.

Create a custom color palette for the plots (Optional):

palette <- color_list <- c("#02acba", "#273b8d", "#be1e2d", "#c0c5cb",
                           "#3F3A34", "#D8C472", "#4A4ACC", "#B3EFB2", 
                           "#AA78A6", "#41E1D2", "#9a8c98", "#E8E288", 
                           "#1D3461", "#F7C1BB", "#7280D9", "#EDD4B2", 
                           "#FD5345", "#BFB6BB", "#306B34", "#D499B9", 
                           "#C7F0BD", "#DDA448", "#F2542D", "#43AA8B", 																									 
                           "#95B8D1" )

Import all series of Table 10 (Government Debt) data from EconData

Assign the imported data to the variable debt_list.

debt <- read_dataset(id = "GOVTDEBT", 
                     version = "1.0.0",
                     series_key = "N.D.",
                     release = "2024 (2024-02-21)", # This specifies the budget release of 2024
                     # The following arguments lets EconData import the data in a long tidy format
                     wide = FALSE,
                     combine = TRUE,
                     tidy = TRUE
)

The econdatar package by default imports economic and financial data as a list of dataframes, each indexed by date like a time series object. Lists in R are highly versatile, allowing for efficient storage and manipulation of complex data structures. Though, for plotting, we often require the data in a long format – and the econdatar package can do that too! The code above changes the default arguments and loads the data (and metadata) in a long and tidy format.

Prepare the Data:

First, filter the data for the subsection of Table 10 we are interested in: line items related to the composition of public debt (all Mnemonic codes starting with ‘N.D.C’). You can learn more about the Mnemonic codes in our Public Finance and Accounts User Guide.

filtered_debt <- debt %>% 
  filter(grepl("N.D.C", series_key),
         !series_key %in% c("N.D.CDD", "N.D.CMD"))

Plot the data:

area_chart <- filtered_debt %>%
  
  # Aggregate or summarise to calculate the total contribution of each series_name
  group_by(series_name) %>%
  mutate(total_contribution = sum(obs_value, na.rm = TRUE)) %>%
  ungroup() %>%
  
  # Optional: Reorder based on the total contribution
  mutate(
    series_name = fct_reorder(series_name, total_contribution, .desc = FALSE)
  ) %>%
  
  # Create the ggplot
  ggplot(
    aes(x = time_period, 
        y = obs_value, 
        fill = series_name)
  ) +
  geom_area() +  # Create area chart
  scale_fill_manual(values = palette) +  # Custom color palette
  theme_bw() +  # Apply theme
  labs(x = "Date",
       y = "Proportion of Total Debt",
       title = "Composition of Debt in South Africa",
       fill = "Debt Component")

Treasury Main Budget Table 10 (Government Debt): Hairyline plot of debt over time

expand to see code

Here we will produce a hairy line plot. Hairy lines are a great way to visualise how forecasts differ from realised values over time. Here we will make use of all the available vintages (previous and current budget releases) to draw a hairy line for Gross Government Debt (% of GDP) over time.

To do this, we need to add the data vintage by vintage / release by release.

Define the releases

r <- c("2024 (2024-02-21)",
       "2023 (2023-02-22)",
       "2022 (2022-02-23)",
       "2021 (2021-02-24)",
       "2020 (2020-02-26)",
       "2019 (2019-02-20)",
       "2018 (2018-02-21)",
       "2017 (2017-02-22)",
       "2016 (2016-02-24)",
       "2015 (2015-02-25)",
       "2014 (2014-02-26)",
       "2013 (2013-02-27)")

These releases are the specific dates on which the budget data were released every year from 2013 to 2024.

Import the data:

# Loop through each release and import from EconData
total_debt <- lapply(r, function(r) {
  read_dataset(
    id = "GOVTDEBT",
    version = "1.0.0",
    series_key = "N.D.PGL",
    release = r,
    wide = FALSE,
    tidy = TRUE,
    combine = TRUE
  ) %>% 
    mutate(vintage = r)
}) %>% 
  bind_rows()

Plot the hairy line:

hairy_line <- total_debt %>% 
  filter(time_period >= as.Date("2000-01-01")) %>%
  mutate(OBS_STATUS = ifelse(OBS_STATUS == "A", "Realised", "Forecast")) %>% 
  ggplot(aes( x = time_period, 
              y = obs_value, 
              group = vintage, # So that we have multiple lines, each representing a vintage
              color = OBS_STATUS # To differentiate between observed and forecasted values
  )) +
  
  geom_line(lwd = 1) +
  
  theme_bw() + # general theme
  scale_color_manual(values = palette) + # here using the custom color palette
  
  labs(title = "Gross Debt Forecasts vs. Actual Values",
       x = "Date",
       y = "Observed Value (% of GDP)",
       color = "Label",
       linetype = "Observation Status") + 
  
  theme(legend.position = "bottom") +
  
  scale_y_continuous(labels = percent_format(), limits = c(0,1)) # Format y-axis as percentages

Treasury Budget Annexure AW22 (Total Transfers to Provinces): Stacked bar chart of conditional grants and provincial equitable share over time

expand to see code

In this example, we will be working with the Treasury Provincial Annexure Tables and how to conduct data analyses of them. A stacked bar chart may be used to see the evolution of allocations over time.

Import the data from EconData:

For the provincial tables, the series_key is structured as "Section"."Province"."Table+Line Item". Here we use Section “A”, All the provinces, Table “PP” and Line Items “E” and “G” (Equitable Share & Conditional Grants to Provinces).

pes <- read_dataset(
  id = "GOVTANNEX_PROV",
  version = "1.0.0",
  series_key = "A.EC+FS+GP+KN+LM+MP+NW+NC+WC.PPG+PPE",
  release = "2024 (2024-02-21)",
  # The following arguments lets EconData import the data in a long tidy format
  wide = FALSE,
  combine = TRUE,
  tidy = TRUE
  )

Plot the stacked bar chart:

stacked_bar <- ggplot(pes, aes(x = time_period, y = obs_value, group = category, fill = south_african_province)) +
  
  geom_col(position = "fill") + # Stacks the bars and normalizes their height
  
  facet_wrap(~category, scales = "free_y") + # Facet by ITEM
  
  theme_bw() +
  
  labs(
    title = "National Treasury Allocations to Provinces over Time",
    x = "Year",
    y = "Allocation",
    fill = "Province"
  ) +
  scale_fill_manual(values = palette) +
  
  scale_y_continuous(labels = percent_format()) + # Format y-axis as percentages
  scale_x_date(date_breaks = "2 years", date_labels = "%Y") # Dynamic date breaks every 2 years

Conclusion

There are many more dataseries from the Treasury’s budget data. This tutorial should equip you to use these sources effectively. Have you checked out EconData’s User Guide? It provides more detail on how to use EconData as well as some R programming essentials.

##EconData #Budget Review #Coding #EconData R #Public Finance Module

Working with EconData’s Public Finance Data

Introduction

Treasury Main Budget Table 10 (Government Debt): Area chart of debt composition

Creating an Area Chart of Composition of Debt

Load necessary packages:

Create a custom color palette for the plots (Optional):

Import all series of Table 10 (Government Debt) data from EconData

Prepare the Data:

Plot the data:

Treasury Main Budget Table 10 (Government Debt): Hairyline plot of debt over time

Define the releases

Import the data:

Plot the hairy line:

Treasury Budget Annexure AW22 (Total Transfers to Provinces): Stacked bar chart of conditional grants and provincial equitable share over time

Import the data from EconData:

Plot the stacked bar chart:

Conclusion