---
title: "Importing & exporting bulk data"
output: rmarkdown::html_vignette
description: >
  This article is your quick-start guide to the core import/export functions  
  in `bulkreadr`. You’ll learn how to bring in data from multi-sheet Excel workbooks  
  (`read_excel_workbook()`, `read_excel_files_from_dir()`), Google Sheets (`read_gsheets()`),  
  and directories of CSV files (`read_csv_files_from_dir()`). Then, you’ll see how  
  to automate exporting every sheet of an Excel file to individual CSVs with  
  `write_excel_sheets_to_csv()`.  
author: "Ezekiel Ogundepo and Ernest Fokoué"
vignette: >
  %\VignetteIndexEntry{Importing & exporting bulk data}
  %\VignetteEncoding{UTF-8}
  %\VignetteEngine{knitr::rmarkdown}
editor_options: 
  chunk_output_type: console
---


```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse      = TRUE,
  message       = FALSE,
  warning       = FALSE,
  comment       = "#>",
  fig.path      = "man/figures/",
  out.width     = "100%"
)

options(
  tibble.print_min = 5,
  tibble.print_max = 5,
  rmarkdown.html_vignette.check_title = FALSE
)
```

## Introduction

Welcome to **bulkreadr**! This article demonstrates how to efficiently import and export large-scale tabular data:

- **Excel workbooks**  
  - `read_excel_workbook()` pulls in every sheet from a single `.xlsx` or `.xls` file and returns a combined data frame.  
  - `read_excel_files_from_dir()` searches a directory for all Excel files, reads each one, and appends their contents into one data frame.  
  - `write_excel_sheets_to_csv()` takes any Excel workbook and writes each sheet out as its own CSV file in a folder you specify.

- **Google Sheets**  
  - `read_gsheets()` retrieves one or more sheets from a Google Sheets document (by URL or sheet ID) and merges them into a single data frame.

- **CSV files**  
  - `read_csv_files_from_dir()` loads every `.csv` file in a given directory and binds them into one data frame, preserving the file order.

By the end of this article, you’ll have a reproducible pipeline for moving data between Excel, Google Sheets, and CSV formats.

## Note on Example Data

> All examples below use sample files shipped with `bulkreadr` package (via `system.file()`). To apply these in your own project, just replace those paths with the paths to your local files or folders.

```{r}
library(bulkreadr)
library(dplyr)
```

## `read_excel_workbook()`

The `read_excel_workbook()` function reads **all sheets** from a single Excel workbook and combines them into one tidy data frame. This is ideal when you have multiple related sheets in one file and want to work with them as a single table.

```{r}
# Path to the Excel workbook bundled with the package
path <- system.file("extdata", "Diamonds.xlsx", package = "bulkreadr")

# Import every sheet into one data frame
df_all_sheets <- read_excel_workbook(path = path)

# View the result
df_all_sheets
```

## `read_excel_files_from_dir()`

`read_excel_files_from_dir()` scans a directory for all `.xlsx`/`.xls` files, reads each workbook, and appends their sheets into a single data frame. Great for batch-processing multiple files at once.

```{r}
# Directory containing multiple Excel workbooks
directory <- system.file("xlsxfolder", package = "bulkreadr")

# Read and combine all workbooks in that folder
combined_excel <- read_excel_files_from_dir(dir_path = directory)

# View the result
glimpse(combined_excel)
```

## `write_excel_sheets_to_csv()`

With a single call to `write_excel_sheets_to_csv()`, you can convert every worksheet in an Excel file into a separate CSV file in a directory of your choice. The function reads each sheet from the Excel file and writes it out as individual CSVs:

```{r, message=TRUE}
# Excel file with multiple sheets
excel_file <- system.file("extdata", "Diamonds.xlsx", package = "bulkreadr")

# Specify an output directory (will be created if it doesn't exist)
output_dir <- file.path(tempdir())

# Export each sheet to its own CSV and capture the file paths
write_excel_sheets_to_csv(
  excel_path = excel_file,
  output_dir = output_dir
)
```

This function ensures consistent file naming, automates the export process, and supports reproducible reporting workflows.

## `read_csv_files_from_dir()`

`read_csv_files_from_dir()` reads every `.csv` in a specified directory and binds them row-wise into one data frame, following the directory’s file order.

```{r}
# Directory with CSV files
directory <- system.file("csvfolder", package = "bulkreadr")

# Import all CSVs at once
all_csv_data <- read_csv_files_from_dir(dir_path = directory)

# View the result
all_csv_data
```

## `read_gsheets()`

For Google Sheets data, `read_gsheets()` lets you pull data from one or more sheets in a spreadsheet and combines them into a single data frame.

```{r}
# De-authenticate since we're accessing a public or already-shared sheet
googlesheets4::gs4_deauth()

# Google Sheet ID (or full URL)
sheet_id <- "1izO0mHu3L9AMySQUXGDn9GPs1n-VwGFSEoAKGhqVQh0"

# Read and merge all sheets
gsheet_data <- read_gsheets(ss = sheet_id)

# Inspect the imported data
glimpse(gsheet_data)
```

---
With these functions in `bulkreadr`, you can effortlessly import and export bulk data, leaving you more time for analysis and insight.