--- title: "ChinAPIs: Access Chinese Data via Public APIs and Curated Datasets" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{ChinAPIs: Access Chinese Data via Public APIs and Curated Datasets} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup} library(ChinAPIs) library(ggplot2) library(dplyr) ``` # Introduction The `ChinAPIs` package provides a unified interface to access open data from the **World Bank API**, **Nager.Date API**, and the **REST Countries API**, with a focus on **China**. It allows users to retrieve up-to-date information on topics such as economic indicators, population statistics, unemployment rates, holidays, and basic geopolitical details. In addition to API-access functions, the package includes one of the largest curated collections of open datasets related to **China** and **Hong Kong**. These datasets cover areas such as air quality, demographic indicators, input-output economic tables, epidemiology, administrative divisions, name distributions, political structure, and various social indicators. `ChinAPIs` is designed to support users working with data related to **China** by combining international API sources with structured datasets from public, academic, and governmental sources in a single, easy-to-use R package. ## Functions for ChinAPIs The `ChinAPIs` package provides several core functions to access real-time and structured information about **China** from public APIs such as the **World Bank API**, **Nager.Date**, and **REST Countries**. Below is a list of the main functions included in the package: - `get_country_info_cn()`: Get key geopolitical and administrative information about China from the REST Countries API - `get_china_gdp()`: Get China’s Gross Domestic Product (current US$) from the World Bank - `get_china_cpi()`: Get China's Consumer Price Index from World Bank - `get_china_population()`: Get China’s total population from the World Bank - `get_china_literacy_rate()`: Get China's adult literacy rate (Age 15+) from the World Bank - `get_china_life_expectancy()`: Get life expectancy at birth for China from the World Bank - `get_china_unemployment()`: Get China's Unemployment Rate from World Bank - `get_china_energy_use()`: Get China's energy use per capita (kg of oil equivalent) from the World Bank - `get_china_child_mortality()`: Get under-5 mortality rate (per 1,000 live births) in China from the World Bank - `get_china_hospital_beds()`: Get hospital beds per 1,000 people in China from the World Bank - `get_china_holidays()`: Get official public holidays in China for a given year, e.g. get_china_holidays(2025) - `view_datasets_ChinAPIs()`: Lists all curated datasets included in the ChinAPIs package These functions allow users to access high-quality and structured information on **China**, which can be combined with tools like **dplyr** and **ggplot2** to support a wide range of data analysis, visualization, and research tasks. In the following sections, you’ll find examples on how to work with `ChinAPIs` in practical scenarios. ### China's GDP (Current US$) from World Bank 2022 - 2017 ```{r china-gdp,echo = TRUE,message = FALSE,warning = FALSE,results = 'markup'} china_gdp <- head(get_china_gdp()) print(china_gdp) ``` ### China's Life Expectancy at Birth from World Bank 2022 - 2017 ```{r china-life-expectancy,echo = TRUE,message = FALSE,warning = FALSE,results = 'markup'} life_expectancy <- head(get_china_life_expectancy()) print(life_expectancy) ``` ### China's Total Population from World Bank 2022 - 2017 ```{r china-population,echo = TRUE,message = FALSE,warning = FALSE,results = 'markup'} china_population <- head(get_china_population()) print(china_population) ``` ## Total Population by District in Hong Kong ```{r hk-population-plot, message=FALSE, warning=FALSE, fig.width=7, fig.height=5} # Plot total population by district with formatted x-axis labels hk_population_tbl_df %>% arrange(desc(TotalPopulation)) %>% ggplot(aes(x = reorder(District_EN, TotalPopulation), y = TotalPopulation)) + geom_col(fill = "steelblue") + coord_flip() + scale_y_continuous(labels = function(x) format(x, big.mark = ",", scientific = FALSE)) + labs( title = "Total Population by District in Hong Kong", x = "District", y = "Total Population" ) + theme_minimal() ``` ## Dataset Suffixes Each dataset in `ChinAPIs` is labeled with a *suffix* to indicate its structure and type: - `_df`: A standard data frame. - `_tbl_df`: A tibble data frame object. - `_list`: A list object. - `_matrix`: A matrix object. ## Datasets Included in ChinAPIs In addition to API access functions, `ChinAPIs` provides one of the largest curated collections of open datasets focused on **China** and **Hong Kong**. These preloaded datasets cover topics such as air quality, administrative divisions, input-output tables, names, demographics, infrastructure, and public health. Below are some featured examples: - `hk_population_tbl_df`: Hong Kong Population by District and Age Group - `chinese_cities_tbl_df`: A tibble that contains information about 367 prominent cities in China - `family_name_df`: Chinese Surnames and National Frequency (1930–2008) ## Conclusion The `ChinAPIs` package offers a comprehensive interface to access curated datasets and structured data about **China**, encompassing a wide range of topics relevant to the country’s environment, economy, demography, and public infrastructure. Unlike other tools focused solely on API connections, ChinAPIs provides preloaded datasets that include information on air quality in **Beijing**, corruption perception indices, inter-industry input-output tables across multiple years, detailed demographic records, COVID-19 and SARS statistics in Hong Kong, as well as data on Chinese dams, pandas, administrative divisions, and given/family names. These datasets enable users to analyze patterns in urban development, public health, environmental quality, political structure, and social trends. The package serves as a valuable resource for researchers, educators, journalists, and developers interested in **China**’s contemporary landscape, offering localized, high-resolution data in tidy formats ready for direct use in R. Together, `ChinAPIs` helps bridge the gap between complex Chinese open data sources and accessible, reproducible, and transparent data science workflows in R.