--- title: "boilerplate Package Architecture" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{boilerplate Package Architecture} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 5 ) ``` # Overview The `boilerplate` package is designed to manage and generate standardised text for scientific reports. It uses a unified database architecture with a hierarchical path system and template variable substitution. # Core Architecture Components ## 1. Unified Database System The package uses a unified database structure where all content types share a common interface: ``` boilerplate_db (unified) ├── methods/ │ ├── statistical/ │ │ ├── regression/ │ │ └── longitudinal/ │ └── sampling/ ├── measures/ │ ├── psychological/ │ └── demographic/ ├── results/ ├── discussion/ ├── appendix/ └── template/ ``` ### Key Design Principles - **Single Source of Truth**: All content managed through one unified database - **Consistent Interface**: Same functions work across all content types - **Hierarchical Organisation**: Dot notation paths for nested content - **Format Agnostic**: Supports both RDS (legacy) and JSON formats ## 2. Path System Content is organised using dot notation paths: ```r # Access nested content "methods.statistical.regression.linear" "measures.psychological.anxiety.gad7" "results.descriptive.demographics" ``` ### Path Operations - **Navigation**: `get_nested_folder()` traverses the hierarchy - **Modification**: `modify_nested_entry()` adds/updates/removes entries - **Wildcards**: `methods.statistical.*` matches all statistical methods - **Validation**: `boilerplate_path_exists()` checks path validity ## 3. Template Variable System Dynamic content substitution using `{{variable}}` syntax: ```r # Template text "We analysed {{n}} participants using {{method}} regression." # Variables list(n = 100, method = "linear") # Result "We analysed 100 participants using linear regression." ``` ### Variable Scoping 1. **Global Variables**: Available to all sections 2. **Section Variables**: Override globals for specific sections 3. **Text Overrides**: Direct text replacement ## 4. File Organisation ``` R/ ├── Core Functions │ ├── init-functions.R # Database initialisation │ ├── import-export-functions.R # I/O operations │ └── utilities.R # Core utilities │ ├── User Interface │ ├── manage-measures.R # Measure management │ ├── generate-text.R # Text generation │ └── generate-measures.R # Measure generation │ ├── Data Operations │ ├── merge-databases.R # Database merging │ ├── path-operations.R # Path manipulation │ └── category-helpers.R # Category extraction │ ├── Format Support │ ├── json-support.R # JSON operations │ ├── migration-utilities.R # Format migration │ └── bibliography-support.R # Citation handling │ └── Batch Operations ├── boilerplate_batch_edit_functions.R └── boilerplate_standardise_measures.R ``` # Data Flow Architecture ## 1. Initialisation Flow ``` boilerplate_init() ├── Creates directory structure ├── Initialises empty databases └── Saves as unified.json/rds ``` ## 2. Import Flow ``` External Data → boilerplate_import() ├── Detects format (JSON/RDS/CSV) ├── Validates structure ├── Merges with existing └── Updates unified database ``` ## 3. Text Generation Flow ``` boilerplate_generate_text() ├── Load unified database ├── Extract category paths ├── Apply template variables ├── Handle text overrides └── Return formatted text ``` # Key Design Patterns ## 1. Function Naming Convention ```r # Public API boilerplate_() # Main functions boilerplate__() # Category-specific # Internal functions _() # No prefix for internals ``` ## 2. Error Handling Strategy - User confirmation prompts for destructive operations - Informative error messages with suggestions - Validation before operations - Backup creation for critical operations ## 3. Extensibility Points ### Adding New Categories 1. Define default content in `default-databases.R` 2. Add accessor function following pattern 3. Update unified structure 4. Add tests ### Adding New Formats 1. Implement read/write functions in format-specific file 2. Add format detection in `detect_database_type()` 3. Update import/export functions 4. Ensure round-trip compatibility ## 4. Performance Considerations - Lazy loading of large databases - Efficient path traversal using recursive algorithms - Minimal file I/O with in-memory operations - Batch operations for multiple edits # Database Schema ## Unified Database Structure ```r list( methods = list( category1 = list( entry1 = list( text = "Method description with {{variables}}", reference = "@citation2023", keywords = c("keyword1", "keyword2") ) ) ), measures = list( category1 = list( measure1 = list( name = "measure_name", description = "Description", type = "continuous|categorical|ordinal|binary", ... ) ) ), template = list( global = list(var1 = "value1"), methods = list(var2 = "value2") ) ) ``` ## Entry Types ### Text Entries (methods, results, discussion) ```r list( text = "Content with {{variables}}", # Required reference = "@citation", # Optional keywords = c("keyword1", "keyword2"), # Optional large = "Extended version", # Optional variant brief = "Short version" # Optional variant ) ``` ### Measure Entries ```r list( name = "measure_id", # Required description = "Description", # Required type = "continuous", # Required values = c(1, 2, 3), # For categorical value_labels = c("Low", "Med", "High"), range = c(0, 100), # For continuous unit = "points", reference = "@citation" ) ``` # Testing Architecture ## Test Organisation ``` tests/testthat/ ├── test-init-functions.R # Initialisation tests ├── test-import-export.R # I/O operations ├── test-generate-text.R # Text generation ├── test-path-operations.R # Path system ├── test-json-support.R # JSON functionality └── test-utilities.R # Core utilities ``` ## Testing Strategy 1. **Unit Tests**: Each function tested in isolation 2. **Integration Tests**: Full workflows tested 3. **Format Tests**: Round-trip compatibility 4. **Edge Cases**: Invalid inputs, empty databases # Security Considerations 1. **File Operations**: Validated paths, no arbitrary file access 2. **User Input**: Sanitised for path traversal attacks 3. **Confirmations**: Required for destructive operations 4. **Backups**: Automatic for critical operations # Future Architecture Considerations ## Planned Enhancements 1. **Plugin System**: Allow custom content types 2. **Version Control**: Built-in change tracking 3. **Validation Rules**: Custom validation per category 4. **Performance**: Caching for large databases ## Backwards Compatibility - RDS format support maintained - Automatic migration utilities - Deprecation warnings for old functions - Version detection in files