--- title: "Working with Measures in boilerplate" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Working with Measures in boilerplate} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = FALSE ) ``` ```{r setup} library(boilerplate) ``` # Overview This vignette provides a comprehensive guide to working with measures in the boilerplate package. Measures are a special type of content that describes variables, instruments, and scales used in research. The package provides powerful tools for managing, standardising, and generating formatted text about your measures. # Quick Start: Adding and Using Measures ## Basic Workflow ```{r} # Initialise and import the database # Using a temporary directory for this example temp_measures <- file.path(tempdir(), "measures_workflow_example") boilerplate_init(data_path = temp_measures, create_dirs = TRUE, create_empty = FALSE, confirm = FALSE, quiet = TRUE) unified_db <- boilerplate_import(data_path = temp_measures, quiet = TRUE) # Add a measure directly to the unified database # IMPORTANT: Measures must be at the top level, not nested in categories unified_db$measures$anxiety_gad7 <- list( name = "generalised anxiety disorder scale (GAD-7)", description = "anxiety was measured using the GAD-7 scale.", reference = "spitzer2006", waves = "1-3", keywords = c("anxiety", "mental health", "gad"), items = list( "feeling nervous, anxious, or on edge", "not being able to stop or control worrying", "worrying too much about different things", "trouble relaxing", "being so restless that it is hard to sit still", "becoming easily annoyed or irritable", "feeling afraid, as if something awful might happen" ) ) # Save the database boilerplate_save(unified_db, data_path = temp_measures, confirm = FALSE, quiet = TRUE) # Generate formatted text about the measure measures_text <- boilerplate_generate_measures( variable_heading = "Anxiety Measure", variables = "anxiety_gad7", db = unified_db, heading_level = 3, print_waves = TRUE ) cat(measures_text) ``` # Understanding Measure Structure ## Required Fields Every measure must have these fields: - **name**: A descriptive name for the measure - **description**: A brief description of what the measure assesses - **type**: The measurement type (continuous, categorical, ordinal, or binary) ## Optional Fields Additional fields provide more detail: - **reference**: Citation for the measure - **waves**: Data collection waves where the measure was used - **keywords**: Terms for searching and categorisation - **items**: List of individual items/questions - **values**: Possible response values (for categorical/ordinal) - **value_labels**: Labels for the response values - **range**: Min and max values (for continuous measures) - **unit**: Unit of measurement - **cutoffs**: Clinical or meaningful cutoff values - **scoring**: Information about how to score the measure - **subscales**: Details of any subscales ## Common Mistakes to Avoid ### ❌ Incorrect: Nesting measures under categories ```{r} # DON'T DO THIS - measures should not be nested under categories unified_db$measures$psychological$anxiety <- list(...) # WRONG ``` ### ✅ Correct: Top-level measure entries ```{r} # DO THIS - add measures directly at the top level unified_db$measures$anxiety_gad7 <- list(...) # CORRECT unified_db$measures$depression_phq9 <- list(...) # CORRECT ``` # Managing Multiple Measures ## Adding Multiple Measures at Once ```{r} # Add several psychological measures unified_db$measures$depression_phq9 <- list( name = "patient health questionnaire-9 (PHQ-9)", description = "depression symptoms were assessed using the PHQ-9.", type = "ordinal", reference = "kroenke2001", waves = "1-3", items = list( "little interest or pleasure in doing things", "feeling down, depressed, or hopeless", "trouble falling or staying asleep, or sleeping too much", "feeling tired or having little energy", "poor appetite or overeating", "feeling bad about yourself — or that you are a failure", "trouble concentrating on things", "moving or speaking slowly, or being fidgety or restless", "thoughts that you would be better off dead" ), values = c(0, 1, 2, 3), value_labels = c("not at all", "several days", "more than half the days", "nearly every day") ) unified_db$measures$self_esteem <- list( name = "rosenberg self-esteem scale", description = "self-esteem was measured using a 3-item version of the Rosenberg scale.", type = "continuous", reference = "rosenberg1965", waves = "5-current", range = c(1, 7), items = list( "On the whole, I am satisfied with myself.", "I take a positive attitude toward myself.", "I feel that I am a person of worth, at least on an equal plane with others." ) ) # Save all changes boilerplate_save(unified_db, data_path = temp_measures, confirm = FALSE, quiet = TRUE) ``` ## Interactive Management Browse and edit measures programmatically: ```{r} # View all measures names(unified_db$measures) # Access a specific measure (if it exists) if ("anxiety" %in% names(unified_db$measures)) { unified_db$measures$anxiety } else if ("anxiety_gad7" %in% names(unified_db$measures)) { unified_db$measures$anxiety_gad7 } # Add or update measures using boilerplate_add_entry() or boilerplate_update_entry() ``` # Standardising Measures The standardisation process cleans and enhances your measure entries for consistency and completeness. ## What Standardisation Does 1. **Extracts scale information** from descriptions 2. **Identifies reversed items** marked with (r) 3. **Cleans formatting** issues 4. **Ensures complete structure** with all standard fields 5. **Standardises references** for consistency ## Running Standardisation ```{r} # Check quality before standardisation boilerplate_measures_report(unified_db$measures) # Standardise all measures unified_db$measures <- boilerplate_standardise_measures( unified_db$measures, extract_scale = TRUE, # Extract scale info from descriptions identify_reversed = TRUE, # Identify reversed items clean_descriptions = TRUE, # Clean up description text verbose = TRUE # Show progress ) # Check quality after standardisation boilerplate_measures_report(unified_db$measures) # Save the standardised database boilerplate_save(unified_db, data_path = temp_measures, confirm = FALSE, quiet = TRUE) ``` ## Example: Before and After Standardisation ### Before: ```{r} # Messy measure entry unified_db$measures$perfectionism <- list( name = "perfectionism scale", description = "Perfectionism (1 = Strongly Disagree, 7 = Strongly Agree). Higher scores indicate greater perfectionism.", items = list( "Doing my best never seems to be enough.", "My performance rarely measures up to my standards.", "I am hardly ever satisfied with my performance. (r)" ) ) ``` ### After standardisation: ```{r} # Clean, standardised entry # The standardisation process will: # - Extract scale: "1 = Strongly Disagree, 7 = Strongly Agree" # - Clean description: "Perfectionism. Higher scores indicate greater perfectionism." # - Identify reversed items: item 3 marked as reversed # - Add missing fields: type, scale_info, scale_anchors, reversed_items ``` # Generating Quality Reports ## Basic Quality Assessment ```{r} # Get a quality overview boilerplate_measures_report(unified_db$measures) # Output shows: # - Total measures # - Completeness percentages # - Missing information # - Standardisation status ``` ## Detailed Quality Analysis ```{r} # Get detailed report as data frame quality_report <- boilerplate_measures_report( unified_db$measures, return_report = TRUE ) # Find measures missing critical information missing_refs <- quality_report[!quality_report$has_reference, ] missing_items <- quality_report[!quality_report$has_items, ] # View specific issues cat("Measures without references:", missing_refs$measure_name, sep = "\n") cat("\nMeasures without items:", missing_items$measure_name, sep = "\n") ``` # Batch Operations on Measures ## Finding Entries to Clean ```{r} # Find measures with specific characters in references problematic_refs <- boilerplate_find_chars( db = unified_db, field = "reference", chars = c("@", "[", "]", " "), category = "measures" ) print(problematic_refs) ``` ## Batch Cleaning ```{r} # Clean reference formatting unified_db <- boilerplate_batch_clean( db = unified_db, field = "reference", remove_chars = c("@", "[", "]"), replace_pairs = list(" " = "_"), trim_whitespace = TRUE, category = "measures", preview = TRUE # Preview first ) # If preview looks good, run without preview unified_db <- boilerplate_batch_clean( db = unified_db, field = "reference", remove_chars = c("@", "[", "]"), replace_pairs = list(" " = "_"), trim_whitespace = TRUE, category = "measures" ) ``` ## Batch Editing ```{r} # Update references for multiple measures unified_db <- boilerplate_batch_edit( db = unified_db, field = "reference", new_value = "sibley2024", target_entries = c("political_orientation", "social_dominance"), category = "measures", preview = TRUE ) # Update wave information using wildcards unified_db <- boilerplate_batch_edit( db = unified_db, field = "waves", new_value = "1-16", target_entries = "political_*", # All political measures category = "measures" ) # Update based on current values unified_db <- boilerplate_batch_edit( db = unified_db, field = "waves", new_value = "1-current", match_values = c("1-15", "1-16"), # Update these specific values category = "measures" ) ``` # Generating Formatted Output ## Basic Measure Text ```{r} # Generate text for a single measure exposure_text <- boilerplate_generate_measures( variable_heading = "Exposure Variable", variables = "perfectionism", db = unified_db, heading_level = 3, subheading_level = 4, print_waves = TRUE ) cat(exposure_text) ``` ## Multiple Measures with Categories ```{r} # Generate text for multiple measures grouped by type psychological_measures <- boilerplate_generate_measures( variable_heading = "Psychological Measures", variables = c("anxiety_gad7", "depression_phq9", "self_esteem"), db = unified_db, heading_level = 3, subheading_level = 4, print_waves = TRUE, sample_items = 3 # Show only first 3 items ) demographic_measures <- boilerplate_generate_measures( variable_heading = "Demographic Variables", variables = c("age", "gender", "education"), db = unified_db, heading_level = 3, subheading_level = 4, print_waves = FALSE # Don't show waves for demographics ) # Combine into methods section methods_measures <- paste( "## Measures\n\n", psychological_measures, "\n\n", demographic_measures, sep = "" ) ``` ## Advanced Formatting Options ```{r} # Table format for enhanced presentation measures_table <- boilerplate_generate_measures( variable_heading = "Study Measures", variables = c("anxiety_gad7", "perfectionism"), db = unified_db, table_format = TRUE, # Use table format sample_items = 3, # Show only 3 items check_completeness = TRUE, # Note missing information quiet = TRUE # Suppress progress messages ) cat(measures_table) ``` # Complete Workflow Example Here's a complete workflow from adding measures to generating a methods section: ```{r} # 1. Initialise and import # Create a new temp directory for this complete example temp_complete <- file.path(tempdir(), "complete_measures_example") boilerplate_init(data_path = temp_complete, create_dirs = TRUE, create_empty = FALSE, confirm = FALSE, quiet = TRUE) unified_db <- boilerplate_import(data_path = temp_complete, quiet = TRUE) # 2. Add your measures unified_db$measures$political_orientation <- list( name = "political orientation", description = "political orientation on a liberal-conservative spectrum", type = "continuous", reference = "jost2009", waves = "all", range = c(1, 7), items = list("Please rate your political orientation") ) unified_db$measures$social_wellbeing <- list( name = "social wellbeing scale", description = "social wellbeing measured using the Keyes Social Well-Being Scale", type = "continuous", reference = "keyes1998", waves = "5-current", items = list( "I feel like I belong to a community", "I feel that people are basically good", "I have something important to contribute to society", "Society is becoming a better place for everyone" ) ) # 3. Standardise the measures unified_db$measures <- boilerplate_standardise_measures( unified_db$measures, verbose = FALSE ) # 4. Check quality boilerplate_measures_report(unified_db$measures) # 5. Save the database boilerplate_save(unified_db, data_path = temp_complete, confirm = FALSE, quiet = TRUE) # 6. Generate formatted output exposure_text <- boilerplate_generate_measures( variable_heading = "Exposure Variable", variables = "political_orientation", db = unified_db, heading_level = 3 ) outcome_text <- boilerplate_generate_measures( variable_heading = "Outcome Variable", variables = "social_wellbeing", db = unified_db, heading_level = 3 ) # 7. Combine with other methods text sample_text <- boilerplate_generate_text( category = "methods", sections = "sample.default", # Use a valid section path global_vars = list( population = "New Zealand adults", timeframe = "2020-2024" ), db = unified_db ) # 8. Create complete methods section methods_section <- paste( "# Methods\n\n", "## Participants\n\n", sample_text, "\n\n", "## Measures\n\n", exposure_text, "\n\n", outcome_text, sep = "" ) cat(methods_section) ``` # Best Practices ## 1. Measure Organisation - Keep measure names descriptive but concise - Use consistent naming conventions (e.g., `scale_abbreviation`) - Group related measures using consistent prefixes ## 2. Quality Control - Run standardisation after importing data - Review the quality report regularly - Keep references consistent and complete - Document any special scoring requirements ## 3. Workflow Tips - Export your database before major changes - Use preview mode for batch operations - Test on a few measures before applying to all - Keep the original items text exact for reproducibility ## 4. Integration with Text Generation - Define measures before referencing them in text - Use the exact measure name in `boilerplate_generate_measures()` - Consider your audience when choosing format options - Combine measure descriptions with method text for complete sections # Troubleshooting ## Common Issues ### Measure not found ```{r} # Error: Measure 'anxiety' not found # Solution: Check exact name names(unified_db$measures) # List all measure names ``` ### Standardisation warnings ```{r} # Warning: Some measures already standardised # Solution: This is normal - already standardised measures are skipped ``` ### Missing required fields ```{r} # Error: Measure missing required field 'type' # Solution: Add the missing field unified_db$measures$my_measure$type <- "continuous" ``` ## Getting Help If you encounter issues: 1. Check the measure structure matches the examples 2. Run `boilerplate_measures_report()` to identify problems 3. Use `verbose = TRUE` in functions for detailed output 4. Consult the package documentation: `?boilerplate_generate_measures` # Summary The boilerplate package provides a complete workflow for managing research measures: 1. **Add** measures to the unified database with proper structure 2. **Standardise** entries for consistency and completeness 3. **Assess** quality using the reporting tools 4. **Edit** multiple measures efficiently with batch operations 5. **Generate** professional formatted output for publications By following this workflow, you can maintain a high-quality, consistent database of measures that integrates seamlessly with your research documentation.