In this vignette we will discuss about how to customize the summary statistics using ExpCustomStat function from SmartEDA. The output of this function returns matrix object containing descriptive information on all input variables for each level/combination of levels in categorical/group variable. Also, while running the analysis we can filter row/cases of the data. We can apply the filters at individual variable level or complete data like base subsetting.
Function definition:
ExpCustomStat(data,Cvar=NULL,Nvar=NULL,stat=NULL,gpby=TRUE,filt=NULL,dcast=FALSE)Key functionalities of ExpCustomStat are:
ExpCustomStat functionWill open the carseats data from ISLR package and drive different types of use cases using ExpCustomStat function.
In this vignette, we will be using a simulated data set containing sales of child car seats at 400 different stores.
Data Source ISLR package.
Function source SmartEDA package
Carseats data from ISLR package:
##   Sales CompPrice Income Advertising Population Price ShelveLoc Age Education Urban  US
## 1  9.50       138     73          11        276   120       Bad  42        17   Yes Yes
## 2 11.22       111     48          16        260    83      Good  65        10   Yes Yes
## 3 10.06       113     35          10        269    80    Medium  59        12   Yes Yes
## 4  7.40       117    100           4        466    97    Medium  55        14   Yes Yes
## 5  4.15       141     64           3        340   128       Bad  38        13   Yes  NoCategorical summaries to describe the distribution for a qualitative variables.
The number of observations for particular category
"US", "Urban" and "ShelveLoc" variable##      Level  Group_by Count  Prop
##  1:    Yes        US   258 64.50
##  2:     No        US   142 35.50
##  3:    Yes     Urban   282 70.50
##  4:     No     Urban   118 29.50
##  5:    Bad ShelveLoc    96 24.00
##  6:   Good ShelveLoc    85 21.25
##  7: Medium ShelveLoc   219 54.75
##  8:     17 Education    49 12.25
##  9:     10 Education    48 12.00
## 10:     12 Education    49 12.25
## 11:     14 Education    40 10.00
## 12:     13 Education    43 10.75
## 13:     16 Education    47 11.75
## 14:     15 Education    36  9.00
## 15:     18 Education    40 10.00
## 16:     11 Education    48 12.00OR we can use similar analysis using ExpCTable function from same package, this functions includes cumulative percentages and Total
##     Variable  Valid Frequency Percent CumPercent
## 1  ShelveLoc    Bad        96   24.00      24.00
## 2  ShelveLoc   Good        85   21.25      45.25
## 3  ShelveLoc Medium       219   54.75     100.00
## 4  ShelveLoc  TOTAL       400      NA         NA
## 5      Urban     No       118   29.50      29.50
## 6      Urban    Yes       282   70.50     100.00
## 7      Urban  TOTAL       400      NA         NA
## 8         US     No       142   35.50      35.50
## 9         US    Yes       258   64.50     100.00
## 10        US  TOTAL       400      NA         NA
## 11 Education     10        48   12.00      12.00
## 12 Education     11        48   12.00      24.00
## 13 Education     12        49   12.25      36.25
## 14 Education     13        43   10.75      47.00
## 15 Education     14        40   10.00      57.00
## 16 Education     15        36    9.00      66.00
## 17 Education     16        47   11.75      77.75
## 18 Education     17        49   12.25      90.00
## 19 Education     18        40   10.00     100.00
## 20 Education  TOTAL       400      NA         NA"US", "Urban" and "ShelveLoc" variable##     Level  Group_by Count  Prop
## 1:    Yes        US   258 64.50
## 2:     No        US   142 35.50
## 3:    Yes     Urban   282 70.50
## 4:     No     Urban   118 29.50
## 5:    Bad ShelveLoc    96 24.00
## 6:   Good ShelveLoc    85 21.25
## 7: Medium ShelveLoc   219 54.75To produce cross tables which calculate counts and proportions for each combination of categorical variables we can use ExpCustomStat
NOTE: For crosstabulation change input gpby=TRUE
"US" Vs "Urban"##     US Urban Count Prop
## 1: Yes   Yes   186 46.5
## 2:  No   Yes    96 24.0
## 3: Yes    No    72 18.0
## 4:  No    No    46 11.5We can also produce multidimensional tables based on three or more categorical variables
"US" Vs "Urban" Vs "ShelveLoc"##      US Urban ShelveLoc Count  Prop
##  1: Yes   Yes       Bad    51 12.75
##  2: Yes   Yes      Good    39  9.75
##  3: Yes   Yes    Medium    96 24.00
##  4:  No   Yes       Bad    23  5.75
##  5: Yes    No       Bad    11  2.75
##  6:  No   Yes    Medium    55 13.75
##  7:  No    No    Medium    29  7.25
##  8: Yes    No    Medium    39  9.75
##  9:  No   Yes      Good    18  4.50
## 10: Yes    No      Good    22  5.50
## 11:  No    No      Good     6  1.50
## 12:  No    No       Bad    11  2.75If we want to understand the number of stores in US and location is Urban for Population size greater than 150
Population is > 150"US" Vs "Urban"##     US Urban         Filter Count  Prop
## 1: Yes   Yes Population>150   138 48.25
## 2:  No   Yes Population>150    63 22.03
## 3: Yes    No Population>150    53 18.53
## 4:  No    No Population>150    32 11.19Population > 150 and Urban==“Yes”"US" Vs "ShelveLoc"##     US ShelveLoc                        Filter Count  Prop
## 1: Yes       Bad Urban=='Yes' & Population>150    40 19.90
## 2: Yes      Good Urban=='Yes' & Population>150    31 15.42
## 3: Yes    Medium Urban=='Yes' & Population>150    67 33.33
## 4:  No       Bad Urban=='Yes' & Population>150    13  6.47
## 5:  No    Medium Urban=='Yes' & Population>150    37 18.41
## 6:  No      Good Urban=='Yes' & Population>150    13  6.47Numerical summaries to describe the distribution for quantitative variables.
options(width = 150)
ExpCustomStat(Carseats,Nvar=c("Population","Sales","CompPrice","Income"),stat = c('Count','mean','sum','var','sd','min','max','IQR'))##     Attribute Count       mean       sum          var         sd min    max    IQR
## 1: Population   400 264.840000 105936.00 21719.813935 147.376436  10 509.00 259.50
## 2:      Sales   400   7.496325   2998.53     7.975626   2.824115   0  16.27   3.93
## 3:  CompPrice   400 124.975000  49990.00   235.147243  15.334512  77 175.00  20.00
## 4:     Income   400  68.657500  27463.00   783.218239  27.986037  21 120.00  48.25##     Attribute min  p0.25 median  p0.75    max
## 1: Population  10 139.00 272.00 398.50 509.00
## 2:      Sales   0   5.39   7.49   9.32  16.27
## 3:  CompPrice  77 115.00 125.00 135.00 175.00
## 4:     Income  21  42.75  69.00  91.00 120.00Filter rows/cases of complete dataset where conditions are true
options(width = 150)
ExpCustomStat(Carseats,Nvar=c("Population","Sales","CompPrice","Income"),stat = c('Count','mean','sum','var','min','median','max'),filt="Urban=='Yes'")##     Attribute       Filter Count       mean      sum          var   min median    max
## 1: Population Urban=='Yes'   282 259.886525 73288.00 20913.488857 12.00 274.00 508.00
## 2:      Sales Urban=='Yes'   282   7.468191  2106.03     8.044141  0.37   7.42  16.27
## 3:  CompPrice Urban=='Yes'   282 125.634752 35429.00   246.545834 77.00 125.00 175.00
## 4:     Income Urban=='Yes'   282  69.343972 19555.00   743.706885 21.00  70.00 120.00options(width=150)
ExpCustomStat(Carseats,Nvar=c("Population","Sales","CompPrice","Income"),stat = c('Count','mean','sum','median','IQR'),filt="Urban=='Yes' & Population>150")##     Attribute                        Filter Count       mean      sum median    IQR
## 1: Population Urban=='Yes' & Population>150   201 332.641791 66861.00 331.00 165.00
## 2:      Sales Urban=='Yes' & Population>150   201   7.573383  1522.25   7.49   3.74
## 3:  CompPrice Urban=='Yes' & Population>150   201 125.134328 25152.00 125.00  19.00
## 4:     Income Urban=='Yes' & Population>150   201  68.223881 13713.00  69.00  47.00This will be useful when we need to exclude redundant values like ‘999’ or ‘9999’ or ‘-9’ or ‘-1111’, or ‘888’ etc from each selected variable.
Eg:dat = data.frame(x = c(23,24,34,999,12,12,23,999,45), y = c(1,3,4,999,0,999,0,8,999,0)
Exclude 999:
x = c(23,24,34,12,12,23,45) y = c(1,3,4,0,0,8,0)
data_sam = Carseats[,]
data_sam[sample(1:400,30),"Sales"] <- 999
data_sam[sample(1:400,20),"CompPrice"] <- -9
data_sam[sample(1:400,45),"Income"] <- 999
ExpCustomStat(data_sam,Nvar=c("Population","Sales","CompPrice","Income"),stat = c('Count','mean','sum','min'),filt="All %ni% c(999,-9)")##     Attribute                   Filter Count       mean       sum min
## 1: Population Population%ni% c(999,-9)   400 264.840000 105936.00  10
## 2:      Sales      Sales%ni% c(999,-9)   370   7.589027   2807.94   0
## 3:  CompPrice  CompPrice%ni% c(999,-9)   380 124.984211  47494.00  77
## 4:     Income     Income%ni% c(999,-9)   355  68.447887  24299.00  21Different filters for each numeric variable. For example, below are the conditions (logic) for each variable summary analysis.
"Population" - Consider only Good ShelveLoc (the quality of the shelving location for the car seats at each site) ShelveLoc=='Good'
"Sales" - Inculde only those store belongs to Urban location (Urban==Yes)
"CompPrice" - Exclude Price is greater than 150
"Education" - All stores
"Income" - Inculde only stores in US (US==Yes)
Table: Descriptive summary for Price, Population, Sales, CompPrice, Income based on the filters.
options(width = 150)
ExpCustomStat(Carseats,Nvar=c("Population","Sales","CompPrice","Education","Income"),stat = c('Count','mean','sum','var','sd','IQR','median'),filt=c("ShelveLoc=='Good'^Urban=='Yes'^Price>=150^All^US=='Yes'"))##     Attribute            Filter Count       mean      sum          var         sd      IQR median
## 1: Population ShelveLoc=='Good'    85 267.047059 22699.00 16193.473950 127.253581 177.0000 272.00
## 2:      Sales      Urban=='Yes'   282   7.468191  2106.03     8.044141   2.836219   3.9175   7.42
## 3:  CompPrice        Price>=150    32 141.875000  4540.00   181.596774  13.475785  18.2500 142.50
## 4:  Education               All   400  13.900000  5560.00     6.867168   2.620528   4.0000  14.00
## 5:     Income         US=='Yes'   258  70.515504 18193.00   782.849953  27.979456  48.0000  70.00Descriptive summary for numerical variable by group level.
options(width = 150)
ExpCustomStat(Carseats,Cvar = c("Urban","ShelveLoc"), Nvar=c("Population","Sales"), stat = c('Count','Prop','mean','min','P0.25','median','p0.75','max'),gpby=FALSE)##      Level  Attribute  Group_by Count  Prop       mean   min    P0.25 median    p0.75    max
##  1:    Yes Population     Urban   282 70.50 259.886525 12.00 139.0000 274.00 376.7500 508.00
##  2:     No Population     Urban   118 29.50 276.677966 10.00 144.0000 271.00 408.0000 509.00
##  3:    Yes      Sales     Urban   282 70.50   7.468191  0.37   5.3750   7.42   9.2925  16.27
##  4:     No      Sales     Urban   118 29.50   7.563559  0.00   5.4400   7.67   9.3350  14.90
##  5:    Bad Population ShelveLoc    96 24.00 275.291667 10.00 145.5000 296.00 400.5000 501.00
##  6:   Good Population ShelveLoc    85 21.25 267.047059 14.00 176.0000 272.00 353.0000 503.00
##  7: Medium Population ShelveLoc   219 54.75 259.401826 12.00 124.0000 261.00 405.0000 509.00
##  8:    Bad      Sales ShelveLoc    96 24.00   5.522917  0.37   4.0525   5.21   7.4625  11.67
##  9:   Good      Sales ShelveLoc    85 21.25  10.214000  3.58   8.3300  10.50  11.9600  16.27
## 10: Medium      Sales ShelveLoc   219 54.75   7.306575  0.00   5.6250   7.38   8.7750  13.36options(width = 150)
ExpCustomStat(Carseats,Cvar = c("Urban","US","ShelveLoc"), Nvar=c("CompPrice","Income"), stat = c('Count','Prop','mean','sum','PS','min','max','IQR','sd'), gpby = TRUE)##     Urban  US ShelveLoc Attribute Count  Prop      mean   sum    PS min max   IQR        sd
##  1:   Yes Yes       Bad CompPrice    51 12.75 124.88235  6369 12.74  93 161 17.00 14.768408
##  2:   Yes Yes      Good CompPrice    39  9.75 127.00000  4953  9.91  89 156 22.00 15.998355
##  3:   Yes Yes    Medium CompPrice    96 24.00 125.05208 12005 24.01  85 175 20.00 15.410434
##  4:   Yes  No       Bad CompPrice    23  5.75 125.52174  2887  5.78  86 157 21.50 17.367340
##  5:    No Yes       Bad CompPrice    11  2.75 122.45455  1347  2.69  88 149  8.50 17.494934
##  6:   Yes  No    Medium CompPrice    55 13.75 126.00000  6930 13.86  77 159 20.00 16.611018
##  7:    No  No    Medium CompPrice    29  7.25 121.86207  3534  7.07  89 154 21.00 16.012772
##  8:    No Yes    Medium CompPrice    39  9.75 126.33333  4927  9.86  96 150 18.50 14.411862
##  9:   Yes  No      Good CompPrice    18  4.50 126.94444  2285  4.57  95 157 21.00 16.049454
## 10:    No Yes      Good CompPrice    22  5.50 122.36364  2692  5.39 107 147 16.25 11.986284
## 11:    No  No      Good CompPrice     6  1.50 126.50000   759  1.52  96 142 13.75 16.610238
## 12:    No  No       Bad CompPrice    11  2.75 118.36364  1302  2.60 106 131 15.00  9.080449
## 13:   Yes Yes       Bad    Income    51 12.75  76.31373  3892 14.17  28 119 40.50 26.657449
## 14:   Yes Yes      Good    Income    39  9.75  66.89744  2609  9.50  21 117 49.00 28.163802
## 15:   Yes Yes    Medium    Income    96 24.00  68.83333  6608 24.06  25 120 41.25 27.070538
## 16:   Yes  No       Bad    Income    23  5.75  67.52174  1553  5.65  25 114 34.00 25.414508
## 17:    No Yes       Bad    Income    11  2.75  72.72727   800  2.91  21 120 52.50 33.400871
## 18:   Yes  No    Medium    Income    55 13.75  66.14545  3638 13.25  22 118 42.50 26.857248
## 19:    No  No    Medium    Income    29  7.25  62.44828  1811  6.59  22 120 60.00 32.077569
## 20:    No Yes    Medium    Income    39  9.75  69.05128  2693  9.81  21 119 55.00 30.884291
## 21:   Yes  No      Good    Income    18  4.50  69.72222  1255  4.57  24 113 62.25 31.765326
## 22:    No Yes      Good    Income    22  5.50  72.31818  1591  5.79  22 115 35.00 27.425026
## 23:    No  No      Good    Income     6  1.50  53.83333   323  1.18  30  78 40.25 22.542552
## 24:    No  No       Bad    Income    11  2.75  62.72727   690  2.51  25 105 17.00 23.524842
##     Urban  US ShelveLoc Attribute Count  Prop      mean   sum    PS min max   IQR        sdoptions(width = 150)
ExpCustomStat(Carseats,Cvar = c("Urban","US","ShelveLoc"), Nvar=c("CompPrice","Income"), stat = c('Count','Prop','mean','sum','PS','median','IQR'), gpby = TRUE,filt="Urban=='Yes'")##     Urban  US ShelveLoc Attribute       Filter Count  Prop      mean   sum    PS median   IQR
##  1:   Yes Yes       Bad CompPrice Urban=='Yes'    51 18.09 124.88235  6369 17.98  125.0 17.00
##  2:   Yes Yes      Good CompPrice Urban=='Yes'    39 13.83 127.00000  4953 13.98  128.0 22.00
##  3:   Yes Yes    Medium CompPrice Urban=='Yes'    96 34.04 125.05208 12005 33.88  125.0 20.00
##  4:   Yes  No       Bad CompPrice Urban=='Yes'    23  8.16 125.52174  2887  8.15  125.0 21.50
##  5:   Yes  No    Medium CompPrice Urban=='Yes'    55 19.50 126.00000  6930 19.56  127.0 20.00
##  6:   Yes  No      Good CompPrice Urban=='Yes'    18  6.38 126.94444  2285  6.45  122.0 21.00
##  7:   Yes Yes       Bad    Income Urban=='Yes'    51 18.09  76.31373  3892 19.90   81.0 40.50
##  8:   Yes Yes      Good    Income Urban=='Yes'    39 13.83  66.89744  2609 13.34   69.0 49.00
##  9:   Yes Yes    Medium    Income Urban=='Yes'    96 34.04  68.83333  6608 33.79   69.0 41.25
## 10:   Yes  No       Bad    Income Urban=='Yes'    23  8.16  67.52174  1553  7.94   67.0 34.00
## 11:   Yes  No    Medium    Income Urban=='Yes'    55 19.50  66.14545  3638 18.60   69.0 42.50
## 12:   Yes  No      Good    Income Urban=='Yes'    18  6.38  69.72222  1255  6.42   73.5 62.25options(width = 150)
data_sam = Carseats[,]
data_sam[sample(1:400,30),"Sales"] <- 888
data_sam[sample(1:400,20),"CompPrice"] <- 999
data_sam[sample(1:400,45),"Income"] <- 999
ExpCustomStat(data_sam,Cvar = c("Urban","US","ShelveLoc"), Nvar=c("Sales","CompPrice","Income"), stat = c('Count','Prop','mean','sum','PS'), gpby = TRUE,filt="All %ni% c(888,999)")##     Urban  US ShelveLoc Attribute                   Filter Count  Prop       mean      sum    PS
##  1:   Yes Yes       Bad     Sales     Sales%ni% c(888,999)    45 12.16   5.602000   252.09  9.16
##  2:   Yes Yes      Good     Sales     Sales%ni% c(888,999)    36  9.73  10.905000   392.58 14.27
##  3:   Yes Yes    Medium     Sales     Sales%ni% c(888,999)    91 24.59   7.464396   679.26 24.68
##  4:   Yes  No       Bad     Sales     Sales%ni% c(888,999)    23  6.22   5.359130   123.26  4.48
##  5:    No Yes       Bad     Sales     Sales%ni% c(888,999)     8  2.16   5.486250    43.89  1.59
##  6:   Yes  No    Medium     Sales     Sales%ni% c(888,999)    51 13.78   6.816471   347.64 12.63
##  7:    No  No    Medium     Sales     Sales%ni% c(888,999)    25  6.76   6.422400   160.56  5.83
##  8:    No Yes    Medium     Sales     Sales%ni% c(888,999)    36  9.73   7.684722   276.65 10.05
##  9:   Yes  No      Good     Sales     Sales%ni% c(888,999)    18  4.86   9.210556   165.79  6.02
## 10:    No Yes      Good     Sales     Sales%ni% c(888,999)    20  5.41   9.993500   199.87  7.26
## 11:    No  No      Good     Sales     Sales%ni% c(888,999)     6  1.62   8.968333    53.81  1.96
## 12:    No  No       Bad     Sales     Sales%ni% c(888,999)    11  2.97   5.135455    56.49  2.05
## 13:   Yes Yes       Bad CompPrice CompPrice%ni% c(888,999)    46 12.11 124.413043  5723.00 12.06
## 14:   Yes Yes      Good CompPrice CompPrice%ni% c(888,999)    38 10.00 126.631579  4812.00 10.14
## 15:   Yes Yes    Medium CompPrice CompPrice%ni% c(888,999)    91 23.95 125.505495 11421.00 24.07
## 16:   Yes  No       Bad CompPrice CompPrice%ni% c(888,999)    23  6.05 125.521739  2887.00  6.08
## 17:    No Yes       Bad CompPrice CompPrice%ni% c(888,999)    11  2.89 122.454545  1347.00  2.84
## 18:   Yes  No    Medium CompPrice CompPrice%ni% c(888,999)    53 13.95 125.603774  6657.00 14.03
## 19:    No  No    Medium CompPrice CompPrice%ni% c(888,999)    27  7.11 122.259259  3301.00  6.96
## 20:    No Yes    Medium CompPrice CompPrice%ni% c(888,999)    38 10.00 126.763158  4817.00 10.15
## 21:   Yes  No      Good CompPrice CompPrice%ni% c(888,999)    16  4.21 125.750000  2012.00  4.24
## 22:    No Yes      Good CompPrice CompPrice%ni% c(888,999)    21  5.53 121.809524  2558.00  5.39
## 23:    No  No      Good CompPrice CompPrice%ni% c(888,999)     5  1.32 124.000000   620.00  1.31
## 24:    No  No       Bad CompPrice CompPrice%ni% c(888,999)    11  2.89 118.363636  1302.00  2.74
## 25:   Yes Yes      Good    Income    Income%ni% c(888,999)    37 10.42  66.729730  2469.00 10.07
## 26:   Yes Yes    Medium    Income    Income%ni% c(888,999)    88 24.79  68.568182  6034.00 24.61
## 27:   Yes  No       Bad    Income    Income%ni% c(888,999)    20  5.63  69.850000  1397.00  5.70
## 28:    No Yes       Bad    Income    Income%ni% c(888,999)    11  3.10  72.727273   800.00  3.26
## 29:   Yes  No    Medium    Income    Income%ni% c(888,999)    47 13.24  68.042553  3198.00 13.05
## 30:    No  No    Medium    Income    Income%ni% c(888,999)    26  7.32  63.692308  1656.00  6.76
## 31:    No Yes    Medium    Income    Income%ni% c(888,999)    31  8.73  67.741935  2100.00  8.57
## 32:   Yes  No      Good    Income    Income%ni% c(888,999)    13  3.66  68.846154   895.00  3.65
## 33:    No Yes      Good    Income    Income%ni% c(888,999)    20  5.63  74.250000  1485.00  6.06
## 34:   Yes Yes       Bad    Income    Income%ni% c(888,999)    45 12.68  77.066667  3468.00 14.15
## 35:    No  No      Good    Income    Income%ni% c(888,999)     6  1.69  53.833333   323.00  1.32
## 36:    No  No       Bad    Income    Income%ni% c(888,999)    11  3.10  62.727273   690.00  2.81
##     Urban  US ShelveLoc Attribute                   Filter Count  Prop       mean      sum    PSDifferent base for each numeric variable.
"Population" - Consider only Good ShelveLoc (the quality of the shelving location for the car seats at each site) ShelveLoc=='Good'
"Sales" - Inculde only those store belongs to Urban location (Urban==Yes)
"CompPrice" - Exclude Price is greater than 150
##     Urban  US  Attribute            Filter Count  Prop       mean      sum          var      IQR
##  1:   Yes Yes Population ShelveLoc=='Good'    39 45.88 277.538462 10824.00 18259.676113 176.0000
##  2:   Yes  No Population ShelveLoc=='Good'    18 21.18 219.888889  3958.00 13625.633987 155.5000
##  3:    No Yes Population ShelveLoc=='Good'    22 25.88 283.318182  6233.00 15122.512987 205.5000
##  4:    No  No Population ShelveLoc=='Good'     6  7.06 280.666667  1684.00 13183.066667 125.2500
##  5:   Yes Yes      Sales      Urban=='Yes'   186 65.96   7.710968  1434.24     8.734665   4.0775
##  6:   Yes  No      Sales      Urban=='Yes'    96 34.04   6.997813   671.79     6.445127   3.4175
##  7:   Yes Yes  CompPrice        Price>=150    16 50.00 142.187500  2275.00   223.362500  17.5000
##  8:    No Yes  CompPrice        Price>=150     7 21.88 138.428571   969.00   180.952381  16.0000
##  9:   Yes  No  CompPrice        Price>=150     7 21.88 143.285714  1003.00   156.571429  14.5000
## 10:    No  No  CompPrice        Price>=150     2  6.25 146.500000   293.00   112.500000   7.5000Reshapes a grouped data
options(width = 150)
ExpCustomStat(Carseats,Cvar = c("Urban"), Nvar=c("Population","Sales"), stat = c('Count','Prop'),gpby=TRUE,dcast=TRUE)## Row value   : Attribute
## Column value: Urban
## Statistics  : Count + Prop##     Attribute Count_No Count_Yes Prop_No Prop_Yes
## 1: Population      118       282    29.5     70.5
## 2:      Sales      118       282    29.5     70.5Example scripts
##Frequency table for categorical variables
ExpCustomStat(Carseats,Cvar=c("US","Urban","ShelveLoc"),gpby=FALSE)
##Crosstabulation between categorical variables
ExpCustomStat(Carseats,Cvar=c("US","Urban"),gpby=TRUE,filt=NULL)
ExpCustomStat(Carseats,Cvar=c("US","Urban","ShelveLoc"),gpby=TRUE,filt=NULL)
##Adding filters for custom tables
ExpCustomStat(Carseats,Cvar=c("US","Urban"),gpby=TRUE,filt="Population>150")
ExpCustomStat(Carseats,Cvar=c("US","ShelveLoc"),gpby=TRUE,filt="Urban=='Yes' & Population>150")
## Numeric variable summary
ExpCustomStat(Carseats,Nvar=c("Population","Sales","CompPrice","Income"),stat = c('Count','mean','sum','var','min','max'))
ExpCustomStat(Carseats,Nvar=c("Population","Sales","CompPrice","Income"),stat = c('min','p0.25','median','p0.75','max'))
## Adding filters for complete data (like base Subset)
ExpCustomStat(Carseats,Nvar=c("Population","Sales","CompPrice","Income"),stat = c('Count','mean','sum','var'),filt="Urban=='Yes'")
ExpCustomStat(Carseats,Nvar=c("Population","Sales","CompPrice","Income"),stat = c('Count','mean','sum'),filt="Urban=='Yes' & Population>150")
## Filter unique value from all the numeric variables
ExpCustomStat(data_sam,Nvar=c("Population","Sales","CompPrice","Income"),stat = c('Count','mean','sum','min'),filt="All %ni% c(999,-9)")
## Adding filters at variable level
ExpCustomStat(Carseats,Nvar=c("Population","Sales","CompPrice","Education","Income"),stat = c('Count','mean','sum','var','sd','IQR','median'),filt=c("ShelveLoc=='Good'^Urban=='Yes'^Price>=150^ ^US=='Yes'"))
##Numerical summaries by category
##Variable summary report (One group variable)
ExpCustomStat(Carseats,Cvar = c("Urban","ShelveLoc"), Nvar=c("Population","Sales"), stat = c('Count','Prop','mean','min','P0.25','median','p0.75','max'),gpby=FALSE)
##Variable summary report (More than One group variable)
ExpCustomStat(Carseats,Cvar = c("Urban","US","ShelveLoc"), Nvar=c("CompPrice","Income"), stat = c('Count','Prop','mean','sum','PS','min','max','IQR','sd'), gpby = TRUE)
##Variable summary report (More than One group variable) with filter
ExpCustomStat(Carseats,Cvar = c("Urban","US","ShelveLoc"), Nvar=c("CompPrice","Income"), stat = c('Count','Prop','mean','sum','PS','P0.25','median','p0.75'), gpby = TRUE,filt="Urban=='Yes'")
ExpCustomStat(data_sam,Cvar = c("Urban","US","ShelveLoc"), Nvar=c("Sales","CompPrice","Income"), stat = c('Count','Prop','mean','sum','PS'), gpby = TRUE,filt="All %ni% c(888,999)")
ExpCustomStat(Carseats,Cvar = c("Urban","US"), Nvar=c("Population","Sales","CompPrice"), stat = c('Count','Prop','mean','sum','var','min','max'), filt=c("ShelveLoc=='Good'^Urban=='Yes'^Price>=150"))