---
title: "Checking and Improving Results of package Synth"
vignette: >
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteIndexEntry{Checking and Improving Results of package Synth}
  %\VignetteEncoding{UTF-8}
output: 
  html_vignette:
    toc: true
bibliography: ../inst/REFERENCES.bib
---

```{r, echo = FALSE}
knitr::opts_chunk$set(
  fig.width  = 7,
  fig.height = 4,
  fig.align  = "center",
#  cache      = TRUE,
  autodep    = TRUE
)
```

## Introduction

This vignette illustrates the usage of `improveSynth`. For a more general introduction to package `MSCMT` see its [main vignette](WorkingWithMSCMT.html).

Estimating an SCM model involves searching for an approximate solution of a nested optimization problem. 
Although the formulation of the optimization problem is quite simple, finding a (good approximate) solution can be hard for several reasons, see @Mafia and @FastReliable.
While implementing package `MSCMT` we put a lot of effort into the design of a smart and robust (but still fast) optimization procedure.

Apart from function `mscmt` for the estimation of SCM models based on our model syntax, we also included the convenience function `improveSynth`,
which implements checks for feasibility and optimality of results delivered by package `Synth`. 
Below, we illustrate how to use `improveSynth`.


## First Example

We exemplify the usage of `improveSynth` based on the first example of function `synth` in package `Synth`.

### Generating the result of package `Synth`

The following code is thus essentially borrowed from the `example` section of the corresponding help page (all comments have been removed):

```{r}
library(Synth)
data(synth.data)
dataprep.out <-
  dataprep(
    foo = synth.data,
    predictors = c("X1", "X2", "X3"),
    predictors.op = "mean",
    dependent = "Y",
    unit.variable = "unit.num",
    time.variable = "year",
    special.predictors = list(
      list("Y", 1991, "mean"),
      list("Y", 1985, "mean"),
      list("Y", 1980, "mean")
    ),
    treatment.identifier = 7,
    controls.identifier = c(29, 2, 13, 17, 32, 38),
    time.predictors.prior = c(1984:1989),
    time.optimize.ssr = c(1984:1990),
    unit.names.variable = "name",
    time.plot = 1984:1996
  )

synth.out <- synth(dataprep.out)
```

### Checking the result

We check the result by applying function `improveSynth` to `synth.out` and `dataprep.out`:

```{r}
library(MSCMT)
synth2.out <- improveSynth(synth.out,dataprep.out)
```

Package `Synth` generated a (slightly) infeasible solution, returning a (slightly)
suboptimal weight vector `w` for the control units.
However, the predictor weights `v` are (considerably) suboptimal anyway, 
because the original dependent loss of `r round(synth.out$loss.v,6)` (as well as
the dependent loss for the corrected `w` `r round(synth2.out$new.loss.v,6)`)
is considerably larger than the dependent loss `r round(synth2.out$loss.v,6)` 
for the optimal predictor weights obtained by `improveSynth`.

## Second Example

In the second example, we modify the first example by allowing package `Synth` to use `genoud` as (outer) optimization algorithm.

### Generating the result of package `Synth`

`genoud` is switched on by the corresponding function argument. We capture the output with `capture.output` because it is **very** verbose. Furthermore, the calculation is quite lengthy, therefore the results have been cached.^[To reproduce from scratch, please delete `"synth3.out.RData"` from the `vignettes` folder.]

```{r}
if (file.exists("synth3.out.RData")) load ("synth3.out.RData") else {
  set.seed(42)
  out <- capture.output(synth3.out <- synth(dataprep.out,genoud=TRUE))
}  
```

### Checking the result

We again check the result by applying function `improveSynth` to `synth3.out` and `dataprep.out`:

```{r}
synth4.out <- improveSynth(synth3.out,dataprep.out)
```

Now, package `Synth` generated a solution with a dependent loss of `r round(synth3.out$loss.v,6)` which is even smaller than the dependent loss `r round(synth2.out$loss.v,6)` obtained by `improveSynth`.
However, the solution generated by `Synth` is severely **infeasible**: the inner optimization failed, returning a suboptimal weight vector `w` for the control units, which itself lead to a wrong calculation of the dependent loss (which, of course, depends on `w`). 
Implanting the true optimal `w` (depending on `v`) leads to a large increase of the dependent loss, which uncovers the suboptimality of `v`.

`improveSynth` is able to detect this severe problem and calculates an improved *and feasible* solution 
(the improved solution `r if(!isTRUE(all.equal(round(synth4.out$loss.v,6),round(synth2.out$loss.v,6)))) "essentially" else ""` matches the solution obtained from the first call to `improveSynth` above, 
with a dependent loss of `r round(synth4.out$loss.v,6)``r if(!isTRUE(all.equal(round(synth4.out$loss.v,6),round(synth2.out$loss.v,6)))) paste0("as compared to ",round(synth2.out$loss.v,6)," above") else ""`).

## Summary

Issues with the inner and outer optimizers used in `synth` from package `Synth` may lead to infeasible or suboptimal solutions.
This vignette illustrated the usage of the convenience function `improveSynth` from package `MSCMT` for checking and potentially improving results obtained from `synth`. 

## References