Signal Extraction

Normal Updating

  • Random variable $X \sim N(\mu_X, \sigma^2_X)$, not directly observable
  • A signal $Y$ is observed, where $Y \sim N(\mu_Y, \sigma^2_Y)$

The posterior remains normal with the Bayesian updating formula being

$$X|y \sim N(\mu_X + \frac{Cov(X,Y)}{Var(Y)}(y-\mu_Y), \sigma^2_X-\frac{Cov(X,Y)^2}{Var(Y)})$$

Or using the fact that $Cov(X,Y) = \rho_{X,Y}\sqrt{Var(X)Var(Y)} = \rho_{X,Y}\sigma_X\sigma_Y$ and rewrite:

$$E(X|y) =\mu_X + \rho_{X,Y}\frac{\sigma_X}{\sigma_Y}(y-\mu_Y)$$$$Var(X|y) =\sigma^2_X(1-\rho_{X,Y}^2)$$

For instance, if $\rho_{X,Y}=0$, $X$ and $Y$ are not correlated, then the both the mean and variance remains the same. Or if it is perfectly correlated, then seeing $Y$ is equivalent to knowing $X$, therefore, variance of posterior is zero and mean of posterior is $\mu_X+\rho_{X,Y}(y-\mu_Y)$.

This holds no matter how $X$ and $Y$ are correlated.

A commonly seen example is what is signal is the true varialbe plus a noise.

$$Y=X+\epsilon$$

where $X$ and $\epsilon$ are independent and $\epsilon \sim N(0,\sigma^2_{\epsilon})$.

Then,

$$Cov(X,Y) =Cov(X,X+\epsilon) = Var(X)=\sigma^2_X$$$$\rho_{X,Y} = \frac{Cov(X,Y)}{\sqrt{Var(X)Var(Y)}}=\frac{\sigma^2_X}{\sigma_X \sigma_Y} =\frac{\sigma_X}{\sqrt{\sigma^2_X+\sigma^2_{\epsilon}}}$$

Then the posterior is

$$E(X|y) = \mu_X + \frac{\sigma^2_X}{\sigma^2_X+\sigma^2_{\epsilon}}(y-\mu_y)$$$$Var(X|y) = \sigma^2_X(1-\frac{\sigma^2_X}{\sigma^2_Y})=\sigma^2_X(1-\frac{\sigma^2_X}{\sigma^2_X+\sigma^2_{\epsilon}})=\frac{\sigma^2_X\sigma^2_{\epsilon}}{\sigma^2_X+\sigma^2_{\epsilon}}$$

Typically, we like to write the the udpating equation as the function of preceision, or the inverse of variance.

$$E(X|y) = \mu_X + \frac{\phi_{\epsilon}}{\phi_{\epsilon}+\phi_X}(y-\mu_y)= \frac{\phi_X}{\phi_X+\phi_{\epsilon}}\mu_X+\frac{\phi_{\epsilon}}{\phi_{\epsilon}+\phi_X}y$$$$Var(X|y) =\frac{1}{\phi_{\epsilon}+\phi_X}$$

A few intuitive observations from this

  • Posterior mean is simply a weighted mean of prior mean and signal.
  • A signal reduces variance as $Var(X|Y) = \frac{1}{\phi_{\epsilon}+\phi_X}< \frac{1}{\phi_X}=Var(X)$
  • The greater $\phi_{\epsilon}$ is, meaning the less noise in the signal $Y$, the closer the posterior mean is adjusted toward signal.

More importantly, a few covenenient properties of normal signal extraction

  • Posterior variance does not depend on realization of signal $y$. It is a constant only dependent upon precisions.
  • Posterior mean is a linear combination of prior mean and signal.
  • These properties are super useful to many models we will study.

Multiple signals

Another nice property is the linearity of multiple signals updating.

$$Y_i = X+\epsilon_i \quad \forall i=1,2...,n$$

Then the posterier remains a linear combination of multiple signals with a higher precision.

$$E(X|\{y_i\}) = \frac{\phi_X}{\phi_X+\sum \phi_{\epsilon_i}}\mu_X + \sum \frac{\phi_{\epsilon_i}}{\phi_X+\sum \phi_{\epsilon_i}}y_i$$$$Var(X|\{y_i\})=\frac{1}{\phi_X+\sum \phi_{\epsilon_i}}$$

Simulation using Python

In [2]:
import numpy as np
import matplotlib.pyplot as plt
#import scipy as sp
#from math import log
#from scipy.stats import entropy
%matplotlib inline  
In [3]:
def NormalUpdating(mu_x,sigma_x,mu_y,sigma_y,rho_xy,n):
    if rho_xy>1 or rho_xy<-1:
        print('Errors: correlation coefficient only lies between -1 to 1')
    else:
        mean = [mu_x,mu_y]
        cov = [[sigma_x**2,rho_xy*sigma_x*sigma_y],[rho_xy*sigma_x*sigma_y,sigma_y**2]] 
        true_draws,signals = np.random.multivariate_normal(mean,cov,n).T
        noise = signals - true_draws
        post_mean = np.zeros(n)
        for i in range(n):
            post_mean[i] = mu_x + rho_xy*sigma_x/sigma_y*(signals[i]-mu_y)
        return true_draws,signals,post_mean,noise
In [4]:
## An example 
true_mean = 0
true_sigma = 0.1
sig_mean = 0
sig_sigma = 0.2
rho_true_sig =0.5
nn = 100

x,y,post_mean,noise=NormalUpdating(true_mean,true_sigma,sig_mean,sig_sigma,rho_true_sig,nn)
In [5]:
plt.figure(figsize=(10,7))
plt.title('Normal Updating')
plt.plot(x,'-',label=r'x')
plt.plot(y,'-',label=r'y')
plt.plot(post_mean,'r*',label=r'E(x|y)')
plt.legend(loc=2)
Out[5]:
<matplotlib.legend.Legend at 0x118eea4a8>

Impacts of Different Degree of Correlation

In [6]:
rho_list = np.linspace(0.01,1,20)
post_var_sim = np.zeros(len(rho_list))
nn = 1000


for i in range(len(rho_list)):
    rho_true_sig =rho_list[i]
    x,y,post_mean,noise=NormalUpdating(true_mean,true_sigma,sig_mean,sig_sigma,rho_true_sig,nn)
    post_var_sim[i]=np.var(post_mean)
    
    
plt.title('Responsiveness to Signal as A Function of Correlation')
plt.plot(rho_list,post_var_sim)
plt.ylabel(r'$Var(E(X|y))$')
plt.xlabel(r'$\rho_{x,y}$')
Out[6]:
<matplotlib.text.Text at 0x119120a90>

An Application

Classical permanent income hypothesis assumes that agents can observe permanent and transitory income shocks perfectly. But this assumption may be too stringent. So instead of full-information rationality, we assume agents can only observe the total income and past permanent and transitory income, and try to guess the current period permanent income.

Specifically,

$$Y_t = P_t + \epsilon_t$$$$P_t = P_{t-1}+\theta_t$$
  • $\epsilon_t$ is transitory shocks. $\epsilon_t \sim N(0,\sigma^2_{\phi})$
  • $\theta_t$ is permanent shocks. $\theta_t \sim N(0,\sigma^2_{\theta})$

At time t, the agents can only observe $P_{t-1}$, $\epsilon_{t-1}$ and $Y_t$, but does not know $P_{t}$ and thus $\epsilon_t$ as well.

She needs to form some conditional expection of $P_t$ according to signal extraction model.

$$E(P_t|Y_t, P_{t-1}) =?$$

Prior mean for $P_t$ is $$E(P_t|P_{t-1})=P_{t-1}$$

Signal is $Y_t$,

$$Y_t = P_t+ \epsilon_t$$

where $\epsilon_t$ is the i.i.d. noise.

Therefore we can apply the updating fomula we derived above.

$$E(P_t|Y_t)= \underbrace{P_{t-1}}_{\text{Prior Mean}} + \frac{Cov(Y_t,P_t)}{Var(Y_t)}(\underbrace{Y_t}_{Signal}-E(Y_t|P_{t-1}))$$

We know $$\frac{Cov(Y_t,P_t)}{Var(Y_t)}=\frac{\sigma^2_\theta}{\sigma^2_\theta+\sigma^2_\epsilon}$$

and $$E(Y_t|P_{t-1}) = P_{t-1}$$

So the posterior is

$$E(P_t|Y_t) = \frac{\sigma^2_\epsilon}{\sigma^2_\theta+\sigma^2_\epsilon}P_{t-1} + \frac{\sigma^2_\theta}{\sigma^2_\theta+\sigma^2_\epsilon}Y_t $$
In [ ]: