The Federal Reserve Board eagle logo links to home page

Skip to: [Printable Version (PDF)] [Bibliography] [Footnotes]
Finance and Economics Discussion Series: 2014-75 Screen Reader version

Robust Dynamic Optimal Taxation and Environmental Externalities *

Xin Li
IMF
Borghan Narajabad
Federal Reserve Board
Ted Temzelides
Rice University

First Version: 11/26/2012
Current Version: 05/29/2014



Abstract:

We study a dynamic stochastic general equilibrium model in which agents are concerned about model uncertainty regarding climate change. An externality from greenhouse gas emissions damages the economy's capital stock. We assume that the mapping from climate change to damages is subject to uncertainty, and we use robust control theory techniques to study efficiency and optimal policy. We obtain a sharp analytical solution for the implied environmental externality and characterize dynamic optimal taxation. A small increase in the concern about model uncertainty can cause a significant drop in optimal fossil fuel use. The optimal tax that restores the socially optimal allocation is Pigouvian. Under more general assumptions, we develop a recursive method and solve the model computationally. We find that the introduction of uncertainty matters qualitatively and quantitatively. We study optimal output growth in the presence and in the absence of concerns about uncertainty and find that these concerns can lead to substantially different conclusions.

Keywords: Climate change, optimal dynamic taxation, uncertainty, robustness

JEL Classification: Q54, Q21, Q58, Q48, D81

1 Introduction

We study optimal taxation in a dynamic stochastic general equilibrium model in which agents are concerned about model uncertainty. We assume that an externality through global temperature changes resulting from greenhouse gas emissions (GHG) adversely affects the economy's capital stock and, thus, its output. The precise effects of this externality, however, are subject to uncertainty. Most existing approaches, however, only incorporate the uncertainty associated with climate change in a limited way (Stern, 2013). To fill the gap, we focus on the implications of this uncertainty. In order to model the effect of the emissions created by economic activity on the environment, we employ the framework used in Golosov, Hassler, Krusell, and Tsyvinski (2013, GHKT hereafter).1 While they assume that the mapping from climate change to damages is subject to risk, in our model this mapping is subject to Knightian uncertainty. We study the implications of this assumption using a robust control approach. We believe that this is an appropriate application of uncertainty in economic modeling. After all, man-made climate change is unprecedented, and there is an ongoing heated debate about its potential effects. Although our model does not include the risks of large-scale human migration or conflict resulting from climate change, it proposes a robust control approach as an alternative to standard probability distribution-based modeling. More specifically, concerned about model uncertainty, a social planner in our model maximizes social welfare under a "worst-case scenario."

In addition to taking model uncertainty into consideration, there are two additional differences between our assumptions and those in GHKT. First, we find it convenient to assume that the environmental externality indirectly affects output through the capital stock. As a result, the theoretical analysis in our model brings different results, although the two assumptions lead to identical results if we assume 100 percent capital depreciation (as we do in the computational part). A second difference is that our estimates of total fossil fuel supplies are significantly larger than theirs. This is partly due to the addition of the supply of unconventional oil and gas, but mainly due to our consideration of estimated methane hydrate resources.2

Under additional assumptions, we obtain a sharp analytical solution for the implied pollution externality, and we characterize dynamic optimal taxation. A small increase in the concern about model uncertainty can cause a significant drop in optimal energy extraction. The optimal tax, which restores the social optimal allocation, is Pigouvian. Under more general assumptions, we develop a simple recursive method that allows us to solve the model computationally. We find that the introduction of uncertainty matters in the sense that our model produces results that are qualitatively different - for example, in terms of oil consumption - from those of GHKT. At the same time, concerns about uncertainty do not affect renewable energy adoption. The reason is that, rather than being driven by renewable energy use, the margin that determines short-term decisions regarding energy sources is driven by two factors: the trade-off between higher versus lower total energy consumption, and the choice of coal versus gas/oil. We find that oil use in our model can be flat for some parametrizations. We study optimal output growth in the presence and in the absence of concerns about uncertainty and find that the results can be very different. In the worst-case scenario, optimality implies that a small sacrifice in yearly output can prevent a large future welfare loss.

As the green energy sector does not create emissions in our model, we find that the optimal path for the use of green energy does not directly depend on the level of concern about model uncertainty. However, since green energy, coal, and oil are substitutes, model uncertainty does indirectly affect the use of green energy, through its impact on coal and oil. We also find that an increase in the concern about model uncertainty causes a significant decline in the use of coal, while the use of oil is slightly delayed. Holding other parameters fixed, the optimal path of oil consumption is jointly determined by the resource scarcity effect and the model uncertainty effect. Naturally, we do not find a significant difference in oil consumption when the scarcity effect dominates. However, when we consider a higher level of initial resources of fossil fuel, the concern about model uncertainty substantially discourages the use of oil.

Kolstad (1996) discusses uncertainty in integrated assessment models but does not employ techniques from robust control. Existing work that employs robust control or related techniques in order to address issues related to model uncertainty includes Hennlock (2008, 2009), Funke and Paetz (2010), Sterner and Hennlock (2011), and Lemoine and Traeger (2011). These papers employ a version of Nordhaus's DICE model and we build our analysis closely on GHKT (2013), which is consistent with the DICE model. Using GHKT allows to derive analytical results under a set of additional assumptions. In related recent work, Weitzman (2014) considers the social costs of carbon when catastrophic climate-related events follow a fat-tailed distribution.3 In addition to building on GHKT, our paper relies on existing work in robust control theory from both economics and engineering. In the traditional stochastic control literature, uncertainties in the system are modeled using probability distributions. The goal there is to derive a policy that works best "on average." In contrast, given a bound on uncertainty, robust control is concerned with optimizing performance under a so-called worst-case scenario.4 Hansen and Sargent (2001) introduce techniques from robust control theory to dynamic economic decision making problems.5 They point out the connection between the max-min expected utility theory of Gilboa and Schmeidler (1989) and the applications of robust control theory proposed by Anderson et al. (2000) and Dupuis et al. (1998). Hansen, Sargent, Turmuhambetova and Williams (2006) give a thorough introduction to the robust control approach. They discuss applications to a wide range of problems within the linear-quadratic-Gaussian framework.6

As is standard in the robust control literature, our paper postulates the problem of optimal fossil fuel extraction as a two-person zero-sum dynamic game: in each stage, a social planner (a representative household in the decentralized version) maximizes social welfare (lifetime utility) by choosing the level of energy extraction, consumption, labor and capital investment. Subsequently, a malevolent player chooses alternative distributions in order to minimize the respective payoff. Our work contributes to the existing literature of applications of robust control in economics in two ways. First, it explores a class of models under a non-quadratic objective and non-linear constraints. In that regard, we demonstrate that models of the type used in GHKT (2013) can be restated in a robust control framework. We then derive some sharp analytical results, and compute the resulting model numerically. Second, we employ the exponential distribution as the approximating distribution. While existing studies usually employ the linear-quadratic model combined with Gaussian distributions in order to produce analytical solutions, our work shows that the approximating distribution for models with log-utility and full depreciation of capital can be drawn from either the normal or the exponential family.

The paper proceeds as follows. Section 2 presents the basic model. Section 3 studies the model analytically, while Section 4 presents our numerical and quantitative findings. A brief conclusion follows. Technical material appears in the appendices.

2 The Model

In order to characterize the optimal policy for the case where there is a concern about climate change and model uncertainty, we first formulate a general framework for the robust planner's problem, a benchmark that we will subsequently compare to decentralized market solutions.

Time,  t , is discrete and the horizon is infinite. The world economy is populated by a  [0,1] -continuum of infinite-lived representative agents with utility

\displaystyle E_{0}\sum\limits _{t=0}^{\infty}\beta^{t}u(C_{t}). (1)

The function  u is a standard concave period utility function,  C_{t} represents final-good consumption in period  t , and  \beta\in(0,1) is the discount factor. The final goods sector uses energy,  E , capital,  K , and labor,  N , to produce output. Labor supply is inelastic. The economy's capital stock depreciates at rate  \delta\in(0,1) . Henceforth,  \tilde{K} represents the end-of-period capital (before interacting with the climate factor through the process described below). The feasibility constraint in the final goods sector is given by
\displaystyle C_{t}+\tilde{K}_{t+1}=Y_{t}+(1-\delta)K_{t}. (2)

There are four production sectors. The final-goods sector, indexed by  i=0 , produces the consumption good. The corresponding production function is given by  Y=F(K,N_{0},E) . Thus, in addition to capital and labor, production of the final good requires the use of energy,  E . The three energy-producing sectors for oil, coal, and green energy (labelled by  i=1,2,3 , respectively) produce energy amounts  E_{1} ,  E_{2} and  E_{3} (measured in carbon equivalents). The oil sector is assumed to produce oil at zero cost. We denote by  R the total oil energy stock, and we impose the resource constraint,  R_{t}\geq0 , for all  t . Both the coal and the green energy sectors use linear technologies
\displaystyle E_{i}=A_{i}N_{i}, \displaystyle i=2,3. (3)

We follow GHKT in modeling a simplified carbon cycle as follows. The variable  S (measured in units of carbon content) represents the GHG concentration in the atmosphere in excess of the pre-industrial level. We denote by  P and  T the permanent and temporary components of  S , respectively. These evolve according to the following equations.


\displaystyle P^{\prime} \displaystyle = \displaystyle P+\phi_{L}(E_{1}+E_{2}), (4)
\displaystyle T^{\prime} \displaystyle = \displaystyle (1-\phi)T+(1-\phi_{L})\phi_{0}(E_{1}+E_{2}), (5)
\displaystyle S^{\prime} \displaystyle = \displaystyle P^{\prime}+T^{\prime}. (6)

We introduce model uncertainty regarding climate change through a stochastic variable,  \gamma , which reduces the end-of-period capital stock  \tilde{K}^{\prime} by a factor of  h(S^{\prime},\gamma) to  K^{\prime} . That is,  K^{\prime}=h(S^{\prime},\gamma)\tilde{K}^{\prime} . While  \gamma directly affects output in GHKT, we find it convenient to assume that  \gamma adversely affects the economy's capital stock. The two assumptions are identical under a Cobb-Douglas production function and an exponential damage function (which we assume throughout this paper).7 We use  \pi(\gamma) to denote the approximating distribution of  \gamma , while  \hat{\pi}(\gamma) denotes the welfare-minimizing distribution and  m(\gamma)=\frac{\hat{\pi}(\gamma)}{\pi(\gamma)} is the likelihood ratio. The distance,  \rho , between  \hat{\pi}(\gamma) and  \pi(\gamma) is measured by relative entropy:
\displaystyle \rho(\hat{\pi}(\gamma),\pi(\gamma))\equiv E[m(\gamma)\log m(\gamma)]\equiv\hat{E}[\log m(\gamma)]\equiv\int[m(\gamma)\log m(\gamma)]\pi(\gamma)\mathrm{d}\gamma. (7)

As is standard in robust control, the concern about model uncertainty is represented by a two-person zero-sum dynamic game in which, after observing the choice of a social planner, a malevolent player chooses the worst specification of the model in each period. This game proceeds as follows. Our attention will be restricted to a particular type of equilibrium, the so-called Markov perfect (or feedback) equilibrium. This equilibrium is strongly time-consistent. At the beginning of a period, the state value of  (K,N,P,T,R) is revealed. Then, the planner chooses  (C,E_{i},N_{i},\tilde{K}^{\prime},P^{\prime},T^{\prime},S^{\prime},R^{\prime}) in order to maximize social welfare. After observing the planner's choice, nature (the "malevolent player") chooses an alternative distribution  \hat{\pi}(\gamma) or, equivalently,  m(\gamma) , to minimize welfare. Note that any deviation from the approximating distribution will be penalized by adding  \alpha\rho(\hat{\pi}(\gamma),\pi(\gamma)) to the objective function. Here,  \alpha represents the magnitude of the "punishment." A greater  \alpha means a greater penalty associated with the deviation of  \gamma from its approximating distribution and thus a lower concern about robustness.

This leads to the following social planner's problem:

    \displaystyle V(K,N,P,T,R)=\max_{\{C,E_{i},N_{i},\tilde{K}^{\prime},P^{\prime},T^{\prime},S^{\prime},R^{\prime}\}}\min_{m(\gamma)}  
    \displaystyle \left\{ u(C)+\beta\int\!\left[m(\gamma)V(K^{\prime},N^{\prime},P^{\prime},T^{\prime},R^{\prime})+\alpha m(\gamma)\log m(\gamma)\right]\pi(\gamma)\mathrm{d}\gamma\right\}  
  \displaystyle s.t.    
    \displaystyle E_{i}=A_{i}N_{i}; \displaystyle i=2,3  
    \displaystyle E=(\kappa_{1}E_{1}^{\rho}+\kappa_{2}E_{2}^{\rho}+\kappa_{3}E_{3}^{\rho})^{1/\rho}  
    \displaystyle N=N_{0}+N_{2}+N_{3}  
    \displaystyle \tilde{K}^{\prime}=F(K,N_{0},E)+(1-\delta)K-C  
    \displaystyle K^{\prime}=h(S^{\prime},\gamma)\tilde{K}^{\prime}  
    \displaystyle R^{\prime}=R-E_{1}\geq0  
    \displaystyle N^{\prime}=A_{N}N  
    \displaystyle P^{\prime}=P+\phi_{L}(E_{1}+E_{2})  
    \displaystyle T^{\prime}=(1-\phi)T+(1-\phi_{L})\phi_{0}(E_{1}+E_{2})  
    \displaystyle S^{\prime}=P^{\prime}+T^{\prime}  
    \displaystyle 1=\int m(\gamma)\pi(\gamma)\mathrm{d}\gamma  

The social planner's problem can be solved analytically under a set of additional assumptions, and we will focus on the analytical solution first. We will discuss the decentralized problem and show that the socially optimal allocation can be restored by imposing appropriate fossil fuel taxes on the energy-producing sector.

3 The Analytical Solution

In this section, we will make the following additional assumptions. While these assumptions are admittedly strong, they allow us to fully solve the model analytically. As we shall see, certain aspects of the solution remain instructive in the next section, when the restrictive assumptions are dropped and the model is solved numerically.

(A1) The period utility function is given by  u(C)=\log(C) .

(A2) Capital depreciates fully; i.e.,  \delta=1 .

(A3) The production function is given by  F(K,N_{0},E)=A_{0}K^{\theta}N_{0}^{1-\theta-\nu}E^{\nu} .

(A4) The damage function is given by  h(S^{\prime},\gamma)=e^{-S^{\prime}\gamma} .8

(A5) The approximating distribution for  \gamma is exponential with mean  \lambda^{-1} and variance  \lambda^{-2} ; i.e.,  \pi(\gamma)=\lambda e^{-\lambda\gamma} .9

(A6.1)  \phi_{L}=0 .10

(A6.2)  \phi=0 .

(A7) There is a single fossil energy sector producing oil at zero cost. Production is subject to a resource feasibility constraint:  R^{\prime}\geq0 . As a result,  N_{1}=0 and  N_{0}=N .

(A8) There is no population growth, and the aggregate labor supply is normalized to 1 . That is,  A_{N}=1 and  N=1 in all periods.

(A9) There is no technology improvement. That is,  A_{0} is constant over time. We normalize  A_{0}=1 .

(A10) The resource feasibility constraint is not binding.

We will first solve the social planner's problem. We will then discuss the decentralized problem and show that the socially optimal allocation can be restored by implementing fossil fuel taxes on the energy-producing sector.

Under A1-A10, the social planner's problem can be rewritten as:

\displaystyle V(K,S) \displaystyle = \displaystyle \max_{\{C,E,\tilde{K}^{\prime},S^{\prime}\}}\min_{m(\gamma)}\{u(C)+\beta\int\!\left[m(\gamma)V(K^{\prime},S^{\prime})+\alpha m(\gamma)\log m(\gamma)\right]\pi(\gamma)\mathrm{d}\gamma\} (8)
  \displaystyle s.t.    
    \displaystyle \tilde{K}^{\prime}=F(K,E)-C  
    \displaystyle K^{\prime}=h(S^{\prime},\gamma)\tilde{K}^{\prime}  
    \displaystyle S^{\prime}=S+\phi_{0}E  
    \displaystyle 1=\int m(\gamma)\pi(\gamma)\mathrm{d}\gamma (9)

where  h(S^{\prime},\gamma)=e^{-S^{\prime}\gamma} and  F(K,E)=K^{\theta}E^{\nu} . To solve this problem, we first guess that  V(\cdot) takes the form
\displaystyle V(K^{\prime},S^{\prime})=f(S^{\prime})+\bar{A}\log(K^{\prime})+\bar{D}=f(S^{\prime})+\bar{A}\log(h(S^{\prime},\gamma)\tilde{K}^{\prime})+\bar{D} (10)

where  \bar{A} and  \bar{D} are undetermined coefficients. The functional form for  f(\cdot) will be derived when we solve the minimizing player's problem.

First, we define the robustness problem (the inner minimization problem) by

\displaystyle \mathcal{R}(V)(\tilde{K}^{\prime},S^{\prime}) \displaystyle = \displaystyle \min_{m(\gamma)}\int\left[m(\gamma)V(K^{\prime},S^{\prime})+\alpha m(\gamma)\log m(\gamma)\right]\pi(\gamma)\mathrm{d}\gamma  
  \displaystyle s.t.    
    \displaystyle K^{\prime}=e^{-S^{\prime}\gamma}\tilde{K}^{\prime}  
    \displaystyle 1=\int m(\gamma)\pi(\gamma)\mathrm{d}\gamma  

The first-order condition for  m(\gamma) implies that
\displaystyle m^{\ast}(\gamma)=\frac{\exp(-\frac{V(K^{\prime},S^{\prime})}{\alpha})}{\int\exp(-\frac{V(K^{\prime},S^{\prime})}{\alpha})\pi(\gamma)\mathrm{d}\gamma}=(1-\Delta S^{\prime})e^{\Delta S^{\prime}\lambda\gamma}    

or, equivalently,
\displaystyle \hat{\pi}^{\ast}(\gamma)=m^{\ast}(\gamma)\pi(\gamma)=\lambda^{\ast}e^{-\lambda^{\ast}\gamma},    

where we define  \Delta=\frac{\bar{A}}{\alpha\lambda} and  \lambda^{\ast}=\lambda(1-\Delta S^{\prime}) . The worst-case distribution of  \gamma remains exponential with a distorted mean  (\lambda^{\ast})^{-1} and variance  (\lambda^{\ast})^{-2} . Therefore,
\displaystyle \mathcal{R}(V)(\tilde{K}^{\prime},S^{\prime}) \displaystyle = \displaystyle \int\left[m^{\ast}(\gamma)V(K^{\prime},S^{\prime})+\alpha m^{\ast}(\gamma)\log m^{\ast}(\gamma)\right]\pi(\gamma)\mathrm{d}\gamma  
  \displaystyle = \displaystyle -\alpha\log[\int\exp(-\frac{V(K^{\prime},S^{\prime})}{\alpha})\pi(\gamma)\mathrm{d}\gamma]. (11)

Substituting equation(10) into equation(11), we obtain
\displaystyle \mathcal{R}(V)(\tilde{K}^{\prime},S^{\prime})=f(S^{\prime})+\bar{A}\log(\tilde{K}^{\prime})+\bar{D}+H(S^{\prime};\alpha,\bar{A}),    

where  H(S^{\prime};\alpha,\bar{A}) , the robust version of the externality from carbon emissions, is given by
\displaystyle H(S^{\prime};\alpha,\bar{A})=-\alpha\log[\int h^{-\frac{\bar{A}}{\alpha}}(S^{\prime},\gamma)\pi(\gamma)\mathrm{d}\gamma]    

It follows from (A4)-(A5) that
\displaystyle H(S^{\prime};\alpha,\bar{A})=\alpha\log(1-\Delta S^{\prime}).    

Next, we define the optimal choice problem (the outer maximization problem). Using the analysis above, this problem can be written as
\displaystyle V(K,S)=\max_{\{C,E,\tilde{K}^{\prime},S^{\prime}\}}\{\log(C)+\beta\mathcal{R}(V)(\tilde{K}^{\prime},S^{\prime})\}    

or equivalently,
\displaystyle f(S)+\bar{A}\log(K)+\bar{D} \displaystyle = \displaystyle \max_{C,E}\{\log(C)+\beta\lbrack f(S^{\prime})+\bar{A}\log(\tilde{K}^{\prime})+\bar{D}+H(S^{\prime};\alpha,\bar{A})]\}  
  \displaystyle s.t.    
    \displaystyle \tilde{K}^{\prime}=F(K,E)-C  
    \displaystyle S^{\prime}=S+\phi_{0}E  
    \displaystyle H(S^{\prime};\alpha,\bar{A})=\alpha\log(1-\Delta S^{\prime}).  

The first-order conditions imply
\displaystyle C \displaystyle = \displaystyle \frac{F(K,E)}{1+\beta\bar{A}} (12)
\displaystyle -\phi_{0}\left[\frac{\partial f(S^{\prime})}{\partial S^{\prime}}+\frac{\partial H(S^{\prime};\alpha,\bar{A})}{\partial S^{\prime}}\right] \displaystyle = \displaystyle \frac{1+\beta\bar{A}}{\beta}\frac{\frac{\partial F(K,E)}{\partial E}}{F(K,E)}. (13)

Noting that  H(S;\alpha,\bar{A}) is a logarithmic function of  S , we guess that  f(S)=\bar{B}\log(1-\Delta S) , where  \bar{B} is an undetermined coefficient. As a result, the above F.O.N.C. can be simplified to
\displaystyle C \displaystyle = \displaystyle \frac{K^{\theta}E^{\nu}}{1+\beta\bar{A}}  
\displaystyle \frac{\beta\phi_{0}\Delta(\alpha+\bar{B})}{1-\Delta S^{\prime}} \displaystyle = \displaystyle \frac{\nu(\beta\bar{A}+1)}{E}  

After some derivations, we obtain
\displaystyle \bar{A} \displaystyle = \displaystyle \frac{\theta}{1-\beta\theta}  
\displaystyle \bar{B} \displaystyle = \displaystyle \frac{1}{1-\beta}[\alpha\beta+\frac{\nu}{1-\beta\theta}]  

The expression for  \bar{D} is more complicated and less intuitive. Substituting  \bar{A}=\frac{\theta}{1-\beta\theta} into the first-order conditions, we obtain the optimal allocation. We summarize the above discussion in the following.


\begin{proposition} % latex2html id marker 460 Assume that (A1)-(A10) hold. The... ...Delta}$ and $\lambda^{\ast}=\lambda(1-\Delta S^{\prime\ast})$. \end{proposition}

A few technical remarks are in order. First, the function  V(K,S) is increasing in  K , decreasing in  S , and jointly concave in  K and  S . The value of  \bar{A} is the same as in the model without concern about model uncertainty. Both  E^{\ast} and  S^{\prime\ast} are affine functions of  S . In addition, it can be shown that, given  S , both  E^{\ast} and  S^{\prime\ast} are increasing functions of  \alpha . This is intuitive since a greater  \alpha implies a larger resulting penalty from a deviation of  \gamma from its approximating distribution, and thus a lower concern about model-uncertainty. Note that  C^{\ast} is affected by  S only through  E^{\ast} . This is due to logarithmic utility. As a result, a greater concern about model-uncertainty will lower both  E^{\ast} and  C^{\ast} . The value of the externality from one unit of emissions evaluated at  E^{\ast} is given by

\displaystyle \lambda^{s}=-\beta\frac{\partial V(K^{\prime},S^{\prime})}{\partial E}\vert _{K^{\prime\ast},S^{\prime\ast}}=\frac{\beta\phi_{0}\Delta(\bar{B}+\alpha)}{1-\Delta S^{\prime\ast}}=\frac{\nu}{c_{E}(1-\beta\theta)(1-\Delta S)}=\frac{\nu}{(1-\beta\theta)E^{\ast}}    

Our model so far is similar to the oil regime in GHKT, except that we assume that the resource constraint is not binding. Since  S_{t+1}=S_{t}+\phi_{0}E_{t} , we arrive at the following expression for the aggregate oil extraction
\displaystyle \sum\limits _{t=0}^{+\infty}E_{t}=\lim\limits _{t\rightarrow+\infty}\phi_{0}^{-1}(S_{t}-S_{0})=\phi_{0}^{-1}(\frac{1}{\Delta}-S_{0})    

Thus, the resource constraint is not binding if and only if the aggregate oil reserves are greater than  \phi_{0}^{-1}(\frac{1}{\Delta}-S_{0}) . Figures 1, 2, and 3 below illustrate how  E^{\ast} responds to a concern about model uncertainty. Figures 1 and 2 show how  E^{\ast} reacts to a change in the penalty parameter,  \alpha .

Another natural measurement for model uncertainty is the distance between  \hat{\pi}^{\ast}(\gamma) and  \pi(\gamma) ,  \delta , given by the relative entropy

\displaystyle \delta\equiv\rho(\hat{\pi}^{\ast}(\gamma),\pi(\gamma))=\log(1-\Delta S^{\prime\ast})+\frac{\Delta S^{\prime\ast}}{1-\Delta S^{\prime\ast}},    

where  \rho(\hat{\pi}^{\ast}(\gamma),\pi(\gamma)) can be viewed as the maximum deviation allowed from the approximating model,  \pi(\gamma) , given any penalty parameter,  \alpha . It is straightforward to verify that  \rho(\hat{\pi}^{\ast}(\gamma),\pi(\gamma)) is decreasing as  \alpha increases. Figure 3 shows how  E^{\ast} changes as we relax  \delta , allowing for more uncertainty about the approximating model. In the appendix we show that  \frac{\partial E^{\ast}}{\partial\delta}\vert _{\delta=0}=-\infty . That is, even an infinitesimal concern about model uncertainty can cause a significant drop in the optimal energy extraction.

Figure 1: The Effect of Penalty Parameter  \alpha on Optimal Carbon Emissions,  E

Figure 1: The Effect of Penalty Parameter alpha on Optimal Carbon Emissions, E. This figure illustrates the effect of the penalty parameter on optimal carbon emissions.  This figure is represented as a basic line graph. The x-axis is the penalty parameter, alpha, and it ranges from 0 to 30.  The y-axis is the optimal carbon emissions, E, and it ranges from 2000 to 2350.  When the penalty parameter is at zero the optimal carbon emissions is around 2040. The optimal carbon emissions increases sharply to approximately 2300 when the penalty parameter is equal to 5.  When the penalty parameter is larger than five, the effect of the penalty parameter on carbon emissions levels off to slightly above 2300 and remains stable regardless of the increases in the penalty parameter.


Figure 2: The Effect of  \alpha^{-1} on  E

Figure 2: The Effect of alpha-1 on E. This figure illustrates the effect of the inverse of the penalty parameter on optimal carbon emissions.  This figure is represented as a basic line graph.  The x-axis is the inverse of the penalty parameter and it ranges from 0 to 2. The y-axis is the optimal carbon emissions and it ranges from 2000 to 2350.  When the inverse of the penalty parameter is 0 the optimal carbon emissions is approximately 2350. There is an inverse relationship between the optimal carbon emissions and the inverse of the penalty parameter.  The optimal carbon emissions steadily declines till it reaches approximately 2050 when the inverse of the penalty parameter is equal to 2.


Figure 3: The Effect of Model Deviation as Measured by Entropy,  \delta , on  E

Figure 3: The Effect on Model Deviation as Measured by Entropy, delta, on E. This figure illustrates the effect on model deviation as measured by entropy on optimal carbon emissions.  This figure is represented as a basic line graph.  The x-axis is the entropy measure and it ranges from 0 to 8.  The y-axis is the optimal carbon emissions and it ranges from 2000 to 2350.  When the entropy measure is 0, the optimal carbon emissions is approximately 2400.  There is a steady decline in the optimal carbon emissions to approximately 2150 when the entropy measure is 2. When the entropy measure is greater than 2, the optimal carbon emissions and the entropy measure assume an inverse relationship. The optimal carbon emissions declines to approximately 2100 when the entropy measure is approximately 6.25.


Robust control modeling can be introduced in a variety of ways. So far, we have used a closed-loop zero-sum dynamic game in which the social planner moves first in each period. Alternatively, we can construct a game with the same information structure by interchanging the order of  \max and  \min in equation (8). The two games differ only in terms of the timing protocol. However, both lead to the same (unique) feedback saddle-point equilibrium if certain conditions are satisfied. More precisely, if (A1)-(A10) hold, then the objective in (8) is strictly concave in  C and  E , and strictly convex in  m(\gamma) . Consequently, the two closed-loop zero-sum dynamic games admit the same unique pure strategy saddle-point Nash equilibrium, which is the one described in Proposition 1.

Let us now turn to the decentralized problem. Suppose a percentage tax,  \tau_{t} , is imposed on emissions,  E_{t} . Since the extraction cost of energy (the cost of creating emissions) is zero, it must be true that

\displaystyle \tau_{t}=p_{t}=\frac{\partial F(K_{t},E_{t})}{\partial E_{t}}=\nu K_{t}^{\theta}E_{t}^{\nu-1}    

The above equation captures the one-to-one relationship between  E_{t} and  \tau_{t} . Therefore, to achieve the optimal emissions level,  E_{t}=c_{E}(1-\Delta S) in equation (14), we must impose  \tau_{t}=\nu c_{E}^{\nu-1}(1-\Delta S_{t})^{\nu-1}K_{t}^{\theta} . It is straightforward to show that  \tau_{t}=\frac{\lambda^{s}}{u^{\prime}(C_{t}^{\ast})} , where  C_{t}^{\ast} is the optimal consumption, given by equation (14). That is, the optimal tax on emissions is equal to the corresponding GHG externality measured in units of the consumption good. It remains to show that  C_{t}^{\ast} can be recovered under the optimal tax. This can be shown using the representative household's problem as follows. Since we have established a one-to-one relationship between  E_{t} and  \tau_{t} , we may assume without loss of generality that the planner chooses  E_{t} . Further, assume that  E_{t} is chosen as a function of  S_{t} only. This is without loss of generality, since our goal is to recover the optimal emissions in equation (14), which depends only on  S_{t} . Given  E=E(S) ,  k ,  K , and  S , a representative household solves:
\displaystyle V(k,K,S) \displaystyle = \displaystyle \max_{c,\tilde{k}^{\prime}}\min_{\hat{\pi}(\gamma)}\left\{ u(c)+\beta\hat{E}_{\gamma}\left[V(k^{\prime},K^{\prime},S^{\prime})+\alpha\log\left(\frac{\hat{\pi}(\gamma)}{\pi(\gamma)}\right)\right]\right\}  
  \displaystyle s.t.    
\displaystyle c+\tilde{k}^{\prime} \displaystyle = \displaystyle r(K,S)k+\tau(K,S)E(S)+\pi^{profit}  
\displaystyle \tilde{K}^{\prime} \displaystyle = \displaystyle G(K,S)  
\displaystyle k^{\prime} \displaystyle = \displaystyle e^{-\gamma S^{\prime}}\tilde{k}^{\prime}  
\displaystyle K^{\prime} \displaystyle = \displaystyle e^{-\gamma S^{\prime}}\tilde{K}^{\prime}  
\displaystyle S^{\prime} \displaystyle = \displaystyle S+\phi_{0}E(S)  

where  u(c)=\log(c) ,  r(K,S)=\theta K^{\theta-1}[E(S)]^{\nu} ,  \tau(K,S)=\nu K^{\theta}[E(S)]^{\nu-1} ,  \pi^{profit} is the firm's profit, and  \tilde{K}^{\prime}=G(K,S) is the equilibrium transition law for the aggregate capital stock. Here,  (k,K,S) stands for the beginning-of-period and  (\tilde{k}^{\prime},\tilde{K}^{\prime},S^{\prime}) for the end-of-period state. Notice that  (\tilde{k}^{\prime},\tilde{K}^{\prime}) is not equal to the beginning-of-next-period state,  (k^{\prime},K^{\prime}) , due to capital deterioration by a factor  e^{-\gamma S^{\prime}} . In addition,  \hat{E}_{\gamma} is calculated with respect to the worst-case distribution for  \gamma ,  \hat{\pi}(\gamma) , as chosen by the minimizing player. Since the minimizing player moves after the maximizing player, the worst distribution is, in general, conditional on the end-of-period state,  (\tilde{k}^{\prime},\tilde{K}^{\prime},S^{\prime}) . It can be shown that the optimal consumption sequence satisfies the following Euler equation:
\displaystyle u^{\prime}(c^{\ast})=\beta\frac{\int e^{-\gamma S^{\prime}}r(K^{\prime},S^{\prime})u^{\prime}(c^{\prime\ast})e^{-\frac{V(k^{\prime},K^{\prime},S^{\prime})}{\alpha}}\pi(\gamma)d\gamma}{\int e^{-\frac{V(k^{\prime},K^{\prime},S^{\prime})}{\alpha}}\pi(\gamma)d\gamma}    

This yields the following proposition.


\begin{proposition}Assume that (A1) - (A10) hold. The optimal energy consumption is $E=c_{E}(1-\Delta S)$. The optimal tax is $\tau_{t}=\frac{\lambda^{s}}{u^{\prime}(C^{\ast})}$, with tax proceeds rebated lump-sum to the representative consumer. The resulting competitive equilibrium allocation coincides with the solution to the planner's problem. That is, $c^{\ast}=C^{\ast}=(1-\beta\theta)K^{\theta}[c_{E}(1-\Delta S)]^{\nu}$. \end{proposition}

4 The Computational Solution and Calibration

In this section we first extend the analytical model by relaxing assumptions (A6.1) and (A6.2). For our baseline model, we will assume that  \pi(\gamma) , the approximating distribution of  \gamma , is exponential. As we now allow for  \phi_{L}>0 , we need to introduce two additional state variables ( P and  T ), since keeping track of the sum  S=P+T will no longer suffice. We will also relax (A7) by incorporating a coal and a green sector into the model. Furthermore, we will relax (A8) and (A9) by allowing  A_{2}N_{2} and  A_{3}N_{3} to grow at a rate of 2 percent per year. Last, we will drop (A10).

The social planner's problem becomes:

    \displaystyle V(K,N,P,T,R)=\max_{\{C,E_{1},E_{2},E_{3},E,\tilde{K}^{\prime},P^{\prime},T^{\prime},S^{\prime},R^{\prime}\}}\min_{m(\gamma)}  
    \displaystyle \{u(C)+\beta\int\!\left[m(\gamma)V(K^{\prime},N^{\prime},P^{\prime},T^{\prime},R^{\prime})+\alpha m(\gamma)\log m(\gamma)\right]\pi(\gamma)\mathrm{d}\gamma\}  
  \displaystyle s.t.    
\displaystyle E \displaystyle = \displaystyle (\kappa_{1}E_{1}^{\rho}+\kappa_{2}E_{2}^{\rho}+\kappa_{3}E_{3}^{\rho})^{1/\rho}  
\displaystyle \tilde{K}^{\prime} \displaystyle = \displaystyle F\left(K,N(1-\frac{E_{2}}{A_{2}N}-\frac{E_{3}}{A_{3}N}),E\right)-C  
\displaystyle K^{\prime} \displaystyle = \displaystyle h(S^{\prime},\gamma)\tilde{K}^{\prime}  
\displaystyle A_{2}^{\prime}N^{\prime} \displaystyle = \displaystyle (1+g)A_{2}N  
\displaystyle A_{3}^{\prime}N^{\prime} \displaystyle = \displaystyle (1+g)A_{3}N  
\displaystyle R^{\prime} \displaystyle = \displaystyle R-E_{1}\geq0  
\displaystyle P^{\prime} \displaystyle = \displaystyle P+\phi_{L}(E_{1}+E_{2})  
\displaystyle T^{\prime} \displaystyle = \displaystyle (1-\phi)T+(1-\phi_{L})\phi_{0}(E_{1}+E_{2})  
\displaystyle S^{\prime} \displaystyle = \displaystyle P^{\prime}+T^{\prime}  
\displaystyle 1 \displaystyle = \displaystyle \int m(\gamma)\pi(\gamma)\mathrm{d}\gamma  

To solve this problem we first argue that most of the analysis conducted in Section 3 carries over. The only difference is that the function  f(\cdot) no longer has a closed form expression. We will again apply the outer-inner loop method used in Section 3. The inner loop minimization problem is unchanged, while the outer loop maximization problem will be solved in parts. In that regard, it is important to note that solving the optimization problem for  E_{i} ,  P^{\prime} ,  T^{\prime} , and  R^{\prime} can be carried out separately from solving for  C and  \tilde{K}^{\prime} . Furthermore, the solution to the second optimization problem remains the same as in Section 3; i.e.,  C^{\ast}=(1-\beta\theta)Y^{\ast} and  \tilde{K}^{\prime\ast}=\beta\theta Y^{\ast} , where  Y^{\ast} denotes the optimal output level. After substituting for  C^{\ast} , the optimization problem for  E_{i} ,  P^{\prime} ,  T^{\prime} , and  R^{\prime} can be simplified, leading to the dynamic programming problem below:
\displaystyle f(N,P,T,R) \displaystyle = \displaystyle \max\limits _{E_{1},E_{2},E_{3},E,P^{\prime},T^{\prime},S^{\prime},R^{\prime}}\left\{ \begin{array}{l} \frac{1}{1-\beta\theta}\log\left[(1-\frac{E_{2}}{A_{2}N}-\frac{E_{3}}{A_{3}N})^{1-\theta-\nu}E^{\nu}\right]\ \ +\beta\left[f(N^{\prime},P^{\prime},T^{\prime},R^{\prime})+\alpha\log(1-\Delta S^{\prime})\right] \end{array}\right\} (14)
       
  \displaystyle s.t.    
    \displaystyle E=(\kappa_{1}E_{1}^{\rho}+\kappa_{2}E_{2}^{\rho}+\kappa_{3}E_{3}^{\rho})^{1/\rho}  
    \displaystyle N^{\prime}=(1+g)N  
    \displaystyle R^{\prime}=R-E_{1}\geq0  
    \displaystyle P^{\prime}=P+\phi_{L}(E_{1}+E_{2})  
    \displaystyle T^{\prime}=(1-\phi)T+(1-\phi_{L})\phi_{0}(E_{1}+E_{2})  
    \displaystyle S^{\prime}=P^{\prime}+T^{\prime}  

Next, we characterize the optimality conditions for  E_{3} ,  E_{2} , and  E_{1} , respectively. The first-order condition for  E_{3} implies
\displaystyle \frac{\nu\kappa_{3}}{E_{3}^{1-\rho}E^{\rho}}=\frac{1-\theta-\nu}{A_{3}N_{0}}.    

The first-order condition for  E_{2} gives
    \displaystyle \frac{1-\theta-\nu}{A_{2}N_{0}}  
  \displaystyle = \displaystyle \frac{\nu\kappa_{2}}{E_{2}^{1-\rho}E^{\rho}}+(1-\beta\theta)\beta\left[\phi_{L}\left(\frac{\partial f}{\partial P^{\prime}}-\frac{\alpha\Delta}{1-\Delta S^{\prime}}\right)+(1-\phi_{L})\phi_{0}\left(\frac{\partial f}{\partial T^{\prime}}-\frac{\alpha\Delta}{1-\Delta S^{\prime}}\right)\right].  

Applying the envelope theorem to  P and  T gives
\displaystyle \frac{\partial f}{\partial P} \displaystyle = \displaystyle \beta\left(\frac{\partial f}{\partial P^{\prime}}-\frac{\alpha\Delta}{1-\Delta S^{\prime}}\right) (15)
\displaystyle \frac{\partial f}{\partial T} \displaystyle = \displaystyle \beta(1-\phi)\left(\frac{\partial f}{\partial T^{\prime}}-\frac{\alpha\Delta}{1-\Delta S^{\prime}}\right). (16)

Defining  \hat{\Lambda}^{P}=-(1-\beta\theta)\frac{\partial f}{\partial P} and  \hat{\Lambda}^{T}=-(1-\beta\theta)\frac{\partial f}{\partial T} to be the marginal values of the externality caused by  P and  T , respectively, the first-order condition for  E_{2} becomes
\displaystyle \frac{1-\theta-\nu}{A_{2}N_{0}}=\frac{\nu\kappa_{2}}{E_{2}^{1-\rho}E^{\rho}}-\left[\phi_{L}\hat{\Lambda}^{P}+\frac{(1-\phi_{L})\phi_{0}}{1-\phi}\hat{\Lambda}^{T}\right]    

The marginal externality of  S ,  \hat{\Lambda}^{S} , can be calculated as the consequence of a unitary increase in  E_{1}+E_{2} . Note that increasing  E_{1}+E_{2} by one unit is equivalent to simultaneously increasing  P by  \phi_{L} units and  T by  \frac{(1-\phi_{L})\phi_{0}}{1-\phi} units. Therefore,  \hat{\Lambda}^{S} is given by
\displaystyle \hat{\Lambda}^{S}=\phi_{L}\hat{\Lambda}^{P}+\frac{(1-\phi_{L})\phi_{0}}{1-\phi}\hat{\Lambda}^{T}.    

Thus, we obtain
\displaystyle \frac{\nu\kappa_{2}}{E_{2}^{1-\rho}E^{\rho}}-\hat{\Lambda}^{S}=\frac{1-\theta-\nu}{A_{2}N_{0}}.    

This equation has the same form as the corresponding equation in GHKT, but under a different interpretation for  \hat{\Lambda}^{S} . To see the difference, it is convenient to restore the time index,  t . From equation (15) and equation (16) we have
\displaystyle \hat{\Lambda}_{t}^{P} \displaystyle = \displaystyle (1-\beta\theta)\alpha\Delta\sum\limits _{j=1}^{+\infty}\frac{\beta^{j}}{1-\Delta S_{t+j}}=\theta\bar{\gamma}\sum\limits _{j=1}^{+\infty}\frac{\beta^{j}}{1-\Delta S_{t+j}}  
\displaystyle \hat{\Lambda}_{t}^{T} \displaystyle = \displaystyle (1-\beta\theta)\alpha\Delta\sum\limits _{j=1}^{+\infty}\frac{[\beta(1-\phi)]^{j}}{1-\Delta S_{t+j}}=\theta\bar{\gamma}\sum\limits _{j=1}^{+\infty}\frac{[\beta(1-\phi)]^{j}}{1-\Delta S_{t+j}}.  

The second equality in either equation is obtained by using  (1-\beta\theta)\alpha\Delta=(1-\beta\theta)\alpha\frac{\bar{A}}{\alpha\lambda}=\theta\lambda^{-1}=\theta\bar{\gamma} , where  \lambda^{-1}=\bar{\gamma} is the mean of  \gamma under the approximating model. It follows immediately that  \hat{\Lambda}_{t}^{S} can be expressed as
\displaystyle \hat{\Lambda}_{t}^{S}=\theta\bar{\gamma}\sum\limits _{j=1}^{+\infty}\left[\phi_{L}\frac{\beta^{j}}{1-\Delta S_{t+j}}+\frac{(1-\phi_{L})\phi_{0}}{1-\phi}\frac{[\beta(1-\phi)]^{j}}{1-\Delta S_{t+j}}\right].    

It is instructive to consider the case when  \alpha\rightarrow+\infty ; i.e., when there is no concern about model uncertainty. Observe that  \Delta\rightarrow0 as  \alpha\rightarrow+\infty . Therefore,
\displaystyle \lim\limits _{\alpha\rightarrow+\infty}\hat{\Lambda}_{t}^{S} \displaystyle = \displaystyle \theta\bar{\gamma}\sum\limits _{j=1}^{+\infty}\left[\phi_{L}\beta^{j}+\frac{(1-\phi_{L})\phi_{0}}{1-\phi}[\beta(1-\phi)]^{j}\right]  
  \displaystyle = \displaystyle \theta\beta\bar{\gamma}\left[\frac{\phi_{L}}{1-\beta}+\frac{(1-\phi_{L})\phi_{0}}{1-(1-\phi)\beta}\right] (17)

Contrasting this equation with the corresponding equation (12) in GHKT,  \hat{\Lambda}_{t}^{S}=\bar{\gamma}\left[\frac{\phi_{L}}{1-\beta}+\frac{(1-\phi_{L})\phi_{0}}{1-(1-\phi)\beta}\right], we identify two differences. First, equation (17) contains an additional term ( \theta ). This is because GHG directly affect aggregate capital instead of output in our model. Second, the externality related to  P and  T is weighted by  \beta in equation (17). This is because GHG, in our model, affect next period's capital rather than the capital of the current period.

Finally, the first-order condition for  E_{1} yields

\displaystyle \frac{\nu\kappa_{1}}{E_{1}^{1-\rho}E^{\rho}}-\hat{\Lambda}^{S}=\beta\left[\frac{\nu\kappa_{1}}{(E_{1}^{\prime})^{1-\rho}(E^{\prime})^{\rho}}-(\hat{\Lambda}^{S})^{\prime}\right]    

Note that the operator  \mathbb{E}_{t} does not appear on the right-hand-side, as the planner optimizes under the worst-case scenario, rather than averaging over all cases. As the planner's problem has a similar structure as in the analytical model, it can be shown that analogues of Propositions 1 and 2 hold in this environment. We numerically solve the above problem for the cases where  \alpha =0.01 and  \alpha=100 . We use the same parameter values as in GHKT, except for  R_{0} , which is set to 800, as in Rogner (1997). Figures 4 through 6 plot the computed optimal paths.
Parameter  \phi  \phi_{L}  \phi_{0}  \theta  \nu  \beta  \rho  1+g
Value 0.0228 0.2 0.393 0.3 0.04  0.985^{10} -0.058  1.02^{10}
Parameter  P_{0}  T_{0}  R_{0}  \kappa_{1}  \kappa_{2}  A_{2,0}  A_{3,0}  \lambda^{-1}
Value 103 699 800 0.5008 0.08916  7,693  1,311  2.379\times10^{-5}


Figure 4: Optimal Use of Energy

Figure 4: Optimal Use of Energy. This figure contains four separate graphs. The upper left graph is the Optimal Green Energy Use graph. The upper right graph is the Optimal Coal Use graph. The bottom left graph is the Optimal Oil Use graph. The bottom right graph is the Optimal Carbon Concentration (net of the preindustrial level) graph. The upper left graph is the Optimal Green Energy Use graph. This figure is represented by a line graph depicting two separate trends.  The x-axis is the year and it ranges from 0 to 200. The y-axis is the fraction of Gitcon Carbon (GtC) per year and it ranges from 0 to 120. The upper left corner contains a legend. The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =100).  The two trends completely overlap each other at each data point. The double layered trend illustrates that when the year is 0, the optimal green energy use is also 0.  At year 50, the optimal green energy use is 5. As the year increases, the optimal green energy increases to 120 for both the robust optimal path and the non-robust optimal path. The upper right graph is the Optimal Coal Use graph. This figure is represented by a line graph depicting two separate trends. The x-axis is the year and it ranges from 0 to 200. The y-axis is the fraction of Gitcon Carbon (GtC) per year and it ranges from 0 to 15. The middle right side of the graph contains a legend. The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =100).  The robust optimal path (solid line) starts at year 0 and the GtC at 4.  It raises slightly from year 0 to year 50 reaching a GtC of 4.5. The GtC levels off at approximately 4.6 from year 150 to approximately year = 175. After year 175, there is a slight decline in GtC until it ends at approximately 4. The non-robust optimal path (dotted line) starts at year 0 and a GtC at 4.5. The trend increases in almost a diagonal path till it ends at year 198 and a GtC of 15. The lower left graph is the Optimal Oil Use graph. This figure is represented by a line graph depicting two separate trends. The x-axis is the year and it ranges from 0 to 200. The y-axis is the fraction of Gitcon Carbon (GtC) per year and it ranges from 0 to 12. The upper right side of the graph contains a legend. The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =100).  The robust optimal (solid) path and the non-robust optimal path (dotted) path almost completely overlap. There are minor differences with the non-robust optimal path being slightly higher from year 0 to year 50.  The trend starts at a year 0 and a GtC of 10. It continues with a downward curving slope till year 100 and a GtC at 3.  The trend completes itself at year 198 and a GtC of 1. The lower right graph is the Optimal Carbon Concentration (net of the preindustrial level) graph. This figure is represented by a line graph depicting two separate trends. The x-axis is the year and it ranges from 0 to 200. The y-axis is the fraction of Gitcon Carbon (GtC) per year and it ranges from 200 to 1400. The lower right side of the graph contains a legend. The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =100). The robust optima path (solid line) starts at year 0 and a GtC of 200. At year 50, the GtC is 400. At year 100, the GtC is 600. At year 150, the GtC is approximately 750 and at year 200, the GtC is 800. The non-robust optimal path (dotted line) illustrates an inverse relationship between year and GtC.  At year 0, the GtC is 200 and it concludes at year 200 with a GtC of 1400.


Figure 4 describes the optimal paths for the use of green energy, coal, and oil, as well as the resulting carbon concentration in the atmosphere, conditional on different levels of concern about model uncertainty. For simplicity, we refer to the optimal path under  \alpha=100 as the non-robust optimal path, and to the path under  \alpha =0.01 as the robust optimal path. Since the green energy sector does not inject carbon into the atmosphere, the optimal path for the use of green energy does not directly depend on the level of concern about model uncertainty regarding the externality from carbon emissions. However, since green energy, coal, and oil are substitutes, through its impact on the "dirty" energy sectors, model uncertainty considerations do affect the use of green energy indirectly.

We find that an increase in the concern about model uncertainty causes a significant decline in the use of coal. In contrast, the use of oil is delayed, but only slightly. As the supply of oil is finite, the decline rate of oil-use depends not only on model uncertainty, but also on resource scarcity. As we will show in the next section, an initial stock of oil equaling  R_{0}=800GtC is low enough so that the resource scarcity effect overwhelms the model uncertainty effect in determining the optimal use of oil in the economy. This explains why we do not observe a sharp decrease in the optimal use of oil when the concern about model uncertainty increases. Finally, straightforward calculation shows that the difference in energy use in the two optimal paths leads to a significant difference in the associated carbon accumulation. Our model predicts that if there is a small concern about model uncertainty (  \alpha=100 ), or if model uncertainty is not incorporated into the model (  \alpha =0.01 ), atmospheric carbon concentrations will reach a level as high as  1350GtC (net of preindustrial levels) after 180 years. However, this number is reduced by 40 percent to about  800GtC if concerns about model uncertainty are incorporated and addressed through the corresponding optimal tax, restoring the optimal energy path under  \alpha =0.01 .

Figure 5: Increases in Global Temperature

Figure 5: Increases in Global Temperature (based on the RICE model).  This figure illustrates the increase in global temperature over time.  This figure is a line graph illustrating two separate trends.  The x-axis is the year and it ranges from 0 to 200. The y-axis is the temperature measured in degrees Celsius and it ranges from 1 to 5.5.   The lower right side of the graph contains a legend. The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =100).  The robust optimal path (solid line) starts at year 0 and the temperature is 1.4C.  The temperature increases slowly to 3C around year 80.  The temperature continues to rise slowly to 3.8C at year 198.  The non-robust optimal path (dotted line) illustrates an inverse relationship between time and temperature. At year 0, the temperature is 1.4C.  At year 198, the temperature rises to 5.375C.


Figure 5 demonstrates a direct consequence of the above analysis: based on the mapping from carbon concentrations to global temperatures used in the RICE model,  T(S_{t})=3\ln(\frac{S_{t}}{\bar{S}})/\ln2 , the global average temperature will rise by 3.8 degrees Celsius 180 years from now if the concern about model uncertainty is addressed, and by 5.3 degrees Celsius otherwise.

Figure 6: Capital Stock and Output

Figure 6: Capital Stock and Output. This figure contains six separate graphs.  The upper left graph is the Total damages as a percent of Capital Stock (based on the approximating model) graph. The upper right graph is the Total damages as a percent of Capital Stock (based on the worst case model) graph. The middle left is the Capital Stock (based on the approximating model) graph. The middle right is the Capital Stock (based on the worst case model) graph. The lower left is the GDP per Period (based on the approximating model) graph. The lower right is the GDP per Period (based on the worst case model) graph. The upper left graph is the total damages as a percent of capital stock (based on the approximating model) graph.  This is a line graph depicting two separate trends.  The x-axis is the year and it ranges from 0 to 200.  The y-axis is percent of capital stock and it ranges from 0 to 4.  The upper left side of the graph contains a legend. The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =100).  The robust optimal path (solid line) starts at year 0 and the percent of capital stock is around 0.5. It increases slowly to year 100 and percent of capital stock of 1.7. The trend starts to level off at year 150 at percent of capital stock of approximately 2.  The non-robust optimal path (dotted line) illustrates a slowly increasing trend.  It also starts at year 0 with a percent of capital stock of approximately 0.5.  At year 200, the percent of capital stock is around 3.8. The upper right graph is the total damages as a percent of capital stock (based on the worst case model) graph.  This is a line graph depicting two separate trends.  The x-axis is the year and it ranges from 0 to 200.  The y-axis is percent of capital stock and it ranges from 0 to 40.  The upper left side of the graph contains a legend. The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =100).  The robust optimal path (solid line) starts at year 0 and percent of capital stock at 0.  It increases modestly by year 50 and percent of capital stock of 1. The trend increases slowly till year 198 and percent of capital stock at 8.  The non-robust optimal path (dotted line) starts at year 0 and percent of capital stock of 0.  It follows a similar path as the robust optimal path till around year 60 where it start to increase. By year 100 the percent of capital stock is approximately 8. At year 125, the trend increases faster and the percent of capital stock is close to 17.  At year 147, the percent of capital stock increases to 38. The middle left graph is the capital stock (based on the approximating model) graph. This is a line graph depicting two separate trends. The x-axis is the year and it ranges from 0 to 200. The y-axis is the capital stock and it ranges from 0.1 to 0.2.  The lower right side contains a legend.  The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =100).  The two trends overlap almost completely with slight differences towards the tail end of the trend.  At year 0, the capital stock is 0.1.  At year 5, the capital stock is just below 0.15.  At year 50, the trends rise to a capital stock measure of around 1.175 and levels off for the remainder of the time period. The non-robust optimal path (dotted line) is slightly lower in the tail of the distribution. The middle right graph is the capital stock (based on the worst case model) graph. This is a line graph depicting two separate trends. The x-axis is the year and it ranges from 0 to 200. The y-axis is the capital stock and it ranges from 0.1 to 0.2.  The lower right side contains a legend.  The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =100).  The two trends overlap almost completely till year 70.  At this point the non-robust optimal path (dotted line) trends downward, while the robust optimal path (solid line) decreases minimally.   At year 0, the capital stock is 0.1.  At year 50, the capital stock is around 0.175.  At year 70, the non-robust optimal path starts to decline till year 105 when the capital stock is around 0.14.  The robust optimal path also starts to decline but at a much slower pace; reaching a capital stock around 0.152 at year 198. The lower left graph is the GDP per period (based on the approximating model) graph.  This is a line graph depicting two separate trends. The x-axis is the year and it ranges from 0 to 200. The y-axis is the GDP per period (10 years) and it ranges from 0.5 to 0.7.  The lower right side contains a legend.  The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =100).  The two trends completely overlap for the whole sample period.  At year 0, the GDP per period is approximately 0.58.  It increases till around year 5 where the GDP per period is approximately 0.65.  At year 50, the GDP per period is around 0.68 and levels off for the remaining time period. The lower right graph is the GDP per period (based on the worst case model) graph.  This is a line graph depicting two separate trends. The x-axis is the year and it ranges from 0 to 200. The y-axis is the GDP per period (10 years) and it ranges from 0.5 to 0.7.  The lower right side contains a legend.  The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =100).  The two trends almost completely overlap till around year 100. At that point, the non-robust optimal path (dotted line) starts to decline while the robust optimal path (solid line) declines at a slower rate.  The trends follow the same path as the approximating model until around year 80 where the non-robust optimal path reaches as a GDP of 0.66. It continues to decline noticeably till year 130 where the GDP per percent is approximately 0.58 (the starting point). The robust optimal path also declines much more modestly where at year 198 the GDP per percent is approximately 0.64.


The graphs in the first (second) column in Figure 6 describe the paths of total damages as a percentage of the capital stock, as a function of the capital stock and output, assuming that the approximating model (worst-case model) for  \gamma is the true model.11 In each graph, the green-dashed line (blue-solid line) represents the outcome when energy is extracted based on the non-robust (robust) optimal path. The main findings can be summarized as follows. If the approximating model for  \gamma is the true model, pursuing the robust optimal path for energy consumption would further reduce total damages by an additional 1 percent 180 years from now. However, due to a more conservative use of oil and coal in the final good sector, such a policy will also reduce both capital stock and output in the long run. Since utility depends only on consumption (which is proportional to output), this implies that the welfare loss from over estimating the concern about uncertainty would be rather small. In contrast, if the true distribution of  \gamma evolves according to the worst-case model in each period (second column of Figure 6), the cost of implementing the non-robust optimal policy is rather large. In fact, the non-robust policy, which overlooks concerns about model uncertainty, will dramatically reduce the entire capital stock in 120 years, resulting in a large reduction in output and welfare.12

4.1 Varying the Approximating Distribution

Here we further explore the implications of assumption (A5). To this end, we now assume that the approximating distribution of  \gamma is normal with mean  \bar{\gamma} and variance  \sigma^{2} ; i.e.,  \pi(\gamma)=\frac{1}{\sqrt{2\pi\sigma^{2}}}e^{-\frac{(\gamma-\bar{\gamma})^{2}}{2\sigma^{2}}} . This creates two key differences. First, the normal distribution provides us with two degrees of freedom: the mean,  \bar{\gamma} , reflecting the planner's prior expectation regarding damages, and the variance,  \sigma^{2} , indicating the prior regarding model uncertainty. In comparison, recall that the exponential distribution only used one parameter,  \lambda , which determined both the mean and the variance of  \gamma . As we shall see below, assuming that  \gamma is normally distributed can also eliminate the "breaking point" for  S , which is always present when  \gamma follows an exponential. This is because the exponential distribution has a "fat"tail, thus allowing more room for nature to create a worst-case-scenario given a level of penalty,  \alpha . We have

\displaystyle H(S^{\prime};\alpha,\bar{A}) \displaystyle = \displaystyle -(\bar{\gamma}+\frac{\bar{A}\sigma^{2}}{2\alpha}S^{\prime})\bar{A}S^{\prime}  
\displaystyle \hat{\pi}^{\ast}(\gamma) \displaystyle \sim \displaystyle \mathcal{N}(\bar{\gamma}+\frac{\bar{A}\sigma^{2}}{\alpha}S^{\prime2},\sigma^{2}).  

It is straightforward to show that  H(\cdot) is strictly negative, strictly increasing in  \alpha , and strictly decreasing in both  \bar{\gamma} and  \sigma^{2} . In addition, the worst-case distribution for  \gamma also follows a normal distribution, and  \hat{\pi}^{\ast}(\gamma) and  \pi(\gamma) differ only in their means. That is, when choosing the worst-case model, nature only alters the mean of  \gamma , rather than its variance. As a by-product, the relative entropy of  \hat{\pi}^{\ast}(\gamma) with respect to  \pi^{\ast}(\gamma) is given by
\displaystyle \rho(\hat{\pi}^{\ast}(\gamma),\pi^{\ast}(\gamma))=\frac{{\bar{A}}^{2}\sigma^{2}S^{\prime2}}{2\alpha^{2}}.    

To complete the model, we need to replace the term  \alpha\log(1-\Delta S^{\prime}) in equation (28) with  -(\bar{\gamma}+\frac{\bar{A}\sigma^{2}}{2\alpha}S^{\prime})\bar{A}S^{\prime} . Accordingly, the optimality conditions for  E_{1} ,  E_{2} , and  E_{3} remain intact, except that the values of the externality associated with  P ,  T , and  E_{2} (or  E_{1} ), respectively, are now as follows:
\displaystyle \hat{\Lambda}_{t}^{P} \displaystyle = \displaystyle \frac{\beta\theta\bar{\gamma}}{1-\beta}+\frac{\theta\bar{A}\sigma^{2}}{\alpha}\sum\limits _{j=1}^{+\infty}\beta^{j}S_{t+j}  
\displaystyle \hat{\Lambda}_{t}^{T} \displaystyle = \displaystyle \frac{\beta(1-\phi)\theta\bar{\gamma}}{1-\beta(1-\phi)}+\frac{\theta\bar{A}\sigma^{2}}{\alpha}\sum\limits _{j=1}^{+\infty}[\beta(1-\phi)]^{j}S_{t+j}  
\displaystyle \hat{\Lambda}_{t}^{S} \displaystyle = \displaystyle \phi_{L}\hat{\Lambda}_{t}^{P}+\frac{(1-\phi_{L})\phi_{0}}{1-\phi}\hat{\Lambda}_{t}^{T}.  

Note that  \hat{\Lambda}_{t}^{S} reduces to the previous expression as  \alpha\rightarrow+\infty , or as  \sigma^{2}\rightarrow0 . That is,
\displaystyle \hat{\Lambda}_{t}^{S}=\theta\bar{\gamma}\left[\frac{\phi_{L}\beta}{1-\beta}+\frac{(1-\phi_{L})\phi_{0}\beta}{1-(1-\phi)\beta}\right], as \displaystyle \alpha\rightarrow+\infty, or \displaystyle \sigma^{2}\rightarrow0.    

We will consider three cases regarding the initial stock of fossil fuel:  R_{0}=253.8GtC ,  R_{0}=8000GtC , and  R_{0}=\infty . While the  R_{0}=\infty case is for expository purposes only, the other two cases are of interest. Indeed, the total stock of oil and gas is estimated to exceed  8000GtC if methane hydrates are included. Estimated resources of methane hydrates vary, but they alone can amount to as much as  2.1\times10^{4}GtC .13 For each case, we numerically solve the above problem for  \alpha =0.01 and for  \alpha=  +\infty . To draw an even closer comparison with GHKT, we have rescaled  \gamma by a factor of  1/\theta , where  \theta is the share of capital. The reason is that, given a Cobb-Douglas specification in final goods production, and given 100 percent depreciation of capital, a proportional damage of  e^{-\gamma S^{\prime}} on capital is equivalent to a proportional damage of  e^{-\theta\gamma S^{\prime}} on output. Accordingly, the mean and variance of  \gamma in the approximating model are set to  \bar{\gamma}=7.93\times10^{-5} and  \sigma^{2}=2.65\times10^{-8} , respectively.

Below we plot the same quantities as those shown in Figure 4 through Figure 6, but under the assumption that the approximating distribution of  \gamma is normal. Our focus here is to compare the effects of model uncertainty on optimal oil use under different values of  R_{0} . As we have discussed earlier, holding other parameters fixed, the optimal path of oil consumption is determined jointly by the resource scarcity effect and the model uncertainty effect. First, note that we can hardly identify a difference between the robust and the non-robust optimal paths for oil-consumption when the scarcity effect dominates, that is, when  R_{0} is sufficiently small. Figure 7 shows that when  R_{0}=253.8GtC , the non-robust optimal paths replicate their counterparts in GHKT. In this case, model uncertainty delays the optimal use of oil only slightly. However, Figure 10 displays an altogether different pattern. When  R_{0} is set to  8000GtC , although both paths are still decreasing over time, model uncertainty discourages the use of oil substantively. Finally, as  R_{0} goes to infinity, as shown in Figure 12, we observe a qualitative difference between the two paths. On the one hand, the non-robust optimal path allows the use of oil to grow unboundedly, partially due to the technological progress in the coal and green sectors. On the other hand, the increasing trend in oil consumption is curbed due to the externality caused by carbon emissions.

Figure 7: Optimal Use of Energy when  R_{0}=253.8

Figure 7: Optimal Use of Energy when R0 = 253.8.  This figure contains four separate graphs. The upper left graph contains the optimal green energy use graph. The upper right graph contains the optimal coal use graph. The bottom left graph contains the optimal oil use graph. The bottom right graph contains the optimal carbon concentration (net of the preindustrial level) graph. The upper left graph contains the optimal green energy use graph.  This figure is a line graph depicting two separate trends. The x-axis is the year and it ranges from 0 to 200. The y-axis is the Giticon Carbon (GtC) per year and it ranges from 0 to 80.  The middle left side of the graph contains the legend. The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =inf).  The two trends completely overlap. At year 0, the GtC per year is 0. At year 50, the GtC is approximately 5.  At year 100, the GtC is slightly larger than 20.  At year 155, the GtC per year is 63. The upper right graph is the optimal coal use graph. This figure is a line graph depicting two separate trends. The x-axis is the year and it ranges from 0 to 200. The y-axis is the GtC per year and it ranges from 0 to 5.  The middle right side of the graph contains the legend. The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =inf).  The robust optimal path (solid line) remains a constant from year 0 to year 155. The GtC per year is just below 1 for this entire period. The non-robust optimal path (dotted line) starts at year 0 and a GtC per year of approximately 2.5. The trend increases till year 100 at which point the GtC per year is 0.425. It ends around year 155 with a GtC per year of approximately 5. The lower left graph is the optimal oil use graph. This figure is a line graph depicting two separate trends. The x-axis is the year and it ranges from 0 to 200. The y-axis is the GtC per year and it ranges from 0.5 to 3.5. The upper right side of the graph contains the legend. The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =inf).  The robust optimal path (solid line) starts at year 0 and GtC per year of 2.8.  It has a steady downward decline till year 100 and a GtC per year of 1.  By year 152 the GtC per year is 0.5.  The non-robust optimal path (dotted line) starts at year 0 with a GtC per year of 3.5.  It has a sharp downward trend till year 50 with a GtC per year of 1.6.  After year 50, the non-robust optimal path is similar to the robust optimal path with each GtC per year measurement being slightly lower in the tail of the distribution.  The lower right graph is the optimal carbon concentration (net of the preindustrial level) graph. This figure is a line graph depicting two separate trends. The x-axis is the year and it ranges from 0 to 200. The y-axis is the GtC and it ranges from 200 to 600. The upper left side of the graph contains the legend. The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =inf). The robust optimal path (solid line) starts at year 0 with a GtC of 200. By year 50 the GtC is approximately 300.  By year 155 the GtC is approximately 325.  The non-robust optimal path (dotted line) starts at year 0 and GtC of 200. There is an inverse relationship between the year and the GtC for the non-robust optimal path. At year 155, the GtC is 600.


Figure 8: Increases in Global Temperature when  R_{0}=253.8

Figure 8: Increases in Global Temperature when R0 = 253.8. This is a line graph depicting two separate trends. The x-axis is year and it ranges from 1 to 160. The y-axis is the temperature measured in degrees Celsius and it ranges from 0 to 3.2. The bottom right side of the graph contains the legend.  The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =inf). The robust optimal path (solid line) starts at year 0 the temperature is 1.4C.  At year 70, the temperature rises to 1.9C.  At year 160, the temperature rises to 2.0C.  The non-robust optimal path (dotted line) starts at year 0 with a temperature of 1.4C. As time increases so does the temperature.  At year 80, the temperature rises to approximately 2.2C.  At year 160, the temperature increase to 3.0C.


Figure 9: Capital Stock and Output when  R_{0}=253.8

Figure 9: Capital Stock and Output when R0 = 253.8. This figure contains six separate graphs.  The upper left graph is the total damages as a percent of capital stock (based on the approximating model) graph.  The upper right graph is the total damages as a percent of capital stock (based on the worst case model) graph. The middle left graph is capital stock (based on the approximating model) graph. The middle right graph is the capital stock (based on the worst case model) graph. The lower left graph is the GDP per period (based on the approximating model) graph. The lower right graph is the GDP per period (based on the worst case model) graph. The upper left graph is the total damages as a percent of capital stock (based on the approximating model) graph.  This figure is a line graph depicting two separate trends.  The x-axis is year and it ranges from 0 to 200. The y-axis is the percent of capital stock and it ranges from 0 to 6.  The upper left side of the graph contains the legend.  The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =inf).The robust optimal path (solid line) shows modest increases with a the trend leveling off around year 100.  At year 0, the percent of capital stock is just below 2. By year 100, the percent of capital stock is approximately 3. After year 100, the percent of capital stock levels off to around 3.1.  The non-robust optimal path (dotted line) shows a sharper increase of percent of capital stock as time matures. At year 0, the percent of capital stock is just below 2. By year 100, the percent of capital stock is around 3.25. By year 155, the percent of capital stock is approximately 5. The upper right graph is the total damages as a percent of capital stock (based on the worst case model) graph.  This figure is a line graph depicting two separate trends.  The x-axis is year and it ranges from 0 to 200. The y-axis is the percent of capital stock and it ranges from 0 to 40.  The upper left side of the graph contains the legend.  The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =inf).The robust optimal path (solid line) shows modest increases with a the trend leveling off around year 100.  At year 0 the percent of capital stock is just below 5. By year 100, the percent of capital stock is approximately 15. After year 100, the percent of capital stock levels off to around 17.  The non-robust optimal path (dotted line) shows a sharper increase of percent of capital stock as time matures. At year 0, the percent of capital stock is just below 5. By year 100, the percent of capital stock is around 22. By year 155, the percent of capital stock is approximately 38.  The middle left graph is the capital stock (based on the approximating model) graph. This figure is a line graph depicting two separate trends.The x-axis is year and it ranges from 0 to 200. The y-axis is the percent of capital stock and it ranges from 0.1 to 0.2.  The lower right side of the graph contains the legend.  The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =inf). Both trends completely overlap till around year 100 where the non-robust optimal path (dotted line) trends slightly lower. At year 0, the capital stock is 0.1. At year 25, the capital stock is 0.16. At year 50, the trend levels off at a capital stock of 1.7.  At year 100, the non-robust optimal path trends slightly lower than the robust optimal path (solid line). The middle right graph is the capital stock (based on the worst case model) graph. This figure is a line graph depicting two separate trends.The x-axis is year and it ranges from 0 to 200. The y-axis is the percent of capital stock and it ranges from 0.05 to 0.2.  The lower left side of the graph contains the legend.  The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =inf). Both trends overlap till year 48 at which point the non-robust optimal path (dotted line) trends considerably lower.  At year 0, the capital stock is 0.1.  At year 48, the robust optimal path (solid line) levels off to a capital stock of 0.148.  At the same point, the non-robust optimal path starts decline steadily to year 155 with a capital stock of 0.8. The lower left graph is the GDP per year (based on the approximating model) graph. This figure is a line graph depicting two separate trends.The x-axis is year and it ranges from 0 to 200. The y-axis is the GDP per period (10 years) and it ranges from 0.5 to 0.7.  The lower right side of the graph contains the legend.  The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =inf).  Both trends almost completely overlap at each data point.  At year 0, the GDP per period is 0.58.  At year 50, the GDP per period rises to 0.65 at this point it starts to level off. At year 155, the GDP per period is 0.585. The lower right graph is the GDP per year (based on the worst model) graph. This figure is a line graph depicting two separate trends.The x-axis is year and it ranges from 0 to 200. The y-axis is the GDP per period (10 years) and it ranges from 0.55 to 0.65.  The lower left side of the graph contains the legend.  The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =inf).  The two trends almost completely overlap till year 48 at this point the non-robust optimal path (dotted line) declines sharply while the robust optimal path (solid line) levels off.  At year 0, the GDP per period is 0.56.  At year 48, the GDP is approximately 0.635. At this point the robust optimal path levels off.  After year 48, the non-robust optimal path decreases sharply till it concludes in year 155 with a GDP per period of 0.557.


Figure 10: Optimal Use of Energy when  R_{0}=8000

Figure 10: Optimal Use of Energy when R0 = 8000.  This figure contains four separate graphs.  The upper left graph is the optimal green energy use graph. The upper right graph is the optimal coal use graph. The lower left graph is the optimal oil use graph. The lower right graph is the optimal carbon concentration (net of the preindustrial level) graph. The upper left graph is the optimal green energy use graph.  This figure is a line graph depicting two separate trends. The x-axis is the year and it ranges from 0 to 200. The y-axis is the Giticon (GtC) per year and it ranges from 0 to 150.  The upper left side of the graph contains the legend. The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =inf).  The two trends almost completely overlap each other. The non-robust optimal path (dotted line) increases slightly more than the robust optimal path (solid line) at around year 100.  At year 0, the GtC per year is 0. At year 50, the GtC per year is around 5.  At year 100, the GtC per year is approximately 20. After year 100, the GtC per year increases more rapidly. By year 200, the GtC per year is approximately 130 for the robust optimal path and slightly higher for the non-robust optimal path. The upper right graph is the optimal coal use graph. This figure is a line graph depicting two separate trends. The x-axis is the year and it ranges from 0 to 200. The y-axis is the Giticon (GtC) per year and it ranges from 0 to 6.  The middle right side of the graph contains the legend. The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =inf).  The robust optimal path (solid line) remains constant for the entire period at a GtC per year of approximately 0.8.  At year 0, the GtC per year is about 2.5 for the non-robust optimal path (dotted line).  The GtC per year increases steadily till around year 198 when the GtC per year is approximately 5.9. The lower left graph is the optimal oil use graph.  This figure is a line graph depicting two separate trends. The x-axis is the year and it ranges from 0 to 200. The y-axis is the Giticon (GtC) per year and it ranges from 0 to 30.  The middle right side of the graph contains the legend. The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =inf).  The robust optimal path (solid line) remains fairly constant for the entire period with approximately 4.5-5 GtC per year.  The non-robust optimal (dotted line) remains constant till around year 150 where it starts to decline. At year 0, the GtC per year is approximately 27 where it remains constant till approximately year 150. After year 150, the GtC per year starts to decline till year 198; at this point the GtC per year is equal to around 23. The lower right graph is the optimal carbon concentration (net of the preindustrial level) graph.  This figure is a line graph depicting two separate trends. The x-axis is the year and it ranges from 0 to 200. The y-axis is the Giticon (GtC) and it ranges from 0 to 3000.  The upper left side of the graph contains the legend. The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =inf).  The robust optimal path (solid line) starts at GtC of approximately 200 at year 0. It increases modestly to a 500 GtC at year 198.  There is an inverse relationship between the time and the GtC for the non-robust optimal path (dotted line).  At year 0, the GtC is approximately 200. At year 198, the GtC is approximately 2800.


Figure 11: Increases in Global Temperature when  R_{0}=8000

Figure 11: Increases in Global Temperature when R0 = 8000.  This figure is a line graph depicting two separate trends. The x-axis is the year and it ranges from 0 to 200. The y-axis is the temperature measured in degrees Celsius and it ranges from 1 to 8.  The upper left side of the graph contains a legend. The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =inf).   The robust optimal path (solid line) starts at year 0 with a temperature of 1.5C.  At year 198, the temperature rises to 3C.  As time increases, the temperature increases steadily in the non-robust optimal path (dotted line).  At year 0, the temperature is at 1.5C.   At year 198, the temperature has risen to 7.8C.


Figure 12: Optimal Use of Energy when  R_{0}=\infty

Figure 12: Optimal Use of Energy when R0 = infinity.  This figure contains four separate graphs. The upper left graph is the optimal green energy use graph. The upper right graph is the optimal coal use graph. The lower left graph is the optimal oil use graph. The lower right graph is the optimal carbon concentration (net of the preindustrial level) graph.  The results are similar to the results in Figure 10; except for the optimal oil use graph.  In Figure 10, the non-robust optimal path (dotted line) starts to decline around year 150. In Figure 12, the non-robust optimal path does not start to decline but makes a modest increases from 23 to 25 GtC per year. The upper left graph is the optimal green energy use graph.  This figure is a line graph depicting two separate trends. The x-axis is the year and it ranges from 0 to 200. The y-axis is the Giticon (GtC) per year and it ranges from 0 to 150.  The upper left side of the graph contains the legend. The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =inf).  The two trends almost completely overlap each other. The non-robust optimal path (dotted line) increases slightly more than the robust optimal path (solid line) at around year 100.  At year 0, the GtC per year is 0. At year 50, the GtC per year is around 5.  At year 100, the GtC per year is approximately 20. After year 100, the GtC per year increases more rapidly. By year 200, the GtC per year is approximately 130 for the robust optimal path and slightly higher for the non-robust optimal path.  The upper right graph is the optimal coal use graph. This figure is a line graph depicting two separate trends. The x-axis is the year and it ranges from 0 to 200. The y-axis is the Giticon (GtC) per year and it ranges from 0 to 5.  The middle right side of the graph contains the legend. The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =inf).  The robust optimal path (solid line) remains constant for the entire period at a GtC per year of approximately 0.8.  At year 0, the GtC per year is about 2.25 for the non-robust optimal path (dotted line).  The GtC per year increases steadily till around year 198 when the GtC per year is approximately 4.9. The lower left graph is the optimal oil use graph.  This figure is a line graph depicting two separate trends. The x-axis is the year and it ranges from 0 to 200. The y-axis is the Giticon (GtC) per year and it ranges from 0 to 30.  The middle right side of the graph contains the legend. The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =inf).  The robust optimal path (solid line) remains fairly constant for the entire period with approximately 4.5-5 GtC per year.  The non-robust optimal (dotted line) remains constant till around year 150 where it starts to decline. At year 0, the GtC per year is approximately 23 where it increases slightly. At year 198, the GtC per year is approximately 25. The lower right graph is the optimal carbon concentration (net of the preindustrial level) graph.  This figure is a line graph depicting two separate trends. The x-axis is the year and it ranges from 0 to 200. The y-axis is the Giticon (GtC) and it ranges from 0 to 3000.  The upper left side of the graph contains the legend. The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =inf).  The robust optimal path (solid line) starts at GtC of approximately 200 at year 0. It increases modestly to a 500 GtC at year 198.  There is an inverse relationship between the time and the GtC for the non-robust optimal path (dotted line).  At year 0, the GtC is approximately 200. At year 198, the GtC is approximately 2800.


We now turn to a comparative analysis of the damages resulting from fossil fuel consumption. GHKT assume  R_{o}=253.8GtC and estimate damages of $56.9/ton of carbon using an annual discount rate of 1.5% and $496/ton under a rate of 0.1%. When  \beta=  0.985^{10} , and if there is no concern about model uncertainty (  \alpha =\infty ), the welfare loss implied by our model equals  0.985^{10}\times56.4=\$48.5/ ton. This number is independent of the approximating distribution for  \gamma , the initial stock of oil, and the future path of the GHG concentration. When  \alpha =0.01 , however, these factors can matter substantially, as seen below. If the approximating distribution is normal, the losses are given in the following table.

 R_{o}/\alpha 0.01 0.1 1 100  \infty
253.8 GtC 239.60 70.65 50.85 48.52 48.49
8000 GtC 276.60 90.60 55.08 48.57 48.49
 \infty 318.70 103.06 63.42 56.49 48.49

4.2 Varying the Resource Feasibility Constraint

In order to further explore the model's implications, we now report the results for the limit case where oil is in infinite supply, while coal is constrained under an initial stock  R_{coal}=666GtC . This case demonstrates that the optimal use of oil mimics that of the case in which both oil and coal are in infinite supply. In addition, the use of coal increases steadily at the beginning and then starts to drop.

Figure 13: Optimal Use of Energy when  R_{coal}=666

Figure 13: Optimal Use of Energy when Rcoal = 666. This figure contains four separate graphs.  The upper left graph is the optimal green energy use graph.  The upper right graph is the optimal coal use graph. The lower left graph is the optimal oil use graph.  The lower right graph is the optimal carbon concentration (net of the preindustrial level) graph. The results are similar to the Figure 12; except for optimal coal use and the optimal oil use graphs.  The upper left graph is the optimal green energy use graph.  This figure is a line graph depicting two separate trends. The x-axis is the year and it ranges from 0 to 200. The y-axis is the Giticon (GtC) per year and it ranges from 0 to 150.  The upper left side of the graph contains the legend. The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =inf).  The two trends almost completely overlap each other. The non-robust optimal path (dotted line) increases slightly more than the robust optimal path (solid line) at around year 100.  At year 0, the GtC per year is 0. At year 50, the GtC per year is around 5.  At year 100, the GtC per year is approximately 20. After year 100, the GtC per year increases more rapidly. By year 200, the GtC per year is approximately 130 for the robust optimal path and slightly higher for the non-robust optimal path. The upper right graph is the optimal coal use graph. This figure is a line graph depicting two separate trends. The x-axis is the year and it ranges from 0 to 200. The y-axis is the Giticon (GtC) per year and it ranges from 0 to3.5.  The middle left side of the graph contains the legend. The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =inf).  The robust optimal path (solid line) remains constant for the entire period at a GtC per year of approximately 0.5-8.  At year 0, the GtC per year is about 2.25 for the non-robust optimal path (dotted line).  At year 70, the GtC per year increases to its highest level of around 3.25.  After year 105, the GtC per year starts to decline to 1.8 at year 198. The lower left graph is the optimal oil use graph.  This figure is a line graph depicting two separate trends. The x-axis is the year and it ranges from 0 to 200. The y-axis is the Giticon (GtC) per year and it ranges from 0 to 30.  The middle right side of the graph contains the legend. The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =inf).  The robust optimal path (solid line) remains fairly constant for the entire period with approximately 4.5-5 GtC per year.  The non-robust optimal (dotted line) starts at GtC per year of 27 at year 0. It increases modestly till year 198 at GtC per year of 29. The lower right graph is the optimal carbon concentration (net of the preindustrial level) graph.  This figure is a line graph depicting two separate trends. The x-axis is the year and it ranges from 0 to 200. The y-axis is the Giticon (GtC) and it ranges from 0 to 3000.  The upper left side of the graph contains the legend. The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =inf).  The robust optimal path (solid line) starts at GtC of approximately 200 at year 0. It increases modestly to a 500 GtC at year 198.  There is an inverse relationship between the time and the GtC for the non-robust optimal path (dotted line).  At year 0, the GtC is approximately 200. At year 198, the GtC is approximately 2800.


Figure 14: Increases in Global Temperature when  R_{coal}=666

Figure 14: Increases in Global Temperature when Rcoal = 666.  This figure is a line graph depicting two separate trends.  The x-axis is year and it ranges from 0 to 200. The y-axis is the temperature measured in degrees Celsius and it ranges from 1 to 8. The upper left side of the graph contains the legend. The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =inf).  The robust optimal path (solid line) starts at year 0 with a temperature of 0.5C.  The temperature increases as time matures till at year 198 the temperature increases to 3C.  The non-robust optimal path (dotted line) starts at 0.5C at year 0.  The temperature increases more rapidly. At year 198, the temperature rises to 7.8C.


Figure 15: Capital Stock and Output when  R_{coal}=666

Figure 15: Capital Stock and Output when Rcoal = 666.  This figure contains six separate graphs. The upper left graph is the total damages as a percent of capital stock (based on the approximating model) graph. The upper right graph is the total damages as a percent of capital stock (based on the worst case model) graph. The middle left graph is the capital stock (based on the approximating model) graph. The middle right graph is the capital stock (based on the worst case model) graph. The lower left graph is the GDP per period (based on the approximating model) graph. The lower right graph is the GDP per period (based on the worst case model) graph. The upper left graph is the total damages as a percent of capital stock (based on the approximating model) graph. It is a line graph depicting two separate trends. The x-axis is the year and it ranges from 0 to 200. The y-axis is the capital stock and it ranges from 0 to 30. The upper left side contains the legend.  The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =inf).  The robust optimal path (solid line) remains somewhat constant ranging from 4.5-5 throughout. There is an inverse relationship between time and capital stock for the non-robust optimal path (dotted line). At year 0, the capital stock is 0. At year 198, the capital stock is approximately 20. The upper right graph is the total damages as a percent of capital stock (based on the worst case model) graph. It is a line graph depicting two separate trends. The x-axis is the year and it ranges from 0 to 200. The y-axis is the capital stock and it ranges from 0 to 100. The upper left side contains the legend.  The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =inf).  There is an inverse relationship between the time and capital stock for the robust optimal path (solid line). At year 0, the capital stock is 0. At year 198, the capital stock is approximately 45.  The non-robust optimal path (dotted line) shows an increasing line with a slight curvature near 50 and it levels off near year 100. At year 0, the capital stock is 0. At year 100, the capital stock is just below 100. At year 150, the capital stock is 100. The middle left graph is the capital stock (based on the approximating model) graph. It is a line graph depicting two separate trends.  The x-axis is the year and it ranges from 0 to 200. The y-axis is the capital stock and it ranges from 0.1 to 0.2.  The lower right hand side contains the legend. The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =inf).  The two trends overlap till around year 50, the non-robust optimal path (dotted line) starts to decline while the robust optimal path (solid line) starts a modest increase.  At year 0, the capital stock is 0.1. At year 25, the capital stock is 0.175.  At year 50, the non-robust optimal path starts to decline. By year 150, the capital stock is 0.15 for the non-robust optimal path.  At year 50, the capital stock is 0.175 for the robust-optimal path. By year 198, the robust optimal path is 0.18. The middle right graph is the capital stock (based on the worst case model) graph. It is a line graph depicting two separate trends.  The x-axis is the year and it ranges from 0 to 200. The y-axis is the capital stock and it ranges from 0.1 to 0.2.  The lower left hand side contains the legend. The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =inf).  At year 0, both paths have a capital stock of 0.1.  By year 10, the non-robust optimal path (dotted line) starts to decline.  By year 105, the capital stock is 0 and remains constant for the rest of the time period.  The robust optimal path (solid line) decreases modestly.  At year 25, the capital stock is 1.5. Shortly after year 25, the robust optimal path starts its modest decline. By year 198, the capital stock is 0.1 (its original starting point). The lower left graph is the GDP per period (based on the approximating model) graph. It is a line graph depicting two separate trends. The x-axis is the year and it ranges from 0 to 200. The y-axis is the GDP per period (10 years) and it ranges from 0.4 to 0.8.  The lower left side contains a legend. The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =inf).  Both trends follow a similar path with the non-robust optimal path (dotted line) being slightly higher for each data point. Each trend has small increasing trend in the earlier period but levels off shortly after year 25.  At year 0, the GDP period is .59 for the robust optimal path (solid line) and 0.6 for the non-robust optimal path.  At year 25, the trends reach their highest point. At this point the GDP per period is 0.65 for the robust optimal path and 0.7 for the non-robust optimal path. After year 25, both trends start to level off remaining between 0.65-0.7 for the robust optimal path and 0.7 for the non-robust optimal path. The lower right graph is the GDP per period (based on the worst case model) graph. It is a line graph depicting two separate trends. The x-axis is the year and it ranges from 0 to 200. The y-axis is the GDP per period (10 years) and it ranges from 0 to 1.  The lower left side contains a legend. The legend defines the solid line as the robust optimal path (alpha = 0.01) and the dotted line as the non-robust optimal path (alpha =inf).  The robust optimal path (solid line) remains constant for the entire period; ranging from 0.6-0.65. The non-robust optimal path (dotted line) follows a similar trend to the robust optimal path till around year 25. After year 25, the non-robust optimal path starts to decrease.  At year 50, the GDP per period is approximately 0.45. By year 150, the GDP per period is 0.1. By year 198, the GDP per period is slightly above 0.


5 Discussion

We studied optimal taxation in a dynamic stochastic general equilibrium model where agents are concerned about model uncertainty regarding climate change. We used robust control theory in order to model the uncertainty associated with climate change. Our work builds heavily on the model introduced in GHKT. While admittedly restrictive, this framework allows us to derive an analytical solution. In contrast to the existing literature, we used an estimate of fossil fuel that includes methane hydrates as part of the supply of unconventional natural gas. While this huge resource is not readily available with today's technology, we believe that it is appropriate to include it given the long-term modeling that we follow throughout this exercise. Finally, we assumed a fat-tailed distribution of damages as a way to capture the extreme effects discussed in Stern (2013).

We obtained a sharp analytical solution for the implied externality, and we characterized the optimal tax. We found that a small increase in the concern about model uncertainty can cause a significant drop in optimal energy extraction. The optimal tax which restores the socially optimal allocation was shown to be Pigouvian. Under more general assumptions, we developed a recursive method that allowed us to solve the model computationally. We showed that the introduction of uncertainty matters in a number of ways, both qualitatively and quantitatively. This dependence relies heavily on specific assumptions about the magnitude of fossil fuel reserves. As our model is based on GHKT, it is worth discussing some of the main differences in our results.

Several of the variables in the model developed in GHKT can be thought of as being subject to uncertainty. These include the variables governing the dynamics of CO _{2} concentration, those governing productivity growth and hence future production, the costs of alternative sources of energy (coal, oil-&-gas, and renewable), and, finally, the damages caused by the concentration of atmospheric CO _{2} . In this paper we concentrate on the uncertainty associated with damages from CO _{2} concentration. As in GHKT, we conclude that the consumption of coal should be constrained. However, as we considere a higher stock of hydrocarbons, we derive different results regarding total consumption of fossil fuel. As a result, we showe that under a less binding resource constraint, hydrocarbon use declines significantly as the concern about model uncertainty increases.

The core theoretical result in GHKT is that, when expressed as a proportion of GDP, the optimal tax on CO _{2} emissions depends only on the discount factor, the measure of the expected damage, and the depreciation of atmospheric CO _{2} . In particular, the tax rate is independent of the stochastic value of future output and the stock atmospheric CO _{2} . They derive this result based on three main assumptions: (i) logarithmic utility (which implies constant saving rate), (ii) the climate damage is proportional to GDP and has constant elasticity with respect to the level of atmospheric CO _{2} , and (iii) the stock of CO _{2} is linear in past and current emissions. We show that once we consider model uncertainty, the Pigouvian tax can implement the optimal allocation as in GHKT. However, the expected level of damage is no longer sufficient for determining the optimal tax. Specifically, the optimal tax rises as the concern about uncertainty increases, even though the expected damages remain unchanged.

Our model can be extended in many ways. For comparison purposes, we tried to stay close to the parametrization used in GHKT. We could study versions of the model under different parametrizations. In the current version, the growth rate of renewables is assumed to be independent from the concern about model uncertainty. It would be interesting to endogenize growth in renewable energy productivity. A related extension could involve using a distortionary tax on labor to subsidize R&D in renewables in order to study the effects on energy composition and growth. Additionally, we could study a benchmark case where coal supply is constrained, while assuming infinite supply of gas and oil. Finally, at the cost of significant additional computational complexity, we could consider more involved climate dynamics.

6 Appendix

6.1 Model Uncertainty and Optimal Energy Extraction

We demonstrate that the optimal level of GHG,  E^{\ast} , has the following properties:  \frac{\partial E^{\ast}}{\partial\delta}<0 and  \frac{\partial E^{\ast}}{\partial\delta}\vert _{\delta=0}=-\infty , where  \delta is the upper bound for entropy allowed in the constraint game.

Proof: Recall that  E^{\ast}=c_{E}(1-\Delta S) and  \delta=\log(1-\Delta S^{\prime\ast})+\frac{\Delta S^{\prime\ast}}{1-\Delta S^{\prime\ast}} , where  S^{\prime\ast}=S+\phi_{0}c_{E}(1-\Delta S) . Define  a=\alpha^{-1} and  b=1-\Delta S^{\prime\ast}=(1-\Delta\phi_{0}c_{E})(1-\Delta S) . It follows immediately that  E^{\ast} is decreasing in  a . In addition, since both  \Delta and  c_{E} are functions of  a , it follows that  b is a function of  a :

\displaystyle b(a)=[1-\Delta(a)\phi_{0}c_{E}(a)][1-\Delta(a)S]    

It is easy to see that  b is decreasing in  a . Thus, it defines  a as an implicit function of  b , with a negative slope. Moreover, we can rewrite  \delta as:
\displaystyle \delta=\log b+\frac{1-b}{b}    

which defines  b as an implicit function of  \delta . Direct calculation shows that  \frac{\partial b}{\partial\delta}=-\frac{b^{2}}{1-b}<0 , as  b\in(0,1) . Thus,
\displaystyle \frac{\partial E^{\ast}}{\partial\delta}=\frac{\partial E^{\ast}}{\partial a}\frac{\partial a}{\partial b}\frac{\partial b}{\partial\delta}<0    

Evaluating this at  \delta=0 , we obtain
\displaystyle \frac{\partial E^{\ast}}{\partial\delta}\vert _{\delta=0}=\left(\frac{\partial E^{\ast}}{\partial a}\vert _{a=0}\right)\left(\frac{\partial a}{\partial b}\vert _{b=1}\right)\left(\frac{\partial b}{\partial\delta}\vert _{\delta=0}\right)    

It is straightforward to show that the first two terms on the right hand side in the above expression are strictly negative and finite, and the last term goes to  -\infty . Therefore,  \frac{\partial E^{\ast}}{\partial\delta}\vert _{\delta=0}=-\infty .

6.2 Equivalence Between the Recursive Game and the Date-0 Game

Here we discuss the equivalence between the recursive Stackelberg game and its date-0 counterpart. We concentrate on the one-sector model. In the recursive version of the Stackelberg game, the worst-case model for  \gamma_{t+1} depends on the endogenous state  S_{t} and on the choice variable  E_{t} . This feature can be difficult to interpret.14 Alternatively, we can construct a date-0 Stackelberg game in which the malevolent player, as the leader of the game, chooses the distorted models of  \{\gamma_{t+1}\} ,  \{\hat{\pi}(\gamma_{t+1})\} , first. This leads to  \{\hat{\pi}(\gamma_{t+1})\} being independent of the endogenous states. We then show that, on the equilibrium path, the worst-case models derived from the date-0 Stackelberg game coincide with those derived from the recursive game. We demonstrate the equivalence by using the big  K little  k result as is in Chapter 7 of Hansen and Sargent (2008).

Consider the date-0 Stackelberg game in which, at date zero, the minimizing player chooses the distorted probability process  \{\hat{\pi}(\gamma_{t+1})\} , followed by the maximizing player choosing the control process  \{u_{t}=(C_{t},E_{t})\} :

\displaystyle \inf_{m\in\mathcal{M}}\sup_{u\in\mathcal{U}}   \displaystyle E\left[\sum\limits _{t=0}^{\infty}\beta^{t}M_{t}\left(u(C_{t})+\beta\alpha m_{t+1}\log m_{t+1}\right)\vert S_{0},K_{0}\right] (18)
\displaystyle s.t.      
\displaystyle M_{t+1} \displaystyle = \displaystyle M_{t}m_{t+1}  
\displaystyle S_{t+1} \displaystyle = \displaystyle S_{t}+\phi_{0}E_{t} (19)
\displaystyle K_{t+1} \displaystyle = \displaystyle h(S_{t+1},\gamma_{t+1})[F(K_{t},E_{t})-C_{t}] (20)

where  \mathcal{U} denotes the space of control processes  u=\{u_{t}:t=0,1,...\} and  \mathcal{M} denotes the space of likelihood ratio processes  m=\{m_{t+1}=\frac{\hat{\pi}(\gamma_{t+1})}{\pi(\gamma_{t+1})}:t=0,1,...\} .

We introduce an exogenous state vector process  \{(\hat{S}_{t},\hat{K}_{t})\} which evolves as:

\displaystyle \hat{S}_{t+1} \displaystyle = \displaystyle \hat{S}_{t}+\phi_{0}\hat{E}_{t}(\hat{S}_{t}),  
\displaystyle \hat{K}_{t+1} \displaystyle = \displaystyle h(\hat{S}_{t+1},\gamma_{t+1})[F(\hat{K}_{t},\hat{E}_{t}(\hat{S}_{t}))-\hat{C}_{t}(\hat{S}_{t},\hat{K}_{t})] (21)

where  \hat{E}_{t}(\hat{S}_{t})=c_{E}(1-\Delta\hat{S}_{t}) and  \hat{C}_{t}(\hat{S}_{t},\hat{K}_{t})=(1-\beta\theta)\hat{K}_{t}^{\theta}[\hat{E}_{t}(\hat{S}_{t})]^{\nu} .15 Note that  \{\hat{S}_{t},\hat{K}_{t},\hat{E}_{t},\hat{C}_{t}\} are independent of the control variables  \{E_{t},C_{t}\} . Moreover, we set  (S_{0},K_{0})=(\hat{S}_{0},\hat{K}_{0}) .

Define the distorted process  \{\gamma_{t+1}\} as

\displaystyle \gamma_{t+1}\sim\hat{\pi}(\gamma_{t+1})=\hat{\lambda}(\hat{S}_{t})e^{-\hat{\lambda}(\hat{S}_{t})\gamma_{t+1}}     (22)

where the distorted parameter,  \hat{\lambda} , is given by  \hat{\lambda}(\hat{S}_{t})=\lambda(1-\Delta\hat{S}_{t+1})=\lambda(1-\phi_{0}c_{E})(1-\Delta\hat{S}_{t}) . The last equality results from equation (48) in the main text. Clearly,  u_{t} does not affect  \hat{S}_{t+1} , and thus the distorted distribution  \hat{\pi}(\gamma_{t+1}) .

Given the above exogenous distorted process, the maximizing player chooses  \{u_{t}\} at date zero to maximize the social welfare given in equation (18). With the aid of the exogenous state, this maximization problem can be expressed in a recursive form as:

\displaystyle \tilde{V}(S_{t},K_{t},\hat{S}_{t},\hat{K}_{t})=\max\limits _{C_{t},E_{t}}\left\{ u(C_{t})+\alpha\beta\int\hat{\pi}(\gamma_{t+1})\log m_{t+1}d\gamma_{t+1}+\beta\int\tilde{V}(S_{t+1},K_{t+1},\hat{S}_{t+1},\hat{K}_{t+1})\hat{\pi}(\gamma_{t+1})d\gamma_{t+1}\right\} ,
subject to equation (19), equation (20), equation (21), and equation (21). The relative entropy  \int\hat{\pi}(\gamma_{t+1})\log m_{t+1}d\gamma_{t+1} equals  \log(\frac{\hat{\lambda}(\hat{S}_{t})}{\lambda})+\frac{\lambda-\hat{\lambda}(\hat{S}_{t})}{\hat{\lambda}(\hat{S}_{t})} , as has been shown in the main text. Since  \tilde{V}(\cdot) depends on  (\hat{S}_{t},\hat{K}_{t}) only through  \hat{\pi}(\gamma_{t+1}) or, equivalently,  \hat{\lambda}(\hat{S}_{t}) , the exogenous state  \hat{K}_{t} is eliminated from  \tilde{V}(\cdot) . Consequently, the above problem can be rewritten as:


\displaystyle \tilde{V}(S_{t},K_{t},\hat{S}_{t})=\max\limits _{C_{t},E_{t}}\left\{ u(C_{t})+\alpha\beta\left[\log(\frac{\hat{\lambda}(\hat{S}_{t})}{\lambda})+\frac{\lambda-\hat{\lambda}(\hat{S}_{t})}{\hat{\lambda}(\hat{S}_{t})}\right]+\beta\int\tilde{V}(S_{t+1},K_{t+1},\hat{S}_{t+1})\hat{\pi}(\gamma_{t+1})d\gamma_{t+1}\right\} ,     (23)

subject to equation (19), equation (20), and equation (21).

We proceed to find the solution to this date-0 problem of the maximizing agent given the distorted process  \{\gamma_{t+1}\} in equation (22). Then we will argue that this solution is identical to the Markov perfect equilibrium of the sequential game defined in the main text. We implement a guess-and-verify method. We first guess that  \tilde{V}(\cdot) takes the form

\displaystyle \tilde{V}(S_{t},K_{t},\hat{S}_{t})=f(S_{t},\hat{S}_{t})+\tilde{A}\log(K_{t})+\tilde{D}
where  \tilde{A} and  \tilde{D} are undetermined coefficients. The functional form for  f(\cdot) will be derived later. Using the analysis above and simplifications in the main text, the problem can be written as


    \displaystyle f(S_{t},\hat{S}_{t})+\tilde{A}\log(K_{t})+\tilde{D}  
  \displaystyle = \displaystyle \max\limits _{C_{t},E_{t}}\left\{ \log(C_{t})+\alpha\beta\left[\log(\frac{\hat{\lambda}(\hat{S}_{t})}{\lambda})+\frac{\lambda-\hat{\lambda}(\hat{S}_{t})}{\hat{\lambda}(\hat{S}_{t})}\right]+\beta\left[f(S_{t+1},\hat{S}_{t+1})+\tilde{A}\log(F(K_{t},E_{t})-C_{t})+\tilde{D}-\frac{\tilde{A}S_{t+1}}{\hat{\lambda}(\hat{S}_{t})}\right]\right\} ,  

subject to equation (19) and equation (21).

Furthermore, we guess that  f(\cdot) takes the form  \tilde{B}\log(1-\Delta\hat{S}_{t})+\frac{\tilde{G}S_{t}}{1-\Delta\hat{S}_{t}} where  \tilde{B} and  \tilde{G} are undetermined coefficients. After some tedious derivations, we obtain

\displaystyle \tilde{A} \displaystyle = \displaystyle \frac{\theta}{1-\beta\theta} (24)
\displaystyle \tilde{G} \displaystyle = \displaystyle \frac{\beta\theta}{(1-\beta\theta)\lambda(\beta-1+\Delta\phi_{0}c_{E})} (25)

and
\displaystyle E_{t}^{opt} \displaystyle = \displaystyle \frac{\nu(1-\Delta\hat{S}_{t+1})}{(1-\beta\theta)\beta\phi_{0}(\frac{\theta}{(1-\beta\theta)\lambda}-\tilde{G})}=c_{E}(1-\Delta\hat{S}_{t}) (26)
\displaystyle C_{t}^{opt} \displaystyle = \displaystyle \frac{F(K_{t},E_{t})}{1+\beta\tilde{A}}=(1-\beta\theta)K_{t}^{\theta}(E_{t}^{opt})^{\nu}. (27)

When  (S_{0},K_{0})=(\hat{S}_{0},\hat{K}_{0}) , we obtain  E_{t}^{opt}=\hat{E}_{t}(\hat{S}_{t})=E_{t}^{\ast}(S_{t}) ,  C_{t}^{opt}=\hat{C}_{t}(\hat{S}_{t},\hat{K}_{t})=C_{t}^{\ast}(S_{t},K_{t}) ,  \hat{S}_{t+1}=S_{t+1} , and  \hat{K}_{t+1}=K_{t+1} for  t=0,1,... . In addition,  \hat{\pi}(\gamma_{t+1})=\hat{\lambda}(\hat{S}_{t})e^{-\hat{\lambda}(\hat{S}_{t})\gamma_{t+1}}, where  \hat{\lambda}(\hat{S}_{t})=\lambda(1-\phi_{0}c_{E})(1-\Delta\hat{S}_{t})=\lambda(1-\phi_{0}c_{E})(1-\Delta S_{t}) . That is, the optimal choices in the date-0 game coincide with the Markov perfect equilibrium allocation in the recursive game.

6.3 The Numerical Solution for the Model

Here we provide a brief description of our numerical procedure. Assume (i) 100 percent capital depreciation, (ii) Cobb-Douglas production function, and (iii) exponential damage function. Then, it follows from the analysis in Sections 3 and 4 that the value function given in equation (61) takes the form

\displaystyle V(K,N,P,T,R)=f(N,P,T,R)+\bar{A}\log(K)+\bar{D}
where  \bar{A}=\frac{\theta}{1-\beta\theta} and  \bar{D} is a constant. The inner loop minimization problem for  \hat{\pi}(\gamma) remains the same as in the one-sector model in Section 3. Furthermore, the outer loop maximization problem for  E_{i} ,  P^{\prime} ,  T^{\prime} , and  R^{\prime} can be carried out separately from the optimization problem for  C and  \tilde{K}^{\prime} . The solution to the latter also remains the same as in Section 3; i.e.,  C^{\ast}=(1-\beta\theta)Y^{\ast} and  \tilde{K}^{\prime\ast}=\beta\theta Y^{\ast} , where  Y^{\ast} denotes the optimal output level. After substituting for  C^{\ast} , the optimization problem for  E_{i} ,  P^{\prime} ,  T^{\prime} , and  R^{\prime} can be simplified, leading to the standard dynamic programming problem below:
    \displaystyle f(N,P,T,R)=\max\limits _{E_{1},E_{2},E_{3},E,P^{\prime},T^{\prime},S^{\prime},R^{\prime}}  
    \displaystyle \left\{ \frac{1}{1-\beta\theta}\log[(1-\frac{E_{2}}{A_{2}N}-\frac{E_{3}}{A_{3}N})^{1-\theta-\nu}E^{\nu}]+\beta\lbrack f(N^{\prime},P^{\prime},T^{\prime},R^{\prime})+\alpha\log(1-\Delta S^{\prime})]\right\}  
       
  \displaystyle s.t.    
\displaystyle E \displaystyle = \displaystyle (\kappa_{1}E_{1}^{\rho}+\kappa_{2}E_{2}^{\rho}+\kappa_{3}E_{3}^{\rho})^{1/\rho}  
\displaystyle N^{\prime} \displaystyle = \displaystyle (1+g)N  
\displaystyle R^{\prime} \displaystyle = \displaystyle R-E_{1}\geq0  
\displaystyle P^{\prime} \displaystyle = \displaystyle P+\phi_{L}(E_{1}+E_{2})  
\displaystyle T^{\prime} \displaystyle = \displaystyle (1-\phi)T+(1-\phi_{L})\phi_{0}(E_{1}+E_{2})  
\displaystyle S^{\prime} \displaystyle = \displaystyle P^{\prime}+T^{\prime}  

We then solve for  f(N,P,T,R) using a 4-dimensional Chebyshev polynomial approximation method. The above simplification has significantly reduced the computational burden of solving a dynamic max-min game, allowing us to utilize the parallel toolbox of MATLAB on a 8-processor computer. Table 1 and Table 2 below report the grid specifications used in the complete model, as well as its variations for  \alpha =0.01 and  \alpha =\infty , respectively.


Table 1: Grid Specifications of Chebyshev Polynomial Approximation (  \alpha =0.01 )

  Exponetial  \gamma ,  R_{0}=800GtC (Oil&Gas) Normal  \gamma ,  R_{0}=253.8GtC (Oil&Gas) Normal  \gamma ,  R_{0}=8000GtC (Oil&Gas) Normal  \gamma ,  R_{0}=\infty GtC (Oil&Gas) Normal  \gamma ,  R_{0}=666GtC (Coal)
 \char93 of grid points for  P 6 6 6 6 6
 \char93 of grid points for  T 7 7 7 7 7
 \char93 of grid points for  R 10 10 10  NA 10
 \char93 of grid points for  N 10 10 10 10 10
 [P_{min},P_{max}]  [-200,1000]  [-200,1000]  [-200,1000]  [-200,1000]  [-200,1000]
 [T_{min},T_{max}]  [-200,1000]  [-200,1000]  [-200,1000]  [-200,1000]  [-200,1000]
 [R_{min},R_{max}]  [1,900]  [1,300]  [1,9000]  NA  [1,750]
 [N_{min},N_{max}]  [0.8,100]  [0.8,100]  [0.8,100]  [0.8,100]  [0.8,100]


Table 2: Grid Specifications of Chebyshev Polynomial Approximation (  \alpha =\infty )

  Exponetial  \gamma ,  R_{0}=800GtC (Oil&Gas) Normal  \gamma ,  R_{0}=253.8GtC (Oil&Gas) Normal  \gamma ,  R_{0}=8000GtC (Oil&Gas) Normal  \gamma ,  R_{0}=\infty GtC (Oil&Gas) Normal  \gamma ,  R_{0}=666GtC (Coal)
 \char93 of grid points for  P 6 6 4 6 6
 \char93 of grid points for  T 7 7 4 7 7
 \char93 of grid points for  R 10 10 30  NA 10
 \char93 of grid points for  N 10 10 10 10 10
 [P_{min},P_{max}]  [-200,2000]  [-200,1000]  [-200,20000]  [-200,20000]  [-200,3000]
 [T_{min},T_{max}]  [-200,2000]  [-200,1000]  [-200,20000]  [-200,20000]  [-200,3000]
 [R_{min},R_{max}]  [1,900]  [1,300]  [1,9000]  NA  [1,750]
 [N_{min},N_{max}]  [0.8,100]  [0.8,100]  [0.8,100]  [0.8,100]  [0.8,100]

Bibliography

Acemoglu, D., P. Aghion, L. Bursztyn, and D. Hemous (2012): The Environment and Directed Technical Change, American Economic Review 102(1) p. 131-166.
Adao, B., Narajabad B., and T. Temzelides, (2012)
: A Model with Spillovers in the Adaptation of New renewable Technologies, James A. Baker III Institute for Public Policy Working Paper.
Anderson, E., Hansen, L. P. and T. J. Sargent, (2000)
: Robustness, Detection and the Price of Risk, Mimeo, University of Chicago.
Boswell, R., and T.S. Collett, (2011)
: Current Perspectives on Gas Hydrate Resources, Energy and Environmental Science, 4, 1206-1215.
Barrage, (2013)
: Sensitivity Analysis for Golosov, Hassler, Krusell, and Tsyvinski (2013): Optimal Taxes on Fossil Fuel in General Equilibrium, Manuscript, University of Maryland.
Bidder, R.M., and M.E. Smith, (2012)
: Robust Animal Spirits, Journal of Monetary Economics, 59, 738-750.
Chandrasekharan, P. C., (1996)
: Robust Control of Linear Dynamical Systems, Academic Press.
Cogley, T., Colacito, R., Hansen, L. P. and T. J. Sargent, (2008)
: Robustness and U.S. Monetary Policy Experimentation, Journal of Money, Credit and Banking, 40 (8), 1599-1623.
Dasgupta and Heal, (1974)
: The Optimal Depletion of Exhaustable Resources, Review of Economic Studies, 41.
Dupuis, P., James, M. R. and I. Petersen, (1998)
: Robust Properties of Risk Sensitive Control, Discussion Paper No. LCDS 98-15, Brown University.
Ellsberg, D., (1961)
: Risk, Ambiguity and the Savage Axioms, Quarterly Journal of Economics, 75, 643-669.
Epstein, L. G. and T. Wang, (1994)
: Intertemporal Asset Pricing under Knightian Uncertainty, Econometrica, 62 (3), 283-322.
Funke, M., and M. Paetz, (2010)
: Environmental Policy Under Model Uncertainty: A Robust Optimal Control Approach, Climatic Change 107, no. 3-4: 225-239.
Gars, J., Golosov M., and A. Tsyvinski, (2009)
: Carbon Taxing and Alternative Energy, Manuscript, Yale University.
Golosov M., J. Hassler, P. Krusell, and A. Tsyvinski (2013): Optimal Taxes on Fossil Fuel in General Equilibrium, Econometrica, 82 (1), 41-88.
Gilboa, I., and D. Schmeidler, (1989)
: Maxmin Expected Utility with Non-unique Prior, Journal of Mathematical Economics, 18, 141-153.
Hansen L., and T. Sargent, (2008)
: Robustness, Princeton University Press.
Hansen, L. P., Sargent, T. J., and T. Tallarini (1999): Robust Permanent Income and Pricing, Review of Economic Studies, 66, 873-907.
Hansen, L. P., and T. J. Sargent, (2001)
: Robust Control and Model Uncertainty, American Economic Review, 91 (2), 60-66.
Hansen, L. P., and T. J. Sargent, (2003)
: Robust Control of Forward-looking Models, Journal of Monetary Economics, 50 (3), 581 - 604.
Hansen, L. P., and T. J. Sargent, (2005)
: Robust Control and Model Misspecification, Journal of Economic Theory, 128 (1), 45-90.
Hansen, L. P., and T. J. Sargent, (2010)
: Wanting Robustness in Macroeconomics, Manuscript.
Hansen, L. P., Sargent, T. J., Turmuhambetova, G. A., and N. Williams, (2006)
: Robust Control, Min-Max Expected Utility, and Model Misspecification, Journal of Economic Theory, 128, 45-90.
Hansen, L. P., Sargent, T. J., and N. E. Wang, (2002)
: Robust Permanent Income and Pricing with Filtering, Macroeconomic Dynamics, 6, 40-84.
Hartley, P., Medlock III K., Temzelides T., and X. Zhang, (2012)
: Energy Sector Innovation and Growth, James A. Baker III Institute for Public Policy Working Paper.
Hennlock, M., (2008)
: A Robust Feedback Nash Equilibrium in a Climate Change Policy Game, Chapter 18 in S. K. Neogy, et al., eds., Mathematical Programming and Game Theory for Decision Making. Statistical Science and Interdisciplinary Research: Volume 1. World Scientific Publishing Co. Pte. Ltd.
Hennlock, M., (2009)
: Robust Control in Global Warming Management - An Analytical Dynamic Integrated Assessment, Resources for the Future Discussion Paper RFF DP 09-19, May.
Hoel, M., (1978)
: Climate Change and Carbon Tax Expectations, CESifoWorking Paper Series 2966.
Hotelling, (1931)
: The Economics of Exhaustible Resources, Journal of Political Economy, 39:2.
Nordhaus, W., and J. Boyer, (2000)
: Warming the World: Economic Modeling of Global Warming, MIT Press, Cambridge, MA.
Kolstad, C. D. (1996: Learning and Stock Effects in Environmental Regulation: The Case of Greenhouse Gas Emissions. Journal of Environmental Economics and Management 31: 1-18.
Knight, F. H., (1921)
: Risk, Uncertainty and Profit, Houghton Mifflin Company.
Krusell P., and A. Smith, (2009)
: Macroeconomics and Global Climate Change: Transition for a Many-Region Economy, Presentation Slides, Yale University.
Lewis, F. L., (1986)
: Optimal Estimation with an Introduction to Stochastic Control Theory, John Wiley & Sons.
Lemoine D.M., and C. Traeger, (2011)
: Tipping Points and Ambiguity in the Economics of Climate Change, CUDARE Working paper 1111R, Department of Agricultural and Resource Economics, University of California at Berkeley.
Nordhaus W., (2008)
: A Question of Balance: Weighing the Options on Global Warming Policies, Yale University Press.
Rogner, H.-H., (1997)
: An Assessment of World Hydrocarbon Resources, Annual Review of Energy and the Environment, 22.
Savage, L. J., (1954)
: The Foundations of Statistics, John Wiley & Sons.
Sinn, H.-W., (2008)
: Public Policies against Global Warming: A Supply Side Approach, International Tax and Public Finance, 15(4).
Stern, N., (2007)
: The Economics of Climate Change: The Stern Review, Cambridge University Press.
Stern N., (2013)
: The Structure of Economic Modeling of the Potential Impacts of Climate Change: Grafting Gross Underestimation of Risk onto Already Narrow Science Models, Journal of Economic Literature 51(3).
Sterner, T., and M. Hennlock, (2011)
: Knightian Uncertainty and Endogenous Growth, Manuscript.
van der Ploeg, F., and C. Withagen, (2012)
: Too Much Coal, Too Little Oil", Journal of Public Economics, 96(1-2), 62-77.
van der Ploeg, F., and C. Withagen, (2012)
: Growth, Renewables and the Optimal Carbon Tax, International Economic Review.
Weitzman, M.L., (2014)
: Fat Tails and the Social Cost of Carbon, Manuscript.
Williams, N., (2008)
: Robust Control, in The New Palgrave Dictionary of Economics, 2nd Edition (S. Durlauf and L. Blume, eds.), Palgrave Macmillan.



Footnotes

* We thank Lars Peter Hansen, participants at the 2013 Midwest Macro Conference and at the Econometric Society Summer 2013 meetings for comments and suggestions. The opinions expressed do not necessarily reflect those of the Federal Reserve System. Return to Text
1. Acemoglu, Aghion, Bursztyn, and Hemous (2012) study related issues. See Nordhaus and Boyer (2000) and Stern (2007) for earlier work that also points to the importance of uncertainty. Return to Text
2. See Boswell and Collett (2011), Hartley, Medlock, Temzelides, and Zhang (2012), and references therein for a more detailed discussion on total estimated fossil fuel resources. Return to Text
3. See also Barrage (2013). Other related work includes Hotelling (1931), Dasgupta and Heal (1974), Nordhaus (2000, 2008), Hoel (1978), Stern (2007), Sinn (2008), Gars, Golosov, and Tsyvinski (2009), Krusell and Smith (2009), and Ploeg and Withagen (2012, 2012). GHKT (2013) provide an excellent review of this literature. Return to Text
4. See, for example, Lewis (1986) and Chandrasekharan (1996). Return to Text
5. See Knight (1921), Savage (1954), Ellsberg (1961), Gilboa and Schmeidler (1989), Hansen and Sargent (2001 and 2010) for related research. Return to Text
6. Related work includes Hansen, Sargent and Tallarini (1999), Hansen and Sargent (2003), Colgey, Colacito, and Hansen and Sargent (2008). See Williams (2008) for a review. In a recent paper, Bidder and Smith (2012) use robust control theory to study the implications of model uncertainty for business cycles generated through "animal spirits." Return to Text
7. Our specification allows us to assume that, in each period, the social planner moves before nature. The resulting max-min game is easier to analyze. To see the equivalence with GHKT, assume that the economy enters the current period with capital  k and carbon concentration  S . In GHKT, the final good production is given by  A_{0}e^{-\gamma S}K^{\theta}N_{0}^{1-\theta-\nu}E^{\nu} , while in our model the final good production is given by  A_{0}(e^{-\gamma S}K)^{\theta}N_{0}^{1-\theta-\nu}E^{\nu}=A_{0}e^{-\theta\gamma S}K^{\theta}N_{0}^{1-\theta-\nu}E^{\nu} . The two production technologies are identical if the damage parameter,  \gamma , in our model is scaled up by a factor of  \frac{1}{\theta} . Return to Text
8. There exists a constant,  \Delta , such that if the GHG concentration,  S , is greater than  \frac{1}{\Delta} , the system cannot be "robustified," in the sense that the value of the game goes to negative infinity. However, if the economy starts with an initial  S_{0}<\frac{1}{\Delta} , then  S_{t} will converge to  \frac{1}{\Delta} as  t\rightarrow+\infty . Return to Text
9. The exponential distribution with mean  \lambda^{-1} is the maximum-entropy distribution among all continuous distributions supported in  [0,\infty] that have mean  \lambda^{-1} . The worst-case distribution for  \gamma is also exponential with mean  (\lambda^{\ast})^{-1} and variance  (\lambda^{\ast})^{-2} , where  \lambda^{\ast}=\lambda(1-\Delta S^{\prime\ast})=\lambda(1-\Delta\phi_{0}c_{E})(1-\Delta S) . That is,  \pi^{\ast}(\gamma)=\lambda^{\ast}e^{-\lambda^{\ast}\gamma} . Since  \lambda^{\ast}=\lambda(1-\Delta S^{\prime\ast})<\lambda , the worst-case mean of  \gamma ,  (\lambda^{\ast})^{-1} , is strictly greater than the approximating mean,  \lambda^{-1} . Return to Text
10. If  \phi_{L}>0 , we need to depict the dynamics of  P and  T separately before we sum them in order to obtain the dynamics of  S . Assuming that  \phi_{L}=0 allows us to express the dynamics of  S without the need to consider  P and  T separately. That is,  S^{\prime}=(1-\phi)S+\phi_{0}E . Moreover, (A6.1) and (A6.2) imply that  S^{\prime}=S+\phi_{0}E , which is necessary for an analytical solution. Return to Text
11. To obtain smooth paths,  \gamma is set to be the expected mean of the approximating (worst-case) distribution(s) in each period. Return to Text
12. The dramatic effects on capital, output, and social welfare are partly due to the assumption that the approximating distribution of  \gamma is exponential. As we discuss next, the losses are somewhat reduced, though still large, if the approximating distribution of  \gamma is assumed to be normal. The exponential distribution is one way to capture the extreme effects in Stern (2013) in the context of our model. Return to Text
13. Of course, only a small fraction of these resources is recoverable using today's technologies. See Boswell and Collett (2011). See also Hartley, Medlock, Temzelides, and Zhang (2012) and references therein. Return to Text
14. We thank Lars Hansen for bringing this point to our attention and for suggesting the use of the big  K little  k result as a way to bypass this difficulty. Return to Text
15. The exogenous processes  \hat{E}_{t}(\hat{S}_{t}) and  \hat{C}_{t}(\hat{S}_{t},\hat{K}_{t}) are constructed to mimic the optimal control  E_{t}^{*}(S_{t}) and  C_{t}^{*}(S_{t},K_{t}) in equation (47) and equation (46) by replacing the endogenous state  (S_{t},K_{t}) by the exogenous state  (\hat{S}_{t},\hat{K}_{t}) . Return to Text

This version is optimized for use by screen readers. Descriptions for all mathematical expressions are provided in LaTex format. A printable pdf version is available. Return to Text