The Federal Reserve Board eagle logo links to home page

Skip to: [Printable Version (PDF)] [Bibliography] [Footnotes]
Finance and Economics Discussion Series: 2008-14 Screen Reader version

Firm Dynamics with Infrequent Adjustment and Learning

Eugénio Pinto1
July 2008

Keywords: Adjustment costs, learning, young firms


We propose an explanation for the rapid post-entry growth of surviving firms found in recent studies. At the core of our theory is the interaction between adjustment costs and learning by entering firms about their efficiency. We show that linear adjustment costs, i.e., proportional costs, create incentives for firms to enter smaller and for successful firms to grow faster after entry. Initial uncertainty about profitability makes entering firms prudent since they want to avoid incurring superfluous costs on jobs that prove to be excessive ex post. Because higher adjustment costs imply less pruning of inefficient firms and faster growth of surviving firms, the contribution of survivors to growth in a cohort's average size increases. For the cohort of 1988 entrants in the Portuguese economy, we conclude that survivors' growth is the main factor behind growth in the cohort's average size. However, initial selection is higher and the survivors' contribution to growth is smaller in services than in manufacturing. An estimation of the model shows that the proportional adjustment cost is the key parameter to account for the high empirical survivors' contribution. In addition, firms in manufacturing learn relatively less initially about their efficiency and are subject to larger adjustment costs than firms in services.

JEL Classification: E24, L11, L16

1 Introduction

In recent years there has been renewed interest in explaining patterns of firm dynamics, with new longitudinal datasets confirming heterogeneities between firms of different size and age. In particular, small and young (surviving) firms tend to grow faster and have higher failure rates than large and old firms, and both job creation due to the scaling-up of firm size and job destruction due to firm exit decrease with age.2 Moreover, entering firms tend to be small, but survivors grow rapidly after entry and are the main factor behind the shift to the right of a cohort's size distribution.3 These patterns differ markedly across sectors and countries, suggesting that both technological differences and country specific factors matter.4

This paper proposes an explanation for the leading role of survivors' growth in post-entry firm dynamics based on the interaction between adjustment costs and a learning-about-efficiency mechanism. Following a literature that uses adjustment costs to account for some dynamic properties of firms' labor demand, such as Campbell and Fisher (2000) , we show that proportional costs can impact the lifetime dynamics of firms' labor demand in a way consistent with the data. To implement our theory, we use a standard model of firm dynamics with passive learning. In order to check the empirical fit of our model, we also assume that inefficient firms are pruned from the market, although the predictions of our theory hold even in the absence of a selection mechanism (e.g. when exit is not allowed).

Our contribution is twofold. First, we contribute to the empirical literature by introducing a decomposition of the change in a cohort's average size into a survivor component and a non-survivor component, and by using this decomposition as the centerpiece in a structural estimation of adjustment costs. Given the emphasis on survivors' growth, our measure allows a quick assessment of how well a particular theory matches the data in that respect. We apply our decomposition to the 1988 cohort of entrants in the Portuguese economy, using the Quadros de Pessoal dataset. Similarly to Cabral and Mata (2003) , we find that growth of survivors is the main force behind the change in the cohort's average firm size. However, we also find that growth of survivors is especially intense in the initial years after entry and that there are significant cross-sector differences in terms of our decomposition. In particular, initial exit rates are smaller and the survivors' contribution to changes in size is higher in manufacturing than in services.

Second, we contribute to the theoretical literature by introducing linear adjustment costs into a model of Bayesian learning about efficiency. Our assumption of linear or proportional costs is justified by the finding of high inaction rates in employment adjustment, in varying degrees across sectors. Our model builds on Jovanovic (1982) by adding proportional costs that apply not only to regular labor adjustment, but also to job creation at entry and job destruction at exit. We show that proportional adjustment costs create incentives for firms to start smaller and, if successful, grow faster after entry. We prove this analytically in a simplified model in which there is no exit of firms. This result shows that proportional costs can generate firm growth without selection. When firms are allowed to exit, selection intensifies the effects of adjustment costs on firm growth, while costs to adjustment reduce exit rates. Therefore, adjustment costs increase the contribution of surviving firms to growth in the cohort's average size.

All that is needed for firm growth under linear adjustment costs is the existence of a learning environment that generates a stochastic process for perceived efficiency with both persistence and decreasing uncertainty in age.5 The intuition for why firms grow faster and display smaller exit rates under proportional adjustment costs is that initial uncertainty about true profitability makes entering firms prudent; that is, they enter small and "wait and see" since they want to avoid incurring superfluous entering/hiring costs and firing/shutdown costs on jobs thatprove to be excessive ex post. This implies that surviving firms will grow faster, even though adjustment costs imply that there are fewer firms exiting the market and therefore less pruning of inefficient firms.

The assumption that entering firms face a Bayesian learning problem concerning their efficiency is standard in selection theories and has been advanced as an explanation for the high rates of exit, job creation, and job destruction among young firms. The initial literature on adjustment costs used a (strictly) convex specification in an attempt to explain the sluggishness in input responses to aggregate shocks. However, the assumption that costs of adjustment are linear is now standard in dynamic labor demand models, following a number of studies since the late 1980s that have documented the importance of inaction in employment adjustment at the micro level.6 Since strictly convex costs imply smooth adjustments over time, whereas linear costs imply immediate adjustment when it occurs, allowing for strictly convex costs, instead of linear costs, in the context where they also apply at entry and exit, would bias our analysis and eventually make our argument stronger. In the case of hiring/entering costs, entering firms would prefer to start smaller and adjust gradually to their optimal size, even if their perceived productivity remained unchanged or learning was absent. For firing/exiting costs, firms experiencing large declines in perceived productivity would adjust downwards in several steps, a scenario that would make firms start smaller to attenuate its effects. Therefore, by avoiding a bias towards firm growth, our decision to assume linear costs is conservative and permits a simplification of the methods employed to measure the effects of adjustment costs.7

To assess our model quantitatively, we calibrate and estimate a version of the model with finite learning horizon and positive dispersion in entry size. We conclude that linear costs are the key element to account for the high empirical contribution of survivors to changes in a cohort's average size. A calibration/estimation for the manufacturing and services cohorts also suggests that firms in manufacturing learn relatively less initially about their efficiency and are subject to substantially larger adjustment costs than firms in services.

This paper is related to the literature on both adjustment costs and firm dynamics. Within the literature on adjustment costs, the paper is associated with theories that use linear adjustment costs to explain certain aspects of the dynamic behavior of labor demand and job flows. Well-known examples are Bentolila and Bertola (1990) , Hopenhayn and Rogerson (1993) , and Campbell and Fisher (2000) . Bentolila and Bertola (1990) and Hopenhayn and Rogerson (1993) analyze the effects of proportional firing (and hiring) costs on the dynamics of hiring and firing decisions, and on average labor demand. Both papers conclude that high firing costs make hiring and firing adjustments more sluggish, but they disagree on the implications of that for long-run employment. Campbell and Fisher (2000) use proportional costs of job creation and job destruction to explain the higher aggregate volatility of job destruction found in the U.S. manufacturing sector. These costs imply that in reaction to aggregate wage shocks employment changes at contracting firms are larger than employment changes at expanding firms. What is new in our paper is the assumption that adjustment costs apply equally to the entry/exit decisions and the hiring/firing decisions.8

Within the literature on firm dynamics the paper is connected with theories that attempt to explain the stylized facts on the lifecycle dynamics outlined above. The two main explanations for these facts are theories based on selection of firms and theories based on financing constraints.9 Selection theories stress the tendency for firms that accumulate bad realizations of productivity to exit the market and for firms that accumulate good realizations to survive and expand. This implies a composition bias towards larger and more efficient firms as smaller, inefficient, and slow-growing firms gradually exit the industry. Representative papers of selection theories are Jovanovic (1982) , Hopenhayn (1992) , Ericson and Pakes (1995) , and Luttmer (2007) . In all cases productivity realizations are exogenous, except in Ericson and Pakes (1995) where they are to some extent endogenous.

Meanwhile, theories employing financing constraints argue that some imperfection in financial markets causes young firms to have limited access to credit, forcing them to enter at a suboptimally small scale. As firms get older and survive, they establish creditworthiness and build up internal resources that enable them to expand to their optimal size. Important contributions to this literature are those of Cooley and Quadrini (2001) , Albuquerque and Hopenhayn (2004) , and Clementi and Hopenhayn (2006) . In Cooley and Quadrini (2001) a transaction cost on equity and a default cost on debt imply that equity and debt are not perfect substitutes, resulting in a positive dependence of firm size on the amount of equity. In Albuquerque and Hopenhayn (2004) and Clementi and Hopenhayn (2006) lenders introduce credit constraints because of limited liability of borrowers and enforcement of debt contracts, in the first case, and because of asymmetric information on the use of funds or the return on investment, in the second case.

Cabral and Mata (2003) analyze whether these two theories are consistent with the evolution of a cohort's size distribution in the Portuguese manufacturing sector. They find that, as the cohort ages, the firm size distribution shifts to the right largely due to growth of surviving firms rather than exit of small firms. In addition, they find that in the first year after entry younger business owners are associated with smaller firms but that is no longer the case once the cohort gets to age seven. Assuming that age is a proxy for the entrepreneur's initial wealth, the authors conclude that the age-size effect supports the idea of financially constrained firms starting at a suboptimal size and present a model with financing constraints capturing this effect.

More recently, Angelini and Generale (2008) use survey and balance sheet information for Italian manufacturing firms to analyze the impact of financing constraints on the evolution of the firm size distribution. They find that financially constrained firms tend to be small and young, although this does not have a significant effect on the overall firm size distribution. Moreover, they find that financing constraints decrease firm growth, with this effect being entirely due to small firms. In particular, being young and financially constrained does not have any additional effect. Based on these results and the fact that young firms grow faster than old firms, the authors conclude that financing constraints are not the main factor behind the evolution of the firm size distribution. In line with their argument, this paper interprets the facts presented in Cabral and Mata (2003) and other cross-sector evidence as the result of the interaction between adjustment costs and learning about efficiency.10

To our knowledge, this work is the first to suggest adjustment costs as an explanation for differences in firm dynamics by age. The paper by Cabral (1995) is nearest to this paper. In his model, firms must pay a proportional sunk cost to increase their production capacity. He argues that, in a model with Bayesian learning, a proportional capacity cost would make small entering firms grow faster than large entering firms. The reason is that small entrants are those whose initial profitability signals were not good, so their exit probabilities are higher, and therefore they choose to invest more gradually. Unlike our model, Cabral's model depends on the existence of selection. Also, by analyzing a size-growth relationship, his model is not able to explain why some large entering firms also grow substantially.

The paper is organized as follows. In section 2, we present evidence of firm dynamics for a cohort of entering firms. In section 3, we build the general model, obtain optimality conditions, and provide heuristic arguments explaining the effects of adjustment costs. In section 4, using a simplified version of the model we analytically prove the effect of linear adjustment costs on survivors' growth. In section 5, we calibrate and estimate a finite learning horizon version of the model and quantify the contribution of adjustment costs to firm dynamics. Section 6 concludes. All proofs are left for an appendix.

2 Firm Dynamics in a Cohort of Entering Firms

There is a well established literature on the identification and explanation of differences in behavior between young and old firms. In this section, we analyze firm dynamics in a cohort of entering firms. We use Quadros de Pessoal, a database containing information on all Portuguese firms with paid employees. This dataset originates from a mandatory annual survey run by the Ministry of Employment, which collects information about the firm, its establishments, and its workers. All economic sectors except public administration are included. The panel we have access to covers the period 1985-2000. Information refers to March of each year from 1985 through 1993, and to October of each year since the reformulation of the survey in 1994. On average the dataset contains 250,000 firms, 300,000 establishments, and 2,500,000 workers in each year.

The literature on firm dynamics typically finds that young firms grow faster than old firms. Using kernel density estimates of the firm size distribution in a cohort of entrants, Cabral and Mata (2003) argue graphically that the cohort's evolution is mostly due to growth of survivors rather than exit of small firms. Their analysis points to the need for a measure of the contribution of survivors versus non-survivors to the growth in a given cohort's average size. To accomplish this, we propose a decomposition of the cohort's cumulative growth that will later allow an assessment of the empirical relevance of adjustment costs. We consider the following decomposition:

\begin{multline*} \frac{1}{N\left( S_{\tau}\right) }\sum_{i\in S_{\tau}}l_{i,\tau}-\frac {1}{N\left( S_{0}\right) }\sum_{i\in S_{0}}l_{i,0}=\underset{\text{Survivor Component}}{\underbrace{\frac{1}{N\left( S_{\tau}\right) }\sum_{i\in S_{\tau}}l_{i,\tau}-\frac{1}{N\left( S_{\tau}\right) }\sum_{i\in S_{\tau} }l_{i,0}}}~+\ \underset{\text{Non-Survivor Component}}{\underbrace{\frac{N\left( D^{\tau }\right) }{N\left( S_{0}\right) }\left( \frac{1}{N\left( S_{\tau}\right) }\sum_{i\in S_{\tau}}l_{i,0}-\frac{1}{N\left( D^{\tau}\right) }\sum_{i\in D^{\tau}}l_{i,0}\right) }} \end{multline*}

where  \tau is the firm's age,  l_{i,\tau}=\ln\left( L_{i,\tau}\right) is log-employment at firm  i in period  \tau,  S_{\tau} is the set of age- \tau surviving firms,  D^{\tau} is the set of age- \tau non-surviving firms, so that  \left\{ S_{\tau},D^{\tau}\right\} is a partition of  S_{0}, and  N\left( X\right) is the number of firms in set  X.11

In general, the growth in a cohort's average size can originate from growth of survivors or from smaller initial size of non-survivors. Any theory of firm dynamics should consider both these sources of growth. Our measure allows a check on whether a particular theory can explain the key source of growth in a cohort's average size. The survivor component compares the current average size of period  \tau survivors with their initial average size, so that it measures how much survivors have grown. The non-survivor component compares the average initial size of period  \tau non-survivors with the average initial size of period  \tau survivors, so that it measures how relatively small non-survivors were initially.

We can obtain a similar decomposition for employment-weighted moments. The weighted decomposition contains information about the entire distribution of employment, not just its cross-sectional mean, and is affected both by within- and between-firm growth. Therefore, the weighted decomposition would be more relevant for assessing a richer model that considers the reallocation of employment shares between firms within the cohort. In the results that follow we focus on the unweighted decomposition because it analyzes within-firm growth, which in our model is the most relevant statistic to assess the effect of adjustment costs on the incentives for firms to grow.12

We can also produce a decomposition based on the cohort's annual growth instead of the cohort's cumulative growth. However, the annual version of the above decomposition is more sensitive to two aspects that would complicate the analysis in the paper. First, the annual survivor component is significantly affected by the business cycle, especially after the first few years of life. To control for this, we would need to somehow remove the cyclical part of the survivor component. Second, as the age of the cohort increases, the annual survivor component becomes increasingly sensitive to downsizing and exit by some survivors that become technologically outdated and consequently relatively less efficient. To fully consider this aspect of the data would force us to introduce additional parameters into the model that we present in section 3. Therefore, we believe that by employing a decomposition based on the cohort's cumulative growth we avoid having to adjust the analysis for these two aspects, and instead focus on how intense is survivor's growth while learning-about-efficiency effects are significant.

In table 6, we present the evolution of exit rates and the share of firm growth due to the survivor component in the 1988 cohort of entering firms for the overall economy.13 In 1988 there were  22,810 entering firms. The exit rate is very high initially but tends to decrease as firms get older.14 However, ten years after entry only  41.5\% of the initial entrants remain active. There is significant growth in the cohort's average size, especially in the first few years, which is mostly due to the growth of survivors rather than to the exit of small inefficient firms: survivors' growth contributes around  69\% to the growth in the cohort's average size.15

Table 6 presents similar evidence on cohort dynamics for the manufacturing and services sectors.16 We include the employment shares of each sector in the 1988 cohort of entering firms, which are close to shares in the overall economy. Although manufacturing has a much higher employment share than services, the number of entering firms in services surpasses that of manufacturing ( 6074 and  4834, respectively). Both sectors display a cumulative exit rate around  58\% by 1999. However, initial differences in exit rates are more significant, with manufacturing displaying the smallest values, and services displaying the highest values. In terms of initial size, manufacturing has the largest entrants, and services the smallest. Although manufacturing has the largest entrants, it exhibits more growth in average employment and a larger contribution of survivors to that growth than services.

We perform two robustness checks on the previous findings. First, we redo our calculations using establishments rather than firms as the unit of analysis. For the 1988 establishment cohort, we obtain similar results, although exit rates and the survivor component are higher than in the case of firms. Second, we examine an alternative cohort to make sure our results are not driven by business cycle conditions. The Portuguese economy experienced an expansion between 1986 and 1991, a period of slow growth with a recession between 1992 and 1994, and another weaker expansion between 1995 and 2000. The growth rates of real GDP were  6.4\% in 1989,  1.1\% in 1992, and  4.3\% in 1995, so that the 1991 cohort did not face as favorable a macroeconomic environment as the 1988 cohort. However, the results for the 1991 cohort are, in all dimensions, very similar to those presented above. The results for the 1994 cohort are also very similar, but with slightly smaller values for the survivor component in the first few years after entry.17

In table 6, we provide evidence on the properties of labor adjustment in the 1988 cohort of entering firms. Namely, we present three characteristics of the distribution of adjusted growth rates, conditional on survival, in 1989 and 1993: the fraction of firms that do not adjust employment,  NA, and the fractions of firms that increase/reduce their size by less than  30\%,  P30/ N30.18 The table shows that the incidence of inaction is very high, increases with age, and is higher in services than in manufacturing. This may reflect technology-induced differences in adjustment costs, or job indivisibilities affecting to a larger extent the services sector for having a higher share of small firms. The table also shows that the large majority of firms have adjustment rates within the  (-30\%,30\%) interval. A high rate of inaction and small adjustment is usually considered consistent with the presence of linear or proportional adjustment costs. In addition, comparing the columns  P30 and  N30 it appears that the 1989 growth distributions are more left-skewed than the 1993 distributions, suggesting that survivors tend to grow more initially, especially in manufacturing. The evidence on inaction justifies our assumption of linear adjustment costs in the model that we present next.

3 A Model of Learning with Linear Adjustment Costs

3.1 Assumptions and Solution

In this section, we introduce linear adjustment costs into a model of Bayesian learning about efficiency. We derive conditions for optimal employment over time and present heuristic arguments about the effects of adjustment costs on the path of employment. Our model is based on Jovanovic (1982) , adding adjustment costs and using a different specification for the idiosyncratic shock.

We assume an industry with competitive output and input markets. Current profits of a representative firm are defined by

\displaystyle \Pi\left( L,\theta\right) =F\left( L\right) \theta-wL,
where  F\left( L\right) \theta is the production function;  L is the amount of labor input;  \theta is a productivity shock; and  w is the wage rate. The output price is normalized to unity, so that all monetary values are expressed in units of the output price. Given the competitive environment, the firm treats  w as a constant.

Concerning technology we make the following assumption.

Assumption 1   The two components of the production function satisfy:

(a)  F:\mathbb{R}_{+}\rightarrow\mathbb{R}_{+} is  C^{2},  F^{\prime}>0,  F^{\prime\prime}<0,  F\left( 0\right) =0,  F^{\prime }\left( 0^{+}\right) =\infty, and  F^{\prime}\left( \infty\right) =0. (b) Letting  \tau denote the firm's age and 0 the period in which the firm enters, the stochastic process of  \theta is defined by

\begin{subequations}\begin{gather}\theta_{\tau}=\xi\left( \eta_{\tau}\right) \text{, \ \ \ \ \ }\eta_{\tau} =\mu+\varepsilon_{\tau}\text{, \ \ \ \ \ }\mu=\mu_{0}+\mu_{1}\text {, \ \ \ \ \ }\tau=0,1,\dots\\ \varepsilon_{\tau}\sim N\left( 0,\sigma^{2}\right) \text{, \ \ \ \ \ }\mu _{0}\sim N\left( \bar{\mu},\sigma_{\mu_{0}}^{2}\right) \text{, \ \ \ \ \ } \mu_{1}\sim N\left(0,\sigma_{\mu_{1}}^{2}\right) \text{,}\end{gather}\end{subequations}


where  \mu_{0},  \mu_{1},  \left\{ \varepsilon_{\tau}\right\} _{\tau\geq 0} are mutually independent,  \xi:\mathbb{R}\rightarrow\mathbb{R}_{++} is  C^{1},  \xi^{\prime}>0 and  \xi\left( -\infty\right) =\nu_{1}\geq0,  \xi\left( \infty\right) =\nu_{2}<\infty.

Part (a) ensures a well defined interior optimum. In some of the analyses below, we will assume that  F is a power function. Meanwhile, part (b) establishes that in each period productivity is stochastic with a constant mean over the firm's lifetime. The productivity component,  \mu, is made of two parts:  \mu_{0}, which is observed before entry, and  \mu_{1}, which is never directly observed by the firm. Intuitively,  \mu_{0} can be thought of as indexing ex ante efficiency, measuring initial technology choice, while  \mu_{1} indexes ex post productivity, measuring how well a firm performs within its technology choice.

The introduction of  \mu_{0} is essential to obtain a non-degenerate distribution of firms' entry size, allowing an analysis of the contribution of survivors to growth in the cohort's average size. In contrast, the absence of  \mu_{0} in Jovanovic's (1982) model generates a degenerate distribution of firms' entry size. Under this scenario, for any period after entry, survivors and non-survivors have the same average initial size implying a value of  100\% to our measure of the survivors' component. By assuming  \sigma_{\mu_{0}}>0, we avoid this aspect of Jovanovic's model.

Before entry the firm knows the parameters governing the stochastic process of  \theta, i.e.,  \bar{\mu},  \sigma_{\mu_{0}}^{2},  \sigma_{\mu_{1}}^{2}and  \sigma^{2}, and learns its ex ante productivity,  \mu_{0}, after paying a research cost,  I. After entry, the firm will learn about its specific ex post productivity,  \mu_{1}, over time as it observes the realizations of productivity,  \theta. In particular, the firm forecasts period- \tau productivity based on the ex ante efficiency parameter  \mu_{0} and on the past realizations of productivity,  \left\{ \theta _{s}\right\} _{s=0}^{\tau-1}. Similarly to Zellner (1971) , a firm with age  \tau has the following Bayesian posterior distribution for  \mu at the beginning of period  \tau:

\begin{subequations} % latex2html id marker 7623 \begin{gather}\left. \mu\right\... ...ounter{temp1}{\value{theorem}} \addtocounter {theorem}{1} \par\end{subequations}



In lemma 2 of appendix A we show that, for purposes of predicting % latex2html id marker 3080$ \Omega _{{\if@compatibility\ifnum\mathgroup=\sy......oldmath $\scriptscriptstyle\mathchar , the information set % latex2html id marker 3082$ \left( {\if@compatibility\ifnum\mathgroup=\symb...... $\scriptscriptstyle\mathchar can be summarized by % latex2html id marker 3084$ {\if@compatibility\ifnum\mathgroup=\symbold\m......\scriptscriptstyle\mathchar, where % latex2html id marker 3086$ {\if@compatibility\ifnum\mathgroup=\symbold\m......boldmath $\scriptscriptstyle\mathchar is the period-% latex2html id marker 3088$ {\if@compatibility\ifnum\mathgroup=\symbold\m......boldmath $\scriptscriptstyle\mathchar forecast of the productivity coefficient based on the information available at the beginning of period % latex2html id marker 3088$ {\if@compatibility\ifnum\mathgroup=\symbold\m......boldmath $\scriptscriptstyle\mathchar 011C$}}\else\mathchar 011C\fi\fi }$. That is, % latex2html id marker 3092$ E_{{\if@compatibility\ifnum\mathgroup=\symbold......\scriptscriptstyle\mathchar, where % latex2html id marker 3094$ {\if@compatibility\ifnum\mathgroup=\symbold\m......boldmath $\scriptscriptstyle\mathchar is the expectation conditional on the period-% latex2html id marker 3088$ {\if@compatibility\ifnum\mathgroup=\symbold\m......boldmath $\scriptscriptstyle\mathchar 011c$}}\else\mathchar= 011c\fi\fi= }$= information set.

We now lay out the timing assumptions.

Assumption 3   A potential entering firm, at the beginning of period 0, takes the following actions:

(i.a) Research cost and ex ante productivity: the firm pays a fixed cost  I, associated with the process of initial research, after which it observes a realization of ex ante productivity,  \mu_{0}.

(i.b) Entry decision and entry cost: based on the idiosyncratic realization of  \mu_{0}, the firm chooses whether to enter the industry or not. In case of entry, the firm pays  W for acquiring the (exogenously determined) capital stock.

(i.c) Initial employment and production decisions: conditional on entering the industry, the firm chooses how much labor to use and how much output to produce in period 0.

A firm of age  \tau>0 takes the following actions:
(ii.a) Update of posterior productivity: at the beginning of period  \tau, the firm updates its posterior expectation of  \theta_{\tau},  \theta_{\tau}^{\ast}, based on the observation of  \theta_{\tau-1}=\xi\left( \eta_{\tau-1}\right) at the end of period  \tau-1.

(ii.b) Exit decision: given the new posterior productivity estimate,  \theta_{\tau}^{\ast}, and employment from last period,  L_{\tau-1}, the firm chooses whether to stay or exit the industry. In case of exit, the firm sells the capital stock for the value initially paid,  W (no depreciation).

(ii.c) Employment and production decisions: conditional on staying, the firm chooses how much labor to use and how much output to produce in the current period. At the end of period  \tau, the firm observes the productivity realization,  \theta_{\tau}, and the process repeats itself again until the firm decides to leave the industry.19

In the absence of adjustment costs, while deciding whether to stay one more period or to exit, the firm compares the expected profit in case it stays,  V , with the opportunity cost of doing so,  W, the value it would recover by selling the (exogenous) capital initially acquired, i.e.,

\displaystyle V\left( \theta_{\tau}^{\ast},\tau\right) =\max_{L_{\tau}}\left\{ \Pi\left( L_{\tau},\theta_{\tau}^{\ast}\right) +\beta E_{\tau}\left[ \max\left\{ W,V\left( \theta_{\tau+1}^{\ast},\tau+1\right) \right\} \right] \right\} (3)

where  V represents expected profits conditional on staying in period  \tau.

At entry, we have  \Omega_{0}\equiv\mu_{0}, and in equilibrium expected profits must compensate for the cost of acquiring capital, i.e.,  V^{EN}\left( \theta_{0}^{\ast}\right) >W. Since markets are competitive and there is no friction in the entry and exit processes, in equilibrium the research cost equals expected gains at the research phase, i.e.,  E(V^{EN}\left( \theta_{0}^{\ast}\right) )=I. If  E\left( V^{EN}\right) >I more firms will initiate research and later enter the industry, causing a decrease in output price until equality is restored. A strictly positive fixed research cost,  I>0, is essential to avoid the extreme situation where trial research is so high that only the highest productivity firms enter and survive. Because there is no reliable capital stock variable in Quadros de Pessoal, we do not make the capital decision endogenous to the model. Instead, we assume that firms are homogeneous along the capital dimension and face the same opportunity cost of remaining in activity,  W.

Up to this point, the only differences between our model and Jovanovic (1982) are that in the latter model the efficiency parameter implicitly affects the cost function and the cohort's entry size distribution is degenerate. Therefore, without adjustment costs there would be no intertemporal linkages in our model aside from the exit decision. As in Jovanovic, because  V is strictly increasing in  \theta^{\ast}, the exit decision is characterized by an age-dependent exit threshold. For values of  \theta_{\tau}^{\ast} above or equal to that threshold, the firm would stay and choose employment to maximize current period profits. For values of  \theta_{\tau}^{\ast} below that threshold, the firm would leave the industry, since its expected profitability is below the opportunity cost. The increasing confidence the firm puts in  \theta_{\tau}^{\ast} as it grows older implies that the exit threshold is increasing with age. This is the driving force underlying Jovanovic's result that the size distribution and the survival probability increase with age.

We now introduce linear adjustment costs into the model. The adjustment cost for continuing firms,  C^{S}, is defined as

\displaystyle C^{S}\left( L_{\tau},L_{\tau-1}\right) =P~\left\vert L_{\tau}-L_{\tau -1}\right\vert
where  P is the cost per unit of adjustment. Since this is a model with endogenous entry and exit of firms, we consider that this cost also applies to the entry and exit decisions, so that the costs for entering and exiting firms,  C^{EN} and  C^{EX} respectively, are given by
\displaystyle C^{EN}\left( L_{0}\right) =P~L_{0},    \displaystyle C^{EX}\left( L_{\tau}\right) =P~L_{\tau}.

With adjustment costs, the problem now becomes,

\begin{multline} V^{S}\left( L_{\tau-1},\theta_{\tau}^{\ast},\tau\right) =\max_{L_{\tau} }\left\{ \left[ \Pi\left( L_{\tau},\theta_{\tau}^{\ast}\right) -C^{S}\left( L_{\tau},L_{\tau-1}\right) \right] +\right. \ \left. \beta E_{\tau}\left[ \max\left\{ V^{EX}\left( L_{\tau}\right) ,V^{S}\left( L_{\tau},\theta_{\tau+1}^{\ast},\tau+1\right) \right\} \right] \right\} \text{,} \end{multline} (4)

for all periods after entry ( \tau\geq1) in which the firm remains in the industry, and
\displaystyle V^{EN}\left( \theta_{0}^{\ast}\right) =\max_{L_{0}}\left\{ \left[ \Pi\left( L_{0},\theta_{0}^{\ast}\right) -C^{EN}\left( L_{0}\right) \right] +\beta E_{0}\left[ \max\left\{ V^{EX}\left( L_{0}\right) ,V^{S}\left( L_{0},\theta_{1}^{\ast},1\right) \right\} \right] \right\}   , (5)

for the entry period, where  V^{EX}, the value of exiting, is defined as
\displaystyle V^{EX}\left( L_{\tau}\right) =W-C^{EX}\left( L_{\tau}\right)   .
Note that contrary to the case without adjustment costs, the previous period employment is a state variable for the current period optimization problem. Also, in  V^{EN} and  V^{EX} the costs of hiring at entry and firing at exit are taken into account.

In general, we could allow for asymmetry among the cost parameters in  C^{S},  C^{EN}, and  C^{EX}. However, asymmetries between the cost of regular firing and the cost of firing at exit or between the cost of regular hiring and the cost of hiring at entry lead to biases in entry and exit decisions. For example, if the per unit regular hiring cost is higher than the per unit entry hiring cost, then firms will hire more workers at entry in order to save on expected future higher hiring costs. Similarly, if the per unit regular firing cost is smaller than the per unit exit firing cost, then firms facing the prospect of exit will fire workers before exiting the industry, saving on expected future higher exit firing costs. To avoid these biases, throughout the paper we assume symmetry between the parameters in  C^{S},  C^{EN}, and  C^{EX}. A more interesting distinction is between firing and hiring costs. We will see below that the conclusion of the paper is immune to asymmetries between the costs of adding and subtracting workers.

In solving the firm's problem, we consider a two-step optimization procedure where the firm first chooses optimal employment in each of three possible scenarios, and then selects the scenario with the highest pay-off. More precisely,

\displaystyle V^{S}\left( \cdot\right) =\max\left\{ V^{SD}\left( \cdot\right) ,V^{SN}\left( \cdot\right) ,V^{SU}\left( \cdot\right) \right\}   ,
where  V^{SD} and  V^{SU} are obtained by maximizing the objective function in (4) over  L_{\tau}\leq L_{\tau-1} and  L_{\tau}\geq L_{\tau-1}, respectively, and  V^{SN} is obtained by choosing  L_{\tau}=L_{\tau-1} in (4). Although the adjustment cost function introduces a non-differentiability of the objective function at the frontiers between adjustment and non-adjustment, the usual properties of the value function  V^{S} and its associated optimal exit policy function hold

Proposition 4  
Let  V^{S} be defined as in (4). Then:
(a) There exists a unique value function  V^{S}\left( L_{\tau -1},\theta_{\tau}^{\ast},\tau\right) satisfying (4) that is bounded, continuous in  \left( L_{\tau-1},\theta_{\tau}^{\ast}\right) , and strictly increasing in  \theta_{\tau}^{\ast}.
(b) There exists a unique optimal exit policy function  \chi_{\tau}^{\ast}\left( L_{\tau -1},\theta_{\tau}^{\ast}\right) =\mathbf{1}\left( \theta_{\tau}^{\ast }<\theta^{EX}\left( L_{\tau-1},\tau\right) \right) , where  \theta ^{EX}\left( L_{\tau-1},\tau\right) is a unique continuous function in  L_{\tau-1}.
Proof. See appendix A.20


In contrast, the non-differentiability of the objective function generates an inaction region in the employment policy, within which optimal employment does not vary with changes in productivity.

Proposition 6  
For any period  \tau>0, if the firm adjusts upwards, optimal employment satisfies
\displaystyle \left[ F^{\prime}\left( L_{\tau}^{\ast}\right) \theta_{\tau}^{\ast }-w\right] +\sum_{s=1}^{\infty}E_{\tau}\beta^{s}\left\{ \tilde{\chi} _{\tau+s}^{\ast}\left( -P\right) +\hat{\chi}_{\tau+s}^{\ast}\left[ F^{\prime}\left( L_{\tau+s}^{\ast}\right) \theta_{\tau+s}^{\ast}-w\right] \right\} =P , (6)

whereas if the firm adjusts downwards optimal employment satisfies
\displaystyle \left[ F^{\prime}\left( L_{\tau}^{\ast}\right) \theta_{\tau}^{\ast }-w\right] +\sum_{s=1}^{\infty}E_{\tau}\beta^{s}\left\{ \tilde{\chi} _{\tau+s}^{\ast}\left( -P\right) +\hat{\chi}_{\tau+s}^{\ast}\left[ F^{\prime}\left( L_{\tau+s}^{\ast}\right) \theta_{\tau+s}^{\ast}-w\right] \right\} =-P , (7)

In period 0, the firm enters the industry if  V^{EN}\left( \theta_{0} ^{\ast}\right) \geq W, in which case optimal employment satisfies
\displaystyle \left[ F^{\prime}\left( L_{0}^{\ast}\right) \theta_{0}^{\ast}-w\right] +\sum_{s=1}^{\infty}E_{0}\beta^{s}\left\{ \tilde{\chi}_{s}^{\ast}\left( -P\right) +\hat{\chi}_{s}^{\ast}\left[ F^{\prime}\left( L_{s}^{\ast }\right) \theta_{s}^{\ast}-w\right] \right\} =P. (8)

 L_{\tau+s}^{\ast} is the optimal employment in period  \tau+s, and  \tilde{\chi}_{\tau+s}^{\ast},  \hat{\chi}_{\tau+s}^{\ast} are functions of the optimal exit decision,  \chi_{\tau+j}^{\ast}, in periods  \tau+1 to  \tau+s, such that  \tilde{\chi}_{\tau+s}^{\ast} equals one when the firm has remained in the industry until period  \tau+s-1 , but decides to exit in period  \tau+s, and  \hat{\chi}_{\tau+s}^{\ast} equals one when the firm is still in the industry in period  \tau+s.

Proof. See appendix A.


Equations (6), (7) and (8) are marginal conditions, similar to the smooth pasting conditions in the (S, s) model literature, and they state that if the firm adjusts then the marginal adjustment cost must equal the expected present discounted value of the marginal revenue product for all future periods in which the firm is still in the industry, minus the increase in the exit cost when the firm decides to exit. This is the discrete-time analog of the continuous-time result present in Nickell (1986) and Bentolila and Bertola (1990) , adjusted for the fact that now we also have an exit decision. Because the firm will not change employment if the marginal cost of adjustment exceeds its marginal benefit for the first unit of adjustment, proportional costs imply inaction in the employment decision of the firm.

Although the results in proposition 6 do not allow a formal proof of the effects on firm growth of adjustment costs in this general model, the following corollary of proposition 6 enables us to make qualitative heuristic statements about those effects.

Corollary 7: For any period  \tau\geq0, the marginal benefit of one additional unit of labor, that is, the LHS of expressions (6), (7), and (8), can be recursively represented as

\displaystyle MB_{\tau}=\left( F^{\prime}\left( L_{\tau}^{\ast}\right) \theta_{\tau }^{\ast}-w\right) +\beta E_{\tau}\left[ \chi_{\tau+1}^{\ast}\left( -P\right) +\left( 1-\chi_{\tau+1}^{\ast}\right) MB_{\tau+1}\right] (9)

where  L_{\tau}^{\ast}=L_{\tau}^{\ast}\left( L_{\tau-1},\theta_{\tau}^{\ast }\right) ,  \chi_{\tau-1}^{\ast}=\chi_{\tau-1}^{\ast}\left( L_{\tau -1},\theta_{\tau}^{\ast}\right) are the optimal employment and exit decisions.

Proof. See appendix A.


3.2 Linear Adjustment Cost and Firm Growth

As we have seen above, in the absence of adjustment costs, optimal employment is determined solely to maximize current period profits, so that  F^{\prime}\left( L^{\ast}\right) \theta^{\ast}=w. Therefore, firms' growth is essentially a by-product of a selection mechanism: those firms that are inefficient, and therefore small, exit, while those firms that are efficient survive and grow. There is an additional source of positive growth when the frictionless employment decision rule,  L^{\ast}\left( \theta^{\ast}\right) , is convex in  \theta^{\ast}. Because of Jensen's inequality and because  \theta_{\tau}^{\ast} is a Martingale, surviving firms will grow over time:  E_{\tau}\left[ L^{\ast}\left( \theta_{\tau+1}^{\ast}\right) \right] >L^{\ast}\left[ E_{\tau}\left( \theta_{\tau+1}^{\ast}\right) \right] =L^{\ast}\left( \theta_{\tau}^{\ast}\right) . However,  L^{\ast} will not be convex in  \theta^{\ast} for general  F\left( L\right) .21

In arguing heuristically about the impact of the proportional cost on firm growth we use the property that  MB_{\tau} is weakly increasing in  \theta_{\tau}^{\ast}, and that  L_{\tau}^{\ast} is locally weakly increasing in  \theta_{\tau}^{\ast}. Because it is not immediately obvious why firing and hiring costs should give similar incentives for firm growth, we analyze separately these two costs.22 We present in figure 1 the case where there is a hiring cost,  P^{H}>0, and no firing cost,  P^{F}=0. This figure assumes a given  L_{\tau-1}. For that specific value of  L_{t-1},  \theta^{SU} and  \theta^{SD} are the frontiers between non-adjustment and upward and downward adjustment, respectively. Therefore, if  \theta_{\tau}^{\ast}\in\left[ \theta^{SD},\theta^{SU}\right] there will be no adjustment and the marginal benefit of an additional unit of labor (represented by the dashed line) is contained in the interval  \left[ 0,P^{H}\right] . To simplify the argument, we consider a firm whose sequence of productivity draws is such that in every period it has a perceived productivity equal to the unconditional mean of  \theta^{\ast}, even though the firm's uncertainty over next period  \theta^{\ast} decreases with age.

Case 1: Hiring Cost:  P^{H}>0,  P^{F}=0

Because the firm starts at the hiring margin, we must have  MB_{0}=P^{H} at entry, and  MB_{\tau}\in\left[ 0,P^{H}\right] , for all subsequent periods,  \tau=1,2,\dots, with the two extremes of the interval representing firing and hiring of workers, respectively. Consider first a situation where exit is not allowed. Under this assumption, (9) would become

\displaystyle MB_{\tau}=\left( F^{\prime}\left( L_{\tau}^{\ast}\right) \theta_{\tau }^{\ast}-w\right) +\beta E_{\tau}MB_{\tau+1}
For the entry period, we have  MB_{0}\left( \theta_{0}^{\ast}\right) =P^{H} , which implies that the firm will start smaller when  P^{H}>0 than when  P^{H}=0.23 Since  MB_{1}\left( \theta_{1}^{\ast }\right) \in\left[ 0,P^{H}\right] ,  E_{0}\left( MB_{1}\right) <P^{H} and thus we must have  pF^{\prime}\left( L_{0}\right) -w>0, for all  \beta\in\left( 0,1\right) , if  P^{H}>0. In the following period, firms will adjust upwards as frequently with  P^{H}>0 as when  P^{H}=0, because they start at the hiring margin and  E_{0}\theta_{1}^{\ast}=\theta_{0}^{\ast} , even though they might have smaller magnitudes of adjustment due to the hiring cost. The proportional hiring cost implies that firms will adjust downwards only if  \theta_{1}^{\ast}<\theta_{1}^{SD}, so that there is a region of inaction when  P^{H}>0 that is not present when  P^{H}=0. That is, firms hire fewer workers initially because the resulting smaller probability of having to fire them, and therefore wasting the initial hiring cost, compensates for the expected decrease in profits this period. Consequently, in period 1 more firms will hire than fire, and this tendency towards growth in young firms will persist for several periods.

The Bayesian learning mechanism implies both persistence and a reduction in variance with age in the Markov process associated with  \theta^{\ast}. The effect of persistence, that is, the fact that  E\left( \theta_{\tau+1}^{\ast }\mid\theta_{\tau}^{\ast},\tau\right) =\theta_{\tau}^{\ast} , was analyzed in the previous paragraph. The reduced uncertainty in the posterior estimate of productivity will be reflected in a smaller inaction region as firms accumulate information on realized productivity; that is,  \theta^{SU} decreases with  \tau. This causes an increase in  E_{\tau}\left( MB_{\tau +1}\right) for those firms already at the hiring margin, which must be balanced by an increase in  L_{\tau}^{\ast} for the right hand side of (9) to remain equal to  P^{H}. As firms become more certain about their true productivity they are more willing to adjust to their long run optimal size. Because most firms are at the hiring margin, this will cause a further increase in average size.

Consider now the possibility of exit. In this case, the uncertainty reduction as the firm ages implies a decrease in the exit probability, and a further increase in the future-periods component of  MB in (9). Consequently,  L_{\tau}^{\ast} needs to increase further in order to offset that.24 On the other hand, the smaller exit probability implies less pruning of inefficient slow-growing firms as a cohort ages, which tends to make growth in average firm size smaller. Therefore, we will have less cohort growth due to non-survivors and more cohort growth due to survivors, so that survivors' contribution to average firm growth in the cohort should increase when exit is allowed.

Case 2: Firing Cost:  P^{F}>0,  P^{H}=0

In this case, we have  MB_{0}=0,  MB_{\tau}\in\left[ -P^{F},0\right] ,  \tau=1,2,\dots. Assume first that exit is not allowed. The intuition is the same as in case 1. In comparison with  P^{F}=0, when  P^{F}>0 firms start smaller and subsequently hire more frequently than they fire. As firms age, the reduction in variance of  \theta^{\ast} causes an increase in  E_{\tau}\left( MB_{\tau +1}\right) , which must be compensated by an increase in  L_{\tau}^{\ast} for firms at the hiring margin. When exit is possible, those effects become more intense, since the exit probability will decrease as firms age.

From the heuristic intuition we have just given it becomes clear that proportional hiring and proportional firing costs reinforce each other in creating incentives for firms to grow. In the end, our assessment of the relevance of linear adjustment costs for firm growth will depend on how well a pure selection model can fit the empirical evidence, and on how much adjustment costs improve the fit. Before we move into a quantitative assessment, we present analytical results for a simple version of the general model.

4 Model with One Period Learning Horizon and No Exit

In this section, we analyze a model where firms' efficiency is revealed after the first period of life and where firms' lifetime horizon is know with certainty at entry. We assume that firms live for  \bar{T} periods, where  \bar{T} is any integer greater than  1, and that no exit is allowed prior to age  \bar{T}. These two simplifications allow us to determine the effect of linear adjustment costs on firm growth.

The introduction of adjustment costs implies an additional expected operating cost for entering firms. Therefore, the equilibrium price must increase to generate higher expected future profits that compensate for the costs incurred while adjusting to optimal size. As a consequence, pre-entry pruning of inefficient firms should increase while post-entry pruning should decrease. This is optimal from a social point of view, since with higher adjustment costs there should be less experimentation in order to save in unrecoverable costs. Therefore, the assumption that exit is exogenous is not critical for the results in this section. Since adjustment costs attenuate post-entry pruning, even if exit was endogenous to the model, the relative contribution of survivors to growth in the cohort's average size would increase through this channel. By eliminating any exit prior to  \bar{T} we focus only on the incentives for survivors to grow.

To formulate the problem, we use the fact that once the firm learns its true efficiency in period 2, it will adjust once and for all to its long run employment level.25 Then, if upon exit at age  \bar{T} firms recover the initial investment net of exit costs, in period 2 we have

\displaystyle V^{S}\left( L_{1},\theta_{2}^{\ast}\right) =\max_{L_{2}}\left\{ \delta\left( \bar{T}\right) \Pi\left( L_{2},\theta_{2}^{\ast}\right) -C^{S}\left( L_{2},L_{1}\right) +\beta^{\bar{T}-1}V^{EX}\left( L_{2}\right) \right\} (10)

where  \delta_{\bar{T}}\equiv\sum_{s=0}^{\bar{T}-2}\beta^{s}=\left( 1-\beta^{\bar{T}-1}\right) /\left( 1-\beta\right) . In period  1, we then have
\displaystyle V^{EN}\left( \theta_{1}^{\ast}\right) =\max_{L_{1}}\left\{ \Pi\left( L_{1},\theta_{1}^{\ast}\right) -C^{EN}\left( L_{1}\right) +\beta E_{1} \left[ V^{S}\left( L_{1},\theta_{2}^{\ast}\right) \right] \right\} .
Finally, in equilibrium potential entrants break even, i.e.,  E_{0}\left[ V^{EN}\left( \theta_{1}^{\ast}\right) \right] =I.

We examine the impact of adjustment costs on the log growth rate of employment, rather than the standard growth rate, in order to attenuate the effect of Jensen's inequality on firm growth.26 In this simple model, the inaction region of optimal employment can be expressed as an interval:  \Theta^{SN}=\left[ \theta ^{SD},\theta^{SU}\right] . Therefore, the average log growth rate between period 1 and period 2, conditional on  \theta_{1}^{\ast}, is defined as

\begin{multline*} g\left( \theta_{1}^{\ast}\right) =E\left[ \ln\left( L_{2}^{\ast}\right) -\ln\left( L_{1}^{\ast}\right) \right] =\ \int_{\nu_{1}}^{\theta^{SD}}\left\{ \ln\left( L_{2}^{\ast SD}\right) -\ln\left( L_{1}^{\ast}\right) \right\} dF_{\theta_{1}^{\ast}}\left( \theta_{2}^{\ast}\right) +\int_{\theta^{SU}}^{\nu_{2}}\left\{ \ln\left( L_{2}^{\ast SU}\right) -\ln\left( L_{1}^{\ast}\right) \right\} dF_{\theta_{1}^{\ast}}\left( \theta_{2}^{\ast}\right) \end{multline*}

where  \Theta\equiv\left[ \nu_{1},\nu_{2}\right] is the support of the distribution of  \theta_{2}^{\ast}, and  \theta^{SD}\left( L_{1}^{\ast }\right) and  \theta^{SU}\left( L_{1}^{\ast}\right) are the frontiers between non-adjustment and downward and upward adjustment, respectively. Depending on the specific value of  \theta_{1}^{\ast} and the magnitude of the adjustment cost parameters, we might have  \theta^{SD}\left( L_{1}^{\ast }\right) =\nu_{1} and/or  \theta^{SU}\left( L_{1}^{\ast}\right) =\nu_{2}. However, in the results that follow, we assume that  \theta_{1}^{\ast} and the adjustment cost parameters are such that both downward adjustment and upward adjustment occur with positive probability, i.e.,  \theta^{SD}\left( L_{1}^{\ast}\right) >\nu_{1} and  \theta^{SU}\left( L_{1}^{\ast}\right) <\nu_{2}. Since we assume exogenous exit, we ignore the indirect effect of adjustment costs that works through changes in the equilibrium price, and implicitly assume that the research cost,  I, adjusts to maintain an equilibrium. This indirect price effects influence average firm size in both periods, but are of second order importance for the average log-growth rate.27

Optimal employment in period 2 is determined by

\begin{displaymath} L_{2}^{\ast}\left( L_{1},\theta_{2}^{\ast}\right) =\left\{ \... ...^{SD}\left( L_{1}\right) >\theta_{2}^{\ast} \end{array}\right. \end{displaymath}
where the frontiers of adjustment are defined as
\displaystyle \theta^{SU}\left( L_{1}\right) \equiv\frac{w+\frac{\beta^{\bar{T}-1}P^{F} }{\delta_{\bar{T}}}+\frac{P^{H}}{\delta_{\bar{T}}}}{F^{\prime}\left( L_{1}\right) }\text{, ~~~}\theta^{SD}\left( L_{1}\right) =\frac {w+\frac{\beta^{\bar{T}-1}P^{F}}{\delta_{\bar{T}}}-\frac{P^{F}}{\delta _{\bar{T}}}}{F^{\prime}\left( L_{1}\right) }\text{.}
Note that the numerator of  \theta^{SU} equals the pro-rated per-period cost of adding another worker, including the wage, the marginal hiring cost, and the discounted cost of firing the worker after period  \bar{T}. The numerator of  \theta^{SD} has a similar interpretation, as the benefit of shedding a worker.

We then have the following result concerning the effects of changes in  P^{H} and  P^{F} on the cohort's average log growth rate of employment.

Proposition 8  
Assuming that  F\left( \cdot\right) is a power function and that  \theta^{SD}>\nu_{1} and
(a) The marginal effect of  P^{H} on  g\left( \theta_{1}^{\ast}\right) , assuming  P^{F}=0, is positive for  \bar{T} sufficiently high.
(b) The marginal effect of  P^{F} on  g\left( \theta_{1}^{\ast}\right) , assuming  P^{H}=0, is positive for all  \bar{T}.

Proof. See appendix A.


Consider first the hiring cost. In the proof, we show that an increase in  P^{H} decreases both  L_{1}^{\ast} and  L_{2}^{\ast SU}. The impact of  P^{H} on the growth rate depends on two opposing effects. First, while in the case of  L_{2}^{\ast SU} the cost of hiring can be equally spread out over  \bar{T}-1 periods with certainty, in the case of  L_{1}^{\ast} it will be spread out over either  \bar{T} periods or one period, depending on whether the firm learns in period  2 that it has overhired. Therefore, ex ante a proportionately greater part of  P^{H} is attached to period  1 in the case of  L_{1}^{\ast} than in the case of  L_{2}^{\ast SU}, affecting more  L_{1}^{\ast} than  L_{2}^{\ast SU}. This explains the positive effect on growth of  P^{H} for  \bar{T}=\infty. Second, the hiring cost on  L_{1}^{\ast} can possibly be spread out over  \bar{T} periods, while the hiring cost on  L_{2}^{\ast SU} can only be spread out over  \bar{T}-1 periods. This affects more  L_{2}^{\ast SU} than  L_{1}^{\ast}, and explains why the effect of  P^{H} on growth is not necessarily positive for finite  \bar{T}. However, as  \bar{T} increases the first effect dominates so that  P^{H} decreases  L_{1}^{\ast} more than  L_{2}^{\ast SU} and growth increases.29

With respect to  P^{F} there is always a positive effect on growth, independently of the lifetime horizon. This occurs because an increase in  P^{F} decreases  L_{1}^{\ast} and increases  L_{2}^{\ast SD}. This positive effect always dominates the uncertain effect due to the fact that  L_{2}^{\ast SU} also decreases with  P^{F}.

When there are both hiring and firing costs and these costs are identical (  P^{H}=P^{F}=P), then an increase in  P has a positive effect on  g\left( \theta_{1}^{\ast}\right) , for sufficiently high  \bar{T}, where the required  \bar{T} is lower than in item (a) of proposition 8.

5 Calibration/Estimation Under Finite Learning Horizon

In the previous two sections, we developed heuristic and some formal arguments about the effect of adjustment costs on the incentives for firms to start smaller and grow faster after entry. In this section, we assess the contribution of adjustment costs to explain some of the basic facts on firm dynamics found in section 2, both for the overall economy and for the manufacturing and services sectors. To accomplish this, we perform a calibration/estimation of the model using computational methods.

To simulate the infinite learning horizon model we follow the suggestion of Ljungqvist and Sargent (2004) and consider an approximation where firms live forever, but learn their ex post true productivity component,  \mu_{1}, with certainty at some age  T.30

We assume that  F\left( \cdot\right) is a power function, i.e.,  F\left( L\right) =L^{\alpha},  \alpha\in\left( 0,1\right) . Under this assumption, when adjustment is costless, optimal employment conditional on survival is a convex function of  \theta_{\tau}^{\ast}, so that Jensen's inequality implies growth of employment even if there is no selection. As in the previous section, in order to avoid any growth due to Jensen's inequality, we take logs of all variables and analyze the effects of adjustment costs on the log-growth rate.

Concerning the productivity distribution, we assume that  \theta_{\tau} is lognormally distributed, i.e.,  \xi\left( \eta\right) =\exp\left\{ \eta\right\} .31 This assumption is made for computational simplicity, and it seems reasonable on empirical grounds (see Aw et al., 2004) . In addition, this assumption is not critical as the results in section 4 suggest that the distribution of productivity mostly affects the intensity of the effect of adjustment costs on firm growth, but not the sign. In fact, proposition 8 is derived independently of the particular distribution of  \theta_{\tau}^{\ast}. With the log-normal distribution assumption, the transition law for the  \theta^{\ast}s is as follows.

Proposition 9  
Let  \theta_{\tau}=\exp\left\{ \eta_{\tau}\right\} be generated as in assumption 1. Then,
(a) The posterior distribution of  \theta_{\tau+j} ( j\geq0), given the information set at time  \tau,  \Omega_{\tau}=\left\{ \genfrac{}{}{0pt}{0}{{}}{{}} \mu_{0},\right. \allowbreak\left. \left\{ \eta_{s}\right\} _{s=0}^{\tau -1}\right\} if  \tau<T, and  \Omega_{\tau}=\left\{ \mu_{0},\mu _{1}\right\} if  \tau\geq T, is
\displaystyle \theta_{\tau+j}\mid_{\Omega_{\tau}}\sim\log N\left( Y_{\tau},Z_{\tau} +\sigma^{2}\right)   ,
where, for  \tau<T,  Y_{\tau} and  Z_{\tau} are defined in (2), and, for  \tau\geq T,  Y_{\tau}=\mu and  Z_{\tau}=0. Let  \theta_{\tau}^{\ast }=E\left( \theta_{\tau}\mid\Omega_{\tau}\right) =E\left( \theta_{\tau} \mid\theta_{\tau}^{\ast},\tau\right) . Then the distribution of  \theta_{\tau+j}^{\ast} ( j\geq1) given  \left( \theta_{\tau}^{\ast} ,\tau\right) is
\displaystyle \theta_{\tau+j}^{\ast}\mid_{\left( \theta_{\tau}^{\ast},\tau\right) } \sim\log N\left( \ln\left( \theta_{\tau}^{\ast}\right) -\frac{1}{2}\left( Z_{\tau}-Z_{\tau+j}\right) ,Z_{\tau}-Z_{\tau+j}\right)   .
Also, the unconditional distribution of  \theta_{\tau}^{\ast} ( \tau\geq0) is
\displaystyle \theta_{\tau}^{\ast}\sim\log N\left( \bar{\mu}+\frac{1}{2}\left( Z_{\tau }+\sigma^{2}\right) ,\sigma_{\mu}^{2}-Z_{\tau}\right) \text{,}
where  \sigma_{\mu}^{2}=\sigma_{\mu_{0}}^{2}+\sigma_{\mu_{1}}^{2}.

Proof. See appendix A.


Since we assume that the firm enters the industry already knowing its ex ante productivity component  \mu_{0} (see assumption 1), we will get a non-degenerate distribution of initial size. This occurs because  L_{0}=L_{0}^{\ast}\left( \theta_{0}^{\ast}\right) , and  \theta_{0}^{\ast} has positive variance in the cohort's initial distribution. The next proposition analyzes the properties of the optimization problem after  \mu is revealed to the firm in period  T.

Proposition 10  
If  \mu is revealed to the firm at period  T, then all adjustments are made at period  T, and the firm will not change its exit and employment decisions after that period. This means that
\displaystyle V^{S}\left( \theta_{T}^{\ast},L_{T-1},T\right) =\max_{L}\left\{ \frac {1}{1-\beta}\Pi\left( L,\theta_{T}^{\ast}\right) -C^{S}\left( L,L_{T-1}\right) \right\}   , (11)
\displaystyle L_{s}^{\ast}=L^{\ast}\left( \theta_{T}^{\ast},L_{T-1},T\right)   , \displaystyle s\geq T\displaystyle \chi_{T}^{\ast}=\mathbf{1}\left[ V^{S}\left( \theta _{T}^{\ast},L_{T-1},T\right) <V^{EX}\left( L_{T-1}\right) \right]   .    
Proof. See appendix A.


This result allows a considerable simplification of the computational algorithm, since it implies a finite horizon dynamic programming problem. In appendix B, we present some details concerning the computational algorithm used to simulate and estimate the model. In the following subsections we calibrate and estimate the model and do a sensitivity analysis.

5.1 Calibration and Estimation of Model with Costly Adjustment

We calibrate and estimate our model to match statistics from the 1988 cohort of entering firms, both for the overall economy and the manufacturing and service sectors. We first calibrate parameters related to inputs directly from the data. We then use the simulated method of moments to estimate the parameters associated with the learning process and the adjustment cost. These estimates are obtained so that the model generated moments match the evolution of firm size, of exit rates, and of the survivor component observed in the data. As discussed in appendix B, we find the set of parameter values that minimize the method of moments objective function by using a simulated annealing method. This optimization method is robust to local minima, to discontinuities, and to the discretization implemented in order to simulate the model.32 A central element to our estimation strategy is a decomposition of the change in the cohort's average size into a survivor component and a non-survivor component. This decomposition forces the model to match not only the growth in the cohort's average size but also the contribution of surviving and non-surviving firms to that growth. Similarly to section 2, with  l_{\tau}\equiv\ln\left( L_{\tau}\right) , our decomposition is defined as

\displaystyle E\left[ l_{\tau}\mid S_{\tau}\right] -E\left[ l_{0}\mid S_{0}\right] =\underset{\text{Survivor Component}}{\underbrace{E\left[ l_{\tau}-l_{0}\mid S_{\tau}\right] }}+\underset{\text{Non-Survivor Component}}{\underbrace {\Pr\left( D^{\tau}\mid S_{0}\right) \left\{ E\left[ l_{0}\mid S_{\tau }\right] -E\left[ l_{0}\mid D^{\tau}\right] \right\} }} (12)

Prior to estimation, we calibrate some parameters. The parameters  \alpha and  w are calibrated with data from INE (1997) containing the Inquérito Annual às Empresas from 1990 to 1995. These data are considered reliable and cover all firms in the Portuguese economy, with sampling among firms with less than 20 workers. We measure  \alpha as the 1990-1995 average of the cost share of labor in value added, and  w as the 1990-1995 average cost per worker. We can also obtain these values at the sectoral level. We deflate all nominal variables using the GDP sectoral price indices available in the updated version of Séries Longas para a Economia Portuguesa in Banco de Portugal (1997) . The real interest rate is calibrated as the 1990-1995 average of the implicit real interest rate on public debt transactions on the secondary market of the Lisbon Stock Exchange. The data was also taken from Banco de Portugal (1997) . We deflate the nominal interest rates using the December-to-December consumer price index from INE (1990-5) . The discount rate is then obtained as  \beta=\frac{1}{1+r}, where  r is the average real interest rate.

The remaining parameters,  \bar{\mu},  \sigma_{\mu_{0}},  \sigma_{\mu_{1}},  \sigma,  W, and  P are estimated using a simulated method of moments estimator, which attempts to make the model match closely the evidence on cohort dynamics presented in section 2. In particular, the estimates are selected to minimize a weighted sum of the distance between the following moments in the model and the data: (a) the time-series of the mean of log-employment conditional on survival,  E\left[ l_{\tau}\mid S_{\tau }\right] ; (b) the time-series of the cumulative change in the standard deviation of log-employment conditional on survival,  SD\left[ l_{\tau}\mid S_{\tau}\right] ;33 (c) the time-series of the cumulative exit rate,  \Pr\left( D^{\tau}\mid S_{0}\right) ; (d) and the time-series of the survivor component, as defined in (12). In estimating the above parameters, the output price is normalized to  1, and the initial research cost,  I, is obtained by the equilibrium condition  I=E\left( V^{EN}\left( \theta_{0}^{\ast}\right) \right) .34

The decision to estimate  W, instead of calibrating it, deserves some discussion. First, the main purpose of this parameter is to induce endogenous exit as it represents the firm's opportunity cost of remaining in activity. In Hopenhayn (1992) , the same result is accomplished using a fixed per period operating cost. Since the per period operating cost can be seen as the periodic payment in an annuity with a present discounted value of  W, the two mechanisms are equivalent. Second, because Quadros de Pessoal misses any reliable capital stock variable, we consider the capital decision to be exogenous. A rough estimate of the magnitude of  W is the present discounted value of an annuity with annual payments equal to the 1990-1995 average of value added minus labor costs, using the same deflators as for  w .35 Because the sample is biased towards surviving firms, this measure overestimates the value of  W, and we cannot use it as a reference to calibrate  W. Consequently, we estimate  W jointly with the remaining parameters in the model.

We present in table 6 the calibrated and estimated parameters for the three cohorts, both for the model with (AC) and without (NAC) adjustment costs, and in figures 2 and 3 we plot the data and simulated moments in the estimated AC and NAC models. We start by making some general remarks on our estimates. First, more information is revealed ex post  (\sigma_{\mu_{1} }>\sigma_{\mu_{0}}) and there is significant noise in the learning process (  \sigma>\sigma_{\mu_{0}},  \sigma>\sigma_{\mu_{1}}). Second, consistently with our expectations, all estimates of  W are below the rough estimates presented in footnote 34. Third, the inferred values for  I are close to  10\% of  W. Finally, the standard errors of the estimated parameters are relatively small, suggesting that parameters are estimated with good precision.36

For the overall economy cohort, the AC model implies an estimate for the proportional cost of about  6.3\% of the annual wage. By comparing the estimated NAC and AC models, in terms of the objective function  Q^{\ast} and the simulated moments in figure 2, we conclude that the proportional cost clearly improves the overall fit of the model, with a particularly notable improvement in the fit of the survivor component. Although the NAC model can generate moments on firm size and exit rates that are close to equivalent empirical moments, it cannot satisfactorily match the empirical survivors' contribution. That is, the NAC model cannot explain the main source of growth in the cohort's average size, since survivors contribute much more to growth in the data than in the NAC model. This shortcoming is especially intense in the initial years after entry, when the distance between the survivor component in the simulated NAC model and in the data is largest, suggesting that in the absence of adjustment costs learning has a larger initial impact on the exit of small inefficient firms than on growth of survivors.

In discussing the results for the manufacturing and services sectors we consider the estimates for the AC model. Manufacturing firms learn relatively less initially about their efficiency than services firms (  \sigma_{\mu_{0}}/\sigma_{\mu_{1}} is smaller in manufacturing). Moreover, in order to account for the higher survivor component, adjustment costs in manufacturing are larger than in services (proportional costs amount to  16.6\% and  0.8\% of the annual wage, respectively). Because of larger adjustment costs and lesser relative knowledge about efficiency at entry, manufacturing firms have higher incentives to start relatively small and to gradually adjust to optimal size as they survive and their uncertainty is resolved. In addition, the smaller relative knowledge at entry in manufacturing explains the lower relative initial research cost ( I/W is lower in manufacturing) and the higher entry rate (  1-\Pr(S_{0}) equals  27.4\% in manufacturing and  47.9\% in services).

As can be seen from the last row of table 6, while the introduction of adjustment costs improves markedly the fit of the model for the overall economy and manufacturing cohorts, it improves only marginally the fit for the services cohort.37 Therefore, the form of adjustment costs considered in the paper seems more relevant for the average firm in the manufacturing sector and the overall economy than for the average firm in the services sector. More generally, although the introduction of adjustment costs clearly improves the overall fit of model, especially in what concerns the survivor component, the model cannot explain entirely the path of the survivor component in the data. In fact, in all three cohorts the initial growth of survivors seems larger than what the AC model can explain. This might be a consequence of the discretization scheme adopted for  \theta^{\ast} in the simulation.38 More importantly, this might reflect other explanations for firm growth that cannot be captured by adjustment costs in our model, such as some mechanisms through which financing constraints operate.

5.2 Sensitivity Analysis

In this subsection we explain some aspects of the calibration/estimation exercise and provide a detailed sensitivity analysis to all parameters in the model. First, we do not attempt to match the level of the cross-sectional variance of log-employment, but only its change over time. This is because to fit the dispersion in employment, the model would require substantially larger values for both  \sigma_{\mu_{0}} and  \sigma_{\mu_{1}}. This would allow the model to match  Var\left[ l_{\tau}\mid S_{\tau }\right] and  \Pr\left( D^{\tau}\mid S_{0}\right) but would also imply an excessive rate of growth in  E\left[ l_{\tau}\mid S_{\tau }\right] . However, this shortcoming is not a serious problem. It implies that only a fraction of the observed cohort's employment dispersion can be attributed to a Bayesian learning process about efficiency. The remaining part could be attributed to heterogeneity in the initial choice of technology.

For instance, consider a model where capital is endogenous and suppose that a firm chooses its initial stock of capital,  K_{0}, based on the realization of a random variable indexing technology choice. Assume further that, after selecting  K_{0}, the firm keeps its capital stock unchanged for the remainder of its life. If the production function has constant returns to scale, if the total opportunity cost is proportional to  K_{0}, i.e.,  \tilde{W}=WK_{0}, then we can easily prove that  \tilde{V}\left( K_{0},L_{\tau-1},\theta_{\tau}^{\ast},\tau\right) =K_{0}V\left( \frac{L_{t-1}}{K_{0}},\theta_{\tau}^{\ast},\tau\right) , where  \tilde{V} is the value function conditional on the chosen  K_{0}. Therefore, in this alternative framework, dispersion in  K_{0} would govern the initial dispersion in employment and only the subsequent evolution in employment dispersion would depend on the Bayesian learning process.39 This is the reason why we attempt to match only the evolution of  SD\left[ l_{\tau}\mid S_{\tau}\right] , but not its level. In the estimated models presented in table 6 less than  40\% of the observed dispersion in the cohort's log-employment can be attributed to the learning process, with the percentage smaller in the manufacturing sector and higher in the services sector.40

Second, the value of  \sigma_{\mu_{0}}/\sigma_{\mu_{1}} affects the long-run contribution of survivors, since a relatively smaller initial dispersion would make the average size of exiting firms closer to the average size of surviving firms in the entry period, and in this case most growth would be due to survivors. In the aforementioned extended model with an initial choice over  K_{0}, if we had  \sigma_{\mu_{0}}=0 we would have a non-degenerate initial distribution of size, entirely due to the heterogeneity in  K_{0}, but the survivors' component would be  100\% in each period. This would occur because the distribution of initial size among exiting firms would be equal to the distribution of initial size among surviving firms. This also explains why even with heterogeneity over  K_{0}, we would still need to assume  \sigma_{\mu_{0}}>0 in order to match the empirical facts on the importance of the survivor component.

While we could increase the long-run contribution of survivors by tinkering with the ratio  \sigma_{\mu_{0}}/\sigma_{\mu_{1}}, without adjustment costs the model cannot match satisfactorily the observed flatness in the path of the survivor component. For any choice of  \sigma_{\mu_{0}} and  \sigma_{\mu_{1}}, it will always be the case that the survivor component will exhibit a substantially increasing path in the absence of adjustment costs. Note also that the ratio  \sigma_{\mu_{0}}/\sigma_{\mu_{1}} affects both the exit rate and the evolution of the firm size dispersion. If this ratio becomes too small, post-entry exit rates become excessively high and the size dispersion increases too fast. This is the reason why in the NAC model we cannot find a value for this ratio that attains the long-run contribution of survivors found in the data, and simultaneously matches the behavior of the cumulative exit rate and the evolution of the size dispersion. Therefore, the value we select for this ratio is disciplined by the exit rates and the evolution of firm size dispersion in the cohort.

Third, to show that proportional adjustment costs are crucial to fit the evidence on the contribution of survivors to growth in the cohort's average size, in figure 4 we perform a sensitivity analysis with respect to each parameter in the model. We take as benchmark the estimated NAC model for the overall economy cohort in table 6. We vary each parameter around its benchmark value and plot the implied cumulative exit rate and survivor component at four different ages of the cohort (ages  1,  2,  5, and  9). We present both the exit rate and the survivor component to show that the essential shortcoming of the NAC model is the inability to increase the survivor component without an unreasonable increase in exit rates.

From figure 4, we see that the model with costless adjustment cannot match satisfactorily the contribution of survivors to growth, even if we allow parameters (except for the proportional cost) to vary one by one from their benchmark values. In fact, no other parameter besides the proportional cost (in the lower-right plot) can increase the survivor component without changing much the exit rate. In addition, adjustment costs imply more than just a mere level effect on the survivor component, as the increase in the survivor component at age  1 is larger than the increase at age  9, shrinking the distance between the survivor component at different ages of the cohort. In summary, the main effect of these costs is to put more emphasis on individual firm growth in the initial years of life, when exit of inefficient firms is very intense.41

To emphasize the role of the proportional adjustment cost in replicating the evidence on the contribution of survivors, in figure 5 we present the impact of changes in  P on the survivor component, using as benchmark the estimates for the AC model in the overall economy cohort. We conclude that allowing for even a small value of  P has a substantial impact on the survivors' contribution, with a larger effect in the initial years after entry.

6 Conclusion

In this paper, we show that a model with linear adjustment costs and learning about efficiency generates incentives for firms to enter smaller and, if successful, expand faster after entry. For a cohort of entrant firms in the Portuguese economy, we present evidence showing that growth in the cohorts' average size is driven largely by growth of survivors rather than by pruning of small inefficient firms, with rapid growth of survivors in the initial post-entry years and significant cross-sector differences in the contribution of survivors. A calibration and estimation of the model reveals that the proportional cost is the key parameter to explain the high contribution of survivors to growth in the cohorts' average size. Furthermore, due to a higher contribution of survivors, adjustment costs need to be substantially higher in manufacturing than in services.

The empirical success of our model in better approximating the growth of survivors as the main source for growth suggests that adjustment costs do play a significant role in post-entry firm size adjustments. Our results suggest that selection theories are more relevant to explain firm exit than growth of survivors. Our specification of adjustment costs assumes that they are proportional to the adjustment size and apply equally at entry, exit, and during regular job creation and destruction. These costs could capture aspects such as costs to the organization, layout, and optimization of the production process, and hiring and firing costs. Although we have not collected evidence documenting the nature of these costs, we would expect them to be larger in sectors employing more complex technologies, such as in manufacturing industries. Potentially, part of the adjustment costs we estimate could also reflect the impact of financial frictions, although we would expect these to be concentrated in entry costs associated with the acquisition of capital, and not so much in regular job creation and destruction costs.

More generally, financing constraints theories should also play a role in explaining growth of survivors, besides what can possibly be captured by adjustment costs in our model, although there is not much evidence that financing constraints can explain cross-sector differences. Notwithstanding this, Angelini and Generale's (2008) conclusion that financing constraints are not the main determinant behind the evolution of the firm size distribution suggests that any government intervention to eliminate financing constraints might not change the lifetime dynamics of firm size we find in this paper. In addition, this paper suggests that, in sectors where adjustment costs are high and learning is important, government policies aimed at curbing financing constraints might not produce the intended results, as firms under those circumstances have incentives to start smaller and, if successful, expand faster, even if financing constraints are eliminated.

A. Appendix: Proofs

Lemma 2    \Omega_{\tau}\equiv\left\{ \mu_{0},\left\{ \eta_{\tau }\right\} _{\tau\geq0}\right\} can be summarized by  \left( \theta_{\tau}^{\ast} ,\tau\right) , and the distribution function  F\left( \theta _{\tau+1}^{\ast}\mid\right. \allowbreak\left. \theta_{\tau}^{\ast} ,\tau\right) is a continuous and strictly decreasing function of  \theta_{\tau}^{\ast}.

Proof. From (2) we have
\displaystyle \theta_{\tau}^{\ast}=g\left( Y_{\tau},\tau\right) =E\left( \xi\left( \eta_{\tau}\right) \mid Y_{\tau},\tau\right) =\nu_{1}+\int_{-\infty} ^{\infty}\left[ 1-F_{\eta}\left( \eta_{\tau}\mid Y_{\tau},\tau\right) \right] d\xi\left( \eta_{\tau}\right)   ,
where  F_{\eta}\left( \cdot\mid Y_{\tau},\tau\right) is the posterior distribution of  \eta_{\tau}. Because  F_{\eta}\left( \eta_{\tau}\mid Y_{\tau},\tau\right) is continuous and strictly decreasing in  Y_{\tau}, and  \xi\left( \eta_{\tau}\right) is strictly increasing in  \eta_{\tau}, we conclude that  g\left( Y_{\tau},\tau\right) is continuous and strictly increasing in  Y_{\tau} (see theorem 3.4.1 in Swartz, 1994) . Therefore, for the purpose of predicting  \theta_{\tau},  \Omega_{\tau }\equiv\left\{ \mu_{0},\left\{ \eta_{s}\right\} _{s=0}^{\tau-1}\right\} \equiv\left\{ Y_{\tau},\tau\right\} \equiv\left\{ \theta_{\tau}^{\ast} ,\tau\right\} , since  Y_{\tau}=g_{Y}^{-1}\left( \theta_{\tau}^{\ast} ,\tau\right) , where  g_{Y}^{-1} is the inverse function of  g with respect to  Y_{\tau}. Using the recursion
\displaystyle Y_{\tau+1}=\frac{\sigma^{-2}}{Z_{\tau+1}^{-1}}\eta_{\tau}+\frac{Z_{\tau}^{-1} }{Z_{\tau+1}^{-1}}Y_{\tau}\text{,}
the conditional distribution of  \theta_{\tau+1}^{\ast} can be represented as
\displaystyle F\left( \theta_{\tau+1}^{\ast}\mid\theta_{\tau}^{\ast},\tau\right) =F_{\eta }\left[ \frac{Z_{\tau+1}^{-1}}{\sigma^{-2}}g_{Y}^{-1}\left( \theta_{\tau +1}^{\ast},\tau+1\right) -\frac{Z_{\tau}^{-1}}{\sigma^{-2}}g_{Y}^{-1}\left( \theta_{\tau}^{\ast},\tau\right) \mid g_{Y}^{-1}\left( \theta_{\tau}^{\ast },\tau\right) ,\tau\right] \text{,}
since we need to integrate the density of  \eta_{\tau} over the domain where  g\left( Y_{\tau+1},\tau+1\right) \leq\theta_{\tau+1}^{\ast}. From this, we conclude that  F\left( \theta_{\tau+1}^{\ast}\mid\theta_{\tau }^{\ast},\tau\right) is a continuous and strictly decreasing function of  \theta_{\tau}^{\ast}. Therefore, the transition function associated with  F\left( \theta_{\tau+1}^{\ast}\mid\theta_{\tau }^{\ast},\tau\right) is monotone and satisfies the Feller property (see pp. 376-9 in Stokey et al., 1989) .

Proof of proposition 4. We use the following notation: (i)  X\equiv\mathbb{R}_{+}\times\Theta\times\mathbb{N}_{0} and  x\equiv\left( L,\theta,\tau\right) \in X, where  \Theta\equiv\left[ \nu_{1},\nu _{2}\right] \subset\mathbb{R}_{+},  \nu_{1}\geq0,  \nu_{2}<\infty; (ii)  T is the operator associated with (4); (iii)  M denotes the following operator
\displaystyle \left( MV^{S}\right) \left( L_{\tau},\theta_{\tau}^{\ast},\tau\right) =\int_{\nu_{1}}^{\nu_{2}}\max\left\{ V^{EX}\left( L_{\tau}\right) ,V^{S}\left( L_{\tau},\theta_{\tau+1}^{\ast},\tau+1\right) \right\} dF\left( \theta_{\tau+1}^{\ast}\mid\theta_{\tau}^{\ast},\tau\right)   ;
(iv)  V_{O}^{S},  V_{O}^{SD}, and  V_{O}^{SU} denote the objective functions associated with  V^{S},  V^{SD}, and  V^{SU}, that is, for  j=S,SD,SU
\displaystyle V_{O}^{j}\left( L_{\tau};L_{\tau-1},\theta_{\tau}^{\ast},\tau\right) =\Pi\left( L_{\tau},\theta_{\tau}^{\ast}\right) -C^{j}\left( L_{\tau },L_{\tau-1}\right) +\beta\left( MV^{S}\right) \left( L_{\tau} ,\theta_{\tau}^{\ast},\tau\right) \text{.}

We prove the proposition in several steps.

Existence and Uniqueness: This follows from the Contraction Mapping Theorem and Blackwell's sufficient conditions (see theorems 3.2 and 3.3 in Stokey et al., 1989) .
Continuity in  \left( L_{\tau-1},\theta_{\tau}^{\ast}\right) : Let  C_{12}\left( X\right) be the space of bounded functions on  X which are continuous in  \left( L_{\tau-1},\theta_{\tau}^{\ast}\right) . This is clearly a closed subset of  B\left( X\right) , the space of bounded functions  V^{S}:X\rightarrow\mathbb{R}. Since  B\left( X\right) with the sup norm  \left\Vert V^{S}\right\Vert =\sup_{x\in X}\left\vert V^{S}\left( x\right) \right\vert is a Banach space, then  C_{12}\left( X\right) is also a Banach space. Now consider  V^{S}\in C_{12}\left( X\right) . Because  \max\left\{ V^{EX},V^{S}\right\} is also continuous and  F\left( \theta_{\tau-1}^{\ast}\mid\theta_{\tau}^{\ast },\tau\right) satisfies the Feller property (see lemma 2), then  MV^{S} is continuous in  \left( L_{\tau},\theta_{\tau}^{\ast}\right) (see lemma 9.5 in Stokey et al., 1989) . Since  \Pi\left( L_{\tau},\theta_{\tau}^{\ast}\right) -C^{S}\left( L_{\tau},L_{\tau-1}\right) is continuous, then  V_{O}^{S}\left( L_{\tau };L_{\tau-1},\theta_{\tau}^{\ast},\tau\right) is continuous in  \left( L_{\tau};L_{\tau-1},\theta_{\tau}^{\ast}\right) . Therefore, applying the maximum theorem, we conclude that  V^{S}\left( L_{\tau -1},\theta_{\tau}^{\ast},\tau\right) is continuous in  \left( L_{\tau-1},\theta_{\tau}^{\ast}\right) . Note that the set of admissible values for employment can be made compact. First, only non-negative values are acceptable for employment. Second, we can choose a value for  L_{\tau} high enough, say  L^{UB}, such that  L_{\tau}^{SU\ast}\left( L_{\tau-1},\theta_{\tau}^{\ast },\tau\right) \leq L^{UB}, for all  L_{\tau-1}\leq L^{UB}, so that all values of interest are considered.  L^{UB} is finite since  F^{\prime}\left( \infty\right) =0, and  MV^{S} is bounded. Therefore,  V^{S} as defined by (4) is continuous in  \left( L_{\tau-1},\theta_{\tau}^{\ast}\right) .
Strict Monotonicity in  \theta_{\tau}^{\ast}: From lemma 2 (the transition function associated with  F\left( \theta_{\tau+1}^{\ast}\mid\theta_{\tau }^{\ast},\tau\right) is monotone) if  V^{S}\left( L_{\tau },\theta_{\tau+1}^{\ast},\tau+1\right) is weakly increasing in  \theta_{\tau+1}^{\ast}, then  \left( MV^{S}\right) \allowbreak\left( L_{\tau},\theta_{\tau}^{\ast},\tau\right) is also weakly increasing in  \theta_{\tau}^{\ast}. Then, because  \Pi\left( L_{\tau },\theta_{\tau}^{\ast}\right) is strictly increasing in  \theta_{\tau}^{\ast} (and the constraint set is not affected by  \theta_{\tau}^{\ast}),  V^{S}\left( L_{\tau -1},\theta_{\tau}^{\ast},\tau\right) is strictly increasing in  \theta_{\tau}^{\ast} (see theorem 9.11 in Stokey et al., 1989) .
Exit Policy: The exit policy is determined by the condition \displaystyle V^{EX}\left( L_{\tau-1}\right) \equiv V^{S}\left( L_{\tau-1},\theta_{\tau }^{\ast},\tau\right) \text{.} Because, for each  L_{\tau-1},  V^{EX} is constant and  V^{S} is strictly increasing in  \theta_{\tau}^{\ast}, then it is obvious that  \theta ^{EX}\left( L_{\tau-1},\tau\right) is a unique function defined by the value of  \theta^{\ast}\in\left[ \nu_{1},\nu_{2}\right] that satisfies the above equation, if it exists, or by  \nu_{1}, when  V^{EX}\left( L\right) <V^{S}\left( L,\nu_{1},\tau\right) , or by  \nu_{2} , when  V^{EX}\left( L\right) >\allowbreak V^{S}\left( L,\nu_{2},\tau\right) . Because both  V^{EX} and  V^{S} are continuous functions, then  \theta^{EX} is also a continuous function in  L. \qedsymbol
Proposition 5  
Let  \bar{T} be the maximum allowed age, so that a firm entering in period 0 must exit the industry at the end of period  \bar{T}. Then  \Pr\left( \theta_{\tau+1}^{\ast}\in\Theta_{\bar{T}}^{D}\left( L_{\tau },\tau+1\right) \mid\theta_{\tau}^{\ast},\tau\right) =1, for all  L_{\tau }\in\mathbb{R}_{+},  \tau\in\left\{ 0,\dots,\bar{T}-1\right\} where \displaystyle \Theta_{\bar{T}}^{D}\left( L_{\tau},\tau+1\right) =\left\{ \theta_{\tau +1}^{\ast}\in\Theta:V_{\bar{T}}^{S}\left( L_{\tau},\theta_{\tau+1}^{\ast },\tau+1\right) \text{ is differentiable at }L_{\tau}\right\} \text{.} Consequently, the objective functions associated with  V_{\bar{T}}^{SD} and  V_{\bar{T}}^{SU} are continuously differentiable in  L, and all optima are interior in the region of their definition.42

Proof of proposition 5. We prove this by induction. In period  \bar{T}, we have \displaystyle V_{\bar{T}}^{S}\left( L_{\bar{T}-1},\theta_{\bar{T}}^{\ast},\bar{T}\right) =\max_{L_{\bar{T}}}\left\{ \Pi\left( L_{\bar{T}},\theta_{\bar{T}}^{\ast }\right) -C^{S}\left( L_{\bar{T}},L_{\bar{T}-1}\right) +\beta V^{EX}\left( L_{\bar{T}}\right) \right\} \text{,} so that  V_{\bar{T},O}^{SD},  V_{\bar{T}}^{SN} and  V_{\bar{T},O}^{SU} are continuously differentiable functions of  L_{\bar{T}},  L_{\bar{T}-1}, and  L_{\bar{T}}, respectively. Since  V_{\bar{T},O}^{SD}\left( L;L,\theta ^{\ast},\bar{T}\right) =V_{\bar{T}}^{SN}\left( L,\theta^{\ast},\bar {T}\right) ,  V_{\bar{T},O}^{SU}\left( L;L,\theta^{\ast},\bar{T}\right) =V_{\bar{T}}^{SN}\left( L,\theta^{\ast},\bar{T}\right) ,  F^{\prime }\left( 0^{+}\right) =\infty,  F^{\prime}\left( \infty\right) =0, and  V^{EX} is bounded above, then  V_{\bar{T},O}^{SD} and  V_{\bar{T},O}^{SU} have interior optima in the regions of definition of  V_{\bar{T}}^{SD} and  V_{\bar{T}}^{SU}. Therefore, those optima are independent of  L_{\bar{T}-1}, and we must have  \partial V_{\bar{T}}^{SD}/\partial L_{\bar{T}-1}=-P,  \partial V_{\bar{T}}^{SU}/\partial L_{\bar{T}-1}=P, in the regions of their definition, and \displaystyle \frac{\partial V_{\bar{T}}^{SN}}{\partial L_{\bar{T}-1}}=F^{\prime}\left( L_{\bar{T}-1}\right) \theta_{\bar{T}}^{\ast}-w-\beta P. We conclude that  V_{\bar{T}}^{S}\left( L_{\bar{T}-1},\theta_{\bar{T}}^{\ast },\bar{T}\right) is continuously differentiable at  L_{\bar{T}-1} \in\mathbb{R}_{+}, with probability one (given  F\left( \cdot\mid \theta_{\bar{T}-1}^{\ast},\bar{T}-1\right) and for all  \theta_{\bar{T} -1}^{\ast}\in\Theta).
Now consider a generic period  \tau\in\left\{ 1,\dots,\bar{T}-1\right\} , and assume that  V_{\bar{T}}^{S}\left( L_{\tau},\theta_{\tau+1}^{\ast} ,\tau+1\right) is continuously differentiable at  L_{\tau }\in\mathbb{R}_{+} with probability one. Because  \theta^{EX}\left( L_{\tau} ,\tau+1\right) is a unique continuous function of  L, we can apply the dominated convergence theorem to conclude that  \left( MV_{\bar{T}} ^{S}\right) \left( L_{\tau},\theta_{\tau}^{\ast},\tau\right) is continuously differentiable at  L_{\tau}, for all  \theta_{\tau}^{\ast}\in\Theta (see theorems 3.2.16 and 3.4.3 in Swartz, 1994) . Consequently, the same argument used for period  \bar{T} can be repeated here.  \qedsymbol

Proof of proposition 6. For given  \left( L_{\tau-1},\tau\right) we partition the state-space associated with  \theta_{\tau}^{\ast},  \Theta, into regions of exit,  \Theta^{EX}, downward adjustment,  \Theta^{SD}, non-adjustment,  \Theta^{SN}, and upward adjustment,  \Theta^{SU}:43 \displaystyle \Theta^{EX}\left( L_{\tau-1},\tau\right) =\left\{ \theta:V^{EX} >V^{S}\right\}   , \displaystyle \Theta^{SD}\left( L_{\tau-1},\tau\right) =\left\{ \theta\in\Theta :V^{SD}>V^{SN}\text{, }V^{SD}\geq V^{SU}\text{, }V^{SD}\geq V^{EX}\right\} \text{,} \displaystyle \Theta^{SN}\left( L_{\tau-1},\tau\right) =\left\{ \theta\in\Theta :V^{SN}\geq V^{SD}\text{, }V^{SN}\geq V^{SU}\text{, }V^{SN}\geq V^{EX} \right\} \text{,} \displaystyle \Theta^{SU}\left( L_{\tau-1},\tau\right) =\left\{ \theta\in\Theta :V^{SU}>V^{SN}\text{, }V^{SU}\geq V^{SD}\text{, }V^{SU}\geq V^{EX}\right\} \text{.}

If it is optimal for the firm to adjust upwards, then we must solve

\displaystyle A_{SU}=\left[ F^{\prime}\left( L_{\tau}^{\ast}\right) \theta_{\tau}^{\ast }-\left( w+P\right) \right] +\beta\frac{\partial\left( MV^{S}\right) \left( L_{\tau}^{\ast},\theta_{\tau}^{\ast},\tau\right) }{\partial L}=0\text{,} and if it is optimal for the firm to adjust downwards, we must solve \displaystyle A_{SD}=\left[ F^{\prime}\left( L_{\tau}^{\ast}\right) \theta_{\tau}^{\ast }-\left( w-P\right) \right] +\beta\frac{\partial\left( MV^{S}\right) \left( L_{\tau}^{\ast},\theta_{\tau}^{\ast},\tau\right) }{\partial L}=0 Now, the derivative can be rewritten as
\begin{multline*} \frac{\partial\left( MV^{S}\right) \left( L_{\tau},\theta_{\ta... ...ta_{\tau+1}^{\ast}\mid \theta_{\tau}^{\ast},\tau\right) \text{,} \end{multline*}

where some of the regions might be empty, and in separating the integrals we have taken into account the continuity of the integrand in  MV^{S} at the frontiers.

For each of the above derivatives we have

\displaystyle \frac{\partial V^{EX}\left( L_{\tau}\right) }{\partial L}=-P, \displaystyle \left. \frac{\partial V^{SD}\left( \cdot\right) }{\partial L}\right\vert _{\theta_{\tau+1}^{\ast}\in\Theta^{SD}\left( L_{\tau},\tau+1\right) }=-P=\left[ F^{\prime}\left( L_{\tau+1}^{\ast}\right) \theta_{\tau+1} ^{\ast}-w\right] +\beta\frac{\partial\left( MV^{S}\right) \left( L_{\tau+1}^{\ast},\theta_{\tau+1}^{\ast},\tau+1\right) }{\partial L}\text{,} \displaystyle \frac{\partial V^{SN}\left( \cdot\right) }{\partial L_{\tau}}=\left[ F^{\prime}\left( L_{\tau}\right) \theta_{\tau+1}^{\ast}-w\right] +\beta\frac{\partial\left( MV^{S}\right) \left( L_{\tau},\theta_{\tau +1}^{\ast},\tau+1\right) }{\partial L}\text{,} \displaystyle \left. \frac{\partial V^{SU}\left( \cdot\right) }{\partial L}\right\vert _{\theta_{\tau+1}^{\ast}\in\Theta^{SU}\left( L_{\tau},\tau+1\right) }=P=\left[ F^{\prime}\left( L_{\tau+1}^{\ast}\right) \theta_{\tau+1}^{\ast }-w\right] +\beta\frac{\partial\left( MV^{S}\right) \left( L_{\tau +1}^{\ast},\theta_{\tau+1}^{\ast},\tau+1\right) }{\partial L}\text{,}
where we have used the fact that  A_{SU}=0, when it is optimal to adjust upwards, and  A_{SD}=0, when it is optimal to adjust downwards. Therefore, we have
\begin{multline*} \frac{\partial\left( MV^{S}\right) \left( L_{\tau}^{\ast},\the... ...u+1}^{\ast},\tau+1\right) }{\partial L}\right\} \right) \text{,} \end{multline*}
Using the law of iterated expectations, we can rewrite the above as
\displaystyle \frac{\partial\left( MV^{S}\right) \left( L_{\tau}^{\ast},\theta_{\tau }^{\ast},\tau\right) }{\partial L}=\sum_{s=1}^{\infty}E_{\tau}\beta ^{s-1}\left\{ \tilde{\chi}_{\tau+s}^{\ast}\left( -P\right) +\hat{\chi }_{\tau+s}^{\ast}\left[ F^{\prime}\left( L_{\tau+s}^{\ast}\right) \theta_{\tau+s}^{\ast}-w\right] \right\} \text{,} The result now follows by plugging this expression in  A_{SU}, and  A_{SD}.  \qedsymbol

Proof of corollary 7. We can rewrite the LHS of (6) and (7) as follows \begin{multline*} MB_{\tau}=\left( F^{\prime}\left( L_{\tau}^{\ast}\right) \thet... ...ast}\right) \theta_{\tau+1+s}^{\ast}-w\right) \right] \right\} . \end{multline*}
Taking into account that  \hat{\chi}_{\tau+1}^{\ast}\tilde{\chi}_{\tau +1+s}^{\ast}=\tilde{\chi}_{\tau+1+s}^{\ast},  \hat{\chi}_{\tau+1}^{\ast} \hat{\chi}_{\tau+1+s}^{\ast}=\hat{\chi}_{\tau+1+s}^{\ast},  \hat{\chi} _{\tau+1}^{\ast}=1-\chi_{\tau+1}^{\ast}, and  \tilde{\chi}_{\tau+1}^{\ast }=\chi_{\tau+1}^{\ast}, then we get the stated result.  \qedsymbol

Proof proposition 8. With proportional costs, optimal employment at entry is determined by \begin{multline*} F^{\prime}\left( L_{1}\right) \theta_{1}^{\ast}-\left( w+P^{H}\right) +\beta\left( \int_{\nu_{1}}^{\theta^{SD}}-P^{F}dF_{\theta_{1}^{\ast}}\left( \theta_{2}^{\ast}\right) +\right. \ \left. \int_{\theta^{SD}}^{\theta^{SU}}\left\{ \delta_{\bar{T}}\left[ F^{\prime}\left( L_{1}\right) \theta_{2}^{\ast}-w\right] -\beta^{\bar{T} -1}P^{F}\right\} dF_{\theta_{1}^{\ast}}\left( \theta_{2}^{\ast}\right) +\int_{\theta^{SU}}^{\nu_{2}}P^{H}dF_{\theta_{1}^{\ast}}\left( \theta _{2}^{\ast}\right) \right) =0 \end{multline*}

(a) In the case of a proportional hiring cost, assuming  P^{F}=0, we have
\displaystyle L_{1}=F^{\prime-1}\left( \frac{w}{\theta^{SD}}\right) =F^{\prime-1}\left( \frac{w+\frac{P^{H}}{\delta_{\bar{T}}}}{\theta^{SU}}\right)   ,    
\displaystyle \frac{\partial L_{1}^{\ast}}{\partial P^{H}}=\frac{F^{\prime}\left( L_{1}^{\ast}\right) }{F^{\prime\prime}\left( L_{1}^{\ast}\right) } \frac{\tilde{w}_{H}}{w\tilde{w}_{w}+P^{H}\tilde{w}_{H}},    
\displaystyle \tilde{w}_{w}=1+\beta\delta_{\bar{T}}\left[ F_{\theta_{1}^{\ast}}\left( \theta^{SU}\right) -F_{\theta_{1}^{\ast}}\left( \theta^{SD}\right) \right]   ,   \displaystyle \tilde{w}_{H}=1-\beta\left[ 1-F_{\theta_{1}^{\ast}}\left( \theta^{SU}\right) \right]   .    

After some algebra we get \begin{multline*} \frac{\partial g}{\partial P^{H}}=-\frac{1}{F_{el}\left( L_{1}^{\ast}\right) }\frac{\tilde{w}_{H}}{w\tilde{w}_{w}+P^{H}\tilde{w}_{H}}F_{\theta_{1}^{\ast} }\left( \theta^{SD}\right) -\ \int_{\theta^{SU}}^{\nu_{2}}\left\{ \frac{1}{F_{el}\left( L_{1}^{\ast }\right) }\frac{\tilde{w}_{H}}{w\tilde{w}_{w}+P^{H}\tilde{w}_{H}}-\frac {1}{F_{el}\left( L_{2}^{\ast SU}\right) }\frac{1}{w\delta_{\bar{T}}+P^{H} }\right\} dF_{\theta_{1}^{\ast}}\left( \theta_{2}^{\ast}\right) \text{,} \end{multline*}

where  F_{el}\left( L\right) =F^{\prime\prime}\left( L\right) L/F^{\prime }\left( L\right) stands for the elasticity of the marginal product of labor. If  F\left( L\right) =AL^{\alpha}, we have  F_{el}=\left( \alpha-1\right) , and the above expression simplifies to \begin{multline*} \frac{\partial g}{\partial P^{H}}=\left( 1-\alpha\right) ^{-1}\left( \frac{\tilde{w}_{H}}{w\tilde{w}_{w}+P^{H}\tilde{w}_{H}}\left\{ F_{\theta _{1}^{\ast}}\left( \theta^{SD}\right) +\left[ 1-F_{\theta_{1}^{\ast} }\left( \theta^{SU}\right) \right] \right\} -\right. \ \left. \frac{1}{w\delta\left( \bar{T}\right) +P^{H}}\left[ 1-F_{\theta _{1}^{\ast}}\left( \theta^{SU}\right) \right] \right) \text{,} \end{multline*}

which is positive when  \bar{T} is high enough so that \displaystyle w\left\{ \delta_{\bar{T}}F_{\theta_{1}^{\ast}}\left( \theta^{SD}\right) -\beta^{\bar{T}-1}\left[ 1-F_{\theta_{1}^{\ast}}\left( \theta^{SU}\right) \right] \right\} +P^{H}\left\{ 1-\beta\left[ 1-F_{\theta_{1}^{\ast} }\left( \theta^{SU}\right) \right] \right\} F_{\theta_{1}^{\ast}}\left( \theta^{SD}\right) >0

(b) In the case of a proportional firing cost, assuming  P^{H}=0, we get similarly \begin{multline*} \frac{\partial g}{\partial P^{F}}=\int_{\nu_{1}}^{\theta^{SD}}... ... dF_{\theta_{1}^{\ast} }\left( \theta_{2}^{\ast}\right) \text{,} \end{multline*}

\displaystyle \tilde{w}_{w}=1+\beta\delta_{\bar{T}}\left[ F_{\theta_{1}^{\ast}}\left( \theta^{SU}\right) -F_{\theta_{1}^{\ast}}\left( \theta^{SD}\right) \right]    
\displaystyle \tilde{w}_{F}=F_{\theta_{1}^{\ast}}\left( \theta^{SD}\right) +\beta^{\bar {T}-1}\left[ F_{\theta_{1}^{\ast}}\left( \theta^{SU}\right) -F_{\theta _{1}^{\ast}}\left( \theta^{SD}\right) \right]    

Under the assumption that marginal productivity is always positive, we need  P^{F}<\frac{w}{1-\beta} or otherwise the firm would prefer to pay the worker his lifetime salary, instead of firing him. If  F\left( L\right) =AL^{\alpha}, we have  F^{\prime}/\left( LF^{\prime\prime}\right) =\left( \alpha-1\right) ^{-1}, and the above expression simplifies to \begin{multline*} \frac{\partial g}{\partial P^{H}}=\left( 1-\alpha\right) ^{-1}... ...1-F_{\theta_{1}^{\ast}}\left( \theta^{SU}\right) \right] \right) \end{multline*}

which is positive for all  \bar{T}.  \qedsymbol

Proof of proposition 9. The result concerning the posterior distribution of  \theta_{\tau+j} follows directly from \displaystyle \ln\left( \theta_{\tau+j}\right) \mid_{\Omega_{\tau}}=\mu\mid_{\Omega_{\tau }}+\varepsilon_{\tau+j}\text{, }\mu\mid_{\Omega_{\tau}}\sim N\left( Y_{\tau },Z_{\tau}\right) \text{.}
For the distribution of  \theta_{\tau+j}^{\ast} conditional on  \left( \theta_{\tau}^{\ast} ,\tau\right) , we use the fact that
\displaystyle \ln\left( \theta_{\tau+j}^{\ast}\right) \mid_{\Omega_{\tau}}=Y_{\tau+j} \mid_{\Omega_{\tau}}+\frac{1}{2}\left( Z_{\tau+j}+\sigma^{2}\right)    
\displaystyle Y_{\tau+j}=\sigma^{-2}Z_{\tau+j}\sum_{s=\tau}^{\tau+j-1}\eta_{s}+\frac {Z_{\tau+j}}{Z_{\tau}}Y_{\tau},    
\displaystyle Z_{\tau+j}=Z_{\tau}-\sigma^{-2}Z_{\tau+j}Z_{\tau}j,    
\displaystyle \eta_{s}\mid_{\Omega_{\tau}}\sim N\left( Y_{\tau},Z_{\tau}+\sigma^{2}\right)   , \displaystyle Cov\left( \eta_{s},\eta_{s^{\prime}}\mid\Omega_{\tau+\tau}\right) =Var\left( \mu\mid\Omega_{\tau}\right) =Z_{\tau}\displaystyle s\displaystyle s^{\prime}\geq\tau\displaystyle s\neq s^{\prime}    

so that, in the end, we get
\displaystyle E\left[ \ln\left( \theta_{\tau+j}^{\ast}\right) \mid\Omega_{\tau}\right] =Y_{\tau}+\frac{1}{2}\left( Z_{\tau+j}+\sigma^{2}\right)   ,    
\displaystyle Var\left[ \ln\left( \theta_{\tau+j}^{\ast}\right) \mid\Omega_{\tau}\right] =Z_{\tau}-Z_{\tau+j}.    

From here the result follows by noting that  \ln\left( \theta_{\tau}^{\ast }\right) =Y_{\tau}+\frac{1}{2}\left( Z_{\tau}+\sigma^{2}\right) .

For the unconditional distribution, just note that  \ln\left( \theta_{\tau }^{\ast}\right) is a sum of normal random variables, and that

\displaystyle E\left[ \ln\left( \theta_{\tau}^{\ast}\right) \right] =\bar{\mu}+\frac {1}{2}\left( Z_{\tau}+\sigma^{2}\right)    
\displaystyle Var\left[ \ln\left( \theta_{\tau}^{\ast}\right) \right] =\sigma_{\mu_{0} }^{2}+\left( Z_{0}-Z_{\tau}\right)    
Proof of proposition 10. After period  T-1 the optimization problem is time invariant, since there is no uncertainty concerning  E\left( \theta\right) . Therefore, for periods  s,  s\geq T, we have
\begin{multline*} V^{S}\left( \theta_{T}^{\ast},L_{s-1},T\right) =\max_{L_{s}\geq0,\chi_{s} \in\left\{ 0,1\right\} }\left\{ \left[ \Pi\left( L_{s},\theta_{T}^{\ast }\right) -C^{S}\left( L_{s},L_{s-1}\right) \right] +\right. \ \left. \beta\left\{ \chi_{s}\left[ W-C^{EX}\left( L_{\tau+s}\right) \right] +\left( 1-\chi_{s}\right) V^{S}\left( \theta_{T}^{\ast} ,L_{s},T\right) \right\} \right\} \text{.} \end{multline*}

Consider a firm that is in the industry at time  s,  s\geq T. We now prove that this firm will not change its employment level in period  s+1. For this, we use the easily proven fact that it is less costly to adjust in one step than in two steps, i. e.,
\displaystyle C^{S}\left( L_{s+1},L_{s}^{\ast}\right) +C^{S}\left( L_{s}^{\ast} ,L_{s-1}\right) \geq C^{S}\left( L_{s+1},L_{s-1}\right)   ,
where  L_{s}^{\ast}=L_{s}\left( \theta_{T}^{\ast},L_{s-1},T\right) . We then have
  \displaystyle \Pi\left( L_{s+1},\theta_{T}^{\ast}\right) -C^{S}\left( L_{s+1} ,L_{s}^{\ast}\right) +\beta\max\left\{ V^{EX}\left( L_{s+1}\right) ,V^{S}\left( \theta_{T}^{\ast},L_{s+1}\right) \right\}    
  \displaystyle \leq\Pi\left( L_{s+1},\theta_{T}^{\ast}\right) -C^{S}\left( L_{s+1},L_{s-1}\right) +\beta\max\left\{ V^{EX}\left( L_{s+1}\right) ,V^{S}\left( \theta_{T}^{\ast},L_{s+1}\right) \right\} +C^{S}\left( L_{s}^{\ast},L_{s-1}\right)    
  \displaystyle \leq V^{S}\left( \theta_{T}^{\ast},L_{s-1}\right) +C^{S}\left( L_{s}^{\ast},L_{s-1}\right)    
  \displaystyle =\Pi\left( L_{s}^{\ast},\theta_{T}^{\ast}\right) -C^{S}\left( L_{s} ^{\ast},L_{s-1}\right) +\beta\max\left\{ V^{EX}\left( L_{s}^{\ast}\right) ,V^{S}\left( \theta_{T}^{\ast},L_{s}^{\ast}\right) \right\} +C^{S}\left( L_{s}^{\ast},L_{s-1}\right)    
  \displaystyle =V^{SN}\left( \theta_{T}^{\ast},L_{s}^{\ast}\right)   .    

Therefore, at time  s+1 it is optimal to set  L_{s+1}^{\ast}=L_{s}^{\ast}.

We now prove that the firm does not exit at time  s+1 after remaining in the industry at time  s,  s\geq T. Because the firm stays at time  s, then  V^{S}\left( \theta_{T}^{\ast},L_{s-1}^{\ast},T\right) \geq V^{EX}\left( L_{s-1}^{\ast}\right) . Now assume that in period  s+1 the firm exits, so that

\displaystyle V^{S}\left( \theta_{T}^{\ast},L_{s}^{\ast},T\right) <V^{EX}\left( L_{s}^{\ast}\right) \Leftrightarrow\Pi\left( L_{s}^{\ast},\theta_{T}^{\ast }\right) <\left( 1-\beta\right) V^{EX}\left( L_{s}^{\ast}\right) \text{.}
This then implies
\displaystyle V^{S}\left( \theta_{T}^{\ast},L_{s-1}^{\ast},T\right) \displaystyle <\left( 1-\beta\right) V^{EX}\left( L_{s}^{\ast}\right) -C^{S}\left( L_{s}^{\ast },L_{s-1}^{\ast}\right) +\beta V^{EX}\left( L_{s}^{\ast}\right)    
  \displaystyle =V^{EX}\left( L_{s}^{\ast}\right) -C^{S}\left( L_{s}^{\ast},L_{s-1} ^{\ast}\right) \leq V^{EX}\left( L_{s-1}^{\ast}\right)    

which is a contradiction  \qedsymbol

B. Appendix: Simulation and Estimation Algorithm

(i) Discretization and transition probability matrices associated with  \theta^{\ast}:
We discretize  \theta_{\tau}^{\ast} with a uniform discrete approximation (with  25 mass points) to the distribution  \log N\left( \mu+\frac{1}{2}\left( Z_{\tau}+\sigma^{2}\right) ,\sigma_{\mu} ^{2}-Z_{\tau}\right) .
We then use Tauchen's (1986) method to build the transition matrices, computing integrals via Gauss-Legendre quadrature.

(ii) Discretization of  L
Based on the decision rules for problem (11) we consider

\displaystyle l\equiv\ln\left( L\right) \sim N\left( \mu_{L},\sigma_{L}^{2}\right),
\displaystyle \mu_{L}=\frac{1}{1-\alpha}\left\{ \bar{\mu}+\frac{1}{2}\sigma^{2}+\ln\left( \frac{\alpha p}{\sqrt{\left[ w+P^{SU}+\beta P^{EX}\right] \left[ w-\left( 1-\beta\right) P^{SD}\right] }}\right) \right\},
\displaystyle \sigma_{L}^{2}=\frac{1}{\left( 1-\alpha\right) ^{2}}\sigma_{\mu}^{2}.
For  \mu_{L} we assume that at the upper end of the grid the firm decreases employment, and at the lower end of the grid the firm increases employment and exits next period. We then discretize L similarly to  \theta^{\ast} (with  800 mass points) using a unique grid for all periods.

(iii) Choice for  T
We choose  T=15, and display results until period  10.

(iv) Model simulation
For a given set of parameters we numerically compute the optimal entry, employment, and exit policy rules.
First, for each  \theta^{\ast} grid point, we compute optimal employment in
\displaystyle \tilde{V}^{SU}\left( \theta_{\tau}^{\ast}\right) \displaystyle =\max_{L_{\tau}}\left\{ \Pi\left( L_{t},\theta_{\tau}^{\ast}\right) -PL_{\tau}+\beta E_{\tau} \max\left\{ V^{EX}\left( L_{\tau}\right) ,V^{S}\left( L_{\tau} ,\theta_{\tau+1}^{\ast}\right) \right\} \right\}   ,    
\displaystyle \tilde{V}^{SD}\left( \theta_{\tau}^{\ast}\right) \displaystyle =\max_{L_{\tau}}\left\{ \Pi\left( L_{t},\theta_{\tau}^{\ast}\right) +PL_{\tau}+\beta E_{\tau} \max\left\{ V^{EX}\left( L_{\tau}\right) ,V^{S}\left( L_{\tau} ,\theta_{\tau+1}^{\ast}\right) \right\} \right\}   ,    

as these do not depend on  L_{\tau-1}. We first find the maximizer on the grid for  L and then use a golden section method to obtain a more precise maximizer.44 Second, we get  V^{SU} and  V^{SD}, determine the inaction regions, and get  V^{S}. Third, we compute all endogenous decisions associated with each realization of the  N_{s}=150,000 random lifetime histories of  \left\{ \theta_{\tau}^{\ast}\right\} . Fourth, we compute the (simulated) moments.

(v) Moments used in estimation
Let  \tilde{T}=\left\{ 1,2,3,4,5,6,7,8,9,10\right\} . We consider four sets of moments:
(a) Exit rate: For  \tau\in\tilde{T}, \displaystyle f_{a\tau i}=\mathbf{1}\left( \chi_{\tau,i}=1\right) -\Pr\left( D^{\tau}\mid S_{0}\right)

(b) Average current size conditional on survival: For  \tau \in\left\{ 0\right\} \cup\tilde{T}, \displaystyle f_{b\tau i}=l_{\tau,i}\mathbf{1}\left( \chi_{\tau,i}=0\right) -E\left[ l_{\tau}\mid S_{\tau}\right] \Pr\left( S_{\tau}\mid S_{0}\right)
(c) Relative change in variance of current size conditional on survival: For  \tau\in\tilde{T},45\begin{multline*} f_{c\tau i}=\left\{ l_{\tau,i}-E\left[ l_{\tau}\mid S_{\tau}\right] \right\} ^{2}\mathbf{1}\left( \chi_{\tau,i}=0\right) -\left\{ l_{0,i} ^{2}-E\left[ l_{0}\mid S_{0}\right] ^{2}\right\} \times\ \left\{ E[l_{\tau}^{2}\mid S_{\tau}]-E\left[ l_{\tau}\mid S_{\tau}\right] ^{2}\right\} \Pr\left( S_{t}\mid S_{0}\right) /\left\{ E[l_{0}^{2}\mid S_{0}]-E\left[ l_{0}\mid S_{0}\right] ^{2}\right\} \end{multline*}
(d) Average entry size conditional on survival: For  \tau\in\tilde{T},46\displaystyle f_{d\tau i}=l_{0,i}\mathbf{1}\left( \chi_{\tau,i}=0\right) -E\left[ l_{0}\mid S_{\tau}\right] \Pr\left( S_{\tau}\mid S_{0}\right)

(vi) Weighting matrix

The weighting matrix is estimated as the sample covariance matrix of the moments in (v), adjusted for the simulation size \displaystyle \Sigma=\left( 1+\frac{1}{N_{s}/N}\right) Var\left( f_{\cdot\cdot i}\right)\displaystyle f_{\cdot\cdot i}=[f_{a\cdot i}^{\prime}~~f_{b\cdot i}^{\prime }~~f_{c\cdot i}^{\prime}~~f_{d\cdot i}^{\prime}]^{\prime}

(vii) Estimation method

We use a simulated annealing method to search for the set of parameter values  b=\left( \bar{\mu},\sigma_{\mu_{0} },\sigma_{\mu_{1}},\right. \allowbreak\left. \sigma,W,P\right) ^{\prime} that minimizes the method of moments objective function,47\displaystyle Q=\left( \frac{1}{N}\sum\nolimits_{i=1}^{N}f_{\cdot\cdot i}\right) ^{\prime }\left( \frac{1}{N}\Sigma\right) ^{-1}\left( \frac{1}{N}\sum\nolimits_{i=1} ^{N}f_{\cdot\cdot i}\right)

(viii) Standard errors

The standard errors of the estimated parameters are obtained as follows \displaystyle std(\hat{b})=\left[ \left( \frac{\partial\left( N^{-1}\sum_{i=1} ^{N}f_{\cdot\cdot i}\left( \hat{b}\right) \right) }{\partial b^{\prime} }\right) ^{\prime}\left( \frac{1}{N}\Sigma\right) ^{-1}\left( \frac{\partial\left( N^{-1}\sum_{i=1}^{N}f_{\cdot\cdot i}\left( \hat {b}\right) \right) }{\partial b^{\prime}}\right) \right] ^{-1}\text{,}
where the matrix of derivatives is computed numerically.


Albuquerque, R., Hopenhayn, H.A., 2004.
Optimal lending contracts and firm dynamics. Review of Economic Studies 71, 285-315.
Angelini, P., Generale, A., 2008.
On the evolution of firm size distributions. American Economic Review 98, 426-438.
Aw, B.Y., Chen, X., Roberts, M.J., 2004.
Firm-level evidence on productivity differentials, turnover, and exports in taiwanese manufacturing. Journal of Development Economics 66, 51-86.
Banco de Portugal, 1997.
Séries Longas para a Economia Portuguesa, Pós II Guerra Mundial, Vol. I and II Séries Estatísticas. Banco de Portugal, Lisboa.
Bartelsman, E., Scarpetta, S., Schivardi, F., 2005.
Comparative analysis of firm demographics and survival: Evidence from micro-level sources in OECD countries. Industrial and Corporate Change 14, 365-391.
Bentolila, S., and Bertola, G., 1990.
Firing costs and labour demand: How bad is eurosclerosis? Review of Economic Studies 57, 381-402.
Cabral, L.M.B., 1995.
Sunk costs, firm size and firm growth. Journal of Industrial Economics 43, 161-172.
Cabral, L.M.B., Mata, J., 2003.
On the evolution of the firms size distribution: Facts and theory. American Economic Review 93, 1075-1090.
Campbell, J.R., Fisher, J.D.M., 2000.
Aggregate employment fluctuations with microeconomic asymmetries." American Economic Review 90, 1323-1345.
Clementi, G.L., Hopenhayn, H.A., 2006.
A theory of financing constraints and firm dynamics. Quarterly Journal of Economics 121, 229-265.
Cooley, T.F., and Quadrini, V., 2001.
Financial markets and firm dynamics. American Economic Review 91, 1286-1310.
Cooper, R.W., and Haltiwanger, J.C., 2006.
On the Nature of Capital Adjustment Costs. Review of Economic Studies 73, 611-633.
Davis, S.J., Haltiwanger, J.C., 1992.
Gross job creation, gross job destruction, and employment reallocation. Quarterly Journal of Economics 107, 819-863.
Dunne, T., Roberts, M.J., Samuelson, L., 1989a.
Plant turnover and gross employment flows in the U.S. manufacturing sector. Journal of Labor Economics 7, 48-71. The growth and failure of U.S. manufacturing plants. Quarterly Journal of Economics 104, 671-98.
Markov-perfect industry dynamics: A framework for empirical work. Review of Economic Studies 62, 53-82.
Global optimization of statistical functions with simulated annealing. Journal of Econometrics 60, 65-99.
Adjustment costs in factor demand. Journal of Economic Literature 34, 1264-1292.
Entry, exit, and firm dynamics in long run equilibrium. Econometrica 60, 1127-1150.
Job turnover and policy evaluation: A general equilibrium analysis." Journal of Political Economy 101, 915-938.
Indice de Preços no Consumidor, December 1990-96. Instituto Nacional de Estatística, Lisboa.
Empresas em Portugal, 1990-1995. Instituto Nacional de Estatística, Lisboa.
Selection and the evolution of industry. Econometrica 50, 649-670.
Recursive Macroeconomic Theory, Second Edition. MIT Press, Cambridge, MA.
Selection, growth, and the size distribution of firms. Quarterly Journal of Economics 122, 1103-1144.
Life duration of new firms. Journal of Industrial Economics 42, 227-245.
Dynamic models of labour demand. In: Ashenfelter, O.C., Layard, R. (Eds.), Handbook of Labor Economics, vol. 1. Elsevier Science, New York, pp. 473-522.
Numerical Recipes: The Art of Scientific Computing, Third Edition. Cambridge University Press, Cambridge.
Establishment size dynamics in the aggregate economy. American Economic Review 97, 1639-1666.
Measure, Integration and Function Spaces. World Scientific, Farrer Road, Singapore.
Recursive Methods in Economic Dynamics. Harvard University Press, Cambridge, MA.
Finite state markov-chain approximation to univariate and vector autoregressions. Economics Letters 20, 177-181.
An Introduction to Bayesian Inference in Econometrics. John Wiley & Sons, New York.

Table 1: 1988 Firm Cohort: All Economy
Year CumEx AvEmp CGrEmp SurComp
1988   1.11    
1989 15.6 1.27 15.2 69.5
1990 24.4 1.36 24.8 70.4
1991 30.8 1.43 31.2 69.7
1992 35.4 1.46 34.2 69.3
1993 40.0 1.46 34.6 68.9
1994 43.4 1.47 35.3 69.1
1995 46.7 1.48 36.1 68.7
1996 49.9 1.49 37.4 67.2
1997 52.7 1.51 39.6 68.5
1998 55.5 1.52 40.7 68.3
1999 58.5 1.54 43.0 68.9
Notes: CumEx is the cumulative exit rate,  100 \times N(D^{\tau}) / N(S_{0});AvEmp is the mean of log-employment among survivors,  N(S_{\tau})^{-1}\sum_{i \in S_{\tau}}l_{\tau }; CGrEmp is the cumulative log-growth rate (in %) of employment among survivors,  100\times[N(S_{\tau})^{-1}\sum_{i \in S_{\tau}}l_{\tau}- N(S_{0})^{-1}\sum_{i \in S_{0}}l_{0}];  SurComp is the survivor component (in %),  100\times[N(S_{\tau})^{-1}\sum_{i \in S_{\tau}}l_{\tau} - N(S_{\tau})^{-1}\sum_{i \in S_{\tau}}l_{0})] / [N(S_{\tau})^{-1}\sum_{i \in S_{\tau}}l_{\tau} - (S_{0})^{-1}\sum_{i \in S_{0}}l_{0}].

Table 2: 1988 Firm Cohort: Summary Characteristics by Sector
Sector EmpSh 88 CumEx 89 CumEx 92 CumEx 99 AvEmp 88 CGrEmp 89 CGrEmp 92 CGrEmp 99 SurComp 88-89
All 100.0 15.6 35.4 58.5 1.11 15.2 34.2 43.0 69.0
Manu 41.8 14.6 35.9 58.9 1.58 17.4 38.7 45.5 82.8
Serv 20.1 17.1 36.6 58.0 0.99 11.7 30.8 40.2 61.7
Notes: EmpSh is the employment share of the sector in the overall economy cohort; CumEx, CGrEmp, SurComp are as defined in table 6.

Table 3: 1988 Firm Cohort: Characteristics of Labor Adjustment by Sector
Sector  N30_{89}  NA_{89}  P30_{89}  N30_{93}  NA_{93}  P30_{93}


7.9 43.0 13.7 13.7 45.3 17.1
Manu 10.8 31.5 20.7 20.9 33.4 24.6
Serv 7.3 47.7 11.8 11.3 50.3 15.0
Notes: N30 is the fraction of firms with an adjusted growth rate of employment, conditional on survival, in the interval  (-30\%,0\%); NA is the fraction of firms that do not adjust employment, conditional on survival; P30 is the fraction of firms with an adjusted growth rate of employment, conditional on survival, in the interval  (0\%,30\%).

Table 4: Calibration/ Estimation: 1988 Firm Cohort
Parameter All NAC All AC Manu NAC Manu AC Serv NAC Serv AC
 \alpha 0.56 0.56 0.57 0.57 0.73 0.73
 \beta 0.956 0.956 0.956 0.956 0.956 0.956
 w 11.8 11.8 13.1 13.1 7.5 7.5
 \bar{\mu} 2.923 3.196 3.368 3.755 2.393 2.419
 \bar{\mu} (Standard deviation) (0.016) (0.038) (0.032) (0.012) (0.018) (0.014)
 \sigma_{\mu_{0}} 0.245 0.188 0.236 0.083 0.133 0.123
 \sigma_{\mu_{0}} (Standard deviation) (0.005) (0.008) (0.009) (0.004) (0.006) (0.006)
 \sigma_{\mu_{1}} 0.319 0.250 0.296 0.186 0.166 0.154
 \sigma_{\mu_{1}} (Standard deviation) (0.005) (0.008) (0.010) (0.006) (0.006) (0.006)
 \sigma 0.884 0.661 0.707 0.440 0.436 0.402
 \sigma (Standard deviation) (0.016) (0.036) (0.034) (0.016) (0.021) (0.018)
 W 767.5 815.2 1294.5 1494.2 200.2 198.7
 W (Standard deviation) (5.2) (8.2) (18.6) (21.9) (3.1) (3.5)
 P 0 0.74 0 2.17 0 0.06
 P (Standard deviation)   (0.05)   (0.07)   (0.02)
 I 89.1 64.0 167.6 90.0 19.8 17.8
 Q^{*} 1866.9 1516.9 700.0 423.9 356.1 351.5
Notes: NAC refers to no-adjustment-costs case; AC refers to proportional-adjustment-costs case; numbers in ( \cdot) are standard deviations of the parameters;  Q^{*} is the value of the objective function.

Figure 1: Proportional Hiring/Entry Cost
Figure 1: Proportional Hiring/Entry Cost. Data plotted as a curve. X axis displays $\theta _{\tau }^{\ast }$, Y axis displays $MB_{\tau }$. The figure plots a hypothetical function $MB_{\tau }$ against $\theta _{\tau }^{\ast }$ for a given $L_{\tau -1}$. Two vertical lines drawn at $\theta ^{SD}$ and $% \theta ^{SU}$ identify the frontiers between adjustment and non-adjustment. For $\theta _{\tau }^{\star }$ smaller than $\theta ^{SD}$ the firm destroys jobs, i.e., $L_{\tau }^{\ast }\left( \theta _{\tau }^{\ast },L_{\tau -1}\right) <L_{\tau -1}$, and $MB_{\tau }=0$. For $\theta _{\tau }^{\ast }$ higher than $\theta ^{SU}$ the firm creates jobs, i.e., $L_{\tau }^{\ast }\left( \theta _{\tau }^{\ast },L_{\tau -1}\right) >L_{\tau -1}$, and $% MB_{\tau }=P^{H}$. For $\theta _{\tau }^{\star }$ higher than or equal to $% \theta ^{SD}$ but smaller than or equal to $\theta ^{SU}$, the firm does not adjust, i.e., $L_{\tau }^{\ast }\left( \theta _{\tau }^{\ast },L_{\tau -1}\right) =L_{\tau -1}$, and $MB_{\tau }$ is increasing in $\theta _{\tau }^{\ast }$, but contained in the interval from $0$ to $P^{H}$.

Figure 2: Firm Dynamics for Overall Economy Cohort
Figure 2: Firm Dynamics for Overall Economy Cohort. Four panels. The figure plots the empirical moments and the estimated moments in the NAC and AC models for the overall economy cohort. Top-left panel: Mean of log-employment conditional on survival. Data plotted as a curve. X axis displays age of the cohort, Y axis displays mean of P. This panel shows that average cohort size increases as the cohort gets older, with faster growth in the initial years. Average cohort size is close to the data in the AC model, slightly below initially and slightly above afterwards, but it is a little below the data in the NAC model case. Top-right panel: Cumulative exit rate. Data plotted as a curve. X axis displays age of the cohort, Y axis displays exit rate in percentage. This panel shows that the cumulative exit rate increases as the cohort gets older and that in both models the cumulative exit rate is close to the data. Bottom-left panel: Standard deviation of log-employment conditional on survival. Data plotted as a curve. X axis displays age of the cohort, Y axis displays standard deviation of P. This panel shows that the dispersion of firm size in the cohort increases as firms get older. Dispersion in the NAC model is in line with dispersion in the data in the initial periods after entry but is higher than dispersion in the data in later periods. The opposite occurs the AC model, with dispersion in the model lower than dispersion in the data in the initial years after entry, but in line with dispersion in the data in later periods. Bottom-right panel: Survivor component. Data plotted as a curve. X axis displays age of the cohort, Y axis displays survivor component in percentage. This panel shows that the survivor component is essentially flat over the life of the cohort. The NAC model is too far below the data in this respect and implies a survivor component that increases with age. The AC does a substantially better job, although it is slightly below the data in the initial years.

Figure 3: Firm Dynamics for Manufacturing and Services Cohorts
Figure 3: Firm Dynamics for Manufacturing and Services Cohorts. Six panels. The figure plots the empirical moments and the estimated moments in the NAC and AC models for the manufacturing (top three panels) and services (bottom three panels) cohorts. Left panels: Mean of log-employment conditional on survival. Data plotted as a curve. X axis displays age of the cohort, Y axis displays mean of P. Middle panels: Cumulative exit rate. Data plotted as a curve. X axis displays age of the cohort, Y axis displays exit rate in percentage. Right panels: Survivor component. Data plotted as a curve. X axis displays age of the cohort, Y axis displays survivor component in percentage. Overall the comparison between the estimated moments in the NAC and AC models and the empirical moments for the manufacturing and services cohorts is very similar to that presented in figure 2 for the overall economy cohort.

Figure 4: Sensitivity Analysis
Figure 4: Sensitivity Analysis. Nine panels. The figure plots a sensitivity analysis of the cumulative exit rate and the survivor component to changes in each of the following parameters: $\alpha $, $\beta $, $w$, $E(\mu )$, $% \sigma _{\mu _{0}}$, $\sigma _{\mu _{1}}$, $\sigma $, $W$, and $P$. Each panel, left side: Exit Rate ($\%$). Data plotted as a curve. X axis displays the parameter value, Y axis displays exit rate in percentage. Each panel, right side: Survivor Component ($\%$). Data plotted as a curve. X axis displays the parameter value, Y axis displays survivor component in percentage. The figure shows that $P$ is the key parameter to match the survivor component (see text).

Figure 5: Sensitivity to Proportional Adjustment Cost
Figure 5: Sensitivity to Proportional Adjustment Cost. Data plotted as a curve. X axis displays the value of W, Y axis displays survivor component in percentage. The figure shows that as W increases from 0 to 1.5, the survivor component increases at all ages, but substantially more in the initial years after entry.


1. Board of Governors of the Federal Reserve System, 20th Street and Constitution Avenue NW, Stop 80, Washington, DC 20551 (e-mail: [email protected]). The views expressed in this paper are those of the author and do not necessarily reflect the views of the Board of Governors of the Federal Reserve System or its staff. I would like to thank valuable comments from John Shea, John Haltiwanger, Michael Pries, Bora{\u{g\/}} an Aruoba, and Rachel Kranton. All remaining errors are my own. I would also like to thank the Direcção-Geral de Estudos, Estatística e Planeamento - Ministério da Segurança Social, da Família e da Criança, for kindly allowing me to access the Quadros de Pessoal database, the University of Minho for their hospitality, and João Cerejeira and Miguel Portela for their help in extracting results. Return to Text
5. For example, firm growth would occur in our model even if exit was random with a constant probability for all firms, whereas that would not be true in a pure selection model. Return to Text
7. Although we could have included fixed adjustment costs they seem more relevant in the case of capital. Return to Text
8. Hopenhayn and Rogerson (1993) did consider that the firing cost applied also at exit, but in their model there is no learning process and they did not analyze the effect of the firing cost on firm growth. Return to Text
9. Rossi-Hansberg and Wright (2007) advance an alternative theory based on mean reversion in the accumulation of industry-specific human capital. However, their model only deals with size-dependence of firm dynamics, and has nothing to say about age-dependence of firm dynamics, which is our main concern. Return to Text
10. Our interpretation of the age-size effect is close to the alternative explanation proposed by Cabral and Mata (2003, footnote 14) , in the sense that young business owners would be subject to a more intense learning. Return to Text
11. Throughout the paper we will assume that firms enter in some generic period 0. Therefore,  \tau will represent both the firm's age and the period (after entry) we are analyzing. Return to Text
12. For the weighted decomposition, the cumulative change would be  \sum_{i\in S_{\tau} }\omega_{i,\tau}^{S_{\tau}}l_{i.\tau}-\sum_{i\in S_{0}}\omega_{i,0}^{S_{0} }l_{i,0}, where  \omega_{i,\tau}^{X} is the weight of firm  i in period  \tau in set  X, with  \omega_{i,\tau}^{X}=L_{i,\tau}/\sum_{i\in X} L_{i,\tau}. The weighted survivor component can be further decomposed as
\displaystyle \sum_{i\in S_{\tau}}\omega_{i,\tau}^{S_{\tau}}l_{i,\tau}-\sum_{i\in S_{\tau} }\omega_{i,0}^{S_{\tau}}l_{i,0}=\sum_{i\in S_{\tau}}\omega_{i,0}^{S_{\tau} }\left( l_{i,\tau}-l_{i,0}\right) +\sum_{i\in S_{\tau}}\left( \omega_{i,\tau}^{S_{\tau}}-\omega_{i,0}^{S_{\tau}}\right) l_{i,0}+\sum_{i\in S_{\tau}}\left( \omega_{i,\tau}^{S_{\tau}}-\omega_{i,0}^{S_{\tau}}\right) \left( l_{i,\tau}-l_{i,0}\right) \text{.}
The first term is a within-firm component, measuring average growth weighted by initial size; the second term is a between-firm component, measuring the contribution of changes in employment shares; and the third is a cross component. For the unweighted decomposition, the last two terms are zero, since in this case  \omega_{i,\tau}^{X}=N\left( X\right) ^{-1}. Return to Text
13. We identify entering firms in year  t as those firms that have not been in the database before  t. Given the high incidence of temporarily missing firms, we select the 1988 entering cohort, using 1985 and 1986 to detect false entries. Similarly, we identify exiting firms in the  \tau-th period (after entry in 1988) as those firms that are present in the database in period  \tau-1, but do not reappear in any of the following periods. Therefore, we display results only up to 1999, using 2000 to detect false exits. This procedure eliminates most false entries and false exits. Return to Text
14. We adopt the following procedures concerning temporarily missing firms. In calculating the exit rate we do not exclude temporarily missing firms, considering them as survivors. In calculating the cohort's mean log-employment at period  \tau, we scale it with a factor that compares that mean in period 0 between all firms and those not temporarily missing in period  \tau. We also adjust the data in 1994, when the survey moved from March to October, to correct for a higher than normal exit rate and average growth in this year. Return to Text
15. When we use employment-weighted data, we find that larger firms have smaller exit rates and, as a consequence, average employment increases more intensely than in the unweighted data. This and the fact that high-growth firms increase their weight over time, explains a larger survivor component in the employment-weighted decomposition. A similar exercise for labor productivity reveals that survivors account for about  90\% of the change in the cohort's unweighted average productivity. Return to Text
16. In order to obtain equivalent one-digit SIC87 sectors, we use the following correspondence in terms of CAE Rev. 1 codes : manufacturing  \left( =3\right) and services  \left( =6.3+8.3.2+8.3.3+9.2+9.3+9.4+9.5\right) . Return to Text
17. Reflecting our previous argument about the greater cyclical sensitivity of the decomposition based on the cohort's annual growth rate, we observe a substantial reduction in the annual survivor component associated with the 1988 and 1991 cohorts during the 1992-1994 slow growth period. However, a similar pattern does not occur with the 1994 cohort. This is one of the reasons why we choose a decomposition based on cumulative growth rates. Note also that the annual non-survivor component is not as sensitive to the business cycle as the annual survivor component. Return to Text
18. Following Davis and Haltiwanger (1992) , the adjusted growth rate in period  \tau is defined as  100\times\left( L_{\tau}-L_{\tau-1}\right) /\tilde{L}_{\tau-1}, where  \tilde{L}_{\tau -1}=\frac{1}{2}\left( L_{\tau}+L_{\tau-1}\right) . Return to Text
19. In this model we do not consider the possibility that as firms get older they might decay or become obsolete. This could be achieved by assuming exogenous probabilities for those two events. This could generate both a decrease in size of old firms (decay) and the exit of old firms (decay and obsolescence). Return to Text
20. Because in general  V^{S} is not concave, we cannot prove the usual differentiability properties of the value function. Therefore, in what follows, we implicitly assume that  V^{S}\left( L_{\tau },\theta_{\tau+1}^{\ast},\tau+1\right) is differentiable at  L_{\tau} with probability one, in terms of  F\left( \theta_{\tau+1}^{\ast}\mid\theta_{\tau }^{\ast},\tau\right) for all  \theta_{\tau}^{\ast}\in\Theta. By part (b) of proposition 4 and the dominated convergence theorem, this implies that the objective functions associated with  V^{SD},  V^{SN} and  V^{SU} are continuously differentiable in  L, so that marginal conditions can be applied to find interior optima. This assumption also implies that  V^{S}\left( L_{\tau -1},\theta_{\tau}^{\ast},\tau\right) is differentiable at  L_{\tau-1} with probability one. In proposition 5 of appendix A, we prove that this property holds both in a model with a finite lifetime horizon and a model with infinite-lived firms that face a finite learning horizon (as in sections 4 and 5). Return to Text
21. In general, from the optimal employment condition,  F^{\prime}\left( L\right) \theta^{\ast}=w, we have
\displaystyle \bar{L}^{\prime\prime}\left( \theta^{\ast}\right) =\left( \frac {F^{\prime\prime\prime}\left( \bar{L}\right) F^{\prime}\left( \bar {L}\right) -2F^{\prime\prime}\left( \bar{L}\right) ^{2}}{F^{\prime\prime }\left( \bar{L}\right) ^{2}\theta^{\ast}}\right) \bar{L}^{\prime}\left( \theta^{\ast}\right) \text{, }F^{\prime\prime}\left( \bar{L}\right) <0\text{, }L^{\prime}\left( \theta^{\ast}\right) >0\text{,}
whose sign depends on  F^{\prime\prime\prime}\left( \bar{L}\right) . Therefore, if decreasing returns to labor do not decrease too fast, that is,  F^{\prime\prime\prime}\left( \bar{L}\right) <2F^{\prime\prime}\left( \bar{L}\right) ^{2}/F^{\prime}\left( \bar{L}\right) , then we will have  \bar{L}^{\prime\prime}\left( \theta^{\ast}\right) <0. When  F\left( L\right) =\ln\left( L\right) , then  \bar{L}^{\prime\prime}\left( \theta^{\ast}\right) =0, and when  F\left( L\right) =L^{\alpha},  \alpha\in\left( 0,1\right) , then  \bar{L}^{\prime\prime}\left( \theta^{\ast}\right) >0. Return to Text
22. In the discussion that follows, the hiring cost applies both to regular hiring and to hiring at entry and the firing cost applies both to regular firing and to firing at exit. Return to Text
23. When exit is not allowed, we can prove that  V^{S} is concave (and continuously differentiable) in  L, so that  L_{0} must decrease for  MB_{0} to increase. Return to Text
24. This effect is similar to that of Cabral (1995) . Return to Text
25. This result is formalized in proposition 10 below. Return to Text
26. As we saw above, when optimal employment is a convex function of  \theta^{\ast}, Jensen's inequality implies positive expected growth, even in the absence of adjustment costs. Because the log transformation is concave, it will offset the convexity of the optimal employment function. For example, when  F\left( \cdot\right) is a power function, the log growth rate eliminates the effect of Jensen's inequality, since  \ln\left( L_{\tau}^{\ast}\right) becomes linear in  \theta_{\tau}^{\ast}. Return to Text
27. In proposition 8 below, if  F\left( \cdot\right) is a power function the indirect price effects cancel out. Return to Text
28. In the proof, we consider a general production function and then specify a power function in order to obtain the sign of the effect. From that general setup, we can say that the form of  F\left( \cdot\right) should not be determinant for these results when the elasticity of the marginal product of labor does not vary much with the amount of labor used. Return to Text
29. In our simulations,  \bar{T}=3 was enough to generate a positive effect on growth. Return to Text
30. Note that this  T differs from the lifetime horizon  \bar{T} used in section 4, with  \bar{T}\geq T. In this section, we assume an infinite horizon, so that  \bar{T}=\infty. In our simulations below, we assume that  T=15 (years), and present results until year  10. Assuming a higher value for  T would slow down the algorithm's execution without improving significantly the accuracy of the model simulations. Return to Text
31. Under log-normality,  \nu_{1}=0 and  \nu_{2} =\infty. Although this violates assumption 1, this is not a problem in this section, since we will be using a discrete approximation to the productivity distribution. Return to Text
32. The objective function is defined as  Q=\left( N^{-1}\sum_{i=1}^{N}f_{i}\right) ^{\prime}\left( N^{-1} \Sigma\right) ^{-1}\left( N^{-1}\sum_{i=1}^{N}f_{i}\right) , where  N^{-1}\sum f_{i} are the differences between the sample and model simulated moments, and  \Sigma is the estimated covariance matrix of  f_{i}. Return to Text
33. We use the change in instead of the level of dispersion because imputing all the dispersion in size to the learning process would make it difficult to capture the level of growth in the model, as we discuss below. Return to Text
34. To make the computation of equilibrium easier for a given set of parameters, instead of changing the output price we change the fixed research cost so that the equilibrium condition is satisfied. Return to Text
35. The estimates are  1373.8 for the overall economy cohort,  3317.1 for the manufacturing cohort, and  269.1 for the services cohort. Return to Text
36. The magnitudes we obtain for both  Q^{\ast} and the standard errors resemble those obtained by Cooper and Haltiwanger (2006) in a study about capital adjustment costs using a similar estimation methodology. Return to Text
37. This is reflected in the small estimate for the proportional cost in the services cohort when compared with the same estimate for the two other cohorts. Return to Text
38. From point (i) in appendix B, we see that the range implicit in the support of the uniform discrete aproximation to the  \theta_{\tau}^{\ast} distribution increases with  \tau. Return to Text
39. We implicitly assume that  P would also be proportional to  K. If this is not the case, then the proportional adjustment cost would be less important for large- K firms, intensifying the tendency for higher growth among small firms. Although augmenting the model with this capital decision would allow us to match the level of size dispersion in the data, the numerical complexity of the model simulation would increase even further. Return to Text
40. In figure 2, we rescale the estimated initial values of  SD\left[ l_{\tau}\mid S_{\tau}\] to the level found in the data. Return to Text
41. Note that we would not be able to identify the hiring/entry cost,  P^{H}, and the firing/exit cost,  P^{F}, separately because  P^{H} and  P^{F} produce almost identical results. This should be expected as the incentives created by proportional hiring/entry and firing/exit costs differ only in the displacement of time by one period. Return to Text
42. A similar result would hold for the case of infinite-lived firms that face a finite learning horizon, as in sections 4 and 5. However, in this case we would need to use proposition 10 first. Return to Text
43. In  \Theta^{SD} and  \Theta^{SU} we need to use  V^{SD}>V^{SU} and  V^{SU}>V^{SD} because  V^{S}, in general, is not concave in  L. Return to Text
45. This moment condition can be expressed in terms of the ratio of the time- \tau and time-0 variances. Return to Text
46. This moment condition together with condition (b) can be expressed in terms of the survivor component. Return to Text

This version is optimized for use by screen readers. Descriptions for all mathematical expressions are provided in LaTex format. A printable pdf version is available. Return to Text