Figure 1: Share of Employment by Size Category: India vs. US
Notes: The graph plots the share of total employment in establishments of different size categories for India and the US. The data for India combines two sources, the Annual Survey of Industries (ASI) and the Survey of Unorganized Manufacturing (SUM) for 2005-06. The data for the US is taken from the County Business Patterns Database for 2006.
Figure 2: Size Distribution of Manufacturing Establishments: Across Indian States
Notes: The graph plots the share of employment in plants of size five or less in a state against per-capita NDP of the state relative to the poorest state. The data for the states combines two sources, the Annual Survey of Industries (ASI) and the Survey of Unorganized Manufacturing (SUM). Only the 15 largest states are included to keep the graph readable.
Figure 3: The relationship between education and the price paid for each product
This figure plots the estimated coefficients from the regression of log(price) on a set of education dummies and additional controls. These controls are the same as those used in table 1 column 4 and is the triple interaction of state, rural, and product fixed effects. The omitted education category is "illiterate." The 95% confidence intervals are denoted by the lines. Standard errors are clustered at the household level.
Figure 4: Frequency Histogram of Product-Specific Price Elasticity to Expenditure
This figure plots the frequency distribution of the coefficient on "log(per-capital expenditure)" when regressing "log(price)" on "log(per-capital expenditure)" for each individual product while including rural interacted with state fixed effects. Specifically, we do the following regression $$log(price)_{p,h} =β_{p}log($$per-capita expenditure$$)_{p,h}+α_{r,s}+ε_{p,h} $$for each product and collect each $$β_{p}$$ (so 188 regressions since there is 188 individual consumer products in our dataset). We then plot the frequency histogram for these $$β_{p}$$.
Figure 5: Non-parametric Estimate: Larger Plants Produce Higher Price Goods
Notes: The data is from the ASI and the SUM of 2005-06. The graph plots the kernel-smoothed local linear regression of residualized log prices charged by a plant for its products on residualized log employment of that plant (removes product fixed effects and the interaction of state and urban-rural fixed effects). Products which have the units problem discussed in footnote 10 and in Appendix F are split into two product categories. 1 percent tails of residualized log employment are excluded. An Epanechnikov kernel with a bandwidth of 0.502 used. The grey regions is the 95 percent confidence interval for the non-parametric estimate.
Figure 6: Relationship between firm size and capital
This figure shows the binned scatterplot and line of best fit for the relationship between the size of the firm and the amount of capital used in the firm. Specifically, we plot the residuals for the natural logarithm of employees in a firm (x-axis) and for the natural logarithm of amount of capital in the firm relative to the firm’s output (y-axis) after controlling for the triple interaction of product, state, and rural fixed effects. To be precise, a binned scatterplot is a non-parametric method of plotting the conditional expectation function (which describes the average y-value for each x-value).
Figure 7: Relationship between capital and prices
This figure shows the binned scatterplot and line of best fit for the relationship firm capital and the price of a product. Specifically, we plot the residuals for the natural logarithm of total capital to employee ratio (x-axis) and for the natural logarithm of price charged (y-axis) after controlling for the triple interaction of product, state, and rural fixed effects. To be precise, a binned scatterplot is a non-parametric method of plotting the conditional expectation function (which describes the average y-value for each x-value).
Figure 8: Frequency Histogram of Product-Specific Price Elasticity to Firm Size
Using the ASI dataset, this figure plots the frequency distribution of the coefficient on "log(number of employees)" when regressing "log(price)" on "log(number of employees)" for each individual product while including rural interacted with state fixed effects. Specifically, we do the following regression $$log(price)_{f ,g} = γ_{p}log($$number of employees$$)_{f} +α_{r,s} +ε_{f ,g} $$ for each product and collect each $$γ_{g}$$ (so 1217 regressions since there is 1217 individual products in our ASI dataset), where f , g, r and s subscripts refer to firm, good, rural and state, respectively. We then plot the frequency histogram for these $$γ_{g}$$. To ensure the figure is not distorted by outliers, we omit the extreme 2 percents of the distribution.
Figure 9: Quality Engel Curve
Notes: The figure plots the share of households who purchase the high quality product for different wage levels. There are only 2 quality level (N = 2) which have prices $$P_{q1} = 1$$. Quality index for the low quality is set to one, that is, $$q1 = 1$$. The three lines correspond to three different values of Δ where $$q2 = 1+Δ$$. $$a_{q2}$$ , the constant for the high quality is chosen such that 30 percent of households with wage equal to one choose the high quality.
Figure 10: Size Distribution - Data vs Model
Notes: The figure plots the share of employment in different size categories in the data and in the calibrated baseline of the model. The data is for the manufacturing sector in India for 2005-06. It combines the ASI and the SUM (same as Figure 1).
Figure 11: Counterfactual Across Indian States - Data vs Model
Notes: The figure plots the share of employment in plants of size five or less across Indian states in the data and for the counterfactual exercise in the model. The blue line is the linear regression line of share of employment in plants of size five or less in different Indian states on log of per-capita GDP of the state. The red line is the model predicted share of employment in plants of size five or less when conducting the counterfactual exercise.
Figure 12: Counterfactual: Changes in Distribution for 3 Richest vs 3 Poorest States
Notes: The figure plots the share of employment in the three poorest states minus the share in the three richest states for different size categories in the data and in the model (when productivity and skill levels are varied to match the differences in per-capita income across these groups of states). The data is from the ASI and SUM for 2005-06.
Figure 13: Counterfactual India Over Time - Data vs Model
Notes: The red bars in the figure plot the share of employment in plants of size five or less for five years for India. The data for each year pools the SUM and and the the ASI for that year. The blue line plots the model predicted share of employment for each year when productivity and skill levels are varied to match the differences in per-capita income in India over time.
Figure 14: Distribution of prices and employees for firms that produce "Finished Cotton Cloth"
This figure shows the histogram of average price charged (left) and the number of employees (right) by each firm that produces "finished cotton cloth" in the Annual Survey of Industries.
Figure 15: Finished cotton cloth: Relationship between size of firm and prices charged
This figure shows the binned scatterplot and line of best fit for the relationship between the price charged and the number of employees in each firm in the finished cotton cloth industries using the ASI. We plot the residuals for log ("price") and log ("labor employed") after controlling for state and rural fixed effects.
Figure 16: Finished cotton cloth: Relationship between size of firm and capital stock
This figure shows the binned scatterplot and line of best fit for the relationship between the price charged and the number of employees in each firm in the finished cotton cloth industries using the ASI. We plot the residuals for log ("price") and log ("labor employed") after controlling for state and rural fixed effects.