On Wednesday, June 4, the Congressional Budget Office (CBO) released its official budget impact score for the “One Big Beautiful Bill” and estimated that the bill will increase the federal deficit by $2.4 trillion over the ten-year budget window.
This score by the CBO, the official budget analyst of the US Congress, has informed calls to “kill the bill.” It even played a role in the recent Trump-Musk breakup. Many Republicans, meanwhile, have dismissed the estimate because they distrust the scorers.
This controversy has highlighted a deeper problem. The official scorekeepers are the CBO and the Joint Committee on Taxation (JCT). The JCT is responsible for providing estimates of the budgetary impact of tax legislation, which the CBO then incorporates into broader cost estimates.
Revenue estimates by these scorekeepers play an important role in shaping how representatives, voters, and agencies understand and react to potential legislation. When one party questions the scorekeeper’s credibility, it undermines the legislative process and taxpayers’ ability to hold elected representatives accountable for fiscal irresponsibility.
In every major fiscal debate over the past few decades, whether it was the Affordable Care Act, the Inflation Reduction Act, or the Tax Cuts and Jobs Act, the official scores exerted substantial influence over whether the legislation was viable for passage and the public’s perception of the legislation. Additionally, revenue estimates may determine whether a bill qualifies for the reconciliation process that bypasses the Senate filibuster and therefore requires only a simple majority to pass a bill instead of 60 votes in the Senate. One key criterion is that only bills that do not increase the deficit beyond a ten-year window are eligible to be passed through reconciliation.
This is even more relevant now than it was several decades ago because the constraint of the filibuster and the inability of either party to get 60 votes in the Senate (for now), has made reconciliation the primary vehicle for federal policy implementation by both parties. In that context, if (and now, when) members of Congress and segments of the public do not trust the score or the scorekeepers, it reduces trust in the legislative process. That distrust may in turn lead lawmakers to dismiss legitimate concerns over their legislation’s deficit effects.
The response by many Republicans to the new score was that the CBO, and by extension the JCT, are biased because over the past several decades there has been consistent misestimation of the effects of major tax and spending legislation in a manner that seems to favor legislation proposed by Democrats. For example, in 2022, the CBO and JCT estimated the energy-related subsidies of the Inflation Reduction Act would cost about $370 billion over the 10-year budget window. A later analysis by Goldman Sachs indicated the true cost of those uncapped subsidies was closer to $1.2 trillion over the 10-year budget window.
Defenders of the scorekeepers make the valid point that the JCT score was constrained by baseline scoring conventions, focused on the cost of increasing existing credits. Agency discretion later expanded the scope and cost of the bill. However, even accounting for these limitations, the four-fold difference in estimated costs raises legitimate questions about forecasting accuracy.
To take another example, in 2009 the CBO estimated the Affordable Care Act would create a net reduction in the federal deficit over the 10-year budget window (i.e., that it would pay for itself). This score was built on optimistic assumptions about healthcare cost growth, price controls, and timing gimmicks that masked long-term costs. These assumptions have not aged kindly. By extension neither has the misestimation of costs because large social programs, once enacted, have proven nearly impossible to repeal. Thus, the fiscal consequences of these scores have proven to some extent to be permanent.1
These cost estimates for the Inflation Reduction Act and Affordable Care Act, which involved substantial expansions of social expenditures with minimal tax increases, yielded deficit neutral or positive scores. That raises important questions about whether current scoring frameworks and statutory limitations allow these agencies to adequately capture the true fiscal impact of major spending expansions.
Additionally, the CBO underestimated the resulting increase in corporate tax revenues after the passage of the Tax Cuts and Jobs Act by 17 percent up through today, which is still short of the 10-year budget window, and underestimated economic growth under the first Trump administration.2 The contention that the increased revenue was from increased public expenditures during Covid has some merit. However, the increased collections ignore the reality that the Tax Cuts and Jobs Act broadened the base that would have been able to collect those increases from Covid expenditures. These misestimations on such substantial pieces of legislation have created a legitimate perception that scoring “errors” tend to favor Democratic priorities: tax cuts appear more expensive than they prove to be, while the costs of new spending are substantially understated.
The deeper problem with these repeated scoring errors is that they have not been followed by meaningful external reforms to the scoring process, such as legislative changes to scoring conventions or to scorekeeper constraints that may contribute to these past misestimations. As a result, Republican trust in the CBO and JCT estimates has steadily eroded to the point where Republicans understandably dismiss the score. The criticisms from the major players included President Trump calling the CBO “very hostile,” and Senate Majority Leader John Thune (R, S.D.) saying the CBO has “a long history of just being flat wrong.”
How necessary then are the CBO and JCT if one major party no longer considers their work reliable or credible? In that context and the context of layoffs in the Washington DC area over the past few months, the scorekeepers themselves seem to have a strong incentive to pursue reforms that restore bipartisan trust. The alternative, if they dig in their heels, is these agencies risk becoming sidelined in the very process they were created to inform.
This erosion of trust is not just about politics or partisan rhetoric. It reflects deeper, and in our view valid, concerns about the transparency of budgetary impacts from modifying CBO and JCT assumptions, the replicability of their models, and the accountability of their scoring process to outside scrutiny. Key modeling choices, such as assumptions about the behavioral response of taxpayers, are either unpublished or inadequately documented for external review. Assumptions about GDP growth are chosen by the CBO, rather than GDP growth being a range of choice parameters.
Additionally, current conventional scoring methods used by CBO and JCT hold GNP constant and incorporate only partial behavioral responses, which excludes labor supply changes that impact economic growth. Dynamic scoring, when used, does attempt to model macroeconomic effects, but internal reconciliation between the conventional and dynamic scores is either unpublished or inadequately documented for external review and replication. In the dynamic scoring process three different models are used and then the “blend” of those scores is decided upon by the JCT for the final estimate. As a result, it is difficult for outside experts or policymakers to verify whether assumptions included in scores are consistent with the economic literature or with observed outcomes. It is essentially impossible for outsiders to stress-test the scores to different assumptions.
This is the introductory post in a series, “Checking the Scorekeepers”. Its purpose is to explore how the CBO and JCT currently score individual income tax legislation, describe what the CBO and JCT scores can and cannot incorporate because of statutory and conventional limitations, identify potential gaps or inconsistencies in the treatment of behavioral responses or assumptions on growth, and suggest reforms to improve transparency and trust in the scoring process.
These reforms may include legislative changes such as clarifying or changing scoring conventions, or requiring greater public documentation to be made available. They may also include requirements for internal improvements by the CBO and JCT to provide more benchmarking of their models to academic research and publicly accessible data. The goal of the series and reforms is to restore confidence in the official scoring of legislation, without which the political process loses its compass for understanding the true effect of legislation on the federal government’s worringly large debt and deficits.
The Affordable Care Act has proven to be politically permanent, and time will tell how much of the Inflation Reduction Act credits are permanent.