Data Critique – Wealth Education Gap

The Dataset and Who Made It

For our project, we chose to analyze the Federal Reserve’s Distributional Financial Accounts (DFA) dataset, which is a comprehensive source for studying U.S. wealth inequality. It tracks quarterly estimates of household wealth from the late 1980s through part of 2025, covering major groups such as race, income, age, and education. It integrates two data products produced by the Federal Reserve Board: the Financial Accounts of the United States and the Survey of Consumer Finances.

The Financial Accounts provides quarterly data on aggregate balance sheets of the major sectors of the U.S. economy, distinguishing between households & non-profit organizations, federal and local government, financial and non-financial uses. The Survey of Consumer Finances provides comprehensive triennial microdata on the assets and liabilities of a representative sample of U.S. households, using weighted estimates of family income, net worth, credit use, and other financial outcomes (Gibson-Davis and Percheski 2018).

Since the two data sets use slightly different measures of wealth, the two data sets are then integrated to use common definitions for household balance sheets. Then, as the SCF is triennial, the authors interpolate between the SCF surveys to produce quarter-by-quarter data, using correlations between SCF data, Financial Accounts data, and consumer participation in relevant markets (Chetty et al., 2022).

The outcome is a raw data set, which comes as a ZIP file containing about 24 CSVs, all following the same structure (quarter in the first column, category in the second, and wealth measures in the following columns), which makes it easy to merge/analyze. Some files show levels, meaning total dollar amounts of wealth, while others show shares, meaning the percentage of total U.S. wealth each group owns.

Who Was Excluded? For What Purpose?

The dataset is produced by the Federal Reserve, headed by Jerome Powell, who represents institutional and policy perspectives of the U.S. government, which shapes how wealth and inequality are presented. As the Federal Reserve is an institution that wields monetary authority, the dataset embodies a technocratic perspective where wealth inequality is strictly an economic happening rather than a result of historical power relations. If this dataset was our only source, the audience would be unaware of the lived experiences of different demographics in regards to wealth inequality. The focus on quantifiable metrics such as net worth, real estate, and private businesses omits the effects of greater political and social developments that have influenced the accumulation of capital in the United States.

Furthermore, the races are included only as White, Black, Hispanic, and Other. Many racial groups, like Asian, Native American, Middle Eastern, and multiracial individuals, aren’t represented, and by merging all of these into “Other,” the dataset loses nuance and makes it difficult to analyze racial groups over time. This leaves out a large part of the U.S. population, and since some of these groups have likely seen major economic changes in recent decades, it isn’t possible to study how their wealth has shifted or improved over time.

Additionally, the education variable in the dataset only includes NoHS, HS, SomeCollege, and College. It doesn’t include types of degrees such as professional, associate’s, bachelor’s, master’s, or Doctoral degrees. It also excludes other forms of education like trade schools, vocational training, and alternative education programs. As a result, any analysis that compares education and wealth will be a little limited, as it treats college graduates/non-college individuals as uniform groups, even though income, jobs, and wealth can vary widely depending on the type of education/training is received.

Similarly, the age variable includes four categories: age70plus, age55to69, age40to54, and ageunder40. However, these groups are quite broad. The label “ageunder40” combines everyone below forty into a single group, which prevents meaningful comparisons among people in their twenties, early thirties, and late thirties. This grouping overlooks how wealth typically changes during these years, when people begin their careers, take on student debt, buy homes, or start families. By merging these life stages together, the dataset conceals important transitions in savings, investment, and debt that influence long-term wealth growth. As a result, it becomes difficult to trace how financial stability develops across the different stages of early and mid-adulthood.

In addition, the DFA data set has no variables for gender, marital status, or household size. These factors strongly influence financial outcomes, and leaving these out prevents comparison across key social dynamics. The dataset also doesn’t include any regional/state-level information, which prevents analysis of how wealth distribution varies across different parts of the country. This makes it impossible to study geographic patterns, such as differences in wealth between urban and rural areas or regional economic disparities across the U.S.

Lastly, the data doesn’t account for inflation throughout the time period, consumer price index (CPI), housing availability/affordability, hours worked, or fluctuations in the evaluation of assets such as real estate, money markets, bonds, and loans. The lack of economic quality of life indicators limits the dataset’s ability to capture the lived realities behind wealth inequality.