# Ho Proficiency Thresholds: Oregon 2024-25 Exploratory Read Generated: 2026-04-26 ## Question Andrew Ho's 2008 paper argues that Percentage of Proficient Students statistics can be technically correct yet misleading as summaries of score distributions. This note asks what that concern means for our Oregon 2024-25 school achievement analyses, which often use `Percent Proficient` as the main outcome. ## Scope - Data: Oregon 2024-25 school achievement rows in `data/processed/*WithAddressesAndCensusSES.csv`. - Rows: `Total Population (All Students)`, grade-specific rows only. `All Grades` science rows are excluded to avoid double-counting. - Filters: charter schools and virtual schools excluded, matching the main non-charter/non-virtual analysis convention. - Aggregation: school-subject totals across tested grades, using achievement-level counts. - Inclusion threshold: at least 30 tested participants after aggregation. - Unit: school-subject rows. Participant counts are subject-counts, not unique students across subjects. ## Ho's Concern, Applied Here Ho's central statistical point is that proficiency percentages depend on the selected cut score and on how many students are near that cut (Ho, 2008). A trend, gap, or school comparison can look large, small, or even directionally different depending on which threshold is used. He recommends distribution-wide views such as averages, effect sizes, percentiles, and multi-point graphical summaries. Oregon's public school-level data do not give us scale-score means, student percentiles, or full score distributions. They do give us counts in Levels 1-4. That means we cannot fully solve Ho's problem, but we can reduce it by checking whether `Percent Proficient` tells the same story as Level 1, Level 4, and a simple four-level index. ## Statewide Level Mix - English Language Arts: 42.3% proficient, 34.2% Level 1, 22.3% Level 2, 25.2% Level 3, 17.2% Level 4; Level 2+3 adjacent-band share is 47.5%. - Mathematics: 31.6% proficient, 43.0% Level 1, 24.3% Level 2, 17.0% Level 3, 14.6% Level 4; Level 2+3 adjacent-band share is 41.2%. - Science: 29.6% proficient, 40.2% Level 1, 29.3% Level 2, 24.4% Level 3, 5.1% Level 4; Level 2+3 adjacent-band share is 53.7%. Interpretation: the proficiency cut is not summarizing the same distributional shape in each subject. Science, for example, has a large Level 2+3 adjacent-band share but very little Level 4, so a single `Level 3+4` rate can blur a different story than it does in ELA. ## Same Percent Proficient, Different Distribution - English Language Arts: Marshfield Senior High School and Riverview School are both around 39.4-40.0% proficient, but their four-level indexes are 2.05 and 2.30. - Mathematics: Guy Lee Elementary School and Sutherlin Middle School are both around 20.0-20.2% proficient, but their four-level indexes are 1.56 and 1.86. - Science: Irrigon Junior/Senior High School and Richardson Elementary School are both around 14.2-13.8% proficient, but their four-level indexes are 1.39 and 1.77. These are not claims about the named schools themselves; they are diagnostic examples. The point is that the official threshold metric can be almost identical while the underlying Level 1-4 mix is materially different. ## Effect on Our Core Context Analyses - English Language Arts: Percent proficient correlations are BA+ +0.64, poverty -0.75, attendance +0.57; the four-level index is similar at +0.63, -0.75, +0.59. - Mathematics: Percent proficient correlations are BA+ +0.66, poverty -0.66, attendance +0.69; the four-level index is similar at +0.64, -0.65, +0.70. - Science: Percent proficient correlations are BA+ +0.58, poverty -0.67, attendance +0.44; the four-level index is similar at +0.54, -0.64, +0.53. This is reassuring but not a free pass. The big context story does not disappear when we replace `Percent Proficient` with the crude four-level index: poverty, attendance, and adult BA+ context still point in similar directions. But the sensitivity plot shows that Level 4 and Level 1 are not interchangeable with proficiency. Top-end performance, severe low-end performance, and near-cut concentration each emphasize different parts of the distribution. ## Stress Test: Does Equal Spacing Drive the Finding? Question answered: if the four-level index is only a rough proxy because Level 1, Level 2, Level 3, and Level 4 may not be evenly spaced, do the SES associations disappear when the level scores are changed? I tested several alternative scoring rules: the original proficient/not-proficient threshold, equal `1-2-3-4` spacing, extra premium for Level 4, stronger premium for Level 4, stronger penalty for Level 1, a compressed middle, and a near-pass-friendly rule that gives Level 2 more credit. These are not official scales; they are robustness checks. - English Language Arts: across the core scoring alternatives, BA+ ranges +0.61 to +0.65, poverty -0.75 to -0.74, and attendance +0.57 to +0.60. - Mathematics: across the core scoring alternatives, BA+ ranges +0.62 to +0.66, poverty -0.66 to -0.64, and attendance +0.69 to +0.71. - Science: across the core scoring alternatives, BA+ ranges +0.51 to +0.58, poverty -0.67 to -0.60, and attendance +0.44 to +0.56. Interpretation: the exact index value is not sacred, but the main SES story is not driven by the equal-spacing assumption. Adult BA+, poverty, and attendance remain strongly associated with the achievement distribution under the core alternate scoring rules. I also included two edge-case lenses, not as preferred indexes but as reminders that "success" can mean different parts of the distribution: - English Language Arts: the edge-case `Level 4 only` view has BA+ +0.68, poverty -0.67, attendance +0.59; the edge-case `Avoid Level 1` view has BA+ +0.54, poverty -0.73, attendance +0.55. - Mathematics: the edge-case `Level 4 only` view has BA+ +0.69, poverty -0.65, attendance +0.63; the edge-case `Avoid Level 1` view has BA+ +0.56, poverty -0.60, attendance +0.72. - Science: the edge-case `Level 4 only` view has BA+ +0.48, poverty -0.57, attendance +0.24; the edge-case `Avoid Level 1` view has BA+ +0.44, poverty -0.50, attendance +0.60. Interpretation: Level 4-only and avoiding Level 1 are different questions. Level 4-only puts more weight on excellence; avoiding Level 1 puts more weight on severe low-end performance. These variants change some magnitudes, especially in science, which reinforces the caution about precise school rankings and tail-specific claims. ## Practical Read 1. `Percent Proficient` is acceptable as a public-facing snapshot, but it should not carry the whole evidentiary load. 2. For school comparisons, pair it with stacked Level 1-4 distributions or at least Level 1 and Level 4 rates. 3. For "overperforming" or "underperforming" claims, rerun the model with Level 4, Level 1, the four-level index, and at least one alternate level-spacing rule as sensitivity checks. 4. For trend claims, be especially cautious. Without scale-score means or percentiles, Oregon school-level data cannot tell whether students are moving across the whole distribution or mostly around the Level 2/3 threshold. 5. The best long-run fix would be ODE release of school-level scale-score summaries or percentile points, not just achievement levels. ## Figures - `ho_proficiency_thresholds_statewide_levels_2026-04-26.png` - `ho_proficiency_thresholds_pp_vs_index_2026-04-26.png` - `ho_proficiency_thresholds_same_pp_pairs_2026-04-26.png` - `ho_proficiency_thresholds_context_correlation_sensitivity_2026-04-26.png` ## Machine-Readable Outputs - `ho_proficiency_thresholds_school_subject_2026-04-26.csv` - `ho_proficiency_thresholds_summary_2026-04-26.csv` - `ho_proficiency_thresholds_same_pp_examples_2026-04-26.csv` - `ho_proficiency_thresholds_correlation_sensitivity_2026-04-26.csv` - `ho_proficiency_thresholds_level_spacing_stress_test_2026-04-26.csv` ## Caveats The four-level index treats Level 1, Level 2, Level 3, and Level 4 as evenly spaced. That is a useful sensitivity check, not an official scale score. The Level 2+3 share is an adjacent-band proxy for threshold sensitivity, not a count of students literally near the cut score. The underlying public data remain aggregate, not student-level. ## Reference Ho, A. D. (2008). The problem with "proficiency": Limitations of statistics and policy under No Child Left Behind. *Educational Researcher, 37*(6), 351-360. https://doi.org/10.3102/0013189X08323842