Conducted hands-on reproducibility testing for a comprehensive study measuring the state of computational reproducibility in machine learning security research across Tier 1 security conferences from 2013-2022.
Reproducibility is fundamental to scientific advancement, yet many fields have faced reproducibility crises. Computer Security has a unique advantage in creating computational artifacts (code, data, figures) that should facilitate reproducibility. However, no comprehensive study had measured the actual state of reproducibility in the security community, particularly for machine learning papers.
As a research assistant, I performed hands-on testing and reproduction of nearly 750 machine learning papers from Tier 1 security conferences (CCS, S&P, USENIX Security, NDSS). This involved systematically attempting to run provided codebases, reproduce results, and document the success rate of computational reproducibility across a decade of research.