Multivariate Mean Comparison Under Differential Privacy2022
Research Hub A: Kryptographie der Zukunft
RC 3: Foundations of Privacy
The comparison of multivariate population means is a central task of statistical inference . While statistical theory provides a variety of analysis tools, they usually do not protect individuals’ privacy. This knowledge can create incentives for participants in a study to conceal their true data (especially for outliers), which might result in a distorted analysis. In this paper, we address this problem by developing a hypothesis test for multivariate mean comparisons that guarantees differential privacy to users. The test statistic is based on the popular Hotelling’s t2-statistic, which has a natural interpretation in terms of the Mahalanobis distance. In order to control the type-1-error, we present a bootstrap algorithm under differential privacy that provably yields a reliable test decision. In an empirical study, we demonstrate the applicability of this approach.