Many studies on research productivity and performance suggest that men consistently outperform women. However, women and men are spread unevenly throughout the academy both horizontally (e.g., by scientific field) and vertically (e.g., by academic position), suggesting that aggregate numbers (comparing all men with all women) may reflect the different publication practices in different corners of the academy rather than gender per se. We use Norwegian bibliometric data to examine how the “what” (which publication practices are measured) and the “who” (how the population sample is disaggregated) matter in assessing apparent gender differences among academics in Norway. We investigate four clusters of indicators related to publication volume, publication type, authorship, and impact or quality (12 indicators in total) and explore how disaggregating the population by scientific field, institutional affiliation, academic position, and age changes the gender gaps that appear at the aggregate level. For most (but not all) indicators, we find that gender differences disappear or are strongly reduced after disaggregation. This suggests a composition effect, whereby apparent gender differences in productivity can to a considerable degree be ascribed to the composition of the group examined and the different publication practices common to specific groups. We argue that aggregate figures can exaggerate some gender disparities while obscuring others. Our study illustrates the situated nature of research productivity and the importance of comparing men and women within similar academic positions or scientific fields—of comparing apples with apples—when using bibliometric indicators to identify gender disparities in research productivity.