Whether qualitative or quantitative, contemporary civil-war studies have a tendency to over-aggregate empirical evidence. In order to open the black box of the state, it is necessary to pinpoint the location of key conflict parties. As a contribution to this task, this article describes a data project that geo-references ethnic groups around the world. Relying on maps and data drawn from the classical Soviet Atlas Narodov Mira (ANM), the ‘Geo-referencing of ethnic groups’ (GREG) dataset employs geographic information systems (GIS) to represent group territories as polygons. This article introduces the structure of the GREG dataset and gives an example for its application by examining the impact of group concentration on conflict. In line with previous findings, the authors show that groups with a single territorial cluster according to GREG have a significantly higher risk of conflict. This example demonstrates how the GREG dataset can be processed in the R statistical package without specific skills in GIS. The authors also provide a detailed discussion of the shortcomings of the GREG dataset, resulting from the datedness of the ANM and its unclear coding conventions. In comparing GREG to other datasets on ethnicity, the article makes an attempt to illustrate the strengths and weaknesses associated with the GREG database.