(Note: British English spelling - ‘harmonisation’; North American English spelling - ‘harmonization’)

There are many definitions of data harmonisation, but a good working definition is provided by Maelstrom Research.

Harmonisation involves achieving or improving comparability of similar measures collected by separate studies or databases for different individuals. Some research programs foster prospective implementation of harmonised measures to collect data across studies, while others turn their efforts to retrospective harmonisation and co-analysis of existing datasets.

In summary, harmonisation seeks to bring together various types, levels and sources of data, which represent measurement of the same latent construct(s), in such a way that they can be made compatible and comparable (see Figure C.8.1 for example on wine consumption).

Harmonisation differs from standardisation in that it does not impose a single methodology or norm, but rather seeks to find ways of integrating or making "an agreeable effect" from information gathered through disparate methodologies (Harris et al., 2012).

Figure C.8.1 Example of harmonisation of common format wine consumption variable using data collected in different ways.

When the purpose of an investigation requires bringing data from multiple sources together in one analysis (e.g. meta-analysis), a preceding step to the main analysis involves converting the variable(s) to a common format which should adhere to the principle of inferential equivalence (see Figure C.8.2).

Figure C.8.2 Principle of inferential equivalence for harmonised common format variables. The inferences about the latent truth are equivalent regardless of the method originally used to collect the data. It is important to note that the ‘method’ refers to initial measurement by a tool/instrument, plus any stages of inference involved in post-processing and derivation of estimates; flexibility and variation during these steps may also influence the degree to which data can be harmonised and be deemed inferentially equivalent.

Researchers may, for example, conduct an analysis of multiple cohorts in order to understand the genetic, lifestyle, social and physical environmental factors associated with disease by bringing together data from diverse countries and regions [2]. As there are significant financial, technical and time burdens associated with developing and maintaining large population health studies which have matured to generate disease events, researchers are making use of data set consisting of information from multiple studies.

Analyses using data from multiple studies are constrained by the quality and compatibility of their data [3]. Variables are often assessed using different methods, limiting their compatibility. In longitudinal studies, there can be changes in methods of assessment between phases of data collection. Standardisation and harmonisation are related approaches which facilitate analyses by improving the compatibility of data, however there are important differences between the two.


Standardisation refers to the implementation of uniform processes for prospective collection, storage and transformation of data [4]. Standardisation implies that precisely the same methods, protocols and standard operating procedures are used in every study or study phase contributing to the analyses [1]. (Note: ‘standardisation’ is used in statistics for a different meaning to divide a variable by a standard deviation, e.g. z-score.)

Using standardised methods across multiple studies greatly facilitates analyses of datasets from separate cohorts. However, imposing identical procedures is also very challenging due to varying:

  1. Study designs
  2. Participant characteristics
  3. Equipment availability
  4. Staff training
  5. Cultural differences
  6. Ethical considerations
  7. Scientific interests or priorities

Even if rigorous standards are implemented, some level of human error and variation between studies is inevitable [4].


Harmonisation is a more flexible approach that is more realistic than standardisation in a collaborative context [3]. Harmonisation refers to the practices that improve the comparability of variables from separate studies, permitting the pooling of data collected in different ways, and reducing study heterogeneity [5].

The harmonisation process involves deriving target variables formatted in a specified way from existing data collected using methods which are diverse across studies. Data can be recoded, transformed or combined with additional information to achieve harmonisation, but the process requires compatibility of both the methods used and the pre-existing data. The degree of similarity required is not absolute or easily defined; it varies according to the target variable to be derived and the scientific context (i.e. the research question). What is important is that the data are ‘inferentially equivalent’, i.e. conclusions about the latent true values from the derived target variable are valid regardless of the method by which the data were originally collected.

For example, when harmonising a ‘total daily energy intake’ target variable, there would be little scope to harmonise data from a study with only fruit and vegetable intakes. Alternatively, if a ‘total daily physical activity’ were harmonised as a target variable, data from diverse methods such as questionnaires and accelerometers could conceivably be harmonised.

Harmonisation includes practices that enable the pooling of data from multiple cohorts/biobanks at a level of precision that is scientifically adequate, yet accommodates the heterogeneity of those studies. The key challenge of harmonisation is to increase sample size by combining an adequate number of studies, whilst limiting inclusion to those that are satisfactorily harmonised [3]. Compared to standardisation, advantages of this more flexible approach include the potential to include a broader range of studies with greater variety of information, and the ability to use existing data which could lead to more rapid scientific impact [6]. However, this work is challenging and time consuming, and requires access to measurement expertise and resources. Resources such as the InterConnect and Maelstrom Research registries therefore aim to capture and share the algorithms and processes used during harmonisation, so that others may utilise this information for future work.

Prospective vs. retrospective harmonisation

Prospective harmonisation

Ideally, researchers would agree in advance on a series of practices to collect data in such a way as to directly enable pooled analysis [4]. This prospective harmonisation does not necessarily denote complete standardisation of methods, since a degree of plurality is accepted where necessary but this would be planned and justified before the data are collected. A prospective harmonisation approach provides comparable output across methods of inference, despite differences in measurement without the need for further harmonisation steps but this involves significant planning, as well as adherence to those plans across studies.

Retrospective harmonisation

In contrast, retrospective harmonisation occurs after the data have been collected; this is the most common scenario. The quantity and quality of data that can be pooled is limited by the pre-existing differences between study methods and protocols [3]. The retrospective harmonisation process involves steps which can be summarised as follows:

  1. Define the target variable(s)
  2. Assess harmonisation potential
  3. Derive common format data

The target variable is the desired common format to be derived using harmonisation from the existing raw data in the different studies. There may be several variables in any given analysis that require harmonisation, and multiple target variables must therefore be defined. The definition of any single target variable should include its unit. Examples of target variables include:

  1. Total quantity of a specific food consumed per day (e.g. g/day)
  2. Leisure-time energy expenditure per week (e.g. kJ/kg/day)
  3. Percentage body fat (%)

The target variable should be suitable for the purposes of answering a research question but is also dependent upon the methods used and data available from the various studies. Some studies may already report the target variable in the desired units with no requirement for modification or transformation. The target variable and its units may need to be reconsidered when assessing harmonisation potential; this is a balance between what is desirable for the purposes of answering the research question, and what is feasible considering the data available. Please refer to case study 2 on the derivation of leisure-time physical activity target variables for the InterConnect project for more information.

As indicated above, the aim of harmonisation is to produce a target variable using data from different studies in such a way that the data can be considered inferentially equivalent. Since the level of harmonisation potential is determined by the methods used (and the resulting data), it is essential to scrutinise the methods used across different studies to establish whether metrics can be harmonised and how this may be achieved.

The first step in the harmonisation process is, therefore, to acquire relevant meta-data information from studies, such as:

  1. The methods and instruments used in the studies (for example to assess diet, physical activity or anthropometry)
    1. Type or name of the specific instruments
    2. Format of the raw data
    3. Observation period(s) whether retrospective or prospective
    4. Subcomponents measured
    5. Assumptions made during processing and derivation
  2. Evidence of criterion validity
  3. Other sources of information, e.g. convergent validity

Methods and method components

Each method has a number of components, including the instrument used to make the initial measurement and how it is administered, plus any data storage, processing and derivation stages. The use of additional information such as energy cost tables or nutritional databases also form part of the method. These components vary between methods used in different studies; however, depending upon the target variable, this variation may not always impact inferential equivalence.

It is therefore useful to document the methods used and assess which components impact inferential equivalence. For example, if the target variable for a study was daily physical activity energy expenditure, then variation in the domains captured by two different questionnaires (e.g. leisure-time activity vs occupational and travel-related activity) would have greater impact on compatibility than variation in administration mode (e.g. electronic vs. pen and paper).  

When assessing harmonisation potential, the components should be examined in detail. For example, when questionnaires are used, assessment items relating to the target variable should be identified and compared. As shown in Figure C.8.3, specific items relating to the target variable of interest are highlighted alongside the units and categories used. This information can be used to assess not only whether the items relate to the variable of interest (e.g. are the activities queried relevant to the research question?), but also whether the existing data can be transformed to the common format.

Online survey of pregnant women In total, how much of the following do you do at present?

• Jogging
• Aerobic
• Ante-natal exercises
• Keep fit exercises
• Yoga
• Squash
• Tennis/badminton
• Swimming
• Brisk walking
• Weight training
• Cycling
• Other exercises

Categorical:  >7 hrs/week, 2-6 hrs/week, <1 hr/week, 0 hrs/week
Interviews of older adults In your spare time, how much time in the past week did you spent on:
• walking for fun?
• riding a bicycle?
• playing sports (for example: tennis, handball, gymnastics, fitness, skating, and swimming)?
• doing any other physical exercise in your spare time, for example working in the garden or doing odd jobs around the house (do not include household activities)?

For each question: At what pace do you usually do this?
• relaxed pace
• average pace
• brisk pace

Continuous: Hours per week of light (relaxed), moderate and vigorous intensity physical activity
Postal survey of general population Nowadays, at least one hour per week, do you engage in any regular
activity like brisk walking, gardening, housework, jogging, cycling, etc.
intense enough to work up a sweat?

Binary: Yes/No

Figure C.8.3 Overview of questionnaire items of leisure-time physical activity from three different studies which can potentially be used to derive a harmonised target variable.

Harmonisation using simple unit conversion

Data are sometimes collected using methods which are sufficiently harmonised but expressed in units which are not directly compatible. If the mathematical relationship between two variables is known, then a conversion factor can be used to harmonise the data to the same units; this process does not have any uncertainty and all other things being equal, the result is fully inferentially equivalent.

One example of this approach is the use of different units for rate of energy turnover, say kilocalories (kcal) per day, kilojoules (kJ) per day, or Watt. The relationship between kcal and kJ is known to be 1 kcal to 4.184 kJ, and Watt is Joules per second (86400 seconds per day). Units can therefore be harmonised using conversion factors as required.

A more complex example may involve energy expenditure data which have been adjusted for body mass (kcal/kg/day), or not (kcal/day). If individual-level data of both energy and body mass are available, it is possible to convert adjusted data to unadjusted data, or vice versa, depending on the analysis to be conducted. If individual-level data are not available, assumptions are necessary on homogeneity of these variables within the strata to be analysed.

Simple conversion should occur in tandem with proper assessment of harmonisation potential (see above). Variables representing the same exposure (e.g. total energy intake) in the same units (e.g. kcal/day) may not be inferentially equivalent due to differences in the methods used. Where differences between methods are too great, further harmonisation using algorithms or validation data is required.

Harmonisation by collapsing to least common denominator

The harmonisation process often requires more complex recoding, modification or transformation of existing data in order to achieve a common format. There is therefore a degree of inference involved. Separate processing rules must be formulated to transform the variables from each study into the common target variable format. These rules, or algorithms, depend upon the data available in each study; the following dimensions may be available to different degrees across studies according to the methods used:

  1. Type (e.g. food or physical activity type)
  2. Frequency (e.g. servings per day of a particular food or occasions of participation in a type of activity per week)
  3. Duration (e.g. time spent participating in a type or intensity of physical activity)
  4. Intensity (e.g. rate of physical activity energy expenditure during a physical activity type)
  5. Quantity (e.g. food portion size)
  6. Sub-totals or features (e.g. physical activity domains, food groups, or fat mass)
  7. Totals (e.g. energy expenditure, intake per day, or total body mass)

The various dimensions of the variable of interest can be combined or modified to produce the target variable. For examples of processing rules for deriving harmonised target variables, please see the three case studies:

  1. Case study 1: fish consumption
  2. Case study 2: leisure-time physical activity
  3. Case study 4: percent body fat at birth

Depending on the type of data available in each study (e.g. continuous, categorical, ordinal, interval) assumptions will likely be needed in order to derive the target variable. For example, in Figure C.8.3 (above), responses of 2-6 hours per week in Study 1 could be recoded as 4 hours per week to yield a target variable of weekly activity duration in mins/week as available in Study 2; to derive another target variable of activity energy expenditure (e.g. in MET * minutes per week), duration information will need to be combined with:

  1. Study 1: Intensity-by-type data, for example from an energy cost table
  2. Study 2: Intensity-by-type and/or self-rated intensity with an assigned energy cost

External information or normative data can be used to inform and support the assumptions made when developing harmonisation algorithms, such as:

  1. External data on average portion sizes could be used in combination with frequency and food type data to derive quantities consumed.
  2. Activity cost tables, such as the compendium of physical activities, may be used to assign intensity values to each type of activity.

Caution is advised when using additional information such as this, as the degree of generalisability to the population may vary by participating study; making assumptions explicit allows better evaluation of inferential equivalence.

Some algorithms may result in the loss of more granular data (see Figure C.8.4 for example of potential variation in granularity of data). If one study provides data in binary format (e.g. low/high), and another provides data in a continuous metric, then a potential harmonisation approach is to reduce the more granular data to the binary format (a ‘reductionist approach’). For more detail on this issue, please see case study 3 on simulation of harmonisation of physical activity exposure using validation data.

Figure C.8.4 Potential differences in granularity of data to be harmonised in three participating studies.

Harmonisation using validation data

The use of collapsing algorithms leads to loss of information when richer, more detailed data are reduced to the less granular level of another variable in order to achieve harmonisation (e.g. coding a continuous variable as ordinal categories of low, medium and high exposure). This loss in information generally weakens statistical power to detect associations.

An alternative approach can preserve the more detailed information from some participating studies and enable harmonisation of less granular data from others. The approach is based on relationships of the estimates in included studies with the unobservable, or latent, true values which can be estimated in method comparison (validation) studies using a criterion measure alongside the method in question, as shown in Figure C.8.5.

Figure C.8.5 Hypothetical relationships between estimates from three participating studies with the latent truth. Data from a suitable criterion method best estimate the latent true values of the target variable. If the relationship (mapping relationship 1) between estimates from a criterion method and estimates from Method A is known from a validation study, then it may be possible to transform data from Method A so that they are harmonised with data from the criterion method.

The above approach relies upon the existence of a suitable criterion for the target variable, and the availability of validation data for a given ‘Method X’ against that criterion. Ideally this validation work would be conducted in a population similar to that which is providing the data being harmonised.

Sourcing applicable validation data for multiple studies across heterogeneous populations may be challenging; for some methods (e.g. Method B in Figure C.8.4), no validation data are available (mapping relationship 2). In this scenario, it may be possible to map via a third method (Method C) if two additional sets of validation data are available, namely criterion validity of Method C (mapping relationship 3) and convergent validity between Method B and C (mapping relationship 4).

When validation data are available, the transformation using validation data consists of:

  1. The mapping of estimates from Method ‘X’ to estimates from a suitable criterion
  2. An estimate of the uncertainty of the mapping

Subsequent association analysis should then ideally provide:

  1. Central estimate of association using mapped variable (required)
  2. Confidence intervals which incorporate propagation of uncertainty in the mapping (desired)
  3. Adjustment for measurement error (desired)

For a worked example of this harmonisation approach, please see case study 3 on simulation of harmonisation of physical activity exposure using validation data.

  1. Harris JR, Burton P, Knoppers BM, Lindpaintner K, Bledsoe M, Brookes AJ, Budin-Ljøsne I, Chisholm R, Cox D, Deschênes M, et al. Toward a roadmap in global biobanking for health. Eur J Human Genet, 2012;20:1105-11
  2. Fortier I, Doiron D, Little J, Ferretti V, L'Heureux F, Stolk RP, Knoppers BM, Hudson TJ, Burton PR, International Harmonization Initiative, et al. Is rigorous retrospective harmonization possible? Application of the DataSHaPER approach across 53 large studies. Int J Epidemiol, 2011;40:1314-28
  3. Fortier I, Burton PR, Robson PJ, Ferretti V, Little J, L'Heureux F, Deschênes M, Knoppers BM, Doiron D, Keers JC, et al. Quality, quantity and harmony: the DataSHaPER approach to integrating data across bioclinical studies. Int J Epidemiol. 2010;39:1383-93
  4. Doiron D, Burton P, Marcon Y, Gaye A, Wolffenbuttel BHR, Perola M, Stolk RP, Foco L, Minelli C, Waldenberger M, et al. Data harmonization and federated analysis of population-based studies: the BioSHaRE project. Emerg Themes Epidemiol., 2013;10:12
  5. Granda P, Blasczyk E. Data Harmonisation. Guidlines for Best Practice in Cross-Cultural Surveys. 3 ed. Ann Arbor, MI: Survey Research Centre, Institute for Social Research, University of Michigan, 2011.
  6. Fortier I, Doiron D, Burton P, Raina P. Invited commentary: consolidating data harmonization--how to obtain quality and applicability? Am J Epidemiol, 2011;174:261-4