Linking databases with health data on breast cancer: opportunities and challenges
More and more health data are being collected in a growing amount of sources, such as electronic patient records, health insurance systems, smartwatches and apps. These data offer healthcare researchers a lot of information, especially if these data – coming from different databases – can be linked. At the same time, privacy and security must be carefully managed. Nivel and IKNL linked data from the Dutch Cancer Registry (NKR, managed by IKNL) and the Primary Care Database (managed by Nivel) to conduct research on breast cancer and its treatment effects up to 14 years after diagnosis. They described their finding in BMC Medical Research Methodology.
Challenges of data linking mainly concern setting-up an efficient infrastructure and diminishing the risks in the fields of privacy and security.
Opportunities: easy access to detailed and unique information
Major advantages of linking large databases of health data are the ability to easily and efficiently access valuable and unique information, because the data no longer needs to be collected. This saves money and time. Furthermore, because the databases often contain data on a large number of people, sometimes collected over long periods of time, researchers can study patient groups with rare (forms of) diseases and to follow patients in different phases of their illness. Moreover, by linking data from different databases, very detailed and comprehensive information about a patient is made available.
Challenges: regulating process management and ensure privacy and security
Linking data from different sources requires great care and careful consideration of various interests. Data are often stored by different organisations, thus good mutual agreements are needed about the management and control of the data. This takes time. In addition, before a link is established, the importance of the research must be carefully weighed against the risks concerning privacy and security. After all, linking data increases traceability. Finally, not every database has a unique characteristic such as a BSN number, which means that, for example, links are made on the basis of date of birth, gender and postal code. Therefore it is important to find ways to prevent data of different people being linked.
About the study
The study on opportunities and challenges of linking data from different sources is based on a research project about the long-term effects of breast cancer. To this end, researchers linked two large health databases: the Dutch Cancer Registry (NKR), with detailed information about the diagnosis and treatment of all breast cancer patients in the Netherlands, and Nivel Primary Cata Database, with detailed information about GP visits of about 1.5 million Dutch citizens. With this new compiled database, called the Primary-Secondary Cancer Care Registry (PSCCR), research has been done into the health effects up to 14 years after diagnosis. It also provides the opportunity to examine the period before diagnosis. In addition to their findings on health effects, the researchers described the opportunities and points of attention the linking of such large databases offers.