Data query and analysis
The data warehouse can be used to define cohorts of patients and patient cases with specific inclusion and exclusion criteria. These criteria are defined using constraints on parameters of various domains (e.g. heart frequency > 100). Additionally to these restricting query entries, the cohort can be enriched by additional parameters that do not restrict the query but are only being reported. The data of such a query (cohort, inclusion/exclusion criteria, and added parameters) can be exported (after ethics committee agreement) and be used for retrospective studies and analysis.
Example: The prevalence of heart failure within a hospital cannot be simply reported using few existing parameters from the hospital information system. Codes of the international classification of diseases would provide a simple parameter but are often optimized for accounting objectives. Thus, multiple parameters were defined by a clinical expert that identify heart failure within the data warehouse. These parameters include text searches in discharge letters and echo cardiographic reports, but also constraints on structured parameters and structurized parameters from free text. The resulting patient cohort was verified by a manual defined cohort and analyzed.
The data warehouse can also be used for prospective studies by defining study cohorts via data warehouse queries (as described above). After such a query is defined, it can be executed regularly (e.g. daily) in order to consecutively find patients for study inclusion. After specific ethics committee approval, they could also being contacted for study inclusion.
Example: A researchers requires to find patients with multiple in-/exclusion criteria, but the data warehouse currently only contains a subset. So the researcher creates a query to find patient cases of the previous years (01/2008 to 01/2012), an age >= 18 and mitral regurgitation (e.g. based on a text search within discharge letters or ICD diagnoses). The resulting identified patients can be depseudonymized (after ethics committee approval), enriched with further details using the hospital information system, and analyzed.
Reference: Wallenborn J, Störk S, Herrmann S, Kukuy O, Fette G, Puppe F, Gorski A, Hu K, Voelker W, Ertl G, Weidemann F. Prevalence of severe mitral regurgitation eligible for edge-to-edge mitral valve repair (MitraClip). Clin Res Cardiol. 2016 Aug;105(8):699-709.
Routine data transfer to study databases
Medical and epidemiological studies might require to document large amounts of data within verified study databases (EDE, electronic data entry systems). While this EDE documentation was and often is done manually, it can be improved by automated data transfer pipelines, e.g. by using the structured and harmonized data of a data warehouse.
Example: The Acute Heart Failure Registry requires to document the full initial (index) hospitalization of patients hospitalized with acute heart failure and their follow-ups. The index hospitalization documentation includes a documentation that is similar for each single day, which can reach more than 40 days. Various Case Report Forms (CRF) were defined for each visit and index hospitalization day. A pipeline was implemented that exports the full data of all registry patients from the data warehouse. The data feeds into an R framework that is used to create the variables of each Case Report Form and results in a single file per CRF. These CRF files are then used to pre-fill the EDE system. This whole import process is done in multiple steps including multiple data verification steps.
Reference: Kaspar M, Ertl M, Fette G, Dietrich G, Toepfer M, Angermann CA, Störk S, Puppe F. Data Linkage from Clinical to Study Databases via an R Data Warehouse User Interface. Experiences from a Large Clinical Follow-up Study. Methods Inf Med. 2016 Aug 5;55(4):381-6.