Liver fat prediction

Single patient form

Information submitted here will be used to estimate the likeliness of elevated liver fat for a single patient. What variables you provide will control which out of three different models will be used for the task.

Physical measurements Required for all models
Alcohol consumption Required for all models
Diabetes status Required for all models
Blood pressure (mm Hg) Required for model 1

Blood biomarkers Required for model 2 & 3
(Fasting insulin only for model 3)

Multiple patients

We are working on a bulk-test option, which will allow you to add values for multiple patients into a given template-file, and have the results returned in a similar file. In the meantime, please use the single patient form.


Non-alcoholic fatty liver disease (NAFLD) is highly prevalent and causes serious health complications in type 2 diabetes (T2D) and beyond. Early diagnosis of NAFLD is important, as this can help prevent irreversible damage to the liver and ultimately hepatocellular carcinomas.

The prediction models are developed here for early-stage diagnosis of fatty liver, which may have utility for clinical diagnosis and research investigation alike. The models were originally developed using data from the IMI DIRECT consortium, which includes a multicenter prospective cohort study of 3029 adults recently diagnosed with T2D or at high risk of developing the disease. See the citation[1] below for more details. The models were successfully validated in UK Biobank, where data permitted.

Prediction models

We developed three different models for NAFLD prediction, some of which are designed for use by clinicians. These models were trained on MRI image-derived liver fat content (< 5% or >= 5%) applying a machine learning method called Random Forest.

Model 1 includes six non-serological input variables: waist circumference, body mass index (BMI), systolic blood pressure, diastolic blood pressure, alcohol consumption and diabetes status. Model 2 includes eight input variables: waist circumference, BMI, TG, ALT, AST, fasting glucose (or hemoglobin A1C (HbA1c) if fasting glucose is not available), alcohol consumption and diabetes status. Model 3 includes nine variables: waist circumference, BMI, TG, ALT, AST, fasting glucose, fasting insulin, alcohol consumption and diabetes status.

The models are assessed based on their “predictive accuracy”. A level of accuracy (ROC AUC) of 50% is equivalent to a coin toss, whereas the accuracy of 100% is absolutely certain. In clinical settings where one is screening patients for a condition like elevated liver fat, an accuracy of 80% is generally considered sufficient. However, models with lower accuracies (70-80%) may still be useful for public health interventions designed to raise awareness of possible liver disease and/or research studies. For model 1 the ROC AUC is 73%, for model 2 it is 79% and for model 3 it is 82%.


The prediction model selected for your calculation is based on the minimum information required for a given model. Even if you provided information on multiple variables, you may find that a more basic model was selected because one or more necessary additional pieces of information were not provided by you. The level of accuracy in the prediction will be affected by the model used for your calculation, with the more complex models being more accurate than the basic models.

In our analysis, we estimate the probability of the presence of fatty liver. In order to make a class prediction, it is necessary to impose a cut-off above which fatty liver is deemed probable and below which it is considered improbable. The choice of cut-off influences sensitivity (true positive rate), specificity (false positive rate) and balanced accuracy (the proportion of individuals correctly classified as true positives and true negatives) within each class. The suggested optimal cut-off considering sensitivity, specificity and balanced accuracy is 0.4 in the prediction models given here.


  1. Discovery of biomarkers for glycaemic deterioration before and after the onset of type 2 diabetes: descriptive characteristics of the epidemiological studies within the IMI DIRECT Consortium.
    Koivula RW, Forgie IM, Kurbasic A, Viñuela A, Heggie A, Giordano GN, Hansen TH, Hudson M, Koopman ADM, Rutters F, Siloaho M, Allin KH, Brage S, Brorsson CA, Dawed AY, De Masi F, Groves CJ, Kokkola T, Mahajan A, Perry MH, Rauh SP, Ridderstråle M, Teare HJA, Thomas EL, Tura A, Vestergaard H, White T, Adamski J, Bell JD, Beulens JW, Brunak S, Dermitzakis ET, Froguel P, Frost G, Gupta R, Hansen T, Hattersley A, Jablonka B, Kaye J, Laakso M, McDonald TJ, Pedersen O, Schwenk JM, Pavo I, Mari A, McCarthy MI, Ruetten H, Walker M, Pearson E, Franks PW; IMI DIRECT Consortium.
    Diabetologia. 2019 Sep;62(9):1601-1615. doi: 10.1007/s00125-019-4906-1. Epub 2019 Jun 15.
    PMID: 31203377


The work leading to this has received support from the Innovative Medicines Initiative Joint Undertaking under grant agreement n°115317 (DIRECT), resources of which are composed of financial contribution from the European Union's Seventh Framework Programme (FP7/2007-2013) and EFPIA companies’ in kind contribution.

This work was supported by a European Research Council award ERC-2015-CoG - 681742_NASCENT.