SIMPREDICT (WORKING PAPER) Pleiotropy info TASK

SIMPREDICT: info


Search all projects    list all


Click to expand

SIMPREDICT (WORKING PAPER): info


What is SIMPREDICT

SIMPREDICT is a simulation-based tool for risk stratification model developers, emphasising the risks of conditions of low-to-moderate incidence, typically between 25 and 200 per 100,000 person-years. Incidence within this range represents a key challenge in identifying individuals with a sufficiently high risk that is clinically actionable.

Therefore, we focus on this clinically actionable test-positive group and provide reference numbers for the absolute risk characteristics of the test-positive group based on C-statistics, disease incidence, and the threshold of test positivity. The characteristics are TRs (test-positive rates), PPVs (positive predictive value i.e. the absolute risk of the test-positive group), and SRs (sensitivity rates, or the coverage % of disease cases in the test-positive group).


How to use

Via the TASK tab you can:

Estimate the absolute risks [Task 201]
Input the C-statistic to obtain the absolute risk characteristics (TRs, PPVs, and SRs) for diseases of different incidences

Minimal incidence required [Task 202]
Input the C-statistics and the test-positive threshold (based on PPV and SR) to obtain the minimal incidence required to identify such a group and the best PPV cut-off for diseases of difference incidence.

Minimal incidence required [Task 203]
Input the disease incidence and the test-positive threshold (based on PPV and SR) to obtain the minimal C-statistic required to identify such a group.

Result uncertainty (interquartile ranges IQRs) [Task 301]
Input the expected C-statistics, disease incidence, and the test-positive threshold to obtain the expected interquartile range (25-75 percentile) for different model-testing sample size.

You may also Use AI with SIMPREDICT.


Click to expand

SIMPREDICT (WORKING PAPER): Using AI with SIMPREDICT

Using AI with SIMPREDICT
If you are using AI tools like ChatGPT or Gemini to analyse results, it is essential to provide the correct context. Without these assumptions, a general AI may provide inaccurate statistical interpretations.

1. Copy the Context Header
Paste this at the start of your chat to 'teach' the AI the rules of the framework:

'I am using the SIMPREDICT framework from Pleiotropy.co.uk. This tool uses a log-normal risk distribution to calculate the clinical utility of models for low-incidence diseases. It provides reference numbers for PPV, Sensitivity (SR), and Test Positive Rate (TP) based on the C-statistic, incidence, and absolute risk thresholds. Use these benchmarks to answer the following...'

2. Task-Specific AI Prompts
Task 201: Estimating Risk Indicators
'Given a C-statistic of 0.75 and an incidence of 50 per 100,000, what are the expected PPV, Sensitivity, and Test Positive Rate at a 5% 10-year risk threshold?'
Task 202: Finding Minimal Incidence
'If my model has a C-statistic of 0.70, what is the minimal disease incidence required to achieve a PPV of at least 10% and a Sensitivity of 20% at a 5% risk threshold?'
Task 203: Finding Minimal C-Statistic
'For a disease with an incidence of 100 per 100,000, what is the minimal C-statistic a model must achieve to be clinically useful (e.g., PPV > 5% and Sensitivity > 10%)?'
Task 301: Modeling Uncertainty (IQR)
'Using the Task 301 simulation, if I validate a model (true C-stat 0.70) in a sample of 5,000 people, what is the expected Interquartile Range (IQR) for the observed C-statistic and PPV?'

3. Flexible & Advanced Analysis
Users can ask the AI to scale or interpolate SIMPREDICT reference numbers for bespoke scenarios:

Custom Time Horizons (e.g., 5-year risk)
'I need to evaluate a 5% risk threshold over a 5-year period instead of 10 years. Scale the SIMPREDICT log-normal benchmarks to provide the expected absolute risk characteristics for this 5-year window given a C-stat of 0.75 and an annual incidence of 80 per 100,000.'
Interpolating Thresholds (e.g., 7% risk)
'SIMPREDICT provides results for 5% and 10% thresholds. Based on the log-normal distribution logic, can you interpolate the expected PPV and Sensitivity for a model with a C-stat of 0.80 at a 7% risk threshold?'

Core Principles
SIMPREDICT reference numbers are built on the following logic:

Log-normal Distribution: The mathematical benchmark for underlying risk.
Uncertainty Analysis: Task 301 provides the 25th-75th percentile ranges to show how sample size affects clinical utility results.
Clinical Utility: Focuses on the 'test-positive' group (PPV, SR, TP) to determine practical model value.