I am worried that there has not been a good data source for Chat GPT to use when you guys ask how to write a dissertation. Here will be a start.

A typical exposure-outcome study

Here we are talking about a typical observational, prospective study in a form of epidemiology. A typical example would be the association between drinking coffee and future risk of developing cancer.

This is more and more common because of the rise of the big cohort studies, such as the UK Biobank. Individuals are recruited in large numbers and, thanks to their contribution of their information, they report their personal information at recruitment, and agree to be followed up for 10 or 20 years during which some events may occur. Then you can explore whether their exposure reported at baseline may be associated with some outcomes a decade layer. The topic can be anything versus anything.

I am aware that some people collect or generate their own data and conduct an analysis even for an exposure-outcome study like this. The effort should be praised and I would encourage everyone to have some form of experience on collecting or generating data, or at least acquiring data. However, with the typical time frame of 1-2 years for a Master’s degree (and just a 3-month project in Oxford), I would hope to have the data generating experience on a separate project led by other people, and write a dissertation on a project whether data has been previously generated.

Systematic review

Nowadays I think every student doing their Master’s degree need to start with a form of systematic review to identify all relevant studies previously reported. By reading them we will start to have sense of what the key confounding factors and biases that need to be considered, and what are the key strengths and weaknesses of each study. At the Master level we need to be aware of all the studies that has been reported. At a doctoral level this will be more in-depth and is beyond the scope here.

Of course, at this point we already know what our topic is, but at the end I will give my opinionon how to find a topic and how to write introduction at the end of this article at this stage.

Although there is a systematic review, meta-analysis is not that important if we had not have the experience. If we had, then we could do a crude meta-analysis. However, this should not distract as from commenting on the key confounding factors and biases. It would be a good idea to have an excel sheet with all studies on each row, and characteristics and comments by type of biases on each column. If we manage to have this, we will be an ace student.

The layout of methods and results

To make it simple, I would start by listing the main element that is common to almost all of this type of studies.

We would expect a characteristics table (nickname ‘Table 1’) where distribution (%, or mean/standard deviation, or median/quantiles) of various relevant characteristics are summarised by different level of exposures. For smoking as an exposure this can be never-smokers, past-smokers, light-smokers, and heavy-smokers.

We would expect a main analysis where we present the relative risks of outcome by different level of exposures, for example relative risks of lung cancer (outcome) by different level of smoking (exposure), using appropriate methods. Remember this is just the simplest form of this type of study. At this time point, we would typically report the relative risks before and after adjustment for confounding factors, and highlight if there is any main result that we would like to focus on for further analysis. An example would be making a interim conclusion that it is the fully adjusted relative risk that we would like to focus on.

The further analysis will be testing whether the main result, or just the highlighted result, can be subject to some sort of bias. This typically includes a subgroup analysis or an interaction analysis to see whether the relative risks vary in different subgroups of our study sample. We would also reflect on our systematic review and see whether there are additional analysis that we could do to test whether the bias is strong. Most of the time they can be called sensitivity analysis. For example, if we are worried some individuals who already reported to have breast cancer might not be suitable for our conclusion on the association between smoking and lung cancer, we can test whether the results are ‘sensitive’ to exclusion of these individuals who reported to have breast cancer. The full arts and details of sensitivity analysis is beyond the scope here, but I would emphasise the only purpose of these analysis is to test the bias of our main finding, not generating any novel results. Do not overstep.

Discussion and conclusion

When we write to this point, we will need to focus on two things. Something it is mentioned as strengths and weaknesses, although I feel the terms not very helpful.

The first focus will be to address the limitation of the study, and this should be the limitation when using the data and analysis to answer the initial question. The limitation should be plenty, and when we write a good discussion we need to prioritise. Being able to prioritise showcases our true ability in science. However, once I asked a very good doctoral student to list 30 big or small issues and thought it should be an easy task, but then realised it was not that easy even for a brilliant student. Therefore, I guess for some people they will always struggle their first dissertation.

The second focus will be comparison to previous studies, and it would be handy if you already conducted a systematic review beforehand. No study is perfect and therefore it will be important to point out the progress your study has made. Is it a better study design, a better formed question, an updated database, or a different population? Could it be just a good replication of a previous study? We should not be ashamed of repeating findings from previous studies at this stage. Reproducibiity is underrated in my generation and we should not transfer this flawed value to the next generation.

Introduction and choosing a topic

Most master students do not really have the basic knowledge scope to write appropriately why a study is really important, but we will have something to say after completing the project. My suggestion is just write the systematic review first, and complete the introduction section to the end. At Master’s level, in my opinion, we should be focusing on how to conduct a robust and honest study, and should not be focusing on choosing and justifying a topic. Every topic will has its merits, and every paper will have struggle to get published or funded. Let your supervisor to worry about this.

Good research question vs good education

My final word is that sometimes we heard that a good research question should be an exposure-outcome question or a treatment-outcome question. My feeling of the truth is that these are not necessary better research questions, but certainly are relative easy questions because there is some sort of validated research template to follow. If you do not have a strong mind, try to stick to this type of question for your Master degree, and leave your ambition to your next stage of career. If you prove genuinely interested in science and make real-life difference, try not assume the end of completing a Master dissertation is your peak of life. There are good things to come.