How to do your dissertation secondary research in 4 steps

If you are reading this guide, it's very likely you may be doing secondary research for your dissertation, rather than primary. If this is indeed you, then here's the good news: secondary research is the easiest type of research! Congratulations!

In a nutshell, secondary research is far more simple. So simple, in fact, that we have been able to explain how to do it completely in just 4 steps (see below). If nothing else, secondary research avoids the all-so-tiring efforts usually involved with primary research. Like recruiting your participants, choosing and preparing your measures, and spending days (or months) collecting your data.

That said, you do still need to know how to do secondary research. Which is what you're here for. So, go make a decent-sized mug of your favourite hot beverage (consider a glass of water, too) then come back and get comfy.

Here's what we'll cover in this guide:

The basics: What's secondary research all about?
Understanding secondary research
Advantages of secondary research
Disadvantages of secondary research
Methods and purposes of secondary research
Types of secondary data
Sources of secondary data
Secondary research process in 4 steps
Step 1: Develop your research question(s)
Step 2: Identify a secondary data set
Step 3: Evaluate a secondary data set
Step 4: Prepare and analyse secondary data
Summary

The basics: What's secondary research all about?

Understanding secondary research

So, what exactly do we mean when we say “secondary research”?

To answer this question, let’s first recall what we mean by primary research. As you probably already know, primary research is when the researcher collects the data himself or herself. The researcher uses so-called “real-time” data, which means that the data is collected during the course of a specific research project and is under the researcher’s direct control.

In contrast, secondary research involves data that has been collected by somebody else previously. This type of data is called “past data” and is usually accessible via past researchers, government records, and various online and offline resources.

So to recap, secondary research involves re-analysing, interpreting, or reviewing past data. The role of the researcher is always to specify how this past data informs his or her current research.

In contrast to primary research, secondary research is easier, particularly because the researcher is less involved with the actual process of collecting the data. Furthermore, secondary research requires less time and less money (i.e., you don’t need to provide your participants with compensation for participating or pay for any other costs of the research).

TABLE 1 outlines the differences between primary and secondary research:

Comparison basis	PRIMARY RESEARCH	SECONDARY RESEARCH
Definition	Involves collecting factual, first-hand data at the time of the research project	Involves the use of data that was collected by somebody else in the past
Type of data	Real-time data	Past data
Conducted by	The researcher himself/herself	Somebody else
Needs	Addresses specific needs of the researcher	May not directly address the researcher’s needs
Involvement	Researcher is very involved	Researcher is less involved
Completion time	Long	Short
Cost	High	Low

Advantages of secondary research

Whatever type of research you are conducting, always be aware of its strengths and limitations. If you look at the table above, you should already be able to discern some advantages of secondary research.

One of the most obvious advantages is that, compared to primary research, secondary research is inexpensive. Primary research usually requires spending a lot of money. For instance, members of the research team should be paid salaries. There are often travel and transportation costs. You may need to pay for office space and equipment, and compensate your participants for taking part. There may be other overhead costs too.

These costs do not exist when doing secondary research. Although researchers may need to purchase secondary data sets, this is always less costly than if the research were to be conducted from scratch.

As an undergraduate or graduate student, your dissertation project won't need to be an expensive endeavour. Thus, it is useful to know that you can further reduce costs, by using freely available secondary data sets.

But this is far from the only consideration.

Most students value another important advantage of secondary research, which is that secondary research saves you time. Primary research usually requires months spent recruiting participants, providing them with questionnaires, interviews, or other measures, cleaning the data set, and analysing the results. With secondary research, you can skip most of these daunting tasks; instead, you merely need to select, prepare, and analyse an existing data set.

Moreover, you probably won’t need a lot of time to obtain your secondary data set, because secondary data is usually easily accessible. In the past, students needed to go to libraries and spend hours trying to find a suitable data set. New technologies make this process much less time-consuming. In most cases, you can find your secondary data through online search engines or by contacting previous researchers via email.

A third important advantage of secondary research is that you can base your project on a large scope of data. If you wanted to obtain a large data set yourself, you would need to dedicate an immense amount of effort. What's more, if you were doing primary research, you would never be able to use longitudinal data in your graduate or undergraduate project, since it would take you years to complete. This is because longitudinal data involves assessing and re-assessing a group of participants over long periods of time.

When using secondary data, however, you have an opportunity to work with immensely large data sets that somebody else has already collected. Thus, you can also deal with longitudinal data, which may allow you to explore trends and changes of phenomena over time.

With secondary research, you are relying not only on a large scope of data, but also on professionally collected data. This is yet another advantage of secondary research. For instance, data that you will use for your secondary research project has been collected by researchers who are likely to have had years of experience in recruiting representative participant samples, designing studies, and using specific measurement tools.

If you had collected this data yourself, your own data set would probably have more flaws, simply because of your lower level of expertise when compared to these professional researchers.

Disadvantages of secondary research

By now you may have concluded that using secondary data is a perfect option for your graduate or undergraduate dissertation. However, let’s not underestimate the disadvantages of doing secondary research.

The first such disadvantage is that your secondary data may be, to a greater or lesser extent, inappropriate for your own research purposes. This is simply because you have not collected the data yourself.

When you collect your data personally, you do so with a specific research question in mind. This makes it easy to obtain the relevant information. However, secondary data was always collected for the purposes of fulfilling other researchers’ goals and objectives.

Thus, although secondary data may provide you with a large scope of professionally collected data, this data is unlikely to be fully appropriate to your own research question. There are several reasons for this. For instance, you may be interested in the data of a particular population, in a specific geographic region, and collected during a specific time frame. However, your secondary data may have focused on a slightly different population, may have been collected in a different geographical region, or may have been collected a long time ago.

Apart from being potentially inappropriate for your own research purposes, secondary data could have a different format than you require. For instance, you might have preferred participants’ age to be in the form of a continuous variable (i.e., you want your participants to have indicated their specific age). But the secondary data set may contain a categorical age variable; for example, participants might have indicated an age group they belong to (e.g., 20-29, 30-39, 40-49, etc.). Or another example: A secondary data set may contain too few ethnic categories (e.g., “White” and “Other”), while you would ideally want a wider range of racial categories (e.g., “White”, “Black or African American”, “American Indian”, and “Asian”). Differences such as these mean that secondary data may not be perfectly appropriate for your research.

The above two disadvantages may lead to yet another one: the existing data set may not answer your own research question(s) in an ideal way. As noted above, secondary data was collected with a different research question in mind, and this may limit its application to your own research purpose.

Unfortunately, the list of disadvantages does not end here. An additional weakness of secondary data is that you have a lack of control over the quality of data. All researchers need to establish that their data is reliable and valid. But if the original researchers did not establish the reliability and validity of their data, this may limit its reliability and validity for your research as well. To establish reliability and validity, you are usually advised to critically evaluate how the data was gathered, analysed, and presented.

But here lies the final disadvantage of doing secondary research: original researchers may fail to provide sufficient information on how their research was conducted. You might be faced with a lack of information on recruitment procedures, sample representativeness, data collection methods, employed measurement tools and statistical analyses, and the like. This may require you to take extra steps to obtain such information, if that is possible at all.

TABLE 2 provides a full summary of advantages and disadvantages of secondary research:

ADVANTAGES	DISADVANTAGES
Inexpensive: Conducting secondary research is much cheaper than doing primary research	Inappropriateness: Secondary data may not be fully appropriate for your research purposes
Saves time: Secondary research takes much less time than primary research	Wrong format: Secondary data may have a different format than you require
Accessibility: Secondary data is usually easily accessible from online sources.	May not answer your research question: Secondary data was collected with a different research question in mind
Large scope of data: You can rely on immensely large data sets that somebody else has collected	Lack of control over the quality of data: Secondary data may lack reliability and validity, which is beyond your control
Professionally collected data: Secondary data has been collected by researchers with years of experience	Lack of sufficient information: Original authors may not have provided sufficient information on various research aspects

Methods and purposes of secondary research

So far, we have defined secondary research and outlined its advantages and disadvantages.

At this point, we should ask: “What are the methods of secondary research?” and “When do we use each of these methods?” Here, we can differentiate between three methods of secondary research: using a secondary data set in isolation, combining two secondary data sets, and combining secondary and primary data sets. Let’s outline each of these separately, and also explain when to use each of these methods.

Initially, you can use a secondary data set in isolation – that is, without combining it with other data sets. You dig and find a data set that is useful for your research purposes and then base your entire research on that set of data. You do this when you want to re-assess a data set with a different research question in mind.

Let’s illustrate this with a simple example. Suppose that, in your research, you want to investigate whether pregnant women of different nationalities experience different levels of anxiety during different pregnancy stages. Based on the literature, you have formed an idea that nationality may matter in this relationship between pregnancy and anxiety.

If you wanted to test this relationship by collecting the data yourself, you would need to recruit many pregnant women of different nationalities and assess their anxiety levels throughout their pregnancy. It would take you at least a year to complete this research project.

Instead of undertaking this long endeavour, you thus decide to find a secondary data set – one that investigated (for instance) a range of difficulties experienced by pregnant women in a nationwide sample. The original research question that guided this research could have been: “to what extent do pregnant women experience a range of mental health difficulties, including stress, anxiety, mood disorders, and paranoid thoughts?” The original researchers might have outlined women’s nationality, but weren’t particularly interested in investigating the link between women’s nationality and anxiety at different pregnancy stages. You are, therefore, re-assessing their data set with your own research question in mind.

Your research may, however, require you to combine two secondary data sets. You will use this kind of methodology when you want to investigate the relationship between certain variables in two data sets or when you want to compare findings from two past studies.

To take an example: One of your secondary data sets may focus on a target population’s tendency to smoke cigarettes, while the other data set focuses on the same population’s tendency to drink alcohol. In your own research, you may thus be looking at whether there is a correlation between smoking and drinking among this population.

Here is a second example: Your two secondary data sets may focus on the same outcome variable, such as the degree to which people go to Greece for a summer vacation. However, one data set could have been collected in Britain and the other in Germany. By comparing these two data sets, you can investigate which nation tends to visit Greece more.

Finally, your research project may involve combining primary and secondary data. You may decide to do this when you want to obtain existing information that would inform your primary research.

Let’s use another simple example and say that your research project focuses on American versus British people’s attitudes towards racial discrimination. Let’s say that you were able to find a recent study that investigated Americans’ attitudes of these kind, which were assessed with a certain set of measures. However, your search finds no recent studies on Britons’ attitudes. Let’s also say that you live in London and that it would be difficult for you to assess Americans’ attitudes on the topic, but clearly much more straightforward to conduct primary research on British attitudes.

In this case, you can simply reuse the data from the American study and adopt exactly the same measures with your British participants. Your secondary data is being combined with your primary data. Alternatively, you may combine these types of data when the role of your secondary data is to outline descriptive information that supports your research. For instance, if your project is focusing on attitudes towards McDonald’s food, you may want to support your primary research with secondary data that outlines how many people eat McDonald’s in your country of choice.

TABLE 3 summarises particular methods and purposes of secondary research:

METHOD	PURPOSE
Using secondary data set in isolation	Re-assessing a data set with a different research question in mind
Combining two secondary data sets	Investigating the relationship between variables in two data sets or comparing findings from two past studies
Combining secondary and primary data sets	Obtaining existing information that informs your primary research

Types of secondary data

The two most common types of secondary research are, as with all types of data, quantitative and qualitative. Secondary research can, therefore, be conducted by using either quantitative or qualitative data sets.

We have already provided above several examples of using quantitative secondary data. This type of data is used when the original study has investigated a population’s tendency to smoke or drink alcohol, the degree to which people from different nationalities go to Greece for their summer vacation, or the degree to which pregnant women experience anxiety.

In all these examples, outcome variables were assessed by questionnaires, and thus the obtained data was numerical.

Quantitative secondary research is much more common than qualitative secondary research. However, this is not to say that you cannot use qualitative secondary data in your research project. This type of secondary data is used when you want the previously-collected information to inform your current research. More specifically, it is used when you want to test the information obtained through qualitative research by implementing a quantitative methodology.

For instance, a past qualitative study might have focused on the reasons why people choose to live on boats. This study might have interviewed some 30 participants and noted the four most important reasons people live on boats: (1) they can lead a transient lifestyle, (2) they have an increased sense of freedom, (3) they feel that they are “world citizens”, and (4) they can more easily visit their family members who live in different locations. In your own research, you can therefore reuse this qualitative data to form a questionnaire, which you then give to a larger population of people who live on boats. This will help you to generalise the previously-obtained qualitative results to a broader population.

Importantly, you can also re-assess a qualitative data set in your research, rather than using it as a basis for your quantitative research. Let’s say that your research focuses on the kind of language that people who live on boats use when describing their transient lifestyles. The original research did not focus on this research question per se – however, you can reuse the information from interviews to “extract” the types of descriptions of a transient lifestyle that were given by participants.

TABLE 4 highlights the two main types of secondary data and their associated purposes:

TYPES	PURPOSES
Quantitative	Both can be used when you want to (a) inform your current research with past data, and (b) re-assess a past data set
Qualitative	Both can be used when you want to (a) inform your current research with past data, and (b) re-assess a past data set

Sources of secondary data

The two most common types of secondary data sources are labelled as internal and external.

Internal sources of data are those that are internal to the organisation in question. For instance, if you are doing a research project for an organisation (or research institution) where you are an intern, and you want to reuse some of their past data, you would be using internal data sources.

The benefit of using these sources is that they are easily accessible and there is no associated financial cost of obtaining them.

External sources of data, on the other hand, are those that are external to an organisation or a research institution. This type of data has been collected by “somebody else”, in the literal sense of the term. The benefit of external sources of data is that they provide comprehensive data – however, you may sometimes need more effort (or money) to obtain it.

Let’s now focus on different types of internal and external secondary data sources.

There are several types of internal sources. For instance, if your research focuses on an organisation’s profitability, you might use their sales data. Each organisation keeps a track of its sales records, and thus your data may provide information on sales by geographical area, types of customer, product prices, types of product packaging, time of the year, and the like.

Alternatively, you may use an organisation’s financial data. The purpose of using this data could be to conduct a cost-benefit analysis and understand the economic opportunities or outcomes of hiring more people, buying more vehicles, investing in new products, and so on.

Another type of internal data is transport data. Here, you may focus on outlining the safest and most effective transportation routes or vehicles used by an organisation.

Alternatively, you may rely on marketing data, where your goal would be to assess the benefits and outcomes of different marketing operations and strategies.

Some other ideas would be to use customer data to ascertain the ideal type of customer, or to use safety data to explore the degree to which employees comply with an organisation’s safety regulations.

The list of the types of internal sources of secondary data can be extensive; the most important thing to remember is that this data comes from a particular organisation itself, in which you do your research in an internal manner.

The list of external secondary data sources can be just as extensive. One example is the data obtained through government sources. These can include social surveys, health data, agricultural statistics, energy expenditure statistics, population censuses, import/export data, production statistics, and the like. Government agencies tend to conduct a lot of research, therefore covering almost any kind of topic you can think of.

Another external source of secondary data are national and international institutions, including banks, trade unions, universities, health organisations, etc. As with government, such institutions dedicate a lot of effort to conducting up-to-date research, so you simply need to find an organisation that has collected the data on your own topic of interest.

Alternatively, you may obtain your secondary data from trade, business, and professional associations. These usually have data sets on business-related topics and are likely to be willing to provide you with secondary data if they understand the importance of your research. If your research is built on past academic studies, you may also rely on scientific journals as an external data source.

Once you have specified what kind of secondary data you need, you can contact the authors of the original study.

As a final example of a secondary data source, you can rely on data from commercial research organisations. These usually focus their research on media statistics and consumer information, which may be relevant if, for example, your research is within media studies or you are investigating consumer behaviour.

TABLE 5 summarises the two sources of secondary data and associated examples:

INTERNAL SOURCES	EXTERNAL SOURCES
Definition: Internal to the organisation or research institution where you conduct your research	Definition: External to the organisation or research institution where you conduct your research
Examples: • Sales data • Financial data • Transport data • Marketing data • Customer data • Safety data	Examples: • Government sources • National and international institutions • Trade, business, and professional associations • Scientific journals • Commercial research organisations

Secondary research process in 4 steps

In previous sections of this guide, we have covered some basic aspects of doing secondary research. We have defined secondary data, outlined its advantages and disadvantages, introduced the methods and purposes of secondary research, and outlined the types and sources of secondary data.

At this point, you should have a clearer understanding of secondary research in general terms.

Now it may be useful to focus on the actual process of doing secondary research. This next section is organised to introduce you to each step of this process, so that you can rely on this guide while planning your study. At the end of this blog post, in Table 6, you will find a summary of all the steps of doing secondary research.

Step 1: Develop your research question(s)

Secondary research begins exactly like any type of research: by developing your research question(s).

For an undergraduate thesis, you are often provided with a specific research question by your supervisor. But for most other types of research, and especially if you are doing your graduate thesis, you need to arrive at a research question yourself.

The first step here is to specify the general research area in which your research will fall. For example, you may be interested in the topic of anxiety during pregnancy, or tourism in Greece, or transient lifestyles. Since we have used these examples previously, it may be useful to rely on them again to illustrate our discussion.

Once you have identified your general topic, your next step consists of reading through existing papers to see whether there is a gap in the literature that your research can fill. At this point, you may discover that previous research has not investigated national differences in the experiences of anxiety during pregnancy, or national differences in a tendency to go to Greece for a summer vacation, or that there is no literature generalising the findings on people’s choice to live on boats.

Having found your topic of interest and identified a gap in the literature, you need to specify your research question. In our three examples, research questions would be specified in the following manner: (1) “Do women of different nationalities experience different levels of anxiety during different stages of pregnancy?”, (2) “Are there any differences in an interest in Greek tourism between Germans and Britons?”, and (3) “Why do people choose to live on boats?”.

Step 2: Identify a secondary data set

As we mentioned above, most research begins by specifying what is already known on the topic and what knowledge seems to be missing. This process involves considering the kind of data previously collected on the topic.

It is at this point, after reviewing the literature and specifying your research questions, that you may decide to rely on secondary data. You will do this if you discover that there is past data that would be perfectly reusable in your own research, therefore helping you to answer your research question more thoroughly (and easily).

But how do you discover if there is past data that could be useful for your research? You do this through reviewing the literature on your topic of interest. During this process, you will identify other researchers, organisations, agencies, or research centres that have explored your research topic.

Somewhere there, you may discover a useful secondary data set. You then need to contact the original authors and ask for a permission to use their data. (Note, however, that this happens only if you are relying on external sources of secondary data. If you are doing your research internally (i.e., within a particular organisation), you don’t need to search through the literature for a secondary data set – you can just reuse some past data that was collected within the organisation itself.)

In any case, you need to ensure that a secondary data set is a good fit for your own research question. Once you have established that it is, you need to specify the reasons why you have decided to rely on secondary data.

For instance, your choice to rely on secondary data in the above examples might be as follows: (1) A recent study has focused on a range of mental difficulties experienced by women in a multinational sample and this data can be reused; (2) There is existing data on Germans’ and Britons’ interest in Greek tourism and these data sets can be compared; and (3) There is existing qualitative research on the reasons for choosing to live on boats, and this data can be relied upon to conduct a further quantitative investigation.

Step 3: Evaluate a secondary data set

If you recall our previous discussion on the disadvantages of secondary data, you will remember us specifying that: (1) secondary data may not be fully appropriate for your research purposes, (2) secondary data may have a different format than you require, (3) secondary data may lack reliability and validity, (4) secondary data may not answer your research question, and (5) original authors may have failed to provide sufficient information about their research.

Because such disadvantages of secondary data can limit the effectiveness of your research, it is crucial that you evaluate a secondary data set. To ease this process, we outline here a reflective approach that will allow you to evaluate secondary data in a stepwise fashion.

Step 3(a): What was the aim of the original study?

When evaluating secondary data, you first need to identify the aim of the original study. This is important because the original authors’ goals will have impacted several important aspects of their research, including their population of choice, sample, employed measurement tools, and the overall context of the research.

During this step, you also need to pay close attention to any differences in research purposes and research questions between the original study and your own investigation. As we have discussed previously, you will often discover that the original study had a different research question in mind, and it is important for you to specify this difference.

Let’s put this step of identifying the aim of the original study in practice, by referring to our three research examples. The aim of the first research example was to investigate mental difficulties (e.g., stress, anxiety, mood disorders, and paranoid thoughts) in a multinational sample of pregnant women.

How does this aim differ from your research aim? Well, you are seeking to reuse this data set to investigate national differences in anxiety experienced by women during different pregnancy stages. When it comes to the second research example, you are basing your research on two secondary data sets – one that aimed to investigate Germans’ interest in Greek tourism and the other that aimed to investigate Britons’ interest in Greek tourism.

While these two studies focused on particular national populations, the aim of your research is to compare Germans’ and Britons’ tendency to visit Greece for summer vacation. Finally, in our third example, the original research was a qualitative investigation into the reasons for living on boats. Your research question is different, because, although you are seeking to do the same investigation, you wish to do so by using a quantitative methodology.

Importantly, in all three examples, you conclude that secondary data may in fact answer your research question. If you conclude otherwise, it may be wise to find a different secondary data set or to opt for primary research.

Step 3(b): Who has collected the data?

A further step in evaluating a secondary data set is to ask yourself who has collected the data. To what institution were the authors affiliated? Were the original authors professional enough to trust their research? Usually, you will be able to obtain this information through quick online searches.

Let’s say that, in our example of research on pregnancy, data was collected by the UK government; that in our example of research on Greek tourism, the data was collected by a travel agency; and that in our example of research on the reasons for choosing to live on boats, the data was collected by researchers from a UK university.

Let’s also say that you have checked the background of these organisations and researchers, and that you have concluded that they all have a sufficiently professional background, except for the travel agency. Given that this agency’s research did not lead to a publication (for instance), and given that not much can be found about the authors of the research, you conclude that the professionalism of this data source remains unclear.

Step 3(c): Which measures were employed?

If the study on which you are basing your research was conducted in a professional manner, you can expect to have access to all the essential information regarding this research.

Original authors should have documented all their sample characteristics, measures, procedures, and protocols. This information can be obtained either in their final research report or through contacting the authors directly.

It is important for you to know what type of data was collected, which measures were used, and whether such measures were reliable and valid (if they were quantitative measures). You also need to make a clear outline of the type of data collected – and especially the data relevant for your research.

Let’s say that, in our first example, researchers have (among other assessed variables) used a demographic measure to note women’s nationalities and have used the State Anxiety Inventory to assess women’s anxiety levels during different pregnancy stages, both of which you conclude are valid and reliable tools. In our second example, the authors might have crafted their own measure to assess interest in Greek tourism, but there may be no established validity and reliability for this measure. And in our third example, the authors have employed semi-structured interviews, which cover the most important reasons for wanting to live on boats.

Step 3(d): When was the data collected?

When evaluating secondary data, you should also note when the data was collected. The reason for this is simple: if the data was collected a long time ago, you may conclude that it is outdated. And if the data is outdated, then what’s the point of reusing it?

Ideally, you want your secondary data to have been collected within the last five years. For the sake of our examples, let’s say that all three original studies were conducted within this time-range.

Step 3(e): What methodology was used to collect the data?

When evaluating the quality of a secondary data set, the evaluation of the employed methodology may be the most crucial step.

We have already noted that you need to evaluate the reliability and validity of employed measures. In addition to this, you need to evaluate how the sample was obtained, whether the sample was large enough, if the sample was representative of the population, if there were any missing responses on employed measures, whether confounders were controlled for, and whether the employed statistical analyses were appropriate. Any drawbacks in the original methodology may limit your own research as well.

For the sake of our examples, let’s say that the study on mental difficulties in pregnant women recruited a representative sample of pregnant women (i.e., they had different nationalities, different economic backgrounds, different education levels, etc.) in maternity wards of seven hospitals; that the sample was large enough (N = 945); that the number of missing values was low; that many confounders were controlled for (e.g., education level, age, presence of partnership, etc.); and that statistical analyses were appropriate (e.g., regression analyses were used).

Let’s further say that our second research example had slightly less sufficient methodology. Although the number of participants in the two samples was high enough (N1 = 453; N2 = 488), the number of missing values was low, and statistical analyses were appropriate (descriptive statistics), the authors failed to report how they recruited their participants and whether they controlled for any confounders.

Let’s say that these authors also failed to provide you with more information via email. Finally, let’s assume that our third research example also had sufficient methodology, with a sufficiently large sample size for a qualitative investigation (N = 30), high sample representativeness (participants with different backgrounds, coming from different boat communities), and sufficient analyses (thematic analysis).

Note that, since this was a qualitative investigation, there is no need to evaluate the number of missing values and the use of confounders.

Step 3(f): Making a final evaluation

Having considered all the things outlined in the steps above, what can you conclude regarding the quality of your secondary data set? Again, let’s consider our three examples.

We would conclude that the secondary data from our first research example has a high quality. Data was recently collected by professionals, the employed measures were both reliable and valid, and the methodology was more than sufficient. We can be confident that our new research question can be sufficiently answered with the existing data. Thus, the data set for our first example is ideal.

The two secondary data sets from our second research example seem, however, less than ideal. Although we can answer our research questions on the basis of these recent data sets, the data was collected by an unprofessional source, the reliability and validity of the employed measure is uncertain, and the employed methodology has a few notable drawbacks.

Finally, the data from our third example seems sufficient both for answering our research question and in terms of the specific evaluations (data was collected recently by a professional source, semi-structured interviews were well made, and the employed methodology was sufficient).

The final question to ask is: “what can be done if our evaluation reveals the lack of appropriateness of secondary data?”. The answer, unfortunately, is “nothing”. In this instance, you can only note the drawbacks of the original data set, present its limitations, and conclude that your own research may not be sufficiently well grounded.

Step 4: Prepare and analyse secondary data

During the secondary data evaluation process, you will familiarise yourself with the original research. Having done so, your next step is to prepare a secondary data set.

Your first sub-step here (if you are doing quantitative research) is to outline all variables of interest that you will use in your study. In our first example, you could have at least five variables of interest: (1) women’s nationality, (2) anxiety levels at the beginning of pregnancy, (3) anxiety levels at three months of pregnancy, (4) anxiety levels at six months of pregnancy, and (5) anxiety levels at nine months of pregnancy. In our second example, you will have two variables of interest: (1) participants’ nationality, and (2) the degree of interest in going to Greece for a summer vacation. Once your variables of interest are identified, you need to transfer this data into a new SPSS or Excel file. Remember simply to copy this data into the new file – it is vital that you do not alter it!

Once this is done, you should address missing data (identify and label them) and recode variables if necessary (e.g., giving a value of 1 to German participants and a value of 2 to British participants). You may also need to reverse-score some items, so that higher scores on all items indicate a higher degree of what is being assessed.

Most of the time, you will also need to create new variables – that is, to compute final scores. For instance, in our example of research on anxiety during pregnancy, your data will consist of scores on each item of the State Anxiety Inventory, completed at various times during pregnancy. You will need to calculate final anxiety scores for each time the measure was completed.

Your final step consists of analysing the data. You will always need to decide on the most suitable analysis technique for your secondary data set. In our first research example, you would rely on MANOVA (to see if women of different nationalities experience different stress levels at the beginning, at three months, at six months, and at nine months of pregnancy); and in our second example, you would use an independent samples t-test (to see if interest in Greek tourism differs between Germans and Britons).

The process of preparing and analysing a secondary data set is slightly different if your secondary data is qualitative. In our example on the reasons for living on boats, you would first need to outline all reasons for living on boats, as recognised by the original qualitative research. Then you would need to craft a questionnaire that assesses these reasons in a broader population.

Finally, you would need to analyse the data by employing statistical analyses.

Note that this example combines qualitative and quantitative data. But what if you are reusing qualitative data, as in our previous example of re-coding the interviews from our study to discover the language used when describing transient lifestyles? Here, you would simply need to recode the interviews and conduct a thematic analysis.

TABLE 6:

STEPS FOR DOING SECONDARY RESEARCH	EXAMPLE 1: USING SECONDARY DATA IN ISOLATION	EXAMPLE 2: COMBINING TWO SECONDARY DATA SETS	Outline all variables of interest; Transfer data to a new file; Address missing data; Recode variables; Calculate final scores; Analyse the data
1. Develop your research question	Do women of different nationalities experience different levels of anxiety during different stages of pregnancy?	Are there differences in an interest in Greek tourism between Germans and Britons?	Why do people choose to live on boats?
2. Identify a secondary data set	A recent study has focused on a range of mental difficulties experienced by women in a multinational sample and this data can be reused	There is existing data on Germans’ and Britons’ interest in Greek tourism and these data sets can be compared	There is existing qualitative research on the reasons for choosing to live on boats, and this data can be relied upon to conduct a further quantitative investigation
3. Evaluate a secondary data set
(a) What was the aim of the original study?	To investigate mental difficulties (e.g., stress, anxiety, mood disorders, and paranoid thoughts) in a multinational sample of pregnant women	Study 1: To investigate Germans’ interest in Greek tourism; Study 2: To investigate Britons’ interest in Greek tourism	To conduct a qualitative investigation on reasons for choosing to live on boats
(b) Who has collected the data?	UK government (professional source)	Travel agency (uncertain professionalism)	UK university (professional source)
(c) Which measures were employed?	Demographic characteristics (nationality) and State Anxiety Inventory (reliable and valid)	Self-crafted measure to assess interest in Greek tourism (reliability and validity not established)	Semi-structured interviews (well-constructed)
(d) When was the data collected?	2015 (not outdated)	2013 (not outdated)	2014 (not outdated)
(e) What methodology was used to collect the data?	Sample was representative (women from different backgrounds); large sample size (N = 975); low number of missing values; confounders controlled for (e.g., age, education, partnership status); analyses appropriate (regression)	Sample representativeness not reported; sufficient sample sizes (N1 = 453, N2 = 488); low number of missing values; confounders not controlled for; analyses appropriate (descriptive statistics)	Sample was representative (participants of different backgrounds, from different boat communities); sufficient sample size (N = 30); analyses appropriate (thematic analysis)
(f) Making a final evaluation	Sufficiently developed data set	Insufficiently developed data set	Sufficiently developed data set
4. Prepare and analyse secondary data	Outline all variables of interest; Transfer data to a new file; Address missing data; Recode variables; Calculate final scores; Analyse the data	Outline all variables of interest; Transfer data to a new file; Address missing data; Recode variables; Calculate final scores; Analyse the data	Outline all reasons for living on boats; Craft a questionnaire that assesses these reasons in a broader population; Analyse the data

In summary…

This might have been a long read to accompany your cup of coffee or tea, but you should, by now, know how to do your secondary research. Hopefully you will have concluded that doing secondary research is not that hard. Just follow the guidelines summarised in Table 6 and you are all set.

^ Jump to top

Recent Posts

Topics