Skip to main content

Table 1 Summary of size estimation methods. The continuity of this table is across four pages

From: Summarizing methods for estimating population size for key populations: a global scoping review for human immunodeficiency virus research

Sampling method Description Assumption Strength Weakness
Capture-recapture [15] Assesses the overlaps between incomplete case lists from multiple independent data sets 1) the selected sampled population is a good representation of the whole population
2) the sample is a closed population
3) able to match individuals in both datasets;
4) individuals have an equal likelihood of being captured
Simple and easy to use for researchers Capture biases: not everyone has an equal chance of being captured;
Estimates would be too high if matches were not identified or too low if recaptures were matched incorrectly
Multiplier [16] Two independent sources of data are used to make the estimation, including an authentic count or list of the population whose size is being estimated and a survey of the populations whose size is being estimated There is accurate demographic and geographic information of the key population Simple and easy to use The quality of the data can cause bias; the resulting survey samples may not be fully representative of the key population
Delphi [17] Estimating the size of key populations by the individual judgment of several experts The estimation from an expert team could accurately reflect the reality Low cost with high efficiency The estimation may be subjective and not reliable because of the quality of the expert team; Lack of strategies to deal with the disparity between the experts
Mapping [18] The locations of the key population are systematically identified and mapped to estimate the size of the key population The quality of the data can be guaranteed by the full involvement of the key populations The estimate is made with transparency The missing of some geographical locations may underestimate the size of key populations; overestimation may happen if the key population frequently attend multiple locations
Workbook [19] The key population is identified first and then the estimates are combined with the total population to calculate the proportion of the key population in a specific region Typically used in countries or regions where the epidemic is low and concentrated The estimate is made with transparency; errors can be prevented by automatic consistency and audit check In some countries, data may be limited because of stigma and discrimination among the key populations and legal issues, which may make data unreliable or of poor quality
Network scale-up [20] Respondents are asked about the behaviors of acquaintances from their social network to estimate the number of key populations from the social network of each respondent The average size of personal networks of key populations and the population as a whole are the same;
People can accurately report the behaviors of acquaintances from their social networks
The privacy of the key populations is protected because the researchers do not directly contact them The respondents may ignore key populations among their acquaintances (transmission error); Obtaining a representative sample is challenging because of stigma and discrimination
Respondent Driven Sampling [21] A sample from the key population is selected purposively and then these selected individuals are given coupons to
recruit other key populations from their social network
Recruiters randomly pass coupons to their social network members who are members of the key populations;
Every participant has only one chance to receive the coupon and is
equally likely to be recruited;
The Respondent-Driven Sampling method is an effective sampling method for estimating hard-to-reach networked populations with no sampling frames Limited recruitment within the key populations may lead to biased estimates
Bayesian Estimation [22] The key population size is estimated following Bayes' theorem, which is based on a prior probability distribution If there exists some prior knowledge, like prior probability, the Bayesian method is suitable It can solve the problem when there is no direct data to estimate the population size for a specified geographical area through survey sampling studies by utilizing empirical data Bayesian methods might be subjective, due to different researchers with different prior beliefs
Stochastic Simulation [23] Estimating the size of a certain population (e.g., HIV-positive) using epidemiologic data using the Monte Carlo method Parameters are based on the data from representative clinical trials or observational cohort studies Stochastic simulation makes it possible to naturally produce plausibility intervals for estimates in the face of uncertainty First, some complex simulation process is quite time-consuming. Second, thanks to different kinds of parameters setting and the unknown quality of observed data, the robustness of some simulation model estimates is not stable
Laska-Meisner-Siegel Estimation [24] Based on a single sample and in a single venue, it is an unbiased estimator for the size of a population This method assumes that we only have a one-time sampling This estimation method is time- and resources- saving, when comparing with capture-recapture This method only requires one single sample, thus its estimation accuracy might be lower than other several times sampling estimation methods