In this blog you will find the correct answer of the Coursera quiz Introduction to Statistics Data Analysis in Public Health mixsaver always try to brings best blogs and best coupon codes
Week- 2
Special case of age
1.
Question 1
Pablo requests the birth records for every individual in his region. He is told that the data set contains everyone’s date of birth so he will be able to calculate their age in days if he wishes. What sort of data will Pablo have:
1 point
- Continuous
- Integer
- Ordinal
2.
Question 2
When Pablo receives the data set he finds that in fact the version of the data set that he has been given contains age group rather than dates of birth. Each individual has been classified as <18 years, 18-44, 45-64 and 65+ years of age. What sort of data does Pablo actually have:
1 point
- Continuous
- Binary
- Ordinal
3.
Question 3
Meghan downloads the following death rate data for the population of England and Wales:
Age Number of people Number of deaths Crude death rate per 1000 people
“Young” (<65) 80,971 47,863,700 1.69
“Old” (65+) 442,886 10,517,500 42.11
True or false:
The death rate for males aged 65 or older in England and Wales is 42.11.
1 point
- True
- False
4.
Question 4
2. The death rate in England and Wales remains constant at 42.11 deaths per 1000 people for ages 0 to 64.
1 point
- True
- False
Week- 2
Well-behaved Distributions
1.
Question 1
Match this distribution to the plot:
Normal with mean 50 and standard deviation 12
Note: There are two correct answers
1 point
c.
d.
2.
Question 2
Match this distribution to the plot:
Poisson with mean 4
Note: There are two correct answers
1 point
a.
c.
3.
Question 3
Match this distribution to the plot:
Normal with mean 50 and standard deviation 4
Note: There are two correct answers
1 point
a.
c.
4.
Question 4
Which of the following plots shows the distribution with the biggest standard deviation?
1 point
a.
5.
Question 5
What proportion of the data lies in the shaded area on the plot below?
1 point
68%
95%
50%
6.
Question 6
What proportion of the data lies in the shaded area on the plot below?
1 point
68%
95%
50%
7.
Question 7
A drug is given to 100 migraine suffers to prevent the onset of new migraines. 40% experience a new migraine after taking the drug. What distribution does the outcome (new migraine) follow:
1 point
Binomial
Normal
Poisson
8.
Question 8
A new drug is given to 100 asthma suffers to reduce the number of hospital admissions due to asthma attack over a 12 month period. After 12 months, the mean number of hospital admissions is 2. What distribution does the outcome (hospital admissions) follow:
1 point
Binomial
Normal
Poisson
9.
Question 9
The normal distribution is a:
1 point
Discrete distribution
Continuous distribution
10.
Question 10
The Poisson distribution is a:
1 point
Discrete distribution
Continuous distribution
11.
Question 11
The Binomial distribution is a:
1 point
Discrete distribution
Continuous distribution
12.
Question 12
Which of these does not follow a Poisson distribution:
1 point
Asthma exacerbations over a 12-month period
Patients arriving at a hospital emergency department in a one hour time period
Number of patients in disease remission
Number of patient falls on a geriatric ward over a twelve-hour shift.
Ways of Dealing with Weird Data
1.
Question 1
The video introduced the idea that data do not always fit well-behaved distributions. However, this matters to a greater or lesser extent depending on how you plan to use the data. The following will test your understanding of this and the potential solutions available to you when you have “weird” data.
Dev has collected information on the average number of times a month that people viewed a particular public health information website (he has no information on people who did not access the website at all). He plots the data and observes the following:
Dev wants to describe website access in his sample. What would be the best approach for him to do this?
1 point
Try transforming the data to see if it makes the distribution more normal and analyse as a normal distribution.
Dichotomise the data into high and low usage using a cut point such as 5 or more times a month on average and analyse as a binomial distribution.
Present a simple summary table of frequencies and proportion of people by average number of logins.
2.
Question 2
Ji-woo is conducting a study that is looking at the effects of a new drug on vision compared with a group that receive standard care. The vision outcome is measured by the ETDRS (a visual acuity scale), which has a range from 0-100 (complete sight loss to perfect vision). She collects the ETDRS at baseline before the drug/standard care is administered and 6 months later. At baseline, the sample contains patients with very poor vision, including some with complete vision loss. The literature shows that the baseline scores are likely to be positively skewed. Ji-woo wants to compare change scores on the ETDRS between baseline and 6 months across the two treatment groups. How should Ji-woo proceed?
In thinking about your answer, one of the things you should consider is how the doctor might most easily communicate the information to the patient.
1 point
Present the mean change scores by group.
Dichotomise the change scores so that the data follows a binomial distribution.
3.
Question 3
Nisha has data that contain each person’s average daily fruit and vegetable consumption over the course of a year for the last ten years. An extract is given in the table below.
A histogram of the data for year 1 is shown below:
She wants to draw a graph of the trend over this 10-year period. She decides she needs to get a summary measure for each year to compare over time. How can she best summarise the data per year to make a comparison over time:
1 point
Calculate the mean average daily fruit and vegetable consumption for each year.
Calculate the proportion per year that eat above the daily recommended amount.
Sampling
1.
Question 1
Which one of the following defines the standard error of a mean?
1 point
The difference between the population mean and the sample mean
The average difference between the population mean and the sample mean
The average difference between the individual observations and the sample mean
2.
Question 2
Lucy takes a sample of BMI values across her class of 35 students. The sample mean and standard deviation are 23.2 and 2 respectively. What is the estimated standard error of Lucy’s sample:
1 point
0.06
0.34
3.92
3.
Question 3
Lucy want to calculate the 95% confidence interval for the sample mean. What is Lucy’s estimated 95% confidence interval:
1 point
(22.53, 23.87)
(19.28, 27.12)
(21.20, 25.20)
Week- 3
Distributions and Medians
1.
Question 1
Match the below plot with the correct distribution.
1 point
Poisson(4)
Uniform (0,100)
Normal (75, 10)
Binomial (100, 0.5)
2.
Question 2
Match the below plot with the correct distribution.
1 point
Poisson(4)
Normal (75, 10)
Uniform (0,100)
Binomial (100, 0.5)
3.
Question 3
Match the below plot with the correct distribution.
1 point
Poisson(4)
Normal (75, 10)
Binomial (100, 0.5)
Uniform (0,100)
4.
Question 4
Match the below plot with the correct distribution.
1 point
Poisson(4)
Binomial (100, 0.5)
Uniform (0,100)
Normal (75, 10)
5.
Question 5
For the sequence of numbers 3, 4, 5, 5, 7, 36, what is the Mean?
1 point
5
3
4
10
6
6.
Question 6
For the sequence of numbers 3, 4, 5, 5, 7, 36, what is the Median?
1 point
4
3
6
5
10
7.
Question 7
For the sequence of numbers 7, 7, 5, 3, 2, 12, what is the Mean?
1 point
6
10
4
5
3
8.
Question 8
For the sequence of numbers 7, 7, 5, 3, 2, 12, what is the Median?
1 point
5
4
10
6
3
Week- 4
Results: Running a New Hypothesis Test
1.
Question 1
Suppose you want to compare the proportions of overweight and cancer. First, define your variables:
3
cancer <- g$cancer
overweight <- ifelse(g$bmi >= 25, 1, 0)
Have a look at your new variable to check everything makes sense:
7
table(overweight)
overweight
0 1
34 32
Next perform a chi-squared test. For best practice, assigning the explanatory variable to x and the dependent variable to y. The “dependent variable” is so named because we are hypothesising that its value depends at least partly on some other variable(s) – called the “explanatory variable(s)”.
1
chisq.test(x = overweight, y = cancer)
What did you get? What do you conclude?
Enter the p value in the box below (to 2 decimal places) and tick which of the given options for the conclusion you agree with.
1 point
Enter answer here
.65 |
2.
Question 2
Tick which of the below given options for the conclusion you agree with.
1 point
Being overweight gives you cancer
Being overweight protects you from getting cancer
Being overweight does not give you cancer
There is no association between being overweight and cancer
There is good evidence of an association between being overweight and cancer
There is no evidence of an association between being overweight and cancer anywhere in the world
There is no evidence of an association between being overweight and cancer in this data set
Hypothesis Testing
1.
Question 1
In each of the following six questions, you’ll be asked to choose the single correct answer.
David takes 5 samples of 10 patients from the National Cancer Registry. He calculates mean BMI values for each of these 5 samples and obtains the following results – 24.3, 27.9, 25.2, 26.7, 26.4. Why are David’s sample means all different?
1 point
Sampling variation
Population variation
Measurement error
2.
Question 2
Charlotte wants to test the mean BMI value in the National Cancer Registry based on a sample of 100 patients. She hypothesizes that the mean BMI value in her sample will be 27. Before she conducts her experiment, her boss points out an error in her hypothesis. What is wrong with Charlotte’s statement?
1 point
The hypothesis should relate to the population value.
She hasn’t specified her alpha value.
27 is an unreasonable value for mean BMI.
3.
Question 3
Charlotte corrects her hypotheses and randomly selects her sample of 100 patients. She has decided to use a two-sided alpha value of 0.01 instead of the conventional value of 0.05 because she believes that this will decrease her risk of making the wrong conclusion. Will this lower value reduce her risk of concluding the mean population BMI is 27 when in fact it isn’t?
1 point
Yes
No
4.
Question 4
Charlotte’s colleague repeats her experiment but chooses a two-side alpha value of 0.05. What happens to the chance area (or probability of making a type I error)?
1 point
Becomes larger
Stays the same
Becomes smaller
5.
Question 5
How many degrees of freedom will Charlotte’s test have?
1 point
99
100.
0.01
0.05
6.
Question 6
Noah has the following data and wants to test whether age-group is associated with the presence or absence of cancer. He decides to perform a chi-squared test.
How many degrees of freedom does his test have?
1 point
919
4
920
10
End-of-course Assessment
1.
Question 1
Part of the success of the UN’s Millennium Development Goals was due to the statistical monitoring of data on measures such as infant mortality and living in extreme poverty.
1 point
True
False
2.
Question 2
As long as a research question is interesting, it is scientifically testable as a hypothesis – the more interesting, the more testable.
1 point
true
false
3.
Question 3
In the study published in the Journal of the American College of Cardiology on the effect of taking supplements of vitamins and minerals that you read earlier in this course, they concluded that, in simple terms, there’s no health benefit in taking such supplements (with the exception of folic acid) and there might even be some risk.
1 point
true
false
4.
Question 4
That phrase that I wrote in the previous question, “there’s no health benefit in taking supplements”, uses precise enough language to be used in a hypothesis test.
1 point
true
false
5.
Question 5
The responsibility for accurate reporting of medical research always lies solely with the journalist. If there’s a misinterpretation of the results, the scientist is never to blame.
1 point
true
false
6.
Question 6
The next set of questions concern data types and exploratory analyses in R. A histogram is a useful but rough way to assess whether a variable is normally distributed.
1 point
true
false
7.
Question 7
When undertaking a t-test in R, it is fine to use “t.test” before “hist” and “summary”
1 point
true
false
8.
Question 8
You want to see whether patients with cancer have different mean BMIs from those without. If you type t.test(cancer~bmi). Please select all that apply.
1 point
You have written “cancer” and “bmi” the wrong way round
BMI should be roughly normally distributed for a t-test to be valid
You should have done a chi-squared test instead
9.
Question 9
Your boss reminds you that BMI is often categorised, with underweight, normal weight etc as categories. Which of the following is/are correct?
1 point
Making categories from a normally distributed variable loses a lot of information, and it’s more efficient to compare means instead of proportions
If you did categorise BMI, you could do a chi-squared test using “chisq.test” in R
The chi-squared statistic that R gives you in the output is really useful and should always be reported
10.
Question 10
You decide to turn BMI into categories because they are of public interest, even though it loses information. Before running the above chi-squared test, you make the variable “bmi.group”. You should run these commands in R first and for the reason given…
1 point
table(cancer) in order to check how many values “cancer” has
table(bmi.group, exclude=NULL) to check your code for grouping BMI gives sensible results
hist(bmi) to check that BMI is roughly normally distributed
summary(bmi) to check that BMI is roughly normally distributed
table(bmi) in order to see how common each BMI value is
11.
Question 11
The next set of questions concern the interpretation of official mortality figures from India. These figures are publicly available from https://data.gov.in/catalog/estimated-age-specific-death-rates-sex and in the reading before this test (pdf download) and give the rates of death per 1,000 population in each age-gender group.
datafile (est age-sex death rates 2006-11 in India).pdf
PDF File
True or false?
These data were published in April 2014, but they only go up to 2011. Such delays in releasing official data are common in many countries.
1 point
True
False
12.
Question 12
In 2011 in India according to official statistics, the estimated death rate for girls aged under 1 was higher than that for every older age group until 75-79.
datafile (est age-sex death rates 2006-11 in India).pdf
PDF File
1 point
true
false
13.
Question 13
The lack of a 95% confidence interval for either of these estimates means that we can safely say that 49.7 is statistically significantly higher than 42.5.
datafile (est age-sex death rates 2006-11 in India).pdf
PDF File
1 point
true
false
14.
Question 14
To see whether these two rates (42.5 and 49.7 per 1,000) are statistically significantly different from one another, we would carry out a t-test and interpret its p value.
datafile (est age-sex death rates 2006-11 in India).pdf
PDF File
1 point
true
false
15.
Question 15
As these rates are in fact based on proportions (they’re proportions of the population of each age group that died that are then multiplied by 10 to make them easier to read), the appropriate test is a chi-squared test. We have enough information to carry out this test.
datafile (est age-sex death rates 2006-11 in India).pdf
PDF File
1 point
true
false
Important Link: