STAT 402: Homework 4 Solved

$24.99 $18.99

Exercise 1: I want to survey a group of students on how much time they are spending on homework and studying. I have a frame of students that I can contact, along with their age, gender, major and year in college. Do you think it would be worth stratifying on one of these variables? Why…

5/5 – (2 votes)

You’ll get a: zip file solution

 

Categorys:

Description

5/5 – (2 votes)

Exercise 1: I want to survey a group of students on how much time they are spending on homework and studying. I have a frame of students that I can contact, along with their age, gender, major and year in college. Do you think it would be worth stratifying on one of these variables? Why or why not? If you were to use a variable to make strata, which would you pick and why?

Solution:

I think it is absolutely worth stratifying on one if not two of these variables. Firstly, con-sidering a students major, it’s easy to see that there is significant disparity when it comes to study time. Furthermore there is also likely significant disparity in study time when com-paring upper and lower class men, as classes tend to scale up in di culty/required study time. Therefore a SRS that contains lower and upper class men, rigorous and non-rigorous majors will almost assuredly have a higher variance with respect to study time than if you had stratified.

Exercise 2: In the Alaska Department of Fish and Game paper (included with this

HW) they use either stratification or post-stratification. Which one did they use?

What were the strata? Why did they do this?

Solution:

In the 2002 ADF&G survey for Chinook Salmon we can see that the primary sampling method was regular stratification with some secondary use of post-stratification. In the sec-tion of the paper it is stated that the area of interest was split into three separate stratum, the lower, middle, and upper portions of the Kuskokwim River. The paper states that the rea-son for dividing the area into these stratum is because of ”di ering proportions in gear type usage”. Beyond that each SRS in the stratum was designed as an ”opportunistic” sample i.e. samples were taken across time with a variety of gear with the assumption that samples would still be unbiased and independent. It is also stated, later in the sample design section that the SRS in each stratum would be ”post stratify(ed) by time and gear”. Under an ”op-portunistic” sampling scheme this, further post-stratification seems necessary to account for variance across sampling time and gear. The details of the post-stratification is defined in the second paragraph of the ”Data Processing, Analysis, and Reporting” section.

STAT 402: Homework 4

STAT 402: Homework 4

>

S t r a t a

m e a n s < c ( 2 5 0 0 , 4 0 0 0 , 5 5 0 0 )

>

S t r a t a

S S q u a r e d < c ( 5 0 0 , 7 5 0 , 7 5 0 )

> S t r a t a

n

<

c ( 1 2 0 , 1 5 0 , 1 3 0 )

> S t r a t a

N < c ( 2 0 0 0 0 , 3 0 0 0 0 , 3 0 0 0 0 )

> S t r a t a

P r o p o r t i o n = S t r a t a n / 4 0 0

> M e a n

e s t i m a t o r = sum ( S t r a t a

P r o p o r t i o n S t r a t a

m e a n s )

[1]

4037.5

> M a r g i n

e r r o r = 2

s q r t ( sum ( S t r a t a P r o p o r t i o n ˆ 2

S t r a t a n ) / S t r a t a n )

( ( S t r a t a N

( S t r a t a

S S q u a r e d / S t r a t a

n ) ) )

> CI95 <

c ( M e a n

e s t i m a t o r + M a r g i n

e r r o r ,

M e a n

e s t i m a t o r M a r g i n

e r r o r )

[1]

4074.49

4000.51

Finally we get ˆ = 4037:5 and a 95 percent confidence interval of (4074:49; 4000:51). A 95 percent confidence interval means that there is 95% chance the true mean is contained in the interval.

  1. Suppose that we complete the study and decide, before beginning the analysis, to just analyze it without sorting into strata. Would this be valid? Why or why not?

Solution:

This would be a valid technique. Usually when you have to use post-stratification the sampling schema is already set up for a ’one big’ SRS analysis. However the point of post-stratification is that we have information about how the data can be stratified and ignoring that information means leaving accuracy on the table.

STAT 402: Homework 4

Exercise 4.: We want to estimate the concentration of available nitrogen in the soils of a region. Cold, wet soils generally have more available nitrogen, but also soils with nitrogen fixing (alder) plants or areas showing high productivity might have higher soil nitrogen. We think we can very easily classify plots of ground into either low or high nitrogen plots just by looking at them, but the actual soil sampling and analysis is expensive.

To lower cost, we’ll do the following:

  1. divide the region into N = 20000 reasonably-sized plots.

  1. take a SRS of size m = 500 plots which we will visit and rapidly classify into either high N stratum or low N stratum (actually this would probably be done as a systematic sample, not an SRS, which we’ll see later).

  2. We find that m1 = 300 of these plots are classified as low nitrogen and m2 = 200 plots as high nitrogen.

  3. Now we take an SRS of size n1 = 30 from the low nitrogen (we hope) plots and, inde-pendently, n2 = 50 from the high nitrogen plots. We get the following:

Classified as low nitrogen: x1 = 30ppm, s1 = 10ppm

Classified as high nitrogen: x2 = 40ppm, s2 = 15ppm.

  1. Does it appear that the stratification will help us much? If we decide it doesn’t, can we just pretend we took a SRS of size 500 and ignore the stratification?

  1. Find a 95 percent confidence interval for the average nitrogen concentration.

Solution:

Code:

> S t r a t a m e a n s < c ( 3 0 , 4 0 )

> S t r a t a

S S q u a r e d < c ( 1 0 , 1 5 )

> S t r a t a

n < c ( 3 0 , 5 0 )

> S t r a t a

m < c ( 3 0 0 , 2 0 0 )

>

S t r a t a

P r o p o r t i o n = S t r a t a m / sum ( S t r a t a

m )

>

s t i m a t o r

= sum ( S t r a t

M e a n e

a P r o p o r t i o n S t r a t a m e a n s )

[1]

34

>

M a r g i n

e r r o r = 2

s q r t ( sum ( S t r a t a

P r o p o r t i o n ˆ 2

[1]

2.212691

( ( S t r a

t a m

S t r a t a n ) / S t r a t a n )

( S t r a t a S

S q u a r e d / S t r a t a n ) ) )

STAT 402: Homework 4

5

STAT 402: Homework 4 Solved
$24.99 $18.99