ANALYSIS of DATASET

Description

The purpose of this paper is to show your
expertise in performing descriptive and inferential SAS analyses.
Additionally, writing your paper will give you some experience of how to
correctly discuss your work and results in a clear and concise manner.My SAS account information is:

User ID: Zozolott0

Password: EDUcation19*

https://odamid.oda.sas.com/SASLogon/login?service=…

I have also attached the codes I utilized to pull the data.

I wanted to see if there was a correlation between weight, smoking and the cause of death in the lifestyle of the study group.

Honestly
if you don’t think that would yield a strong analysis or you want to do
something easier for time purposes I’ll be okay with it.

The SAS Code to identify the variables is:

PROC CONTENTS DATA = SASHELP.HEART; RUN;

Your completed final analysis project will be graded based on the following criteria:

  • Introduction (10 points)
  • Methods (20 points)
  • Discussion (45 points)
  • Conclusion (10 points)
  • Proper grammar and sentence structure used for clear and concise writing (5 points)
  • SAS results printed out in RTF format and results being reported in paper highlighted (5 points)
  • SAS code copied and pasted into the same Word document as your paper (5 points)


Your paper should be in


APA format


–

that is, double-spaced, with a title page, header with page number and 1 inch margin on all sides

. You can find a good reference to APA style here.

https://owl.english.purdue.edu/owl/resource/560/01/





Parts 1-4


of your paper should be between 6- 10 pages, double-spaced (not
including tables). After 10 pages I will stop reading (seriously). If it
is much less than 6 pages, you probably left something important out.
If it is over 10 pages, you are probably going off-topic.



Part 7


should include approximately 25 pages of SAS results printed out in RTF
format and the specific sections where you obtained the results you are
reporting in the paper

highlighted

, so that it is very clear where your answers are coming from. This part is


essential


,
so that I can double-check and confirm that you have run your analyses
in the correct manner and that you are reporting the appropriate
results.If your results are not highlighted, and I have to hunt for your
answers, you will lose points on this section.


Descriptive Analysis

Make
sure your paper has four distinct parts (shown in the table below) that
flow well and transition as seamlessly as possible into one another.
The main point of this final paper is to state the conclusions you’ve
reached from all the various analyses you have performed, and use these
conclusions to answer your overall research question. You should write
the paper as if your reader does not have an extensive background in
biostatistics. When you write as if your reader is a novice, this will
force you to be very detailed in your explanations, and will help you to
write in a much clearer and well-thought-out manner.



Inferential Analysis

After
you have written up the findings for your descriptive data analysis,
your next step is to conduct inferential data analysis and report those
findings. Now you are moving from just describing the individual
variables of interest to investigating whether there are statistically
significant relationships among your variables of interest. These
inference tests will help you answer your main research question.

The
purpose of this assignment is to provide a second rough draft of
another main section of your analysis paper (the inferential
statistics), so that you can receive feedback on your methods. Your
inferential data analysis should include more than one inferential test.
This could just be an ANOVA with an examination of the overall F-test
and discussion of the post-hoc tests. It could also be a chi-square and a
logistic regression. Any of the tests of inference discussed in

Chapters 7 and 9

of your textbook are fair game for you to use.


Please use the following as a guideline for what you are required to include in the paper and how your paper will be graded:


Criteria


Maximum Points Possible


Introduction:

  • Short description of your research question


    and


    its relevance to public health – it is recommended to do a quick
    literature review and include information from other peer-reviewed
    journal articles to support the statements you are making regarding this
    paper [Remember that if you use any outside sources for this paper,
    that you cite them, APA-style, in the

    Reference

    section at the end of the paper (before the Appendix)]
  • Statement of the

    dataset

    you will use to answer the question


    and


    a short description of the parent study that this dataset originated from


10 points

  • 5 points
  • 5 points


Methods:


  • Names of the variables

    you will use to answer the research question

  • Descriptions of the variables

    you will use to answer the research question

  • Description of the


    descriptive statistics

    methodology [(1) Describe the SAS analyses you did to produce
    descriptive statistics that summarize the study sample based upon the
    characteristics (variables) you are interested in; (2) Discuss why you
    chose these methods (should be based upon the variables you are using
    and the question you are answering)]

  • Description of the


    inferential statistics

    methodology [(1) Describe the SAS analyses you did to produce the
    inferential statistics that assess the relationships between/among the
    characteristics (variables) that you are investigating in order to
    answer your research question for the study sample; (2) Discuss why you
    chose these methods (this should be based upon the variables you are
    using and the question you are answering)]


20 points

  • 5 points
  • 5 points
  • 5 points
  • 5 points


Discussion:


  • Discussion of the results of the


    descriptive statistics

    analysis done using SAS [(1) Report the appropriate statistics for
    numeric variables, based upon whether the variable is normal or skewed;
    (2) Report the appropriate statistics for categorical variables; (3)
    Present graphical (visual) representations of each variable of interest,


    AND make sure to discuss/explain


    what the graph
    means within the discussion section. You can copy and paste this graph
    or chart directly from SAS into the body of your paper; (4) Present a
    table of results within the body of your paper that summarizes all the
    descriptive statistics


    you have discussed


    in your results section. Make sure to create your own table;

    DO NOT

    just copy and paste a table from SAS into your paper]

  • Discussion of the results of the


    inferential statistics

    analysis done using SAS [

    This must be done for EACH test you run in SAS:

    (1) State the null and alternate hypotheses for each test of inference;
    (2) State the appropriate test statistic SAS will calculate for that
    particular test (i.e., t-values, z-values, chi-square values, F-values,
    r-values, or β-values); (3) State the decision rule – when working in
    SAS you should be using either the

    p-value

    decision rule

    OR the confidence interval

    decision rule; (4) Report the test-statistic value produced by SAS; (5) State the conclusion regarding the null hypothesis


    and


    how you reached this conclusion (the statistical evidence); then, based
    on your conclusion, answer the original research question]


45 points

  • 20 points
  • 25 points


Conclusion:

  • State the final conclusions of what you’ve learned about your study sample, based upon the findings

    from your descriptive statistics analysis;

    discuss how these findings relate to the

    original research question
  • State the final conclusions of what you’ve learned about your study sample

    from your inferential statistics analysis,

    and answer the

    original research question

    based upon these findings


10 points

  • 5 points
  • 5 points


Proper


grammar

and

sentence structure

used for clear and concise writing


5 points



SAS Results

that
show where the results you are actually reporting came from should be
printed out into RTF format and the sections where you directly obtained
results should be

highlighted

, so that it is clear to see where you obtained your final answers


5 points


SAS Codes

used for your analyses (copied and pasted into the

Appendix

of your paper)


5 points



TOTAL SCORE



100 points





PLEASE REMEMBER:



  • Your paper should be in


    APA format


    –

    that is, double-spaced, with a title page, header with page number, and 1 inch margins on all sides

    . You can find a good reference to APA style here:

    https://owl.english.purdue.edu/owl/resource/560/01/


  • A
    data dictionary has been created for these three datasets, which will
    give you detailed information about the variables available for you to
    use in these datasets for your analysis project. The data dictionary can
    be found on the class website.
  • Additional resources to help you
    with the data analysis (SAS coding) AND writing process for your
    analysis project can be found on Blackboard. You have many resources to
    help you with this project. It is up to you to use them.


Paper Length

The
length of your paper should be a minimum of 7 pages, double-spaced (not
including tables, graphs, or charts). After 10 pages I will stop
reading (seriously). If it is much less than 7 pages, you probably left
something important out. If it is over 10 pages, you are probably going
off-topic.


Descriptive Tables

You can, and should, add tables (that you create) to your paper, but

make sure you have achieved your minimum of 6 pages of text

.
In other words, do not use visual representations of your findings as
filler for your paper. I will be able to notice any attempts to pad your
paper. Tables are a good way to illustrate what you are writing about
in your Discussion section, and you must make sure to clearly discuss
any tables presented in your paper.

Tables are a good way to illustrate what you are writing about in your Discussion section.

Any results or findings you present in a table, within the body of your paper,

must be explained and discussed

.

Running head: FRAMINGHAM ANALYSIS PROJECT
Framingham Analysis Project: Faster Death with Smoking
{Student Name}
COH 602: Biostatistics
Professor Wosu
{Date}
EXAMPLE 2
1
FRAMINGHAM ANALYSIS PROJECT
2
Abstract
In this analysis I aim to collate statistical facts on cardiovascular disease by examining effects of
smoking on all participants of the Framingham Heart Study and how it affects their level of
blood pressure that could increase their risk of attaining Cardiovascular Disease through their
lifetime. Smokers in this analysis will be compared to non-smokers and both will be analyzed at
death. This project utilizes the “Heart” dataset, which according to Sullivan (2012), is a
longitudinal cohort study that began in 1948 with a cohort enrollment of over five thousand
participants whom were free of cardiovascular disease in the town of Framingham Massachusetts. It
will be used to identify the risk factors for cardiovascular disease such as smoking and blood
pressure.
Commented [KW1]: It is not necessary to write an
abstract.
FRAMINGHAM ANALYSIS PROJECT
3
Framingham Analysis Project: Faster Death with Smoking
Cardiovascular disease (CVD) is a disease of the heart and blood vessels as it restricts the
supply of blood to the brain. They are the number one cause of death globally as more people die
yearly from CVD’s than from any other disease. “An estimated 17.5 million people died from
CVDs in 2012, representing 31% of all global deaths” (WHO, 2014). CVDs are noncommunicable diseases caused by poor lifestyle choices and risk related factors, they can easily
be prevented if these risk factors like heavy tobacco use are addressed. Because these risk factors
are modifiable, it has become a great task and concern for public health prevention management
task force. “The current focus is providing information on the impact of unhealthy lifestyle
choices as risk factors for preventable chronic diseases and encouraging individual responsibility
for one’s health” (Koenig, 2014). Education on these risk factors and the need to reduce
consumption of tobacco have aided in the prevention of CVD but a large scale is yet to be done
to prevent tremendous deaths and encourage lifestyle changes that may decrease morbidity.
Analysis introduction
To answer the question on how smoking increases the risk of CVD via increases in
systolic blood pressure, and diastolic blood pressure and weather smokers will die earlier with
cardiovascular disease due to the risk; data from the original Framingham Heart Study Cohort
will be calculated using SAS analysis to run PROC FREQ, PROC MEANS, and PROC
UNIVARIATE to retrieve descriptive and inferential statistics. The data analysis presented will
focus on the represented individuals in this study, their smoking status and the levels of blood
pressure and how it decreases their length of life.
4
FRAMINGHAM ANALYSIS PROJECT
Descriptive Analysis Methods
To produce the total variables based on crude analysis that focuses on the associations
between smoking and cardiovascular disease in the Framingham data set, SAS will be used to
run PROC CONTENTS which is used to view the contents of the data “heart” which is to be
analyzed. The inputted code provides information on the total 5209 observed participants in the
Framingham study and the 17 variables within the data set. To answer the study question, focus
for this study will be on the provided variables on age at death, blood pressure status, cause of
death, and smoking status. To determine the response values of all 5209 observed participants
from the chosen variables, SASHELP.HEART – Framingham Heart Study Data dictionary as
shown below was used to help gather a general response of preferred variables data: Age at
Death (36-93); Smoking Status for each participant categorized as Heavy (16-25), Light (1-5),
Moderate (6-15), Non-smoker, and Very Heavy (>25); Systolic Blood Pressure (82-300);
Diastolic Blood Pressure (50-160).
Data from Framingham Heart Study Data dictionary
Variable Title
Variable Label
AGEATDEATH
Age at Death
36-93
SMOKING
Number of cigarettes smoked
0-60
SMOKING_STATUS
Smoking Status
Response Values
1 = Heavy (16-25)
2 = Light (1-5)
3 = Moderate (6-15)
4 = Non-smoker
5 = Very Heavy (>25)
DIASTOLIC
Diastolic Blood Pressure
50-160
SYSTOLIC
Systolic Blood Pressure
82-300
Commented [KW2]: Good job on the tables!
5
FRAMINGHAM ANALYSIS PROJECT
To categorize the variables on smoking status PROC FREQ is used. To retrieve data on
descriptive statistics to analyze numerical data such as: mean, standard deviation, minimum, and
maximum values for all 5,209 participants, PROC UNIVARIATE is used. This helped to analyze
the continuous variables age at death, systolic blood pressure, and diastolic blood pressure.
PROC UNIVARIATE is then used to generate a wider array of summary statistics and PROC
SORT is finally used to stratify the data and answer the research question on how smoking
affects blood pressure and age at death.
Descriptive Analysis Results
Table 1: Smoking status, Frequency and Percentage
Smoking Status
Frequency
Percent
Non-smoker
2501
48.35 %
Light (1-5)
579
11.19 %
Moderate (6-15)
576
11.13 %
Heavy (16-25)
1046
20.22 %
Very Heavy (>25)
471
9.10 %
Table 1 shows that most participants in the Framingham study are non-smokers with a
high percentage of 48.35%, second are the heavy smokers at 20.22%, light smokers at 11.19%,
moderate smokers at 11.13%, and lastly very heavy smokers at 9.10%. The derived data shows
that there are other risk factors besides smoking that increases the risks of developing
cardiovascular disease which is already known from the Framingham Heart Study.
6
FRAMINGHAM ANALYSIS PROJECT
Table 2: Descriptive Statistics of Continuous Variables
Variable
N
Mean
STDEV
Median
Mode
Minimum
Maximum
Age At Death
1991
70.54
10.56
71.00
68.00
36
93
Diastolic Pressure
5209
85.36
12.97
84.00
80.00
50
160
Systolic Pressure
5209
136.91
23.74
132.00
120.00
82
300
Table 2 contains information on the descriptive statistics From the derived data, the mean
and median for the variables age at death, systolic pressure and diastolic pressure are relatively
close in range to one another with values ranging no more than 5 units apart. Age at death has a
mean of 70.54 and 71.00 median, systolic pressure variable with mean of 136.91 and 132.00
median, and lastly the diastolic pressure variable has a mean of 85.36 and 84.00 median. Further
representation is shown in Graphs 1-3.
Table 3. Further Summary Statistics for Framingham Heart Study Participants
Variable
Range
IQR*
Skewness
Distrib. Pos./Neg./Norm.
Age at Death
57.0
16
-0.32
Normal
Diastolic Pressure
110.0
16
0.88
Normal
Systolic Pressure
218.0
28
1.49
Normal
*IQR=Interquartile Range
FRAMINGHAM ANALYSIS PROJECT
Graph 1: Normal distribution and Histogram for Age at Death
Graph 2: Normal Distribution and Histogram for Diastolic Pressures
7
FRAMINGHAM ANALYSIS PROJECT
8
Graph 3: Normal Distribution and Histogram for Systolic Pressures
In Graph 1-3 we observe the Skewness and the direction of asymmetry for each distribution.
Table 3 also aids in understanding the graphs skewness and distribution. Although all the graphs
have a normal distribution due to the mean and median being similar, Graph 1 for Age at Death
shows a normal distribution that is skewed to the left with a skewness of -0.32. while Diastolic
Pressure in graph 2 has a normal distribution and table 3 column 3 line 2 has a skewness of 0.88 .
Lastly Systolic Pressure in graph 3 has a normal distribution and a skewness od1.49 in graph 2
and 3 respectively has a positive skewness.
9
FRAMINGHAM ANALYSIS PROJECT
Table 4: Mean of Age at Death, Diastolic and Systolic Pressure categorized by smoking
Variable
Non-Smoker
Light
Moderate
Heavy
(1-5)
(6-15)
(16-25)
Very
Heavy
(>25)
Age at Death
73.76
70.52
68.59
68.02
65.41
Diastolic Pressure
86.91
83.78
82.61
83.85
85.67
Systolic Pressure
140.38
134.14
131.71
133.36
136.00
Table 4 shows a breakdown of mean values for age at death, diastolic pressure and
systolic pressure by smoking status computed using PROC SORT and PROC MEANS to
produce exact statistics in relation to smoking status and the provided variables. To compare the
results of smokers vs non-smokers it is observed that the diastolic pressure mean in non-smokers
is 86.91 and in very heavy smokers it is 85.67, and systolic pressure mean in non-smoker is
140.38 and in very heavy smoker is 136. These numbers become lower as we go from nonsmoker to smoker but we can see that there is no major change in number from smoker to nonsmoker which then counteract our expected effect that smoking will cause a negative change in a
participant’s blood pressure status. Age at death on the other hand shows a gradual decrease in
mean from non-smoker at 73.76 compared to a very heavy smoker with the mean of 65.41; this
shows that a person can die at a younger age when they introduce smoking which might support
the research question on how smoking can affect the length of life.
Inferential Statistics Methods
Statistical evidence using a hypotheses approach will be used to test the research question
and know if participants who have cardiovascular disease and are heavy smokers when
compared to non-smokers, light, heavy and moderate smokers have a faster rate of death at an
FRAMINGHAM ANALYSIS PROJECT
10
early age. Because the data contains one continuous variable and one categorical variable with
more than two categories, a hypotheses test using Analysis of variance (ANOVA) will be used.
According to Sullivan (2012), the ANOVA technique applies when there are more than two
independent comparison groups, it is used to compare the means of the comparison groups and is
conducted using a five-step approach. Under ANOVA the one-factor approach will be used to
compare the means of different variables of the factor representing different smoking levels.
Inferential Statistics Results
1. Set up hypotheses and determine the level of Significance
u1=non-smoker, u2=light, u3=moderate, u4=heavy, u5= very heavy
H0: u1=u2=u3=u4 =u5
H1: Means are not all equal
a=0.05
2. Select the appropriate test statistic
F=
Σn j (X j
X) 2 /(k
1)
ΣΣ(X X j ) /(N k)
F= MSB/MSE
3. Set up the decision rule
To determine the critical value of F, we need degrees of freedom:
df1= k-1 => 5-1=4
df2=N-k => 1971-4=1967
Using the critical value table at the end of this paper derived from textbook appendix
With a=0.05, df1=4, and df2=1967, we will reject H0 if F> 2.46
4. Compute the test statistic
F=47.21
5. Conclusion
Reject H0 because 47.21 is > 2.46 We have statistically significant evidence at a=0.05 to show
that the mean age at death for non-smokers, light smokers, moderate smokers, heavy
smokers, and very heavy smokers are not all equal.
Commented [KW3]: Excellent work explaining what
the subscripts represent.
Commented [KW4]: Shows how you can do symbols if
you do not know how to use the “Symbols” option in
Microsoft Word.
2
Using PROC GLM I was able to determine the level of significance which is a=0.05, the
SAS data also helped to set up the decision rule providing the sample number (N) which I was
then able to calculate using the appendix table 4 from the textbook by Sullivan (2012) receiving
a 2.46 critical value. To compute the test statistic F, the information was derived from the SAS
results for the F value data. With this I was able to reject the null hypothesis because F> 2.46
Commented [KW5]: Formula was copied and pasted
from lecture PowerPoints or some other course
material, which is perfectly acceptable and resourceful.
11
FRAMINGHAM ANALYSIS PROJECT
which helped to prove that the mean age at death for non-smokers, light smokers, moderate
smokers, heavy smokers, and very heavy smokers are not all equal.
Conclusion
This paper shows the cardiovascular participant data of the Framingham Heart Study
cohort by smoking status. As shown in table 4, smoking shortens the duration of life as we
Commented [KW6]: The major issue with this part is
that after ANOVA result is to reject the null hypothesis,
there is no further discussion of the Tukey post-hoc
multiple comparisons test, which tells us specifically
which groups have means that are not equal. This part
is necessary, and you can see how it’s supposed to be
done in the third example paper.
observed the decrease in death at age between different levels of smokers and non-smoker. The
statistics proves that overtime; non-smokers can live longer with cardiovascular disease as the
mean age of death for non-smokers is higher than that of smokers’. To relate this back to public
health, this research can help provide support for anti-smoking education to help reduce diseases
caused by lifestyle choices like smoking; It is important for those individuals to learn that life
expectancy will increase if an individual stops smoking.
12
FRAMINGHAM ANALYSIS PROJECT
References
Koenig, P. (2014, October, 10). Chronic disease as a result of poor lifestyle choices. Retrieved
August 21, 2016, from https://www.eastporthealth.org/articles/detail.php?ChronicDisease-As-a-Result-of-Poor-Lifestyle-Choices-6
Lisa, S. (2012). Essentials of biostatistics in public health. 2nd Edition. Sudbury, MA: Jones
&Bartlett Learning.
WHO. (May 2014). Cardiovascular diseases (CVDs). Retrieved August 21, 2016, from
http://www.who.int/mediacentre/factsheets/fs317/en/
13
FRAMINGHAM ANALYSIS PROJECT
Appendix
SAS Codes
Descriptive Analysis Codes For Categorical Variable
title “Sashelp.heart — Framingham Heart Study”;
PROC CONTENTS DATA = sashelp.heart;
RUN;
PROC FREQ DATA =sashelp.heart;
tables Status;
tables smoking;
run;
Descriptive Analysis Codes for Continuous Variables
PROC UNIVARIATE DATA = sashelp.heart;
VAR ageatdeath diastolic systolic ;
RUN;
PROC UNIVARIATE DATA = sashelp.heart;
VAR diastolic systolic ageatdeath;
HISTOGRAM / NORMAL;
RUN;
PROC SORT DATA = sashelp.heart OUT = temp;
BY smoking;
RUN;
PROC UNIVARIATE DATA = temp;
VAR ageatdeath diastolic systolic diastolic systolic;
BY smoking_status;
RUN;
PROC MEANS DATA = temp2;
VAR AgeAtDeath Smoking Systolic Diastolic;
BY smoking;
RUN;
Inferential Analysis Codes
PROC GLM DATA = sashelp.heart;
CLASS smoking_status;
MODEL ageatdeath = smoking_status;
MEANS smoking_status / TUKEY;
RUN;
BIOSTASTICS
SAS Data Analysis Project: Descriptive & Inferential Data Analysis Final Paper
Instructions & Grading Criteria
Purpose & Content of Analysis Paper
This final paper is the culmination of all of your SAS data analysis work in this course. The
purpose of this paper is to show your expertise in performing descriptive and inferential SAS
analyses. Additionally, writing your paper will give you some experience of how to correctly
discuss your work and results in a clear and concise manner.
Your completed final analysis project will be graded based on the following criteria:
1. Introduction (10 points)
2. Methods (20 points)
3. Discussion (45 points)
4. Conclusion (10 points)
5. Proper grammar and sentence structure used for clear and concise writing (5 points)
6. SAS results printed out in RTF format and results being reported in paper highlighted (5
points)
7. SAS code copied and pasted into the same Word document as your paper (5 points)
Your paper should be in APA format – that is, double-spaced, with a title page, header with
page number and 1 inch margin on all sides. You can find a good reference to APA style here.
https://owl.english.purdue.edu/owl/resource/560/01/
Parts 1-4 of your paper should be between 8- 10 pages, double-spaced (not including tables).
After 10 pages I will stop reading (seriously). If it is much less than 8 pages, you probably left
something important out. If it is over 10 pages, you are probably going off-topic.
Part 6 should include about 2 pages of SAS code copied and pasted into the Appendix of your
paper. If it isn’t that long, you probably left something out. The output may be much longer,
depending on your research question and the data set used.
Part 7 should include approximately 25 pages of SAS results printed out in RTF format and the
specific sections where you obtained the results you are reporting in the paper highlighted, so
that it is very clear where your answers are coming from. This part is essential, so that I can
double-check and confirm that you have run your analyses in the correct manner and that you
are reporting the appropriate results. If your results are not highlighted, and I have to hunt for
your answers, you will lose points on this section.
Descriptive Analysis
Make sure your paper has four distinct parts (shown in the table below) that flow well and
transition as seamlessly as possible into one another. The main point of this final paper is to
state the conclusions you’ve reached from all the various analyses you have performed, and
use these conclusions to answer your overall research question. You should write the paper as
if your reader does not have an extensive background in biostatistics. When you write as if
your reader is a novice, this will force you to be very detailed in your explanations, and will
help you to write in a much clearer and well-thought-out manner.
Inferential Analysis
After you have written up the findings for your descriptive data analysis, your next step is to
conduct inferential data analysis and report those findings. Now you are moving from just
describing the individual variables of interest to investigating whether there are statistically
significant relationships among your variables of interest. These inference tests will help you
answer your main research question.
The purpose of this assignment is to provide a second rough draft of another main section of
your analysis paper (the inferential statistics), so that you can receive feedback on your
methods. Your inferential data analysis should include more than one inferential test. This
could just be an ANOVA with an examination of the overall F-test and discussion of the posthoc tests. It could also be a chi-square and a logistic regression. Any of the tests of inference
discussed in Chapters 7 and 9 of your textbook are fair game for you to use.
Please use the following as a guideline for what you are required to include in the paper and
how your paper will be graded:
Criteria
Introduction:
a) Short description of your research question and its
Maximum Points Possible
10 points
a) 5 points
relevance to public health – it is recommended to do a
quick literature review and include information from
other peer-reviewed journal articles to support the
statements you are making regarding this paper
[Remember that if you use any outside sources for this
paper, that you cite them, APA-style, in the Reference
section at the end of the paper (before the Appendix)]
b) Statement of the dataset you will use to answer the
b) 5 points
question and a short description of the parent study that
this dataset originated from
Methods:
a) Names of the variables you will use to answer the
20 points
a) 5 points
research question
b) Descriptions of the variables you will use to answer the
b) 5 points
research question
c) Description of the descriptive statistics methodology [(1)
Describe the SAS analyses you did to produce descriptive
c) 5 points
statistics that summarize the study sample based upon
the characteristics (variables) you are interested in; (2)
Discuss why you chose these methods (should be based
upon the variables you are using and the question you
are answering)]
d) Description of the inferential statistics methodology [(1)
Describe the SAS analyses you did to produce the
inferential statistics that assess the relationships
between/among the characteristics (variables) that you
d) 5 points
are investigating in order to answer your research
question for the study sample; (2) Discuss why you chose
these methods (this should be based upon the variables
you are using and the question you are answering)]
Discussion:
a) Discussion of the results of the descriptive statistics
45 points
a) 20 points
analysis done using SAS [(1) Report the appropriate
statistics for numeric variables, based upon whether the
variable is normal or skewed; (2) Report the appropriate
statistics for categorical variables; (3) Present graphical
(visual) representations of each variable of interest, AND
make sure to discuss/explain what the graph means
within the discussion section. You can copy and paste
this graph or chart directly from SAS into the body of
your paper; (4) Present a table of results within the body
of your paper that summarizes all the descriptive statistics
you have discussed in your results section. Make sure to
create your own table; DO NOT just copy and paste a
table from SAS into your paper]
b) Discussion of the results of the inferential statistics
analysis done using SAS [This must be done for EACH
test you run in SAS: (1) State the null and alternate
hypotheses for each test of inference; (2) State the
appropriate test statistic SAS will calculate for that
particular test (i.e., t-values, z-values, chi-square values, Fvalues, r-values, or β-values); (3) State the decision rule –
when working in SAS you should be using either the pvalue decision rule OR the confidence interval decision
rule; (4) Report the test-statistic value produced by SAS;
(5) State the conclusion regarding the null hypothesis
and how you reached this conclusion (the statistical
b) 25 points
evidence); then, based on your conclusion, answer the
original research question]
Conclusion:
10 points
a) State the final conclusions of what you’ve learned about
a) 5 points
your study sample, based upon the findings from your
descriptive statistics analysis; discuss how these findings
relate to the original research question
b) State the final conclusions of what you’ve learned about
b) 5 points
your study sample from your inferential statistics analysis,
and answer the original research question based upon
these findings
Proper grammar and sentence structure used for clear and
5 points
concise writing
SAS Results that show where the results you are actually
5 points
reporting came from should be printed out into RTF format and
the sections where you directly obtained results should be
highlighted, so that it is clear to see where you obtained your
final answers
SAS Codes used for your analyses (copied and pasted into the
5 points
Appendix of your paper)
TOTAL SCORE
100 points
PLEASE REMEMBER:
 Your paper should be in APA format – that is, double-spaced, with a title page, header
with page number, and 1 inch margins on all sides. You can find a good reference to
APA style here: https://owl.english.purdue.edu/owl/resource/560/01/
ï‚® A data dictionary has been created for these three datasets, which will give you detailed
information about the variables available for you to use in these datasets for your
analysis project. The data dictionary can be found on the class website.
ï‚® Additional resources to help you with the data analysis (SAS coding) AND writing
process for your analysis project can be found on Blackboard. You have many resources
to help you with this project. It is up to you to use them.
Paper Length
The length of your paper should be a minimum of 5 pages, double-spaced (not including
tables, graphs, or charts). After 10 pages I will stop reading (seriously). If it is much less than 5
pages, you probably left something important out. If it is over 10 pages, you are probably
going off-topic.
Descriptive Tables
You can, and should, add tables (that you create) to your paper, but make sure you have
achieved your minimum of 6 pages of text. In other words, do not use visual representations of
your findings as filler for your paper. I will be able to notice any attempts to pad your paper.
Tables are a good way to illustrate what you are writing about in your Discussion section, and
you must make sure to clearly discuss any tables presented in your paper.
Tables are a good way to illustrate what you are writing about in your Discussion section. Any
results or findings you present in a table, within the body of your paper, must be explained and
discussed.
SAS Codes & Output Submission
It is important that you submit your SAS Codes and Output, so that I can make sure you have
run the proper analyses. This is a way to double-check your work, and it is necessary. Your
output will take up many pages, but your code should just take a few pages.
Copying Codes
To copy your SAS Codes in order to submit them, highlight the codes you want and copy them
by pressing and holding the “Ctrl” key and depressing the ”C” key simultaneously on your
keyboard if using Windows Operating System.
If you are using the Mac Operating System, highlight the codes you want and copy them by
pressing and holding the “Command” key and depressing the ”C” key simultaneously on your
keyboard if using Windows Operating System.
Pasting Codes
Open a Word document and position the mouse cursor where you want to paste the text you
previously copied. Press the “Ctrl” (or “Command”) key and the “V” key simultaneously to paste
the text.
Once you have copied and pasted all your codes to the document, you can save it and submit
it in the Dropbox.
Printing SAS Results in SAS Studio
These steps for printing can be carried out for SAS Results and should be used when
submitting your Results and Code with the final paper.
How to Insert Symbols (Greek Letters) into Your Document (Microsoft Word
Only)
How to Create Equations in Your Document (Microsoft Word Only)
Saturday, March 3, 2018 03:29:48 PM
The FREQ Procedure
Cause of Death
Frequency
Percent
Cumulative
Frequency
Cumulative
Percent
Cancer
539
27.07
539
27.07
Cerebral Vascular Disease
378
18.99
917
46.06
Coronary Heart Disease
605
30.39
1522
76.44
Other
357
17.93
1879
94.37
Unknown
112
5.63
1991
100.00
DeathCause
Frequency Missing = 3218
1
Saturday, March 3, 2018 03:29:48 PM
The FREQ Procedure
Weight
Frequency
Percent
Cumulative
Frequency
Cumulative
Percent
67
1
0.02
1
0.02
71
1
0.02
2
0.04
72
1
0.02
3
0.06
82
1
0.02
4
0.08
83
1
0.02
5
0.10
85
1
0.02
6
0.12
87
4
0.08
10
0.19
89
2
0.04
12
0.23
90
1
0.02
13
0.25
91
3
0.06
16
0.31
92
5
0.10
21
0.40
94
6
0.12

27
0.52
95
3
0.06
30
0.58
96
4
0.08
34
0.65
97
3
0.06
37
0.71
98
12
0.23
49
0.94
99
9
0.17
58
1.11
100
7
0.13
65
1.25
101
10
0.19
75
1.44
102
14
0.27
89
1.71
103
9
0.17
98
1.88
104
21
0.40
119
2.29
105
14
0.27
133
2.56
106
26
0.50
159
3.06
107
18
0.35
177
3.40
108
24
0.46
201
3.86
109
27
0.52
228
4.38
110
20
0.38
248
4.77
111
21
0.40
269
5.17
112
28
0.54
297
5.71
113
36
0.69
333
6.40
114
42
0.81
375
7.21
115
39
0.75
414
7.96
116
38
0.73
452
8.69
117
49
0.94
501
9.63
Frequency Missing = 6
2
Saturday, March 3, 2018 03:29:48 PM
The FREQ Procedure
Weight
Frequency
Percent
Cumulative
Frequency
Cumulative
Percent
118
52
1.00
553
10.63
119
38
0.73
591
11.36
120
50
0.96
641
12.32
121
34
0.65
675
12.97
122
53
1.02
728
13.99
123
54
1.04
782
15.03
124
45
0.86
827
15.89
125
56
1.08
883
16.97
126
58
1.11
941
18.09
127
57
1.10
998
19.18
128
69
1.33
1067
20.51
129
60
1.15
1127
21.66
130
54
1.04
1181
22.70
131
65
1.25
1246
23.95
132
72
1.38
1318
25.33
133
70
1.35
1388
26.68
134
60
1.15
1448
27.83
135
76
1.46
1524
29.29
136
76
1.46
1600
30.75
137
88
1.69
1688
32.44
138
93
1.79
1781
34.23
139
70
1.35
1851
35.58
140
64
1.23
1915
36.81
141
78
1.50
1993
38.30
142
67
1.29
2060
39.59
143
69
1.33
2129
40.92
144
72
1.38
2201
42.30
145
78
1.50
2279
43.80
146
66
1.27
2345
45.07
147
68
1.31
2413
46.38
148
76
1.46
2489
47.84
149
63
1.21
2552
49.05
150
75
1.44
2627
50.49
151
77
1.48
2704
51.97
152
68
1.31
2772
53.28
Frequency Missing = 6
3
Saturday, March 3, 2018 03:29:48 PM
The FREQ Procedure
Weight
Frequency
Percent
Cumulative
Frequency
Cumulative
Percent
153
70
1.35
2842
54.62
154
75
1.44
2917
56.06
155
62
1.19
2979
57.26
156
67
1.29
3046
58.54
157
56
1.08
3102
59.62
158
57
1.10
3159
60.71
159
70
1.35
3229
62.06
160
56
1.08
3285
63.14
161
40
0.77
3325
63.91
162
63
1.21
3388
65.12
163
42
0.81
3430
65.92
164
56
1.08
3486
67.00
165
55
1.06
3541
68.06
166
56
1.08
3597
69.13
167
62
1.19
3659
70.32
168
61
1.17
3720
71.50
169
46
0.88
3766
72.38
170
53
1.02
3819
73.40
171
57
1.10
3876
74.50
172
47
0.90
3923
75.40
173
50
0.96
3973
76.36
174
47
0.90
4020
77.26
175
61
1.17
4081
78.44
176
42
0.81
4123
79.24
177
44
0.85
4167
80.09
178
45
0.86
4212
80.95
179
51
0.98
4263
81.93
180
52
1.00
4315
82.93
181
46
0.88
4361
83.82
182
52
1.00
4413
84.82
183
35
0.67
4448
85.49
184
27
0.52
4475
86.01
185
24
0.46
4499
86.47
186
31
0.60
4530
87.07
187
33
0.63
4563
87.70
Frequency Missing = 6
4
Saturday, March 3, 2018 03:29:48 PM
The FREQ Procedure
Weight
Frequency
Percent
Cumulative
Frequency
Cumulative
Percent
188
32
0.62
4595
88.31
189
29
0.56
4624
88.87
190
40
0.77
4664
89.64
191
30
0.58
4694
90.22
192
22
0.42
4716
90.64
193
32
0.62
4748
91.26
194
34
0.65
4782
91.91
195
26
0.50
4808
92.41
196
20
0.38
4828
92.79
197
21
0.40
4849
93.20
198
18
0.35
4867
93.54
199
12
0.23
4879
93.77
200
18
0.35
4897
94.12
201
8
0.15
4905
94.27
202
13
0.25
4918
94.52
203
25
0.48
4943
95.00
204
16
0.31
4959
95.31
205
12
0.23
4971
95.54
206
8
0.15
4979
95.69
207
11
0.21
4990
95.91
208
17
0.33
5007
96.23
209
12
0.23
5019
96.46
210
15
0.29
5034
96.75
211
8
0.15
5042
96.91
212
11
0.21
5053
97.12
213
8
0.15
5061
97.27
214
6
0.12
5067
97.39
215
10
0.19
5077
97.58
216
7
0.13
5084
97.71
217
3
0.06
5087
97.77
218
2
0.04
5089
97.81
219
7
0.13
5096
97.94
220
8
0.15
5104
98.10
221
5
0.10
5109
98.19
222
10
0.19
5119
98.39
Frequency Missing = 6
5
Saturday, March 3, 2018 03:29:48 PM
The FREQ Procedure
Weight
Frequency
Percent
Cumulative
Frequency
Cumulative
Percent
223
4
0.08
5123
98.46
224
3
0.06
5126
98.52
225
4
0.08
5130
98.60
226
6
0.12
5136
98.71
227
4
0.08
5140
98.79
228
5
0.10
5145
98.89
229
4
0.08
5149
98.96
230
3
0.06
5152
99.02
231
3
0.06
5155
99.08
232
2
0.04
5157
99.12
234
2
0.04
5159
99.15
235
4
0.08
5163
99.23
236
4
0.08
5167
99.31
237
2
0.04
5169
99.35
238
3
0.06
5172
99.40
239
4
0.08
5176
99.48
240
1
0.02
5177
99.50
241
2
0.04
5179
99.54
242
1
0.02
5180
99.56
243
1
0.02
5181
99.58
244
3
0.06
5184
99.63
245
2
0.04
5186
99.67
246
1
0.02
5187
99.69
247
1
0.02
5188
99.71
250
2
0.04
5190
99.75
255
1
0.02
5191
99.77
256
1
0.02
5192
99.79
260
1
0.02
5193
99.81
261
1
0.02
5194
99.83
269
1
0.02
5195
99.85
271
1
0.02
5196
99.87
273
1
0.02
5197
99.88
275
1
0.02
5198
99.90
276
1
0.02
5199
99.92
281
1
0.02
5200
99.94
Frequency Missing = 6
6
Saturday, March 3, 2018 03:29:48 PM
The FREQ Procedure
Weight
Frequency
Percent
Cumulative
Frequency
Cumulative
Percent
293
1
0.02
5201
99.96
300
2
0.04
5203
100.00
Frequency Missing = 6
Smoking
Frequency
Percent
Cumulative
Frequency
Cumulative
Percent
0
2501
48.35
2501
48.35
1
113
2.18
2614
50.53
5
466
9.01
3080
59.54
10
255
4.93
3335
64.47
15
321
6.21
3656
70.67
20
921
17.80
4577
88.48
25
125
2.42
4702
90.90
30
215
4.16
4917
95.05
35
49
0.95
4966
96.00
40
151
2.92
5117
98.92
45
13
0.25
5130
99.17
50
26
0.50
5156
99.67
55
2
0.04
5158
99.71
60
15
0.29
5173
100.00
Frequency Missing = 36
7
Saturday, March 3, 2018 03:29:48 PM
Frequency
600 +
*****
|
*****
|
*****
|
*****
*****
|
*****
*****
500 +
*****
*****
|
*****
*****
|
*****
*****
|
*****
*****
|
*****
*****
400 +
*****
*****
|
*****
*****
*****
|
*****
*****
*****
*****
|
*****
*****
*****
*****
|
*****
*****
*****
*****
300 +
*****
*****
*****
*****
|
*****
*****
*****
*****
|
*****
*****
*****
*****
|
*****
*****
*****
*****
|
*****
*****
*****
*****
200 +
*****
*****
*****
*****
|
*****
*****
*****
*****
|
*****
*****
*****
*****
|
*****
*****
*****
*****
|
*****
*****
*****
*****
*****
100 +
*****
*****
*****
*****
*****
|
*****
*****
*****
*****
*****
|
*****
*****
*****
*****
*****
|
*****
*****
*****
*****
*****
|
*****
*****
*****
*****
*****
——————————————————————-C
C
C
O
U
a
e
o
t
n
n
r
r
h
k
c
e
o
e
n
e
b
n
r
o
r
r
a
w
a
r
n
l
y
V
a
s
c
u
l
a
H
e
a
r
t
D
Cause of Death
8
Saturday, March 3, 2018 03:29:48 PM
Frequency
700
600
500
400
300
200
100
|
***
|
***
|
*** ***
+
*** ***
|
*** ***
|
*** ***
|
*** ***
|
*** *** ***
+
*** *** ***
|
*** *** ***
|
*** *** *** ***
|
*** *** *** *** ***
|
*** *** *** *** ***
+
*** *** *** *** ***
|
*** *** *** *** ***
|
*** *** *** *** *** *** ***
|
*** *** *** *** *** *** ***
|
*** *** *** *** *** *** ***
+
*** *** *** *** *** *** ***
|
*** *** *** *** *** *** ***
|
*** *** *** *** *** *** ***
|
*** *** *** *** *** *** ***
|
*** *** *** *** *** *** ***
+
*** *** *** *** *** *** *** ***
|
*** *** *** *** *** *** *** ***
|
*** *** *** *** *** *** *** *** ***
|
*** *** *** *** *** *** *** *** ***
|
*** *** *** *** *** *** *** *** ***
+
*** *** *** *** *** *** *** *** ***
|
*** *** *** *** *** *** *** *** *** ***
|
*** *** *** *** *** *** *** *** *** ***
|
*** *** *** *** *** *** *** *** *** ***
|
*** *** *** *** *** *** *** *** *** ***
+
*** *** *** *** *** *** *** *** *** *** *** ***
|
*** *** *** *** *** *** *** *** *** *** *** ***
|
*** *** *** *** *** *** *** *** *** *** *** *** ***
|
*** *** *** *** *** *** *** *** *** *** *** *** *** ***
|
*** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
————————————————————————————————————————–70
80
90 100 110 120 130 140 150 160 170 180 190 200 210 220 230 240 250 260 270 280 290 300
Weight Midpoint
9
Saturday, March 3, 2018 03:29:48 PM
Frequency
2400
2100
1800
1500
1200
900
600
300
| **
| **
| **
+ **
| **
| **
| **
+ **
| **
| **
| **
+ **
| **
| **
| **
+ **
| **
| **
| **
+ **
| **
| **
| **
+ **
**
| **
**
| **
**
| **
**
+ **
**
| **
**
| **
**
**
| **
**
**
+ **
**
**
**
| **
**
**
**
**
**
| **
**
**
**
**
**
**
**
| **
**
**
**
**
**
**
**
**
——————————————————————————————————1
1
1
1
2
2
2
2
3
3
3
3
4
4
4
4
5
5
5
5
6
0
2
5
7
0
2
5
7
0
2
5
7
0
2
5
7
0
2
5
7
0
2
5
7
0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0
5
0
5
0
5
0
5
0
5
0
5
0
5
0
5
0
5
0
5
0
5
0
5
0
Smoking Midpoint
10
Saturday, March 3, 2018 03:29:48 PM
The UNIVARIATE Procedure
Variable: Weight
Moments
N
5203
Sum Weights
5203
Mean
153.086681
Sum Observations
796510
Std Deviation
28.9154261
Variance
836.101866
Skewness
0.55594115
Kurtosis
0.52275608
Uncorrected SS
126284474
Corrected SS
4349401.91
Coeff Variation
18.8882703
Std Error Mean
0.40086919
Basic Statistical Measures
Location
Variability
Mean
153.0867
Std Deviation
28.91543
Median
150.0000
Variance
836.10187
Mode
138.0000
Range
233.00000
Interquartile Range
40.00000
Tests for Location: Mu0=0
Test
Statistic
Student’s t
t
Sign
M
Signed Rank
S
p Value
381.8869
Pr > |t|
= |M|
= |S|
D
W-Sq
A-Sq
|t|
= |M|
= |S|
|t|
= |M|
= |S|
|t|
= |M|
= |S|
|t|
= |M|
= |S|
|t|
= |M|
= |S|
|t|
= |M|
= |S|
|t|
= |M|
= |S|

Purchase answer to see full
attachment

We offer the bestcustom writing paper services. We have done this question before, we can also do it for you.

Why Choose Us

  • 100% non-plagiarized Papers
  • 24/7 /365 Service Available
  • Affordable Prices
  • Any Paper, Urgency, and Subject
  • Will complete your papers in 6 hours
  • On-time Delivery
  • Money-back and Privacy guarantees
  • Unlimited Amendments upon request
  • Satisfaction guarantee

How it Works

  • Click on the “Place Order” tab at the top menu or “Order Now” icon at the bottom and a new page will appear with an order form to be filled.
  • Fill in your paper’s requirements in the "PAPER DETAILS" section.
  • Fill in your paper’s academic level, deadline, and the required number of pages from the drop-down menus.
  • Click “CREATE ACCOUNT & SIGN IN” to enter your registration details and get an account with us for record-keeping and then, click on “PROCEED TO CHECKOUT” at the bottom of the page.
  • From there, the payment sections will show, follow the guided payment process and your order will be available for our writing team to work on it.