R-Code -- Descriptive & Summary Statistics

Let's say you've loaded data from a csv file (example of how to do this in the link)


FYI 
Here's the data I'm working with for this example: 
FYI, red is input you would type, and green is output from R.

First, load the data
    nihs<-read.table("C:\\Documents and Settings\\admin\\My Documents\\temp\\NHIS 2007 data.csv",header=T,sep=",")
I've called the csv file "nihs". Thus R now had the object nihs. 

First, just type: nihs
This should print out in your script window what your nihs data looks like

str(nihs)
str stands structure. It gives you a simple sense of the structure of your new R Object. How many observations you have (observations are basically rows). How many variables (variables are basically columns). 
We get the following output:

'data.frame':   4785 obs. of  9 variables:
 $ HHX   : int  16 20 69 87 88 99 101 122 129 134 ...
 $ FMX   : int  1 1 1 1 1 1 1 1 1 1 ...
 $ FPX   : int  2 1 2 1 1 1 1 1 2 2 ...
 $ SEX   : int  1 1 2 1 2 2 1 1 2 2 ...
 $ BMI   : num  33.4 26.5 32.1 26.6 27.1 ...
 $ SLEEP : int  8 7 7 8 8 98 6 7 7 7 ...
 $ educ  : int  16 14 9 14 13 12 13 12 16 18 ...
 $ height: int  74 70 61 68 66 98 99 70 65 64 ...
 $ weight: int  260 185 170 175 168 998 172 170 147 148 ..
.

fix(nihs)
this loads a spreadsheet where you can see and edit your new R object.
(Note that you won't be able to work in the R script window until you've close your 'fix' window.) 


Mean, Median, Mode, Variance & Standard Deviation
Say we're curious about basic summary statistics (mean, median, variance, standard deviation) for our SLEEP variable. 

mean(nihs$SLEEP)
the mean is: 9.506792 

median(nihs$SLEEP)
the median is: 7


var(nihs$SLEEP)
the variance is:  217.0364

sd(nihs$SLEEP)  
the standard deviation is:14.73215



summary(nihs$SLEEP)
  Min.    1st Qu. Median  Mean    3rd Qu. Max. 
  3.000   6.000   7.000   9.507   8.000   99.000 
this gives you a simple taste of what your variable's central tendency and range looks like. 

If you type summary(nihs), you'll get output for all the variables in your nihs object. 
Once 






Getting a visual sense of your data - Using R to get a histogram and plot of your data.





Comments