Basics Tidyverse is a set of R libraries that enables the best methods for Data Management. I will use the tidyverse libraries to perform cluster analysis and provide this information to other data science teams in the industry. library(devtools) install_github("kassambara/factoextra") Introduction to R Data Preparation and R Packages Required Packages dplyr tidyr testthat cluster factoextra Data Standardization We need the ability to transform vectors in our data frames to standard variables.
Overview The Central Limit Theorem states that when samples of a population are large, the sampliing distribution will take the shape of a normal distribution regardless of the shape of the population from which the sample was drawn. This is proven out through the simulation below that projects the theoretical mean of the exponential distribution compared to the sampling. The variance between the theorectical mean, and the sample mean is .
Overview A data set has been created from an experiment in 1952 which demonstrates the impact of Vitamin C, on the growth of guinea pigs teeth. The response is the length of teeth in each of 10 guinea pigs at each of three dose levels of Vitamin C (0.5, 1, and 2 mg) with each of two delivery methods (orange juice or ascorbic acid). The analysis below works to determine if the two supplement types have different impacts on growth of the guinea pig’s teeth.
I have wanted to make a blog post for awhile to describe how I think we can extend the power of the Healthcare Information Technology environment. My focus has been on the development and the management of Healthcare I.T. since the late 1980’s. I have worked for several Health Systems, including Sutter Health, St. Luke’s Idaho, Tahoe Forest as an employee and as a consultant, I have worked with dozen’s of other Health Systems.