R Heads-up and Tips for Beginners

Last updated:

These are some points that may help you if you are starting out with R (maybe you are taking the Coursera Data Science course).

R vectors are 1-indexed

That means the first index is 1 and not 0 as you may have expected.

How to make an empty data.frame and add data to it

I'm not sure this is the optimal way to handle dataframes but it gets the job done (using rbind changes the names of the labels) ```r

create a data frame that will hold two integer values

with labels "id" and "age"

my_df <- data.frame(id=numeric(0),age=numeric(0))

now add one row to the dataframe

mydf[nrow(mydf)+1,] <- c(id=10,age=25)

add another one

mydf[nrow(mydf)+1,] <- c(id=15,age=35) ```

Generating sequences

For example, to run a command a given number of times. ```r

this prints numbers from 1 to 20

number <- 20 for(i in seq(number)){ print(i) } ```

Printing to the console (command-line or RStudio)

Use cat():

cat("foo, bar baz")

Sorting a data frame by a column's value

If you have a data frame(in variable data) that has two columns: "name" and "age", you can sort data using the age attribute like this:

sorted_data <- data[with(data,order(age)),]

If you want to sort your data by age from highest to lowest, just give the column name a minus sign:

sorted_data <- data[with(data,order(-age)),]

more info: Dirk Eddelbuettel on sorting in R

Change column names in a data frame

Use function colnames() to take data frame with unwieldy column names and give it nice and easy names to work with:

> names(my_data_frame)
[1] "Provider.Number" "Hospital.Name"   "Address.1"
colnames(my_data_frame) <- c("number","name","address")
> names(my_data_frame)
[1] "number","name","address"

Dialogue & Discussion