R Heads-up and Tips for Beginners
Last updated:- R vectors are 1-indexed
- Create empty dataframe and add data to it
- Generate sequences
- Print to console (command-line or RStudio)
- Sort dataframe by column value
- Change column names in dataframe
These are some points that may help you if you are starting out with R (maybe you are taking the Coursera Data Science course).
R vectors are 1-indexed
That means the first index is 1 and not 0 as you may have expected.
Create empty dataframe and add data to it
I'm not sure this is the optimal way to handle dataframes but it gets the job done (using rbind
changes the names of the labels)
# create a data frame that will hold two integer values
# with labels "id" and "age"
my_df <- data.frame(id=numeric(0),age=numeric(0))
# now add one row to the dataframe
my_df[nrow(my_df)+1,] <- c(id=10,age=25)
# add another one
my_df[nrow(my_df)+1,] <- c(id=15,age=35)
Generate sequences
For example, to run a command a given number of times.
# this prints numbers from 1 to 20
number <- 20
for(i in seq(number)){
print(i)
}
Print to console (command-line or RStudio)
Use cat()
:
cat("foo, bar baz")
Sort dataframe by column value
If you have a data frame(in variable data
) that has two columns: "name"
and "age"
, you can sort data using the age attribute like this:
sorted_data <- data[with(data,order(age)),]
If you want to sort your data by age from highest to lowest, just give the column name a minus sign:
sorted_data <- data[with(data,order(-age)),]
more info: Dirk Eddelbuettel on sorting in R
Change column names in dataframe
Use function colnames()
to take data frame with unwieldy column names and give it nice and easy names to work with:
> names(my_data_frame)
[1] "Provider.Number" "Hospital.Name" "Address.1"
colnames(my_data_frame) <- c("number","name","address")
> names(my_data_frame)
[1] "number","name","address"