Good coding format and Practices in R
February 03, 2016
There are many recommended coding standard and layout. A badly written code is big pain for anyone reader. So its always better to have good format of coding and follow few standard. My favorite layout of coding is described below:
- Always start your code with description because when you write many code, names can also be confuse. Code description should have good name followed by what it does than files need for running the code. This will save you and others lot of time in long run.
| ||
##Send a mail to all seller manager and make output for dispatch cockpit | ||
#delisted file from BI, order from BOB |
- Than always load all packages (this will make it easy to see what packages are need to run code when you share file) need for analysis, always used suppressPackageStartupMessages function, it make output elegance.
| ||
suppressPackageStartupMessages(require("mailR")) | ||
suppressPackageStartupMessages(require("lubridate")) | ||
suppressPackageStartupMessages(require("htmlTable")) | ||
suppressPackageStartupMessages(require("googlesheets")) | ||
currentDate = Sys.Date() ##current date to make folder and use in file name |
- Set up directory of R to folder that has all input file. If you are running R code on daily basis for any repetitive task, always have separate folder for input and output ( for output you can have new folder of each day and keep input there)
#set input to require directory | |
setwd("M:/R_Script") | |
filepath=getwd() | |
setwd(paste(filepath, "Input", sep="/")) |
- If you can always, import all file at start of analysis.
seller = read.csv("sellers_delisting.csv", stringsAsFactors = F) | |
order = read.csv2("order.csv") |
- While writing code, if you are reading heavy file or from database always make a copy of original file and keep it separate while you progress ( like say I imported file bob than make copy of bob and do all analysis on copy of bob) as while writing code you will make mistake and if you again have to import original file, its tedious.
order_new = order |
- If you are making many subset of data, give it same name always like "temp" for subset and some relevant name for summary of subset.
temp = subset(seller, seller$Date.delisted> as.Date(Sys.Date())-30 & | |
seller$Status =="Delisted", select = c("Seller.Name", "Reason.for.delisting")) | |
#summarize | |
seller_delisted = table(temp$Seller.Name.,temp$Reason.for.delisting) |
- When you save output always save it in output or today's folder with date in file name. Its will save you from lot of confusion.
| ||
dir.create(as.character(currentDate)) #new folder with name current date | ||
setwd(paste("M:/Daily/Daily", currentDate, sep="/")) | ||
csvFileName1 = paste("Threshold limit and seller delisted",currentDate,".csv",sep=" ") #File name with date | ||
write.csv(seller_delisted, file=csvFileName1, row.names = F) |
- When you save code that need further fine tuning always use git to commit or use Version in file name. Like text_v1. R than text_v2.R so on.
- If your are running multiple code one after another, always remove all variable from R once single analysis is completed. So that there is no interference of old variable with new code variable.
rm(list=ls()) |
Now you ready to write lucid code.
0 comments