Good coding format and Practices in R

February 03, 2016

There are many recommended coding standard and layout. A badly written code is big pain for anyone reader. So its always better to have good format of coding and follow few standard. My favorite layout of coding is described below:
  • Always start your code with description because when you write many code, names can also be confuse. Code description should have good name followed by what it does than files need for running the code. This will save you and others lot of time in long run.
######################Daily_mail and dispatch_cockpit###############################
#######open VPN Client ######
##Send a mail to all seller manager and make output for dispatch cockpit
#delisted file from BI, order from BOB
  • Than always load all packages (this will make it easy to see what packages are need to run code when you share file) need for analysis, always used suppressPackageStartupMessages function, it make output elegance. 
#################load required package
suppressPackageStartupMessages(require("dplyr"))
suppressPackageStartupMessages(require("mailR"))
suppressPackageStartupMessages(require("lubridate"))
suppressPackageStartupMessages(require("htmlTable"))
suppressPackageStartupMessages(require("googlesheets"))
currentDate = Sys.Date() ##current date to make folder and use in file name

  • Set up directory of R to folder that has all input file. If you are running R code on daily basis for any repetitive task, always have separate folder for input and output ( for output you can  have new folder of each day and keep input there)
#set input to require directory
setwd("M:/R_Script")
filepath=getwd()
setwd(paste(filepath, "Input", sep="/"))

  • If you can always, import all file at start of analysis.

seller = read.csv("sellers_delisting.csv", stringsAsFactors = F)
order = read.csv2("order.csv")

  • While writing code, if you are reading heavy file or from database always make a copy of original file and keep it separate while you progress ( like say I imported file bob than make copy of bob and do all analysis on copy of bob) as while writing code you will make mistake and if you again have to import original file, its tedious. 

order_new = order

  • If you are making many subset of data, give it same name always like "temp" for subset and some relevant name for summary of subset.
temp = subset(seller, seller$Date.delisted> as.Date(Sys.Date())-30 &
seller$Status =="Delisted", select = c("Seller.Name", "Reason.for.delisting"))
#summarize
seller_delisted = table(temp$Seller.Name.,temp$Reason.for.delisting)

  • When you save output always save it in output or today's folder with date in file name. Its will save you from lot of confusion.



#Save the the file
setwd("M:/Daily/Daily")
dir.create(as.character(currentDate)) #new folder with name current date
setwd(paste("M:/Daily/Daily", currentDate, sep="/"))
csvFileName1 = paste("Threshold limit and seller delisted",currentDate,".csv",sep=" ") #File name with date
write.csv(seller_delisted, file=csvFileName1, row.names = F)

  • When you save code that need further fine tuning always use git to commit or use Version in file name.  Like text_v1. R than text_v2.R so on.
  • If your are running multiple code one after another, always remove all variable from R once single analysis is completed. So that there is no interference of old variable with new code variable. 
rm(list=ls())
Now you ready to write lucid code.

You Might Also Like

0 comments

What I like in twitter

Contact Form

Name

Email *

Message *