Monday 7 November 2016

How I made my first "serious" R package?

Hello there,

It was when I was working on a Human Activity Recognition system using R when a very good and challenging idea occurred to me.

The idea was not something that could be done with minimal effort, in fact, it was one of the toughest projects that I have ever done.

The idea was, to CREATE AN R PACKAGE OF MY OWN.

So, I read up some blogs on the internet and go to know how to create a basic R package. 

The associated packages needed to proceed with the work are :

  1. Devtools.
  2. ROxygen2
The above-mentioned packages can be easily installed by the following code:



After you are done with the above step. You need to load both the packages. Simple enough, I will still add the code here,

Now, we are all set-up.

One thing I would like to point out is that packages are nothing but functions.

So, let's write a function for our package.

I would like to write a function for getting the Benford Score of a given digit.

I don't know if Benford score is a legit term, but what I mean by it is, given a digit ( say $i$ ),
what is the probability that the first digit of any random number will be $i$ .

Wikipedia Article on Benford Law,

Benford's law, also called the first-digit law, is an observation about the frequency distribution of leading digits in many real-life sets of numerical data. The law states that in many naturally occurring collections of numbers, the leading significant digit is likely to be small.For example, in sets which obey the law, the number 1 appears as the most significant digit about 30% of the time, while 9 appears as the most significant digit less than 5% of the time. By contrast, if the digits were distributed uniformly, they would each occur about 11.1% of the time.Benford's law also makes (different) predictions about the distribution of second digits, third digits, digit combinations, and so on.
Mathematically, the probability of $i$ being the first digit of any number is given by,

$\hspace{2.0cm}$  $\large {P(i)}$ = $\large {log_{10} (1 + \frac{1}{i})}$

Writing it into a function won't be a big deal.

Let's start then, there isn't much to explain. Therefore, the code is as follows,


Now, when you run,


You must get something like this : [1] 30.103 

This is the probability of $1$ occurring in the first place.

Go ahead and try for other digits as well, and you will see the probabilities decreasing as you go from $1$ to $9$.

Yeah, this is the Zipf's Law for numbers.

Okay, enough with the small talk.

Let's make some babies now,

Oops! Sorry. Packages.

Follow the following steps,
  1. Save the function file as main.R in a folder.
    No specific reason, I just like the name "main".
  2. Press Ctrl+Shift+H and travel to the folder in which main.R is saved and select that folder as the working directory. 
  3. Now, type create("Myfirstpkg"),
    You should see something like this,

    Creating package 'Myfirstpkg' in 'C:/Users/user/Desktop/pkgFolder'
    No DESCRIPTION found. Creating with values:

    Package: package

    Title: What the Package Does (one line, title case)

    Version: 0.0.0.9000
    Authors@R: person("First", "Last", email = "first.last@example.com", role = c("aut", "cre")) Description: What the package does (one paragraph). Depends: R (>= 3.3.2) License: What license is it under? Encoding: UTF-8 LazyData: true * Creating `package.Rproj` from template. * Adding `.Rproj.user`, `.Rhistory`, `.RData` to ./.gitignore
  4. Now, inside the folder, you will see another folder named after your package name, here it is Myfirstpkg

  5. Go inside Myfirstpkg folder, there will be another folder named "R", put your main.R in that folder.
  6. Now, do some documentation,
  7. After this, go inside the Myfirstpkg through RStudio console (Refer Step 1) and then, type, document(),

    You should see something like this, 


After all these, you need to edit the DESCRIPTION file as follows,


The above image is taken from Writing R Extensions.

After you are done with it, do the following,


Yaaaaay!!

You made your first R package. :)

For more serious development, upload your file to Win Builder R.

If you pass all the cases without any error, submit it to CRAN. :)

Please have a look at my package : csvFileDescriptor-on Github. Install it and please star it if you like the functionality.

Cheers.