Translate

Friday, 8 December 2017

3D Animated Graph - Rotation

For scatter plot:


library(car) library(rgl)


dat <- iris

KM <- kmeans(dat[, 1:4], centers = 3)

scatter3d(
x = dat$Sepal.Length, 
y = dat$Sepal.Width, 
z = dat$Petal.Length, 
bg.col = c("white"), 
ellipsoid.alpha = 0.2, 
xlab = "Sepal.Length", 
ylab = "Sepal.Width", 
zlab = "Petal.Length", 
surface.col = colorRampPalette(c("blue", "yellow", "red"))(length(levels(factor(KM$cluster)))), 
groups = factor(KM$cluster),
grid FALSE,
surface = FALSE,
ellipsoid = TRUE
)

aspect3d(1,1,1)
          
movie3d(
spin3d(axis = c(0, 1, 0)), 
duration = 25, 
convert = FALSE,
fps = 7, 
movie = "Iris_Cluster", 
dir = getwd()

)


To save the output into a file, ImageMagick needs to be installed first. 

'rgl' package looks for convert.exe file in ImageMagick, so ensure to click the option to install legacy utilities when installing ImageMagick. (Note: the later updates to the packages might have fixed this issue, but older version of Windows could present issues along this line).

Note2: If you are on 32bit Windows, install 32bit ImageMagick.

movie3d(
spin3d(axis = c(0, 1, 0)), 
duration = 25, 
convert = TRUE,
fps = 7, 
movie = "Iris_Cluster", 
dir = getwd()
)




For bar/column chart:

library(lattice)
library(latticeExtra)
library(animation)

Dat <- aggregate(Sepal.Length~Species+Petal.Length, iris, mean)

RotPlot<-function(A){
   for(i in A){
      print(
         cloud(Sepal.Length ~ Species + Petal.Length, Dat, panel.3d.cloud = panel.3dbars,
                screen = list(z = i, x = -60), 
                col.facet=colorRampPalette(c("dodgerblue", "salmon"))(nrow(Dat)),
                xbase = 0.4, ybase = 0.4, scales = list(arrows = FALSE, col = 1),
                col = "white", par.settings = list(axis.line = list(col = "transparent"))
         )
      )
   }
}

ROT<-seq(0,360,by=10)

saveGIF(RotPlot(ROT), interval = 0.5, movie.name = "Iris.gif",
          ani.height = 640, ani.width = 640)















Connect to Google Map API from R

The below enables geocoding from R using Google Map API.

The relevant website is:

https://developers.google.com/maps/

There is a restriction on how many requests can be made per day for free usage etc, and the above website provides the necessary information and guidelines.

While it is not necessary, you can download your own key for additional benefits including tracking of the usage. Please see below link:

https://developers.google.com/maps/documentation/javascript/get-api-key


EXAMPLE:


library(RJSONIO)

#if you are using a key
KEY<-as.character(read.table("//[path to your Google Key]/[name of file containing the key (e.g. '.GoogleApiKey.txt']", header=FALSE, sep = ""))

search_url <- paste("https://maps.googleapis.com/maps/api/geocode/json?sensor=true&address=[insert address or location searched]&key=", KEY, sep = "")


#if you are NOT using a key
search_url <- "https://maps.googleapis.com/maps/api/geocode/json?sensor=true&address=[insert address or location searched]"
  
  
con <- url(search_url, open="r")
  
geo_info <- fromJSON(paste(readLines(con), collapse=""))

close(con)

#Flatten the received JSON
geo_info  <- unlist(geo_info )

latitude <- geo_info["results.geometry.location.lat"]
longitude <- geo_info["results.geometry.location.lng"]








Wednesday, 6 September 2017

Reading files into R

The below are some of the options available to import files into R.
 

 For csv files, use read.csv as below.
 
  
read.csv("test.csv")
 
  V1         V2
1  F -0.5786439
2  E  0.2472908
3  U  0.2748309
4  R  1.1791559
5  K -0.1258598
6  X -0.8898289
7  L  0.4627274
8  C -0.7088007
 
To select first 3 rows,
 
 read.csv("test.csv", nrow = 3)
 
   V1         V2
1  F -0.5786439
2  E  0.2472908
3  U  0.2748309

 
To skip first 2 rows and extract next 3 rows,
 
 
read.csv("test.csv", nrow = 3, skip = 2)
 
  E X0.247290774914572
1 U          0.2748309
2 R          1.1791559
3 K         -0.1258598
 
If the file is not csv, use read.table, but separator/delimiter needs to be specified.
 

read.table("test.csv", sep = ",", nrow = 3, skip = 2)
 
  V1        V2
1  E 0.2472908
2  U 0.2748309
3  R 1.1791559
 
For large files, use fread() function in data.table package for improved speed of import.
 

data.table::fread("test.csv", sep = ",")
 
   V1         V2
1:  F -0.5786439
2:  E  0.2472908
3:  U  0.2748309
4:  R  1.1791559
5:  K -0.1258598
6:  X -0.8898289
7:  L  0.4627274
8:  C -0.7088007

 
To read first 3 rows,
 
fread("test.csv", sep = ",", nrow = 3)
 
   V1         V2
1:  F -0.5786439
2:  E  0.2472908
3:  U  0.2748309
 

To skip first 2 lines and read next 3 rows do below. Note the fread will treat header as a row when skipping.
 
fread("test.csv", sep = ",", nrow = 3, skip = 2)

   V1        V2
1:  E 0.2472908
2:  U 0.2748309
3:  R 1.1791559
 
readLines() is best for checking the contents and delimiter of the file prior to importing, as it is not restricted by encoding or delimiters.
 
readLines("test.csv")
 
[1] "\"V1\",\"V2\""            "\"F\",-0.578643919152124"
[3] "\"E\",0.247290774914572"  "\"U\",0.274830888945797"
[5] "\"R\",1.179155856395"     "\"K\",-0.125859842900427"
[7] "\"X\",-0.889828858494609" "\"L\",0.462727351834403"
[9] "\"C\",-0.708800746374982"
 
To read first 4 lines,
 
readLines("test.csv", n = 4)
 
[1] "\"V1\",\"V2\""            "\"F\",-0.578643919152124"
[3] "\"E\",0.247290774914572"  "\"U\",0.274830888945797" 
 

scan() is similar to readLines() but treat each cell as an item, hence the output does not group elements by rows.
 

scan("test.csv", what = "list", nlines = 4)

Read 8 items
[1] "V1"                  ",\"V2\""             "F"                
[4] ",-0.578643919152124" "E"                   ",0.247290774914572"
[7] "U"                   ",0.274830888945797" 
 
To skip first 2 lines and read next 4 lines (note the header is treated as line 1 when skipping),
 
scan("test.csv", what = "list", nlines = 4, skip = 2)
 
Read 8 items
[1] "E"                   ",0.247290774914572"  "U"                
[4] ",0.274830888945797"  "R"                   ",1.179155856395"  
[7] "K"                   ",-0.125859842900427"

 
If file is compressed, e.g. gzip, use gzfile().
 
To read files in,
 
read.csv(gzfile("test.csv.gz", "r"))

 
To write into the gz file,
  
a <- gzfile("test.csv.gz", "w")
  
cat("New1, 1111 \n New2, 22222\n", file = a)








Tuesday, 15 August 2017

Creating combinations of elements using expand.grid

To generate all combinations of elements from a pair or multiples of vectors, use expand.grid().
 
 
expand.grid(c(1:3), LETTERS[1:3]) 
 

  Var1 Var2
1    1    A
2    2    A
3    3    A
4    1    B
5    2    B
6    3    B
7    1    C
8    2    C
9    3    C
 
   
 
 
expand.grid(c(1:3),LETTERS[1:3],letters[1:2])
 
 
   Var1 Var2 Var3
1     1    A    a
2     2    A    a
3     3    A    a
4     1    B    a
5     2    B    a
6     3    B    a
7     1    C    a
8     2    C    a
9     3    C    a
10    1    A    b
11    2    A    b
12    3    A    b
13    1    B    b
14    2    B    b
15    3    B    b
16    1    C    b
17    2    C    b
18    3    C    b

 
 


Tuesday, 29 November 2016

data.table

data.table package allows R to handle very large data sets, typically 10's or 100's of millions of rows, efficiently. This includes loading/importing the data and aggregating the data.
 
To import a flat file with very large number of rows, data.table provides fread function.
 
library(data.table)
Data<- fread("data.csv", sep = ",", header = TRUE)

  
To aggregate the data set: 
  
Agg <- as.data.table(iris)[, list(Avg_Sepal_Length = mean(Sepal.Length)), by = "Species"]
 
When aggregating multiple columns at the same time:
 
AggMC <- as.data.table(iris)[, list(Avg_Sepal_Length = mean(Sepal.Length), Avg_Petal_Length = mean(Petal.Length)), by = "Species"]
 
When aggregating all columns other than the grouping column:
 
AggAC <- as.data.table(iris)[, lapply(.SD, mean), by = "Species"]
 
   
When aggregating by multiple grouping columns:

AggMCMG <- as.data.table(CO2)[, list(Avg_Conc = mean(conc), Total_Uptake = sum(uptake)), by = c("Plant", "Type")]