Replaced all parentheses ( ) by curly brackets { }
- In this post, author Radford Neal do simple simulation and found there is a big difference between using “()” and “{}” as following:
> a <- 5; b <- 1; c <- 4 > f <- function (n) for (i in 1:n) d <- 1/{a*{b+c}} > g <- function (n) for (i in 1:n) d <- 1/(a*(b+c)) > system.time(f(1000000)) user system elapsed 3.92 0.00 3.94 > system.time(g(1000000)) user system elapsed 4.17 0.00 4.17
In my own function of robust estimation method, I have to do some simply multiplication and matrix multiplication as well. Sometime thresholding algorithm need to manipulate vectors, matrices and numbers. So some parentheses exist in my function. I follow the suggestion of Radford Neal and replace all parentheses by curly brackets. My function improve a lot.
# All parentheses:
user system elapsed
18.036 0.513 20.257
# All brackets
user system elapsed
16.190 0.475 16.728
The elapsed time is surprisingly improved. Based on the defination of these three times, (User time is seconds the computer spent doing calculations. System time is time the operating system core spent responding to the program’s requests. Elapsed time is the sum of those two, plus whatever “waiting around” your program and/or the OS had to do.) the improvement of curly brackets improve the waiting time of program. Another post about What caused elapsed time much longer than user time? gives me some clues. My case is almost all the time spent was spent in user, not sys(which is only 0.513). That means that my OS is counting the time my program has to wait for data against my program, not against the operating system. In this case, the point that I can improve is decrease the IO frequency.
Use “cmpfun” compile function to Byte Code.
With this in mind, I try simulations to show the differences and an improvement by using “cmpfun”. The improvement is significant and worth trying.
a <- 1:10000
f <- function (n) {
for (i in 1:n)
b <- sqrt(sum(a^2))
return(b)
}
g <- function (n) {
for (i in 1:n)
b <- sqrt(sum(as.numeric(a*a)))
return(b)}
k <- function (n) {
for (i in 1:n)
b <- F2norm(a)
return(b)}
# Try cmpfun compiler
TryF <- cmpfun(
function(n){
for (i in 1:n)
b <- sqrt(sum(a^2))
return(b)
})
system.time(T_f <- f(10000))
# user system elapsed
# 0.784 0.146 0.941
system.time(T_g <- g(10000))
# user system elapsed
# 1.066 0.181 1.265
system.time(T_k <- k(10000))
# user system elapsed
# 0.769 0.177 0.954
system.time(T_try <- TryF(10000))
# user system elapsed
# 0.565 0.124 0.690
T_f; T_g; T_k; T_try
# [1] 577393.6
# [1] 577393.6
# [1] 577393.6
# [1] 577393.6
Another solution is by this post. Tal mentioned that without modification of every function and add cmpfun() on them, you can simply run the following code once, in the beginning of your code.
require(compiler)
enableJIT(3)
理解统计学将有助于更好的理解这个世界的运行与发展!
–Xiaorui (Jeremy) Zhu
Knowledge is what we know. Also, what we know we do not know. We discover what we do not know. Essentially by what we know.
Thus knowledge expands.
With more knowledge we come to know More of what we do not know. Thus knowledge expands endlessly.
…
All knowledge is, in final analysis, history.
All sciences are, in the abstract, mathematics.
All judgements are, in their rationale, statistics.
—-C Radhakrishna Rao
“STATISTICS AND TRUTH Putting Chance to Work”