Hi! I did do an implementation in R at one stage, but I've not been maintaining. The computation is inherently iterative (at least as far as I've been able to figure it out), and doesn't suit R's preferred-for-efficiency apply() paradigm, so calculations on long lists can be very slow in R. If you have a C compiler available for your system, I'd suggest compiling the C program and then calling out to it from R using the pipe() function. (And yes, having written this, I've realized that I should write an R package for RBO. I've added it to my list of TODOs.)

William

]]>Thanks for the C code. I was able to get it working. However I often use R and was wondering did you get chance to port the code in R.

Ritesh

]]>

(Latex:

RBO_{EXT}

= \frac{1 - p}{p}

\left(\sum_{d=1}^{l}\frac{2X_d}{|S_{:d}| + |L_{:d}|}p^d

+\sum_{d=s+1}^{l}\frac{2X_s(d-s)}{(|S_{:s}| + |L_{:s}|)d}p^d\right)

+\left(\frac{X_l - X_s}{l}

+\frac{2X_s}{|S_{:s}| + |L_{:s}|}\right) p^l

)

where all the $X_k$ include any ties that go over rank $k$.

]]>RBO_{EXT} = \frac{p}{1-p}\left(\sum_{d=1}^{l}\frac{2X_d}{S_{:d}+L_{:d}}p^d+\sum_{d=s+1}^{l}\frac{2X_s(d-s)}{(S_{:d}+L_{:d})d}p^d\right)+\left(\frac{X_l-X_s}{s}+\frac{2X_s}{S_{:d}+L_{:d}}\right)p^l

]]>In Equation 32, the value of $l$ (the length of the longer list) must be unambiguous as a normalizing divisor. Calculate $X_s / s = A_s$ and $X_d / d = A_d$ as in Equation 28. There's then a single $X_s$ that doesn't have $s$ as a divisor. This is defined as implied in Equation 28, viz. if the longer list is tied over rank $s$ (the shorter one can't be), then include all the ties in calculating $X_s$.

Hope this is clear. I've now added implementing ties in my RBO implementation to my todo list.

]]>http://www.staff.amu.edu.pl/~andrzejz/files/rbo_calc.py

Shall I replace the X_d/d part from Equation 32 with Equation 29 to handle ties?

Thank you.

]]>I'm doing pairwise comparisons between ranked lists of genes returned by different methods. Since all these methods are implemented in Python, I will try to rewrite the RBO_EXT function.

Again, many thanks!

]]>I have to admit, I'd not willingly attempt to calculate extrapolated RBO on uneven rankings by hand; I'd just use the implementation at http://www.umiacs.umd.edu/~wew/downloads/rbo-0.2.1.tar.gz. Let me know if you have trouble installing or using this; I'm happy to reimplement in either Python or R if you like.

If one did want to calculate RBO_EXT(<A, B, C, D, E, H>, <D, B, F, A>) by hand, you'd need to calculate the overlaps at ranks 1 through 6 (the longer of the two lists), then plug the values into Equation 32 from the paper. Here are the overlaps, and the various intermediate values we need for Equation 32:

d R1 R2 X_d X_d/d X_s(d-s)/sd

---------------------------------

1 A D 0 0 -

2 B B 1 1/2 -

3 C F 1 1/3 -

s 4 D A 3 3/4 -

5 E - 3 3/5 3/20

l 6 H - 3 3/6 6/24

Now to place them in the formula. I'll do this using a mix of Latex and R syntax; I hope that's clear enough. (I'm assuming p=0.98)

(1) \sum_{d=1}^l (X_d / d) * p^d

= sum(c(0, 1/2, 1/3, 3/4, 3/5, 3/6) * 0.98 ^ (1:6))

= 2.47

(2) \sum_{d=s+1}^l [(X_s (d - s)) / (sd)] * p^d

= sum(c(3/20, 6/24) * 0.98 ^ (5:6))

= 0.36

(3) [(X_l - X_s) / l + X_s / s] * p^l

= ((3 - 3) / 6 + (3/4)) * 0.98 ^ 6

= 0.66

RBO_EXT = ((1 - 0.98) / 0.98) * ((1) + (2)) + (3)

= ((1 - 0.98) / 0.98) * (2.47 + 0.36) + 0.66

= 0.72

which (thank God!) is the same value as the software implementation gives (though the hand working arrived there after correcting several slips).

William

]]>