Hi Giulio, hi Ettore,
(cc grass5 list)
thanks a lot for your efforts! With this mail I try to get
more people interested in that story... hi developers!
Developers: Prof. Antoniol was so kind to start a first try on
GRASS clone detection. A clone is considered as a piece of code
which is very similar to another piece of code (poor man's definition).
If a clone appears several times, it should become a library
function to simplify the maintenance. In fact detection of clones
is not so easy. You may, for an introduction, read this paper:
Paolo Tonella, ITC-irst: "An Introduction to Clone Detection"
http://mpa.itc.it/grass2001/tonella2001_clones.ps
I do not re-sent the new GRASS clones analysis here due to bandwidth
limitations, but I have put this preliminary analysis online:
http://mpa.itc.it/markus/tmp/grass.cln
(550k, ASCII)
Please read further on Giulio Antoniol's mail below.
Giulio: let's wait for some comments... I'll forward if needed.
Thanks!
Markus
On Thu, Mar 14, 2002 at 04:28:49PM +0100, antoniol wrote:
Hi Markus
sorry for the delay we are really busy this time ....
here the first shot ... I used the:
grass5src_cvs_snapshot_experimentalMar_8_2002.tar.gz
Clones were extracted at the function level, i.e., the finest grain unit
was considered the
C functions (we did not search clones between compount statments).Out of about 20000 functions there are about 16777 over 5 LOC with
something like 5000 clones.
Clones were computed in the most stringent way (exact matching).when you see:
---------------------------------------------
15402 15317 12498 5391/ponza2/grass/src.contrib/GMSL/ogl3d_linux/gsf/open.c reverse 163 168
/ponza2/grass/src.contrib/GMSL/ogl3d_linux/gsf/gsd_img.c reverse 170 175/ponza2/grass/src.contrib/CERL/SGI/libimage/open.c reverse 163 168
/ponza2/grass/src/libes/libimage/open.c reverse 214 219
---------------------------------------------means that the functions reverse in the 4 files open.c, gsd_img.c,
open.c (SGI) and open.c (libimage)
are clones; the function body starts at loc 163 ends at 168 etc.About 2 hours were required to parse the code and produce the result. Our
C parser is not very precise instead it is extremely robust thus there may
be functions missed in the computation.Please consider this the very preliminary result. I did the work in
cooperation with Ettore Merlo who wrote the clone recognizer. He told me
he will produce a finer grained classification plus a color html
visualization.We just need to find another couple of hours to process the data.
In the meantime, I take the liberty to suggest that we cooperate following
the clone documentation and refactoring process, we are basically very
interested to obtain data from the GRASS developers on clone effects. For
example, since you have a bug tracking tool, we may correlate clone with
removed bugs: how many time a removed bug impacted a clone, and were
clones de-bugged?Are there cases where automatic refactoring is possible? Or, can we use
clone information to cluster functions into libraries?Let us know, ciao
Giulio