Here is a shell script that computes the number of duplicate code segments and the total number of copied lines based on the output of EasyPMD's Copy/Paste Detector. This output is structured by segments starting with:
Found a x line (y tokens) duplication in the following files:
The script extracts these lines and sums up the occurrences and number of lines.
#!/bin/bash
if [[ $# -eq 0 ]] ; then
echo 'Please specify a file name.'
exit 0
fi
# Extract strings like:
# Found a X line (Y tokens) duplication in the following files:
cat $1 | grep Found > tmp.txt
# define counters
sum_lines=0
occurrences=0
# parse temp file and extract # of lines
while read line
do
tmp=(`echo $line | tr ' ' ' '`)
lines=${tmp[2]}
sum_lines=`expr $sum_lines + $lines`
occurrences=`expr $occurrences + 1`
done < tmp.txt
rm tmp.txt
echo "$occurrences code duplicates, $sum_lines lines in total."
Found a x line (y tokens) duplication in the following files:
The script extracts these lines and sums up the occurrences and number of lines.
#!/bin/bash
if [[ $# -eq 0 ]] ; then
echo 'Please specify a file name.'
exit 0
fi
# Extract strings like:
# Found a X line (Y tokens) duplication in the following files:
cat $1 | grep Found > tmp.txt
# define counters
sum_lines=0
occurrences=0
# parse temp file and extract # of lines
while read line
do
tmp=(`echo $line | tr ' ' ' '`)
lines=${tmp[2]}
sum_lines=`expr $sum_lines + $lines`
occurrences=`expr $occurrences + 1`
done < tmp.txt
rm tmp.txt
echo "$occurrences code duplicates, $sum_lines lines in total."
No comments:
Post a Comment