The second exam though is optional.
The exam file syntax is:
For sorting we have to be able to identify from which file the entry stems from - so we prepend a "file-id" column - with value 2 for 1st exam and 1 for 2nd exam.
We sort now both files, firstly on the student-id (column 2) and secondly on the file-id (column 1).
i.e.:
marks1.txt:
012 3 --> 2 012 3
marks2.txt:
012 4 --> 1 012 4
sort -k 2,2 -k 1,1 <( sed -e 's/^/2 /' marks1.txt ) <( sed -e 's/^/1 /' marks2.txt )
After sorting we obtain:
1 012 4
2 012 3
If we have duplicates we only want to retain the first line (the most recent mark).
So lets first remove the file-id column again with a
cut --delimiter=' ' --fields=2-
Then remove duplicates, checking on the student-id column (1st column 3 bytes wide):
uniq -w 3Finally, we wan to link the student-ids to the student's names.
Let's say we have a file id_names.txt with the format:
All we have to do is make a join the student-id column.
The whole code:
join <( sort -k 2,2 -k 1,1 <( sed -e 's/^/2 /' marks1.txt ) <( sed -e 's/^/1 /' marks2.txt ) | cut --delimiter=' ' --fields=2- | uniq -w 3 ) <( sort id_names.txt ) > marks.txtd