Friday, February 18, 2011

finding duplicate entries with duplicate()

I'm working with a data set from a long-term where plants get re-measured every year. Occasionally a new record gets inserted for a plant that already occurs in the data. This can be identified with the duplicated() command.

I can get a list that tells me TRUE/FALSE which regarding which entries are duplciated.

duplicated(data$tag)


This is a long list, so I can screen that output just for what is true by nesting the duplicated() command within a which() command

which(duplicated(data$tag) == TRUE)

The unique() command can also be used for similar task

No comments:

Post a Comment