Here's a checklist of functions that I hope to learn how to use. They look very useful for manipulating messy data. Information in quotes is from the R help files.
aggregate()
"Splits the data into subsets, computes summary statistics for each, and returns the result in a convenient form."
complete.cases()
"a logical vector indicating which cases are complete, i.e., have no missing "
cumsum (), cumprod(x), cummax(x), cummin(x)
"Cumulative Sums, Products, and Extremes"
"Returns a vector whose elements are the cumulative sums, products, minima or maxima of the elements of the argument."
sets (), union(x, y), intersect(x, y), setdiff(x, y), setequal(x, y), is.element(el, set)
"Performs set union, intersection, (asymmetric!) difference, equality and membership on two vectors"
make.names()
"Make Syntactically Valid Names"
"Make syntactically valid names out of character vectors."
make.unique()
"Make Character Strings Unique"
"Makes the elements of a character vector unique by appending sequence numbers to duplicates."
chartr()
"Character Translation and Casefolding"
"Translate characters in character vectors, in particular from upper to lower case or vice versa."
grep(), grepl(), regexpr(), gregexpr()
"Pattern Matching and Replacement"
"for matches to argument pattern within each element of a character vector: they differ in the format of and amount of detail in the results."
sub(), gsub()
" perform replacement of the first and all matches respectively"
split()
"Divide into Groups and Reassemble"
"split divides the data in the vector x into the groups defined by f. The replacement forms replace values corresponding to such a division. unsplit reverses the effect of split."
cut()
"Convert Numeric to Factor"
"cut divides the range of x into intervals and codes the values in x according to which interval they fall. The leftmost interval corresponds to level one, the next leftmost to level two and so on."
No comments:
Post a Comment