Composite Datatypes

Arrays, Vectors, Lists, or Ordered Arrays.

When we think of ways to store data to later retrieve is by the mailbox. Basically, numbered places where we store data.

Likewise, we can make an ordered list of data such as by A1 is ‘hello’ and A2 is ‘goodbye’. One might declare it (typically) by brackets. The key is that they are numbered sequentially.  You can place things out of order, but in general, the expectation is that you push one thing onto a stack growing the size of the array by one.

A[1]=0.234234
A[2]=0.3234
A[3]=23.23

Arrays can, of course, be multi-dimensional, but generally, they are presumed to be all the same type of data, and thus you can write:

A[1,2]=0.234234

Associative arrays, objects in javascript, named arrays, or hashes.

Number storage vehicles have limitations, and thus there is another type of storage that is much similar to an address, and those are termed Associative arrays. Instead of a number, we use a name.

GeneName={"PTEN":"phosphatase and tensin homolog"}

I could have a variable called GeneInfo.  I could store GeneInfo{‘PTEN’}{‘Chr’}=’chromosome10′, and then store all sorts of information in a way that is logically retrievable.  This comes in handy a lot.  Again, historically, you do have the same type of data in unordered lists or hashes.

We can get much more complex and mix these quite a bit into data structures.  In R, we use dataframes, which include Hashes of arrays, etc., and so forth.  For example, let us load up some data!

Tables

Tables can be thought of as a Worksheet.  Below, we have the table hospital-data with multiple column headers. This actually comes from a csv file (below) that is publicly available, and you can download it (link).

CSV & TSV Tables

Comma-separated values (CSV) and Tab-Separated Files are plain text files, where the first line is typically a header and the following lines are rows.  These are typically in ASCII or plain text.