--- title: "Introduction to MullerPlot package" author: "F. Farahpour" date: "`r Sys.Date()`" output: rmarkdown::html_vignette bibliography: bibliography.bib vignette: > %\VignetteIndexEntry{Introduction to MullerPlot package} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- Muller plots are plots which show the succession of different OTUs and their relative abundance/population dynamics over time. They are powerful and fascinating tools to visualize diversity and its dynamics, i.e. how diversity emerges and how changes over time. They are called Muller plots in honor of Hermann Joseph Muller which used them to explain his idea of Muller's ratchet [@Muller]. A big difference between Muller plots and normal box plots of abundances is that a Muller plot depicts not only the relative abundances but also succession of OTUs based on their genealogy/parental relation. To see examples of such plots look at [@Herron2013; @Kvitek2013; @Maddamsetti2015]. For generating a Muller plot one needs at least the abundance/population/frequency of OTUs over time/generations and their parental/genealogy/phylogeny relation. ## How to use MullerPlot package This package contains one function called "Muller.plot" and 3 data files in rda format. These data files provide an example of information about an artificial system with dynamics of 8 OTUs over 101 time steps. This artificial data is inspired by figure 2a of [@Herron2013]. To load the package run ```{r, echo=T} library(MullerPlot) ``` ## Inputs (arguments) of Muller.plot function The main inputs of this function are: 1. attributes: Attributes of OTUS which must contain names and parents of OTUs and can contain also optional color for each OTU. As an example you can load Attributes from our artificial example: ```{r, echo=T} data("Attributes") Attributes ``` This could be a matrix or data frame and must have 2 or three columns. The first column contains OTU names. This could be any numeric or character vector. There is no rule for the order of OTU names but as a facility user can control the order of OTUs in the final plot by the order of their names in this column. Order of appearance of OTUs in Muller plot is determined by their Genealogy relation but sister OTUs (OTUs with the same parent) has no preference in the order of appearance. So One can control their appearance by the order of their appearance in this column. The second column determines the parent of each OTU with NA instead of the parent of those OTU which their parents are not known or not present in the dynamics. This column must contain at least one NA. Each OTU in this column must appear also in the first column. The third column (optional) determines the color that we want to be attributed to our OTUs (first column). If this column is empty then a random vector of colors will be attributed to OTUs. This vector with the corresponding name of OTUs will be returned by Muller.plot function. 2. population.data: This input must provide information about population/abundance over time/generation. It could be a matrix/data frame with 3 columns which contains OTU names (numeric or character), time/generation (numeric) and abundance/population (numeric, greater than or equal to 0), respectively. To use this type of population.data you should set data.method="list". Names provided in the first column must be a member of the first column of attributes. There is no rule for the order of the rows. As an example of this type of data you can load PopulationDataList. Here you can see the first 10 rows of this data frame. ```{r, echo=T} data("PopulationDataList") PopulationDataList[1:10,] ``` population.data could also be an OTU table. Then it must be a matrix or data frame with N rows and T columns. N is the number of OTUs and T is the number of time steps. rownames of this matrix must be OTU names (the same as names in attributes, numeric or character, order is not important) and colnames must be time steps or generation (numeric). To use this type of population.data you should set data.method="table". There is no rule for the order of rows and columns as long as the rownames and colnames correspond the correct rows and columns, respectively. As an example of this type of data you can load PopulationDataTable. Here you can see the first 14 columns of this data frame. ```{r, echo=T} data("PopulationDataTable") PopulationDataTable[,1:14] ``` 3. data.method: Type of data provided as population.data. It must be one of "list" or "table". For more information see population.data above. 4. time.interval.method: This must be one of "linear" or "equal". By this input you determine how should be the distance between two successive time steps or generations. If "equal" is determined then the distances would be equal regardless of the value of time steps or generations otherwise the distance is determined by the real difference of two successive time steps or generations. ## How to use Muller.plot function Muller.plot function is easily used by passing its arguments. You can provide any graphical parameter. The function by default do not provide any legend for the plot but you can use output of the function to add legend to your plots. Here, we use data.method="list": ```{r, echo=T, fig.width=7.5,fig.height=3,fig.align='center'} data(PopulationDataList) data(Attributes) layout(matrix(c(1,2), nrow = 1), widths = c(0.8, 0.2)) par(mar=c(3, 3, 1, 0)) m.plot <- Muller.plot(attributes = Attributes, population.data = PopulationDataList, data.method = "list", time.interval.method = "equal",mgp=c(2,0.5,0)) plot(0,0,axes = F) par(mar=c(0, 0, 0, 0)) legend("right", legend = m.plot$name,col = as.character(m.plot$color),pch = 19) ``` If for example, you prefer to have the OTU "rkd D5891 bc" above its sister "ylnC-48" and "BAeR G11" above its sister "HwrK-41" just change their appearance order in the attribute. ```{r, echo=T, fig.width=7.5,fig.height=3,fig.align='center'} Attributes <- Attributes[c(1,3,2,4,7,5,6,8),] layout(matrix(c(1,2), nrow = 1), widths = c(0.8, 0.2)) par(mar=c(3, 3, 1, 0)) m.plot <- Muller.plot(attributes = Attributes, population.data = PopulationDataList, data.method = "list", time.interval.method = "equal",mgp=c(2,0.5,0)) plot(0,0,axes = F) par(mar=c(0, 0, 0, 0)) legend("right", legend = m.plot$name,col = as.character(m.plot$color),pch = 19) ``` In the next plot we use data.method="table". So we should load "PopulationDataTable". ```{r, echo=T, fig.width=7.5,fig.height=3,fig.align='center'} data(PopulationDataTable) par(mar=c(3, 3, 1, 1)) Attributes[,3] <- rainbow(8) m.plot <- Muller.plot(attributes = Attributes, population.data = PopulationDataTable, data.method = "table", time.interval.method = "equal",mgp=c(2,0.5,0)) ``` If you have samples from time/generations with different time intervals you can use time.interval.method="linear" to show the real interval between samples: ```{r, echo=T, fig.width=7.5,fig.height=3,fig.align='center'} colnames(PopulationDataTable) <- c(seq(0,1000,50),seq(1050,3000,100),seq(3500,33000,500)) par(mar=c(3, 3, 1, 1)) Attributes[,3] <- rainbow(8) m.plot <- Muller.plot(attributes = Attributes, population.data = PopulationDataTable, data.method = "table", time.interval.method = "linear",mgp=c(2,0.5,0)) ``` ## How to cite ```{r} citation(package = "MullerPlot") ``` ## References