API

Beluga.BMRevJumpPrior — Type

BMRevJumpPrior

Bivariate autocorrelated rates (Brownian motion) prior inspired by coevol (Lartillot & Poujol 2010) with an Inverse Wishart prior on the unknown covariance 2×2 matrix. Crucially, this is defined for the Node based model, i.e. states at model nodes are assumed to be states at nodes of the phylogeny.

Beluga.BranchKernel — Type

BranchKernel

Reversible jump kernel that introduces a WGD, decreases λ and increases μ on the associated branch.

Beluga.CRRevJumpPrior — Type

CRRevJumpPrior

Constant-rates model.

Beluga.DLWGD — Type

DLWGD{T<:Real,V<:ModelNode{T}}

Duplication, loss and WGD model. This holds a dictionary for easy access of the nodes in the probabilistic graphical model and the leaf names.

Beluga.DLWGD — Method

(m::DLWGD)(row::DataFrameRow)

Instantiate a model based on a row from a trace data frame. This returns a modified copy of the input model.

Beluga.DropKernel — Type

DropKernel

Reversible jump kernel that introduces a WGD and decreases λ on the associated branch.

Beluga.IRRevJumpPrior — Type

IRRevJumpPrior

Bivariate uncorrelated rates prior with an Inverse Wishart prior on the unknown covariance 2×2 matrix. Crucially, this is defined for the Branch based model, i.e. states at model nodes are assumed to be states at branches of the phylogeny.

Beluga.PArray — Type

PArray{T<:Real}

Ditributed array of phylogenetic profiles.

Beluga.PostPredSim — Type

PostPredSim(chain, data::DataFrame, n::Int64)

Perform posterior predictive simulations.

Beluga.Profile — Type

Profile{T<:Real}

Struct for a phylogenetic profile of a single family. Geared towards MCMC applications (temporary storage fields) and parallel applications (using DArrays). See also PArray.

Beluga.RevJumpChain — Type

RevJumpChain

Reversible jump chain struct for DLWGD model inference.

!!! note After construction, an explicit call to init! is required.

Beluga.SimpleKernel — Type

SimpleKernel

Reversible jump kernel that only introduces a WGD while not chnging λ or μ.

Beluga.addwgd! — Method

addwgd!(d::DLWGD, n::ModelNode, t, q)

Insert a WGD node with retention rate q at distance t above node n.

Beluga.addwgds! — Method

addwgds!(m::DLWGD, p::PArray, config::Array)

Add WGDs from array of named tuples e.g. [(lca="ath,cpa", t=rand(), q=rand())] and update the profile array.

Beluga.addwgds! — Method

addwgds!(m::DLWGD, p::PArray, config::String)
addwgds!(m::DLWGD, p::PArray, config::Dict{Int64,Tuple})

Add WGDs from a (stringified) dictionary (as in the wgds column of the trace data frame in rjMCMC applications) and update the profile array.

Beluga.asvector — Method

asvector(d::DLWGD)

Get a parameter vector for the DLWGD model, structured as [λ1, …, λn, μ1, …, μn, q1, …, qk, η].

Beluga.bayesfactors — Method

bayesfactors(trace::DataFrame, model::DLWGD, p::Float64)

Compute Bayes Factors for all branch WGD configurations. Returns a data frame that is more or less self-explanatory.

Beluga.getrates — Method

getrates(model::DLWGD{T})

Get the duplication and loss rate matrix (2 × n).

Beluga.getwgdtrace — Method

getwgdtrace(chain)

Summarize all WGD models from an rjMCMC trace. This provdes the data to evaluate retention rates for WGD models etc. Returns a dict of dicts with data frames (which is a horrible data structure, I know) structured as (branch1 => (1 WGD => trace, 2 WGDs => trace, ...), branch2 => (), ...).

Beluga.gradient — Method

gradient!(d::DLWGD, p::PArray{T})

Accumulate the gradient ∇ℓ(λ,μ,q,η|X) in parallel for the phylogenetic profile matrix p.

Beluga.gradient — Method

gradient(d::DLWGD, x::Vector)

Compute the gradient of the log likelihood under the DLWGD model for a single count vector x, ∇ℓ(λ,μ,q,η|x).

Warning

Currently the gradient seems to only work in NaN safe mode github issue

Beluga.init! — Method

init!(chain::RevJumpChain)

Initialize the chain.

Beluga.posteriorE! — Method

posteriorE!(chain)

Compute E[Xi|Xparent=1,λ,μ] for the joint posterior; i.e. the expected number of lineages at node i under the linear birth-death process given that there was one lineage at the parent of i, for each sample from the posterior. This can give an idea of gene family expansion/contraction.

Beluga.posteriorΣ! — Method

posteriorΣ!(chain)

Sample the covariance matrix of the bivariate process post hoc from the posterior under the Inverse Wishart prior. Based on Lartillot & Poujol 2010.

Beluga.pppvalues — Method

pppvalues(pps::PostPredSim)

Compute posterior predictive p-values based on the posterior predictive distribution and the observed sumary statistics (see e.g. Gelman et al. 2013).

Beluga.removewgd! — Function

removewgd!(d::DLWGD, n::ModelNode, reindex::Bool=true, set::Bool=true)

Remove WGD/T node n from the DLWGD model. If reindex is true, the model nodes are reindexed to be consecutive. If set is true, the model internals (transition and extinction probabilities) are recomputed.

Beluga.removewgds! — Method

removewgds(d::DLWGD)

Remove all WGD nodes from the model.

Beluga.setrates! — Method

setrates!(model::DLWGD{T}, X::Matrix{T})

Set duplication and loss rates for each non-wgd node|branch in the model. Rates should be provided as a 2 × n matrix, where the columns correspond to model node indices.

Distributions.logpdf! — Method

logpdf!(L::Matrix, d::DLWGD, x::Vector{Int64})

Compute the log likelihood under the DLWGD model for a single count vector x ℓ(λ,μ,q,η|x) and update the dynamic programming matrix (L).

Distributions.logpdf! — Method

logpdf!(L::Matrix, n::ModelNode, x::Vector{Int64})

Compute the log likelihood under the DLWGD model for a single count vector x ℓ(λ,μ,q,η|x) and update the dynamic programming matrix (L), only recomputing the matrix above node n.

Distributions.logpdf! — Method

logpdf!(d::DLWGD, p::PArray{T})
logpdf!(n::ModelNode, p::PArray{T})

Accumulate the log-likelihood ℓ(λ,μ,q,η|X) in parallel for the phylogenetic profile matrix. If the first argument is a ModelNode, this will recompute the dynamic programming matrices starting from that node to save computation. Assumes (of course) that the phylogenetic profiles are iid from the same DLWGD model.

Distributions.logpdf — Method

logpdf(d::DLWGD, x::Vector{Int64})

Compute the log likelihood under the DLWGD model for a single count vector x ℓ(λ,μ,q,η|x).

Beluga.AMMProposals — Type

AMMProposals(d)

Adaptive Mixture Metropolis (AMM) proposals, where d is the dimensionality of the rates vector (should be 2 × number of nodes in tree).

Beluga.ConstantDistribution — Type

ConstantDistribution(x)

A 'constant' distribution (Dirac mass), sometimes useful.

Beluga.MWGProposals — Type

MWGProposals

Proposals for the Metropolis-within-Gibbs algorithm. The MWG algorithm iterates over each node in the tree, resulting in very good mixing and fast convergence in terms of number of iterations, but has quite a high computational cost per generation.

Beluga.UpperBoundedGeometric — Type

UpperBoundedGeometric{T<:Real}

An upper bounded geometric distribution, basically a constructor for a DiscreteNonParametric distribution with the relevant probabilities.

Base.rand — Method

rand(d::DLWGD, N::Int64 [; condition::Vector{Vector{Symbol}}])

Simulate N random phylogenetic profiles under the DLWGD model, subject to the constraint that there is at least one gene in each clade specified in the condition array. (by default conditioning is on non-extinction).

Examples:

```julia-repl julia> # include completely extinct families julia> rand(d, N, condition=[])

julia> # condition on at least one lineage in both clades stemming from the root julia> rand(d, N, condition=Beluga.rootclades(d))

Base.rand — Method

rand(d::DLWGD)

Simulate a random phylogenetic profile from the DLWGD model.

Beluga.set! — Method

set!(d::DLWGD)

Compute all model internals in postorder.