API

Beluga.BMRevJumpPriorType
BMRevJumpPrior

Bivariate autocorrelated rates (Brownian motion) prior inspired by coevol (Lartillot & Poujol 2010) with an Inverse Wishart prior on the unknown covariance 2×2 matrix. Crucially, this is defined for the Node based model, i.e. states at model nodes are assumed to be states at nodes of the phylogeny.

source
Beluga.BranchKernelType
BranchKernel

Reversible jump kernel that introduces a WGD, decreases λ and increases μ on the associated branch.

source
Beluga.DLWGDType
DLWGD{T<:Real,V<:ModelNode{T}}

Duplication, loss and WGD model. This holds a dictionary for easy access of the nodes in the probabilistic graphical model and the leaf names.

source
Beluga.DLWGDMethod
(m::DLWGD)(row::DataFrameRow)

Instantiate a model based on a row from a trace data frame. This returns a modified copy of the input model.

source
Beluga.DropKernelType
DropKernel

Reversible jump kernel that introduces a WGD and decreases λ on the associated branch.

source
Beluga.IRRevJumpPriorType
IRRevJumpPrior

Bivariate uncorrelated rates prior with an Inverse Wishart prior on the unknown covariance 2×2 matrix. Crucially, this is defined for the Branch based model, i.e. states at model nodes are assumed to be states at branches of the phylogeny.

source
Beluga.ProfileType
Profile{T<:Real}

Struct for a phylogenetic profile of a single family. Geared towards MCMC applications (temporary storage fields) and parallel applications (using DArrays). See also PArray.

source
Beluga.RevJumpChainType
RevJumpChain

Reversible jump chain struct for DLWGD model inference.

!!! note After construction, an explicit call to init! is required.

source
Beluga.addwgd!Method
addwgd!(d::DLWGD, n::ModelNode, t, q)

Insert a WGD node with retention rate q at distance t above node n.

source
Beluga.addwgds!Method
addwgds!(m::DLWGD, p::PArray, config::Array)

Add WGDs from array of named tuples e.g. [(lca="ath,cpa", t=rand(), q=rand())] and update the profile array.

source
Beluga.addwgds!Method
addwgds!(m::DLWGD, p::PArray, config::String)
addwgds!(m::DLWGD, p::PArray, config::Dict{Int64,Tuple})

Add WGDs from a (stringified) dictionary (as in the wgds column of the trace data frame in rjMCMC applications) and update the profile array.

source
Beluga.asvectorMethod
asvector(d::DLWGD)

Get a parameter vector for the DLWGD model, structured as [λ1, …, λn, μ1, …, μn, q1, …, qk, η].

source
Beluga.bayesfactorsMethod
bayesfactors(trace::DataFrame, model::DLWGD, p::Float64)

Compute Bayes Factors for all branch WGD configurations. Returns a data frame that is more or less self-explanatory.

source
Beluga.getratesMethod
getrates(model::DLWGD{T})

Get the duplication and loss rate matrix (2 × n).

source
Beluga.getwgdtraceMethod
getwgdtrace(chain)

Summarize all WGD models from an rjMCMC trace. This provdes the data to evaluate retention rates for WGD models etc. Returns a dict of dicts with data frames (which is a horrible data structure, I know) structured as (branch1 => (1 WGD => trace, 2 WGDs => trace, ...), branch2 => (), ...).

source
Beluga.gradientMethod
gradient!(d::DLWGD, p::PArray{T})

Accumulate the gradient ∇ℓ(λ,μ,q,η|X) in parallel for the phylogenetic profile matrix p.

source
Beluga.gradientMethod
gradient(d::DLWGD, x::Vector)

Compute the gradient of the log likelihood under the DLWGD model for a single count vector x, ∇ℓ(λ,μ,q,η|x).

Warning

Currently the gradient seems to only work in NaN safe mode github issue

source
Beluga.posteriorE!Method
posteriorE!(chain)

Compute E[Xi|Xparent=1,λ,μ] for the joint posterior; i.e. the expected number of lineages at node i under the linear birth-death process given that there was one lineage at the parent of i, for each sample from the posterior. This can give an idea of gene family expansion/contraction.

source
Beluga.posteriorΣ!Method
posteriorΣ!(chain)

Sample the covariance matrix of the bivariate process post hoc from the posterior under the Inverse Wishart prior. Based on Lartillot & Poujol 2010.

source
Beluga.pppvaluesMethod
pppvalues(pps::PostPredSim)

Compute posterior predictive p-values based on the posterior predictive distribution and the observed sumary statistics (see e.g. Gelman et al. 2013).

source
Beluga.removewgd!Function
removewgd!(d::DLWGD, n::ModelNode, reindex::Bool=true, set::Bool=true)

Remove WGD/T node n from the DLWGD model. If reindex is true, the model nodes are reindexed to be consecutive. If set is true, the model internals (transition and extinction probabilities) are recomputed.

source
Beluga.setrates!Method
setrates!(model::DLWGD{T}, X::Matrix{T})

Set duplication and loss rates for each non-wgd node|branch in the model. Rates should be provided as a 2 × n matrix, where the columns correspond to model node indices.

source
Distributions.logpdf!Method
logpdf!(L::Matrix, d::DLWGD, x::Vector{Int64})

Compute the log likelihood under the DLWGD model for a single count vector x ℓ(λ,μ,q,η|x) and update the dynamic programming matrix (L).

source
Distributions.logpdf!Method
logpdf!(L::Matrix, n::ModelNode, x::Vector{Int64})

Compute the log likelihood under the DLWGD model for a single count vector x ℓ(λ,μ,q,η|x) and update the dynamic programming matrix (L), only recomputing the matrix above node n.

source
Distributions.logpdf!Method
logpdf!(d::DLWGD, p::PArray{T})
logpdf!(n::ModelNode, p::PArray{T})

Accumulate the log-likelihood ℓ(λ,μ,q,η|X) in parallel for the phylogenetic profile matrix. If the first argument is a ModelNode, this will recompute the dynamic programming matrices starting from that node to save computation. Assumes (of course) that the phylogenetic profiles are iid from the same DLWGD model.

source
Distributions.logpdfMethod
logpdf(d::DLWGD, x::Vector{Int64})

Compute the log likelihood under the DLWGD model for a single count vector x ℓ(λ,μ,q,η|x).

source
Beluga.AMMProposalsType
AMMProposals(d)

Adaptive Mixture Metropolis (AMM) proposals, where d is the dimensionality of the rates vector (should be 2 × number of nodes in tree).

source
Beluga.MWGProposalsType
MWGProposals

Proposals for the Metropolis-within-Gibbs algorithm. The MWG algorithm iterates over each node in the tree, resulting in very good mixing and fast convergence in terms of number of iterations, but has quite a high computational cost per generation.

source
Beluga.UpperBoundedGeometricType
UpperBoundedGeometric{T<:Real}

An upper bounded geometric distribution, basically a constructor for a DiscreteNonParametric distribution with the relevant probabilities.

source
Base.randMethod
rand(d::DLWGD, N::Int64 [; condition::Vector{Vector{Symbol}}])

Simulate N random phylogenetic profiles under the DLWGD model, subject to the constraint that there is at least one gene in each clade specified in the condition array. (by default conditioning is on non-extinction).

Examples:

```julia-repl julia> # include completely extinct families julia> rand(d, N, condition=[])

julia> # condition on at least one lineage in both clades stemming from the root julia> rand(d, N, condition=Beluga.rootclades(d))

source
Base.randMethod
rand(d::DLWGD)

Simulate a random phylogenetic profile from the DLWGD model.

source