Factory Methods NMF Models

Description

nmfModel is a S4 generic function which provides a convenient way to build NMF models. It implements a unified interface for creating NMF objects from any NMF models, which is designed to resolve potential dimensions inconsistencies.

nmfModels lists all available NMF models currently defined that can be used to create NMF objects, i.e. -- more or less -- all S4 classes that inherit from class NMF-class.

Usage

nmfModel(rank, target = 0L, ...)

S4 (numeric,numeric)
`nmfModel`(rank, target, ncol = NULL, model = "NMFstd", W, H, ..., force.dim = TRUE, 
  order.basis = TRUE)

S4 (numeric,matrix)
`nmfModel`(rank, target, ..., use.names = TRUE)

S4 (formula,ANY)
`nmfModel`(rank, target, ..., data = NULL, no.attrib = FALSE)

nmfModels(builtin.only = FALSE)

Arguments

rank
specification of the target factorization rank (i.e. the number of components).
target
an object that specifies the dimension of the estimated target matrix.
...
extra arguments to allow extension, that are passed down to the workhorse method nmfModel,numeric.numeric, where they are used to initialise slots specific to the instantiating NMF model class.
ncol
a numeric value that specifies the number of columns of the target matrix, fitted the NMF model. It is used only if not missing and when argument target is a single numeric value.
model
the class of the object to be created. It must be a valid class name that inherits from class NMF. Default is the standard NMF model NMFstd-class.
W
value for the basis matrix. data.frame objects are converted into matrices with as.matrix.
H
value for the mixture coefficient matrix data.frame objects are converted into matrices with as.matrix.
force.dim
logical that indicates whether the method should try lowering the rank or shrinking dimensions of the input matrices to make them compatible
order.basis
logical that indicates whether the basis components should reorder the rows of the mixture coefficient matrix to match the order of the basis components, based on their respective names. It is only used if the basis and coefficient matrices have common unique column and row names respectively.
use.names
a logical that indicates whether the dimension names of the target matrix should be set on the returned NMF model.
data
Optional argument where to look for the variables used in the formula.
no.attrib
logical that indicate if attributes containing data related to the formula should be attached as attributes. If FALSE attributes 'target' and 'formula' contain the target matrix, and a list describing each formula part (response, regressors, etc.).
builtin.only
logical that indicates whether only built-in NMF models, i.e. defined within the NMF package, should be listed.

Value

an object that inherits from class NMF-class.

a list

Details

All nmfModel methods return an object that inherits from class NMF, that is suitable for seeding NMF algorithms via arguments rank or seed of the nmf method, in which case the factorisation rank is implicitly set by the number of basis components in the seeding model (see nmf).

For convenience, shortcut methods and internal conversions for working on data.frame objects directly are implemented. However, note that conversion of a data.frame into a matrix object may take some non-negligible time, for large datasets. If using this method or other NMF-related methods several times, consider converting your data data.frame object into a matrix once for good, when first loaded.

Methods

  1. nmfModelsignature(rank = "numeric", target = "numeric"): Main factory method for NMF models

    This method is the workhorse method that is eventually called by all other methods. See section Main factory method for more details.

  2. nmfModelsignature(rank = "numeric", target = "missing"): Creates an empty NMF model of a given rank.

    This call is equivalent to nmfModel(rank, 0L, ...), which creates empty NMF object with a basis and mixture coefficient matrix of dimension 0 x rank and rank x 0 respectively.

  3. nmfModelsignature(rank = "missing", target = "ANY"): Creates an empty NMF model of null rank and a given dimension.

    This call is equivalent to nmfModel(0, target, ...).

  4. nmfModelsignature(rank = "NULL", target = "ANY"): Creates an empty NMF model of null rank and given dimension.

    This call is equivalent to nmfModel(0, target, ...), and is meant for internal usage only.

  5. nmfModelsignature(rank = "missing", target = "missing"): Creates an empty NMF model or from existing factors

    This method is equivalent to nmfModel(0, 0, ..., force.dim=FALSE). This means that the dimensions of the NMF model will be taken from the optional basis and mixture coefficient arguments W and H. An error is thrown if their dimensions are not compatible.

    Hence, this method may be used to generate an NMF model from existing factor matrices, by providing the named arguments W and/or H:

    nmfModel(W=w) or nmfModel(H=h) or nmfModel(W=w, H=h)

    Note that this may be achieved using the more convenient interface is provided by the method nmfModel,matrix,matrix (see its dedicated description).

    See the description of the appropriate method below.

  6. nmfModelsignature(rank = "numeric", target = "matrix"): Creates an NMF model compatible with a target matrix.

    This call is equivalent to nmfModel(rank, dim(target), ...). That is that the returned NMF object fits a target matrix of the same dimension as target.

    Only the dimensions of target are used to construct the NMF object. The matrix slots are filled with NA values if these are not specified in arguments W and/or H. However, dimension names are set on the return NMF model if present in target and argument use.names=TRUE.

  7. nmfModelsignature(rank = "matrix", target = "matrix"): Creates an NMF model based on two existing factors.

    This method is equivalent to nmfModel(0, 0, W=rank, H=target..., force.dim=FALSE). This allows for a natural shortcut for wrapping existing compatible matrices into NMF models: nmfModel(w, h)

    Note that an error is thrown if their dimensions are not compatible.

  8. nmfModelsignature(rank = "data.frame", target = "data.frame"): Same as nmfModel('matrix', 'matrix') but for data.frame objects, which are generally produced by read.delim-like functions.

    The input data.frame objects are converted into matrices with as.matrix.

  9. nmfModelsignature(rank = "matrix", target = "ANY"): Creates an NMF model with arguments rank and target swapped.

    This call is equivalent to nmfModel(rank=target, target=rank, ...). This allows to call the nmfModel function with arguments rank and target swapped. It exists for convenience:

    • allows typing nmfModel(V) instead of nmfModel(target=V) to create a model compatible with a given matrix V (i.e. of dimension nrow(V), 0, ncol(V))
    • one can pass the arguments in any order (the one that comes to the user's mind first) and it still works as expected.

  10. nmfModelsignature(rank = "formula", target = "ANY"): Build a formula-based NMF model, that can incorporate fixed basis or coefficient terms.

Main factory method

The main factory engine of NMF models is implemented by the method with signature numeric, numeric. Other factory methods provide convenient ways of creating NMF models from e.g. a given target matrix or known basis/coef matrices (see section Other Factory Methods).

This method creates an object of class model, using the extra arguments in ... to initialise slots that are specific to the given model.

All NMF models implement get/set methods to access the matrix factors (see basis), which are called to initialise them from arguments W and H. These argument names derive from the definition of all built-in models that inherit derive from class NMFstd-class, which has two slots, W and H, to hold the two factors -- following the notations used in Lee et al. (1999).

If argument target is missing, the method creates a standard NMF model of dimension 0xrankx0. That is that the basis and mixture coefficient matrices, W and H, have dimension 0xrank and rankx0 respectively.

If target dimensions are also provided in argument target as a 2-length vector, then the method creates an NMF object compatible to fit a target matrix of dimension target[1]xtarget[2]. That is that the basis and mixture coefficient matrices, W and H, have dimension target[1]xrank and rankxtarget[2] respectively. The target dimensions can also be specified using both arguments target and ncol to define the number of rows and the number of columns of the target matrix respectively. If no other argument is provided, these matrices are filled with NAs.

If arguments W and/or H are provided, the method creates a NMF model where the basis and mixture coefficient matrices, W and H, are initialised using the values of W and/or H.

The dimensions given by target, W and H, must be compatible. However if force.dim=TRUE, the method will reduce the dimensions to the achieve dimension compatibility whenever possible.

When W and H are both provided, the NMF object created is suitable to seed a NMF algorithm in a call to the nmf method. Note that in this case the factorisation rank is implicitly set by the number of basis components in the seed.

References

Lee DD and Seung HS (1999). "Learning the parts of objects by non-negative matrix factorization." _Nature_, *401*(6755), pp. 788-91. ISSN 0028-0836, , .

Examples


# data
n <- 20; r <- 3; p <- 10
V <- rmatrix(n, p) # some target matrix

# create a r-ranked NMF model with a given target dimensions n x p as a 2-length vector
nmfModel(r, c(n,p)) # directly
## <Object of class:NMFstd>
## features: 20 
## basis/rank: 3 
## samples: 10
nmfModel(r, dim(V)) # or from an existing matrix <=> nmfModel(r, V)
## <Object of class:NMFstd>
## features: 20 
## basis/rank: 3 
## samples: 10
# or alternatively passing each dimension separately
nmfModel(r, n, p)
## <Object of class:NMFstd>
## features: 20 
## basis/rank: 3 
## samples: 10

# trying to create a NMF object based on incompatible matrices generates an error
w <- rmatrix(n, r)
h <- rmatrix(r+1, p)
try( new('NMFstd', W=w, H=h) )
try( nmfModel(w, h) )
try( nmfModel(r+1, W=w, H=h) )
# The factory method can be force the model to match some target dimensions
# but warnings are thrown
nmfModel(r, W=w, H=h)
## Warning: nmfModel - Objective rank [3] is lower than the number of rows in
## H [4]: only the first 3 rows of H will be used
## <Object of class:NMFstd>
## features: 20 
## basis/rank: 3 
## samples: 10
nmfModel(r, n-1, W=w, H=h)
## Warning: nmfModel - Number of rows in target is lower than the number of
## rows in W [20]: only the first 19 rows of W will be used Warning: nmfModel
## - Objective rank [3] is lower than the number of rows in H [4]: only the
## first 3 rows of H will be used
## <Object of class:NMFstd>
## features: 19 
## basis/rank: 3 
## samples: 10
## Empty model of given rank
nmfModel(3)
## <Object of class:NMFstd>
## features: 0 
## basis/rank: 3 
## samples: 0
nmfModel(target=10) #square
## <Object of class:NMFstd>
## features: 10 
## basis/rank: 0 
## samples: 10
nmfModel(target=c(10, 5))
## <Object of class:NMFstd>
## features: 10 
## basis/rank: 0 
## samples: 5
# Build an empty NMF model
nmfModel()
## <Object of class:NMFstd>
## features: 0 
## basis/rank: 0 
## samples: 0

# create a NMF object based on one random matrix: the missing matrix is deduced
# Note this only works when using factory method NMF
n <- 50; r <- 3;
w <- rmatrix(n, r)
nmfModel(W=w)
## <Object of class:NMFstd>
## features: 50 
## basis/rank: 3 
## samples: 0

# create a NMF object based on random (compatible) matrices
p <- 20
h <- rmatrix(r, p)
nmfModel(H=h)
## <Object of class:NMFstd>
## features: 0 
## basis/rank: 3 
## samples: 20

# specifies two compatible matrices
nmfModel(W=w, H=h)
## <Object of class:NMFstd>
## features: 50 
## basis/rank: 3 
## samples: 20
# error if not compatible
try( nmfModel(W=w, H=h[-1,]) )
# create a r-ranked NMF model compatible with a given target matrix
obj <- nmfModel(r, V)
all(is.na(basis(obj)))
## [1] TRUE
## From two existing factors

# allows a convenient call without argument names
w <- rmatrix(n, 3); h <- rmatrix(3, p)
nmfModel(w, h)
## <Object of class:NMFstd>
## features: 50 
## basis/rank: 3 
## samples: 20

# Specify the type of NMF model (e.g. 'NMFns' for non-smooth NMF)
mod <- nmfModel(w, h, model='NMFns')
mod
## <Object of class:NMFns>
## features: 50 
## basis/rank: 3 
## samples: 20 
## theta: 0.5

# One can use such an NMF model as a seed when fitting a target matrix with nmf()
V <- rmatrix(mod)
res <- nmf(V, mod)
nmf.equal(res, nmf(V, mod))
## [1] TRUE

# NB: when called only with such a seed, the rank and the NMF algorithm
# are selected based on the input NMF model.
# e.g. here rank was 3 and the algorithm "nsNMF" is used, because it is the default
# algorithm to fit "NMFns" models (See ?nmf).
## swapped arguments `rank` and `target`
V <- rmatrix(20, 10)
nmfModel(V) # equivalent to nmfModel(target=V)
## <Object of class:NMFstd>
## features: 20 
## basis/rank: 0 
## samples: 10
nmfModel(V, 3) # equivalent to nmfModel(3, V)
## <Object of class:NMFstd>
## features: 20 
## basis/rank: 3 
## samples: 10
# empty 3-rank model
nmfModel(~ 3)
## <Object of class:NMFstd>
## features: 0 
## basis/rank: 3 
## samples: 0

# 3-rank model that fits a given data matrix
x <- rmatrix(20,10)
nmfModel(x ~ 3)
## <Object of class:NMFstd>
## features: 20 
## basis/rank: 3 
## samples: 10

# add fixed coefficient term defined by a factor
gr <- gl(2, 5)
nmfModel(x ~ 3 + gr)
## <Object of class:NMFstd>
## features: 20 
## basis/rank: 5 
## samples: 10 
## fixed coef [2]:
##   gr = <1, 2>

# add fixed coefficient term defined by a numeric covariate
nmfModel(x ~ 3 + gr + b, data=list(b=runif(10)))
## <Object of class:NMFstd>
## features: 20 
## basis/rank: 6 
## samples: 10 
## fixed coef [3]:
##   gr = <1, 2>
##   b = 0.316867901943624, 0.381737500894815, ..., 0.50038011232391

# 3-rank model that fits a given ExpressionSet (with fixed coef terms)
e <- ExpressionSet(x)
pData(e) <- data.frame(a=runif(10))
nmfModel(e ~ 3 + gr + a) # `a` is looked up in the phenotypic data of x pData(x)
## <Object of class:NMFstd>
## features: 20 
## basis/rank: 6 
## samples: 10 
## fixed coef [3]:
##   gr = <1, 2>
##   a = 0.051527991425246, 0.965822806349024, ..., 0.536786474753171
# show all the NMF models available (i.e. the classes that inherit from class NMF)
nmfModels()
## [1] "NMFstd"    "NMFOffset" "NMFns"
# show all the built-in NMF models available
nmfModels(builtin.only=TRUE)
## [1] "NMFstd"    "NMFOffset" "NMFns"