bnn_alexnet – BNN Model Module¶
Build the TensorFlow model and loss functions
This module contains the functions needed to build the BNN model used in ovejero as well as the loss functions for the different posteriors.
See the script model_trainer.py for examples of how to use these functions.
-
class
ovejero.bnn_alexnet.AlwaysDropout(dropout_rate, **kwargs)[source]¶ Bases:
keras.engine.base_layer.LayerThis class applies dropout to an input both during training and inference. This is consistent with the BNN methodology.
-
call(inputs, training=None)[source]¶ The function that takes the inputs (likely outputs of a previous layer) and conducts dropout.
Parameters: - inputs (tf.Keras.Layer) – The inputs to the Dense layer.
- training (bool) – A required input for call. Setting training to true or false does nothing because always dropout behaves the same way in both cases.
Returns: The output of the Dense layer.
Return type: (tf.Keras.Layer)
-
-
class
ovejero.bnn_alexnet.ConcreteDropout(output_dim, activation=None, kernel_initializer='glorot_uniform', bias_initializer='zeros', kernel_regularizer=1e-06, dropout_regularizer=1e-05, init_min=0.1, init_max=0.1, temp=0.1, random_seed=None, **kwargs)[source]¶ Bases:
keras.engine.base_layer.LayerThis class defines a concrete dropout layer that is built around a Keras Dense layer. The dropout is parametrized by a weight that is optimized along with the model’s weights themselves. Heavy inspiration from code for arxiv.1705.07832.
-
build(input_shape=None)[source]¶ Build the weights and operations that the network will use.
Parameters: input_shape ((int,..)) – The shape of the input to our Dense layer.
-
call(inputs, training=None)[source]¶ The function that takes the inputs of the layer and conducts the Dense layer multiplication with concrete dropout.
Parameters: - inputs (tf.Keras.Layer) – The inputs to the Dense layer.
- training (bool) – A required input for call. Setting training to true or false does nothing because concrete dropout behaves the same way in both cases.
Returns: The output of the Dense layer.
Return type: (tf.Keras.Layer)
-
-
class
ovejero.bnn_alexnet.LensingLossFunctions(flip_pairs, num_params)[source]¶ Bases:
objectA class used to generate the loss functions for the three types of bayesian nn models we have implemented: diagonal covariance, full covariance, and mixture of full covariances. Currently only two gaussians are allowed in the mixture.
-
construct_precision_matrix(L_mat_elements)[source]¶ Take the matrix elements for the log cholesky decomposition and convert them to the precision matrix. Also return the value of the diagonal elements before exponentiation, since we get that for free.
Parameters: L_mat_elements (tf.Tensor) – A tensor of length num_params*(num_params+1)/2 that define the lower traingular matrix elements of the log cholesky decomposition Returns: Both the precision matrix and the diagonal elements (before exponentiation) of the log cholesky L matrix. Note that this second value is important for the posterior calculation. Return type: ((tf.Tensor,tf.Tensor))
-
diagonal_covariance_loss(y_true, output)[source]¶ Return the loss function assuming a diagonal covariance matrix
Parameters: - y_true (tf.Tensor) – The true values of the lensing parameters
- output (tf.Tensor) – The predicted values of the lensing parameters. This should include 2*self.num_params parameters to account for the diagonal entries of our covariance matrix. Covariance matrix values are assumed to be in log space.
Returns: The loss function (i.e. the tensorflow graph for it).
Return type: (tf.Tensor)
-
full_covariance_loss(y_true, output)[source]¶ Return the loss function assuming a full covariance matrix
Parameters: - y_true (tf.Tensor) – The true values of the lensing parameters
- output (tf.Tensor) – The predicted values of the lensing parameters. This should include self.num_params parameters for the prediction and self.num_params*(self.num_params+1)/2 parameters for the lower triangular log cholesky decomposition
Returns: The loss function (i.e. the tensorflow graph for it).
Return type: (tf.Tensor)
-
gm_full_covariance_loss(y_true, output)[source]¶ Return the loss function assuming a mixture of two gaussians each with a full covariance matrix
Parameters: - y_true (tf.Tensor) – The true values of the lensing parameters
- output (tf.Tensor) – The predicted values of the lensing parameters. This should include 2 gm which consists of self.num_params parameters for the prediction and self.num_params*(self.num_params+1)/2 parameters for the lower triangular log cholesky decomposition of each gm. It should also include one final parameter for the ratio between the two gms.
Returns: The loss function (i.e. the tensorflow graph for it).
Return type: (tf.Tensor)
-
log_gauss_diag(y_true, y_pred, std_pred)[source]¶ Return the negative log posterior of a Gaussian with diagonal covariance matrix
Parameters: - y_true (tf.Tensor) – The true values of the parameters
- y_pred (tf.Tensor) – The predicted value of the parameters
- std_pred (tf.Tensor) – The predicted diagonal entries of the covariance. Note that std_pred is assumed to be the log of the covariance matrix values.
Returns: The TF graph for calculating the nlp
Return type: (tf.Tensor)
Notes
This loss does not include the constant factor of 1/(2*pi)^(d/2).
-
log_gauss_full(y_true, y_pred, prec_mat, L_diag)[source]¶ Return the negative log posterior of a Gaussian with full covariance matrix
Parameters: - y_true (tf.Tensor) – The true values of the parameters
- y_pred (tf.Tensor) – The predicted value of the parameters
- prec_mat – The precision matrix
- L_diag (tf.Tensor) – The diagonal (non exponentiated) values of the log cholesky decomposition of the precision matrix
Returns: The TF graph for calculating the nlp
Return type: (tf.Tensor)
Notes
This loss does not include the constant factor of 1/(2*pi)^(d/2).
-
log_gauss_gm_full(y_true, y_preds, prec_mats, L_diags, pis)[source]¶ Return the negative log posterior of a GMM with full covariance matrix for each GM. Note this code allows for any number of GMMs.
Parameters: - y_true (tf.Tensor) – The true values of the parameters
- y_preds ([tf.Tensor,..]) – A list of the predicted value of the parameters
- prec_mats ([tf.Tensor,..]) – A list of the precision matrices
- L_diags ([tf.Tensor,..]) – A list of the diagonal (non exponentiated) values of the log cholesky decomposition of the precision matrices
Returns: The TF graph for calculating the nlp
Return type: (tf.Tensor)
Notes
This loss does not include the constant factors of 1/(2*pi)^(d/2).
-
mse_loss(y_true, output)[source]¶ Returns the MSE loss of the predicted parameters. Will ignore parameters associated with the covariance matrix.
Parameters: - y_true (tf.Tensor) – The true values of the parameters
- output (tf.Tensor) – The predicted values of the lensing parameters. This assumes the first num_params are
Returns: The mse loss function.
Return type: (tf.Tensor)
Notes
This function should never be used as a loss function. It is useful as a metric to understand what portion of the reduciton in the loss function can be attributed to improved parameter accuracy. Also note that for the gmm models the output will default to the first Gaussian for this metric.
-
-
class
ovejero.bnn_alexnet.SpatialConcreteDropout(filters, kernel_size, strides=(1, 1), padding='valid', activation=None, kernel_regularizer=1e-06, dropout_regularizer=1e-05, init_min=0.1, init_max=0.1, temp=0.1, random_seed=None, **kwargs)[source]¶ Bases:
keras.layers.convolutional.Conv2DThis class defines a spatial concrete dropout layer that is built around a Keras Conv2D layer. The dropout is parametrized by a weight that is optimized along with the model’s weights themselves. Heavy inspiration from code for arxiv.1705.07832.
-
build(input_shape=None)[source]¶ Build the weights and operations that the network will use.
Parameters: input_shape ((int,..)) – The shape of the input to our Conv2D layer.
-
call(inputs, training=None)[source]¶ The function that takes the inputs of the layer and conducts the Dense layer multiplication with concrete dropout.
Parameters: - inputs (tf.Keras.Layer) – The inputs to the Dense layer.
- training (bool) – A required input for call. Setting training to true or false does nothing because concrete dropout behaves the same way in both cases.
Returns: The output of the Dense layer.
Return type: (tf.Keras.Layer)
-
-
ovejero.bnn_alexnet.cd_regularizer(p, kernel, kernel_regularizer, dropout_regularizer, input_dim)[source]¶ Calculate the regularization term for concrete dropout.
Parameters: - p (tf.Tensor) – A 1D Tensor containing the p value for dropout (between 0 and 1).
- kernel (tf.Tensor) – A 2D Tensor defining the weights of the Dense layer
- kernel_initializer (float) – The relative strength of kernel regularization term.
- dropout_regularizer (float) – The relative strength of the dropout regularization term.
- input_dim (int) – The dimension of the input to the layer.
Returns: The tensorflow graph to calculate the regularization term.
Return type: (tf.Tensor)
Notes
This is currently not being used because of issues with the Keras framework. Once it updates this will be employed instead of dividing the loss into two parts.
-
ovejero.bnn_alexnet.concrete_alexnet(img_size, num_params, kernel_regularizer=1e-06, dropout_regularizer=1e-05, init_min=0.1, init_max=0.1, temp=0.1, random_seed=None)[source]¶ Build the tensorflow graph for the concrete dropout alexnet BNN.
Parameters: - img_size ((int,int,int)) – A tupe with shape (pix,pix,freq) that describes the size of the input images
- num_params (int) – The number of lensing parameters to predict
- kernel_regularizer (float) – The strength of the l2 norm (associated to the strength of the prior on the weights)
- dropout_regularizer (float) – The stronger it is, the more concrete dropout will tend towards larger dropout rates.
- init_min (float) – The minimum value that the dropout weight p will be initialized to.
- init_max (float) – The maximum value that the dropout weight p will be initialized to.
- temp (float) – The temperature that defines how close the concrete distribution will be to true dropout.
- random_seed (int) – A seed to use in the random function calls. If None no explicit seed will be used.
Returns: The model (i.e. the tensorflow graph for the model)
Return type: (tf.Tensor)
Notes
While the concrete dropout implementation works, the training of the dropout terms is very slow. It’s possible that modifying the learning rate schedule may help.
-
ovejero.bnn_alexnet.dropout_alexnet(img_size, num_params, kernel_regularizer=1e-06, dropout_rate=0.1, random_seed=None)[source]¶ Build the tensorflow graph for the alexnet BNN.
Parameters: - img_size ((int,int,int)) – A tupe with shape (pix,pix,freq) that describes the size of the input images
- num_params (int) – The number of lensing parameters to predict
- kernel_regularizer (float) – The strength of the l2 norm (associated to the strength of the prior on the weights)
- dropout_rate (float) – The dropout rate to use for the layers.
- random_seed (int) – A seed to use in the random function calls. If None no explicit seed will be used.
Returns: The model (i.e. the tensorflow graph for the model)
Return type: (tf.Tensor)