1. Introduction
  2. Using Bayesian Optimization
  3. Ensembling and Results
  4. Code

1. Introduction

In this article we use the Bayesian Optimization (BO) package to determine hyperparameters for a 2D convolutional neural network classifier with Keras.

2. Using Bayesian Optimization

CORRECTION:

In the code below dict_params should be:

dict_params = {'num_cnn_blocks':int(num_cnn_blocks),
               'num_filters':16*int(num_filters),
               'kernel_size':int(kernel_size),
               'num_dense_nodes':64*int(num_dense_nodes),
               'dense_nodes_divisor':2*int(dense_nodes_divisor),
               'batch_size':32*int(batch_size),
               'drop_out':drop_out}

The integer multipliers should have been outside of the cast to integer. BO only supports real valued intervals, so to obtain discrete integers, we must cast the real values to integers then multiply by an integer. for num_filters, we have (1, 4.001) (see pbounds below) as the range of real values used by BO. To convert to discrete values of 16, 32, 48, 64, we use the revised coding above. This will not cause errors in the code, as Keras only requires the use of integers for variables such as filters. Using 17 as the possible value for filters is not erroneous, but it is not the intended possible value of 16, so we did not rerun the code. We apologize for this error.

    model_name = ml_algo_name + '_' + str(np.random.uniform(0,1,))[2:9]
    
    # variable parameters
    dict_params = {'num_cnn_blocks':int(num_cnn_blocks),
                   'num_filters':int(16*num_filters),
                   'kernel_size':int(kernel_size),
                   'num_dense_nodes':int(64*num_dense_nodes),
                   'dense_nodes_divisor':int(2*dense_nodes_divisor),
                   'batch_size':int(32*batch_size),
                   'drop_out':drop_out}
            
    # start of cnn coding   
    input_tensor = Input(shape=input_shape)
    
    # 1st cnn block
    x = BatchNormalization()(input_tensor)
    x = Activation('relu')(x)
    x = Conv2D(filters=dict_params['num_filters'],
               kernel_size=dict_params['kernel_size'],
               strides=1, padding='same')(x)
    #x = MaxPooling2D()(x)
    x = Dropout(dict_params['drop_out'])(x)
    
    # additional cnn blocks
    for iblock in range(dict_params['num_cnn_blocks'] - 1):
        x = BatchNormalization()(x)
        x = Activation('relu')(x)
        x = Conv2D(filters=dict_params['num_filters'],
                   kernel_size=dict_params['kernel_size'],
                   strides=1, padding='same')(x)
        x = MaxPooling2D()(x)
        x = Dropout(dict_params['drop_out'])(x)
                    
    # mlp
    x = Flatten()(x)
    x = Dense(dict_params['num_dense_nodes'], activation='relu')(x)
    x = Dropout(dict_params['drop_out'])(x)
    x = Dense(dict_params['num_dense_nodes']//dict_params['dense_nodes_divisor'], 
              activation='relu')(x)
    output_tensor = Dense(number_of_classes, activation='softmax')(x)
    
    # instantiate and compile model
    cnn_model = Model(inputs=input_tensor, outputs=output_tensor)
    opt = Adam(lr=0.00025)  # default = 0.001
    cnn_model.compile(loss='categorical_crossentropy',
                      optimizer=opt, metrics=['accuracy'])
            
    # callbacks for early stopping and for learning rate reducer
    callbacks_list = [EarlyStopping(monitor='val_loss', patience=early_stop_epochs),                     
                      ReduceLROnPlateau(monitor='val_loss', factor=0.1, 
                                        patience=learning_rate_epochs, 
                                        verbose=0, mode='auto', min_lr=1.0e-6),
                      ModelCheckpoint(filepath=results_dir + model_name + '.h5',
                                      monitor='val_loss', save_best_only=True)]
    
    # fit the model
    h = cnn_model.fit(x=xcalib, y=ycalib,
                      batch_size=dict_params['batch_size'],
                      epochs=maximum_epochs,
                      validation_split=0.25,
                      shuffle=True, verbose=0,
                      callbacks=callbacks_list)
            
    # record actual best epochs and valid loss here, added to bayes opt parameter df below
    list_early_stop_epochs.append(len(h.history['val_loss']) - early_stop_epochs)
    
    validation_loss = np.min(h.history['val_loss'])
    list_validation_loss.append(validation_loss)
    list_saved_model_name.append(model_name)
    
    # bayes opt is a maximization algorithm, to minimize validation_loss, return 1-this
    bayes_opt_score = 1.0 - validation_loss
    
    return bayes_opt_score    
    # end of cnn_function()

optimizer = BayesianOptimization(f=cnn_function,
            pbounds={'num_cnn_blocks':(2, 4.001),
                     'num_filters':(1, 4.001),  # *16
                     'kernel_size':(2, 4.001),
                     'num_dense_nodes':(1, 16.001),  # *64
                     'dense_nodes_divisor':(1, 4.001),  # *2
                     'batch_size':(1, 4.001),  # *32
                     'drop_out': (0.05, 0.5)},
                     verbose=2)

optimizer.maximize(init_points=num_random_points, n_iter=num_iterations)

print('nbest result:', optimizer.max)

We construct a simple CNN as we are using the MNIST handwritten digits data set, so we do not require sophisticated architectures such as DenseNet, ResNet, etc. dict_params contains all of the variables that will be exposed to BO. Note that since BO only accepts continuous numbers, if we want integers, we must use casting to obtain them. However, when we want integers that are evenly spaced with any value other than 1, we set the multiple (16 for number of filters) and the continuous interval (1, 4.001) (pbounds), so that we get [16, 32, 48, 64]. We put this in dict_params so that it is easier to use in the Keras CNN coding.

This code fragment is a function, cnn_function(num_cnn_blocks, num_filters, kernel_size, num_dense_nodes, dense_nodes_divisor, batch_size, drop_out), nested inside of another function (see the Code section at the end of this article). Within this function is an implicit loop enacted by BO n_iter times. Any information from the CNN result that we wish to save can be done so by appending it to a list and synchronizing with the BO diagnostic output.

Note the use of model_name = ml_algo_name + ‘_’ + str(np.random.uniform(0,1,))[2:9] within the implicit loop along with ModelCheckpoint from Keras. ModelCheckpoint will save the best model encountered so far during a loop over epochs. By using the same name for filepath that is unique for each BO loop and also saving the name in a list, we ensure that we can match up the validation loss with the model name. This is later used in ensembling by setting a threshold for validation loss.

A reminder: Bayesian Optimization is a maximization algorithm. Thus we record 1.0 – validation_loss.

See Hyperparameter Search With Bayesian Optimization for Scikit-learn Classification and Ensembling for an explanation of the other BO parameters.

To collect all results from BO, we have

list_dfs = []
counter = 0
for result in optimizer.res:
    df_temp = pd.DataFrame.from_dict(data=result['params'], orient='index',
                                     columns=['trial' + str(counter)]).T
    df_temp['bayes opt error'] = result['target']
    
    df_temp['epochs'] = list_early_stop_epochs[counter]
    df_temp['validation_loss'] = list_validation_loss[counter]
    df_temp['model_name'] = list_saved_model_name[counter]
    
    list_dfs.append(df_temp)
    
    counter = counter + 1
    
df_results = pd.concat(list_dfs, axis=0)
df_results.to_pickle(results_dir + 'df_bayes_opt_results_parameters.pkl')
df_results.to_csv(results_dir + 'df_bayes_opt_results_parameters.csv')

optimizer.res is a list of dictionaries. The code above puts each dictionary into a DataFrame then stacks them vertically into a single DataFrame. optimizer.max yields the best result (maximum target), it can also be obtained from the single DataFrame.

3. Ensembling and Results

To obtain an ensemble, we apply a threshold on the validation loss, load the corresponding saved models, then process production data (unseen data) to obtain class probabilities, NOT the classes themselves. These probabilities are then averaged and the final predicted class is determined via plurality voting.

accepted_models_num = 0
list_predicted_prob = []
num_models = df_params.shape[0]
for i in range(num_models):
    if df_params.loc[df_params.index[i],'validation_loss'] < threshold:
        
        model_name = df_params.loc[df_params.index[i],'model_name']
        model_final = load_model(models_directory + model_name + '.h5')
        
        # these are class probabilities
        list_predicted_prob.append(model_final.predict(xprod))
        
        accepted_models_num = accepted_models_num + 1

# compute mean probabilities
mean_probabilities = np.mean(list_predicted_prob, axis=0)

# compute predicted class
# argmax uses 1st ocurrance in case of a tie
y_predicted_class = np.argmax(mean_probabilities, axis=1)

CNN Hyperparameters

batch_sizedense_nodes_divisordrop_outkernel_sizenum_cnn_blocksnum_dense_nodesnum_filtersbayes opt errorepochsvalidation_lossmodel_name
trial243.33894.00100.05004.00104.00104.83444.00100.9700240.0300cnn_7466261
trial212.50181.00000.05004.00104.001011.09824.00100.9661140.0339cnn_0522347
trial374.00101.00000.05004.00104.00104.49704.00100.9643170.0357cnn_5051812
trial471.28302.07930.05002.10054.00107.08462.61640.9624160.0376cnn_1950449
trial424.00101.94730.05004.00104.00105.00351.79080.9621170.0379cnn_0833026
trial353.76243.99810.05003.17354.00101.89051.08490.9609430.0391cnn_8772981
trial183.34214.00100.50004.00104.001015.36804.00100.9606240.0394cnn_2551120
trial93.69152.30170.25603.12453.45522.49003.75010.9604250.0396cnn_4124185
trial144.00104.00100.05002.00004.001016.00101.00000.9601380.0399cnn_0094828
trial344.00103.65550.05002.00004.001014.81513.50200.9594340.0406cnn_2174602
trial164.00101.00000.05002.00004.001016.00104.00100.9594160.0406cnn_8600071
trial154.00101.00000.05004.00104.00101.00001.00000.9594320.0406cnn_9779994
trial231.00001.00000.05004.00104.00105.16631.00000.9590210.0410cnn_5389027
trial331.00004.00100.05002.41494.00108.87381.00000.9589340.0411cnn_7401566
trial294.00103.24540.05003.96634.001013.47761.34200.9584290.0416cnn_7782207
trial251.00004.00100.05002.00004.001016.00104.00100.9584160.0416cnn_1088887
trial494.00101.33210.05003.77224.001013.03804.00100.9582170.0418cnn_1277738
trial113.29152.27460.32463.18293.83798.67811.42690.9577350.0423cnn_8345552
trial33.93042.98750.25202.56343.21252.55101.52120.9571320.0429cnn_1375661
trial83.28343.89500.12702.92113.87468.52453.74710.9556100.0444cnn_2874261
trial304.00101.00000.05002.00004.00105.46441.00000.9552450.0448cnn_7897795
trial63.38511.57130.46783.03533.422013.10561.35570.9541410.0459cnn_1827607
trial121.00001.00000.05002.00004.00101.00001.06540.9537380.0463cnn_2804524
trial174.00104.00100.05342.00004.00101.00004.00100.9527320.0473cnn_7216191
trial412.25633.47010.28502.01603.89764.38581.00140.9525410.0475cnn_5766945
trial43.41313.85780.06292.81133.77437.39852.21790.951180.0489cnn_6586571
trial381.03243.72490.07633.87983.872510.55573.99360.949360.0507cnn_2038463
trial433.91633.99000.11962.08813.90423.36943.96480.9489150.0511cnn_2407983
trial403.21681.41620.05362.27493.917012.10451.04880.9471170.0529cnn_3200254
trial191.00002.05690.50002.00004.00105.40794.00100.9463310.0537cnn_2272907
trial321.03383.23720.05573.18713.86292.67963.95730.946140.0539cnn_8375640
trial361.62791.28070.06504.00022.21503.16571.22770.939040.0610cnn_2177580
trial01.37103.50670.20243.31162.47357.71483.19120.9381160.0619cnn_4534673
trial454.00104.00100.05004.00102.000016.00101.00000.9356160.0644cnn_6611043
trial264.00101.00000.50004.00104.001015.96931.00000.9303380.0697cnn_7027454
trial201.00001.93280.05004.00102.00001.46614.00100.930120.0699cnn_4494844
trial12.58181.92850.38443.03672.59698.38882.84450.9300160.0700cnn_4664828
trial52.16041.67330.06733.99972.853614.70743.20520.929950.0701cnn_4679929
trial482.94841.00000.05004.00102.000010.92701.00000.929650.0704cnn_4994236
trial21.54861.55760.45452.04582.803413.46133.84630.9229240.0771cnn_4649820
trial464.00101.00000.05002.00002.000016.00101.00000.9224140.0776cnn_0834311
trial71.14231.32250.34572.46752.974412.71852.82410.9211250.0789cnn_3976373
trial274.00104.00100.05002.00002.000016.00104.00100.920870.0792cnn_5418930
trial314.00104.00100.05004.00102.00001.00004.00100.919390.0807cnn_2041164
trial131.00004.00100.50004.00104.00101.00001.03050.9129390.0871cnn_2932650
trial281.50084.00100.05002.00002.00001.00001.00000.9104170.0896cnn_3575582
trial223.98361.23130.44352.16423.96191.19612.50670.9061220.0939cnn_6597575
trial103.73922.77500.45562.49442.01606.65503.15380.893780.1063cnn_0709821
trial391.00004.00100.50002.00004.001016.00101.00000.8935570.1065cnn_1597172
trial444.00104.00100.50002.68604.001011.15101.00000.8296560.1704cnn_4295834

balanced accuracy score = 0.9939
accuracy score = 0.994
number of accepted models = 26 for threshold = 0.05


4. Code

import os
import numpy as np
import pandas as pd
from bayes_opt import BayesianOptimization
from keras.layers import Dense, Conv2D, BatchNormalization
from keras.layers import MaxPooling2D
from keras.layers import Input, Flatten, Dropout
from keras.layers import Activation
from keras.optimizers import Adam
from keras.callbacks import ReduceLROnPlateau, EarlyStopping, ModelCheckpoint
from keras.models import Model, load_model
from keras.utils import to_categorical
from sklearn.model_selection import train_test_split
from sklearn.metrics import balanced_accuracy_score, accuracy_score
from sklearn.metrics import confusion_matrix, classification_report
import time
from pathlib import Path
import sys
sys.path.append(YOUR DIRECTORY)
import confusion_matrix_plot
    
# *******************************************************************

# raw shapes: x:(num pts, 784), y:(num pts,)

def get_data_mnist(data_type, data_dir):
    
    data_dictionary = {}
    
    if data_type == 'calibration':
        # x data - reshape = (60000, 28, 28, 1) - channels last
        x_calib_raw = np.load(data_dir + 'x_mnist_calibration.npy', allow_pickle=True)
        x_calib_raw = x_calib_raw.astype('float32')/255.0  # normalize
        x_calib = np.reshape(x_calib_raw, (x_calib_raw.shape[0], 28, 28, 1))
        
        # y data - shape = (50000, ) -> (50000, 10)
        y_calib = to_categorical(np.load(data_dir + 'y_mnist_calibration.npy', 
                                         allow_pickle=True))
        
        # split
        x_calib1, x_calib2, y_calib1, y_calib2 = train_test_split(x_calib, y_calib,
        train_size=0.5, shuffle=True, stratify=y_calib)
        
        data_dictionary['x_calib1'] = x_calib1
        data_dictionary['y_calib1'] = y_calib1
        
        data_dictionary['x_calib2'] = x_calib2
        data_dictionary['y_calib2'] = y_calib2
        
    elif data_type == 'production':
        # x data - reshape = (10000, 28, 28, 1) - channels last
        x_prod_raw = np.load(data_dir + 'x_mnist_production.npy', allow_pickle=True)
        x_prod_raw = x_prod_raw.astype('float32')/255.0  # normalize
        x_prod = np.reshape(x_prod_raw, (x_prod_raw.shape[0], 28, 28, 1))
        
        # y data - shape = (10000, )
        y_prod = np.load(data_dir + 'y_mnist_production.npy', allow_pickle=True)
        
        data_dictionary['x_prod'] = x_prod
        data_dictionary['y_prod'] = y_prod
        
    else:
        raise NameError
   
    return data_dictionary
    
# end of get_data_mnist()
    
# *******************************************************************
        
def create_save_models_bayes_opt(xcalib, ycalib,
                                 num_random_points, num_iterations, 
                                 results_dir, ml_algo_name):
    
    input_shape = xcalib.shape[1:] # (28, 28, 1) - channels last
    number_of_classes = ycalib.shape[1]
    
    # parameters for cnn that do NOT have to be saved
    maximum_epochs = 1000
    early_stop_epochs = 10
    learning_rate_epochs = 5
       
    # parameters that change for each iteration that must be saved
    list_early_stop_epochs = []
    list_validation_loss = []
    list_saved_model_name = []
    
    start_time_total = time.time()
    
    def cnn_function(num_cnn_blocks, num_filters, kernel_size,
                     num_dense_nodes, dense_nodes_divisor, batch_size, drop_out):
        
        model_name = ml_algo_name + '_' + str(np.random.uniform(0,1,))[2:9]
        
        # variable parameters
        dict_params = {'num_cnn_blocks':int(num_cnn_blocks),
                       'num_filters':int(16*num_filters),
                       'kernel_size':int(kernel_size),
                       'num_dense_nodes':int(64*num_dense_nodes),
                       'dense_nodes_divisor':int(2*dense_nodes_divisor),
                       'batch_size':int(32*batch_size),
                       'drop_out':drop_out}
                
        # start of cnn coding   
        input_tensor = Input(shape=input_shape)
        
        # 1st cnn block
        x = BatchNormalization()(input_tensor)
        x = Activation('relu')(x)
        x = Conv2D(filters=dict_params['num_filters'],
                   kernel_size=dict_params['kernel_size'],
                   strides=1, padding='same')(x)
        #x = MaxPooling2D()(x)
        x = Dropout(dict_params['drop_out'])(x)
        
        # additional cnn blocks
        for iblock in range(dict_params['num_cnn_blocks'] - 1):
            x = BatchNormalization()(x)
            x = Activation('relu')(x)
            x = Conv2D(filters=dict_params['num_filters'],
                       kernel_size=dict_params['kernel_size'],
                       strides=1, padding='same')(x)
            x = MaxPooling2D()(x)
            x = Dropout(dict_params['drop_out'])(x)
                        
        # mlp
        x = Flatten()(x)
        x = Dense(dict_params['num_dense_nodes'], activation='relu')(x)
        x = Dropout(dict_params['drop_out'])(x)
        x = Dense(dict_params['num_dense_nodes']//dict_params['dense_nodes_divisor'], 
                  activation='relu')(x)
        output_tensor = Dense(number_of_classes, activation='softmax')(x)
        
        # instantiate and compile model
        cnn_model = Model(inputs=input_tensor, outputs=output_tensor)
        opt = Adam(lr=0.00025)  # default = 0.001
        cnn_model.compile(loss='categorical_crossentropy',
                          optimizer=opt, metrics=['accuracy'])
        
        
        # callbacks for early stopping and for learning rate reducer
        callbacks_list = [EarlyStopping(monitor='val_loss', patience=early_stop_epochs),                     
                          ReduceLROnPlateau(monitor='val_loss', factor=0.1, 
                                            patience=learning_rate_epochs, 
                                            verbose=0, mode='auto', min_lr=1.0e-6),
                          ModelCheckpoint(filepath=results_dir + model_name + '.h5',
                                          monitor='val_loss', save_best_only=True)]
            
        # fit the model
        h = cnn_model.fit(x=xcalib, y=ycalib,
                          batch_size=dict_params['batch_size'],
                          epochs=maximum_epochs,
                          validation_split=0.25,
                          shuffle=True, verbose=0,
                          callbacks=callbacks_list)
                
        # record actual best epochs and valid loss here, added to bayes opt parameter df below
        list_early_stop_epochs.append(len(h.history['val_loss']) - early_stop_epochs)
        
        validation_loss = np.min(h.history['val_loss'])  # h.history['val_loss']
        list_validation_loss.append(validation_loss)
        list_saved_model_name.append(model_name)
        
        # bayes opt is a maximization algorithm, to minimize validation_loss, return 1-this
        bayes_opt_score = 1.0 - validation_loss
        
        return bayes_opt_score
    
        # end of cnn_function()
    
    optimizer = BayesianOptimization(f=cnn_function,
                pbounds={'num_cnn_blocks':(2, 4.001),
                         'num_filters':(1, 4.001),  # *16
                         'kernel_size':(2, 4.001),
                         'num_dense_nodes':(1, 16.001),  # *64
                         'dense_nodes_divisor':(1, 4.001),  # *2
                         'batch_size':(1, 4.001),  # *32
                         'drop_out': (0.05, 0.5)},
                         verbose=2)
    
    optimizer.maximize(init_points=num_random_points, n_iter=num_iterations)
    
    print('nbest result:', optimizer.max)
    
    elapsed_time_total = (time.time()-start_time_total)/60
    print('\n\ntotal elapsed time =',elapsed_time_total,' minutes')
    
    # optimizer.res is a list of dicts
    list_dfs = []
    counter = 0
    for result in optimizer.res:
        df_temp = pd.DataFrame.from_dict(data=result['params'], orient='index',
                                         columns=['trial' + str(counter)]).T
        df_temp['bayes opt error'] = result['target']
        
        df_temp['epochs'] = list_early_stop_epochs[counter]
        df_temp['validation_loss'] = list_validation_loss[counter]
        df_temp['model_name'] = list_saved_model_name[counter]
        
        list_dfs.append(df_temp)
        
        counter = counter + 1
        
    df_results = pd.concat(list_dfs, axis=0)
    df_results.to_pickle(results_dir + 'df_bayes_opt_results_parameters.pkl')
    df_results.to_csv(results_dir + 'df_bayes_opt_results_parameters.csv')
        
# end of create_save_models_bayes_opt()
    
# *******************************************************************
            
def make_final_predictions(xprod, yprod, 
                           list_class_names, models_directory, 
                           save_directory, df_params,
                           threshold, ml_name):
    
    # apply threshold
    accepted_models_num = 0
    list_predicted_prob = []
    num_models = df_params.shape[0]
    for i in range(num_models):
        if df_params.loc[df_params.index[i],'validation_loss'] < threshold:
            
            model_name = df_params.loc[df_params.index[i],'model_name']
            model_final = load_model(models_directory + model_name + '.h5')
            
            # these are class probabilities
            list_predicted_prob.append(model_final.predict(xprod))
            
            accepted_models_num = accepted_models_num + 1

    # compute mean probabilities
    mean_probabilities = np.mean(list_predicted_prob, axis=0)
    
    # compute predicted class
    # argmax uses 1st ocurrance in case of a tie
    y_predicted_class = np.argmax(mean_probabilities, axis=1)
    
    # compute and save error measures

    # print info to file
    stdout_default = sys.stdout
    sys.stdout = open(save_directory + ml_name + '_prediction_results.txt','w')
    
    print('balanced accuracy score =',balanced_accuracy_score(yprod, y_predicted_class))
    
    print('accuracy score =',accuracy_score(yprod, y_predicted_class))
    
    print('number of accepted models =',accepted_models_num,' for threshold =',threshold)
    
    print('\nclassification report:')
    print(classification_report(yprod, y_predicted_class, digits=3, output_dict=False))
    
    print('\nraw confusion matrix:')
    cm_raw = confusion_matrix(yprod, y_predicted_class)
    print(cm_raw)
    
    print('\nconfusion matrix normalized by prediction:')
    cm_pred = confusion_matrix(yprod, y_predicted_class, normalize='pred')
    print(cm_pred)
    
    print('\nconfusion matrix normalized by true:')
    cm_true = confusion_matrix(yprod, y_predicted_class, normalize='true')
    print(cm_true)
    
    sys.stdout = stdout_default
 
    # plot and save confustion matrices
    figure_size = (12, 8)
    number_of_decimals = 4
    
    confusion_matrix_plot.confusion_matrix_save_and_plot(cm_raw, 
    list_class_names, save_directory, 'Confusion Matrix', 
    ml_name + '_confusion_matrix', False, None, 30, figure_size,
    number_of_decimals)
    
    confusion_matrix_plot.confusion_matrix_save_and_plot(cm_pred, 
    list_class_names, save_directory, 'Confusion Matrix Normalized by Prediction', 
    ml_name + '_confusion_matrix_norm_by_prediction', False, 'pred', 
    30, figure_size, number_of_decimals)
    
    confusion_matrix_plot.confusion_matrix_save_and_plot(cm_true, 
    list_class_names, save_directory, 'Confusion Matrix Normalized by Actual', 
    ml_name + '_confusion_matrix_norm_by_true', False, 'true', 
    30, figure_size, number_of_decimals)

# end of make_final_predictions()
       
# *******************************************************************

if __name__ == '__main__':
    
    ml_algorithm_name = 'cnn'
    dataset_name = 'mnist'
    file_name_stub = ml_algorithm_name + '_' + dataset_name + '_bayes_opt'  
    
    calculation_type = 'production' #'calibration' 'production'
        
    base_directory = YOUR DIRECTORY
    data_directory = base_directory + 'data/'
    
    results_directory_stub = base_directory + file_name_stub + '/'
    if not Path(results_directory_stub).is_dir():
        os.mkdir(results_directory_stub)
                
    # fixed parameters
    threshold_validation_loss = 0.05
    total_number_of_iterations = 50
    number_of_random_points = 12 # random searches to start opt process
    # this is # of bayes iters, thus total=this + # of random pts
    number_of_iterations =  total_number_of_iterations - number_of_random_points
           
    # read data
    dict_data = get_data_mnist(calculation_type, data_directory)
                
    print('\n*** starting at',pd.Timestamp.now())

    # calibration
    if calculation_type == 'calibration':
        
        # using 1/2 of calib data
        x_calib = dict_data['x_calib1']
        y_calib = dict_data['y_calib1']
        
        results_directory = results_directory_stub + calculation_type + '/'
        if not Path(results_directory).is_dir():
            os.mkdir(results_directory)
        
        create_save_models_bayes_opt(x_calib, y_calib,
                                     number_of_random_points, number_of_iterations,
                                     results_directory, ml_algorithm_name)                   
        
    # production
    elif calculation_type == 'production':
        
        # get saved parameters
        models_dir = results_directory_stub + 'calibration/'
        df_parameters = pd.read_pickle(models_dir + 'df_bayes_opt_results_parameters.pkl')
        
        results_directory = results_directory_stub + calculation_type + '/'
        if not Path(results_directory).is_dir():
            os.mkdir(results_directory)
            
        x_prod = dict_data['x_prod']
        y_prod = dict_data['y_prod']
        
        num_classes = np.unique(y_prod).shape[0]
        class_names_list = []
        for i in range(num_classes):
            class_names_list.append('class ' + str(i))
                
        make_final_predictions(x_prod, y_prod, class_names_list, 
                               models_dir, results_directory, 
                               df_parameters, 
                               threshold_validation_loss, ml_algorithm_name)
               
    else:
        print('\ninvalid calculation type:',calculation_type)
        raise NameError