privpack.core.architectures module

The classes defined in this package contain the groundwork for some basic data pre and post processing. For Example: to generate privacy-preserving binary data we have pre-defined a privatizer network that maps two-dimensional binary input to a one-hot encoded input (4-dimensional). And no training related post-processing. However, direct predictions are post-processed to their most-likely state.

For the gaussian case we only handle post-processing. As the adversary is defined to learn the parameters of a multivariate gaussian, we implemented function to convert the adversary network's otuput to the parameters mu and sigma. Furthermore, functions are implemented to compute the likelihood that the adversary guesses the private variable given the released output.

An example for instanting one of the predefined classes:

# Instantiating binary release mechanism
from privpack import BinaryPrivacyPreservingAdversarialNetwork as BinaryGAN
binary_gan = BinaryGAN(torch.device('cpu'), PGANCriterion())

#Instantiating gaussian
from privpack import GaussianPrivacyPreservingAdversarialNetwork as GaussGAN
gauss_gan = GaussGAN(torch.device('cpu'), privacy_size, public_size, release_size,
                     PGANCriterion(), no_hidden_units_per_layer=5, noise_size=1)

API Documentation

Generative adversarial networks to release Binary and Gaussian data, this module defines the following classes:

  • GenerativeAdversarialNetwork

  • BinaryPrivacyPreservingAdversarialNetwork

  • GaussianPrivacyPreservingAdversarialNetwork

class privpack.core.architectures.BinaryPrivacyPreservingAdversarialNetwork(device, binary_gan_criterion: privpack.core.criterion.PGANCriterion, lr=0.01)

Bases: privpack.core.architectures.GenerativeAdversarialNetwork

A Binary implementation of the Generative Adversarial Network defined using the PyTorch library.

This class implements the Generative Adversarial Base framework and thereby defines this class to produce a single privacy preserved binary output given two binary inputs. This is done using the defined privatizer network. The adversary network estimates the probability of the original private value. How the networks are to learn the best outputs is learned using user provided criterions. It is expected to be according to an optimal Privacy-Utility Trade-off.

load()

Load the parameters of the privatizer and adversary network.

privatize(data)

Privatize the provided data using the privatizer in the network.

Parameters:

  • data: data to be privatized by the network.

return the privatized version of the provided data.

reset() → None

Provides the possibility to undo all the learned parameters.

save()

Save the networks learned parameters.

train(train_data, test_data, epochs, batch_size=1, privatizer_train_every_n=5, adversary_train_every_n=1, verbose=False)

Train the Generative Adversarial Network using the implemented privatizer and adversary network. The privatizer network and adversary network are both trained using the supplied train_data. However, the privatizer network is trained every nth batch-iteration supported by the privatizer_train_every_n parameter. Identically, the adversary network is trained only every nth batch-iteration supported by the adversary_train_every_n. Where both should be divisible by 5 due to the current logging system.1

Parameters:

  • train_data: the training data used for training the generative adversarial network.

  • test_data: the testing data used for printing validation results on the generative adversarial network.

  • batch_size: The batch size used when training with the supplied train_data.

  • lr: Learning Rate indicating the step-size of adjusting the network parameters.

  • privatizer_train_every_n: Parameter defining when to update the privatizer network; Default=1.

  • adversary_train_every_n: Parameter defining when to update the adversary network; Default=1.

  • data_sampler: Function used for generating samples by the privatizer network.

  • k: The number of samples which should be generated by the supplied data_sampler function.

class privpack.core.architectures.GaussianPrivacyPreservingAdversarialNetwork(device, privacy_size, public_size, release_size, gauss_gan_criterion, observation_model='full', lr=0.001, noise_size=5, no_hidden_units_per_layer=20)

Bases: privpack.core.architectures.GenerativeAdversarialNetwork

A Gaussian implementation of the Generative Adversarial Network defined using the PyTorch library.

This class implements the Generative Adversarial Base framework and thereby defines this class to produce privatized gaussian outputs given assumed to be gaussian inputs. This is done using the defined privatizer network. The adversary network estimates the probability of the original private value. How the networks are to learn the best outputs is learned using user provided criterions. It is expected to be according to an optimal Privacy-Utility Trade-off.

observation_models()
privatize(data)

Privatize the provided data using the privatizer in the network.

Parameters:

  • data: data to be privatized by the network.

return the privatized version of the provided data.

reset()

Reset the parameters of both networks.

train(train_data, test_data, epochs, k=1, batch_size=2, privatizer_train_every_n=5, adversary_train_every_n=1, verbose=False)

Train the Generative Adversarial Network using the implemented privatizer and adversary network. The privatizer network and adversary network are both trained using the supplied train_data. However, the privatizer network is trained every nth batch-iteration supported by the privatizer_train_every_n parameter. Identically, the adversary network is trained only every nth batch-iteration supported by the adversary_train_every_n. Where both should be divisible by 5 due to the current logging system.1

Parameters:

  • train_data: the training data used for training the generative adversarial network.

  • test_data: the testing data used for printing validation results on the generative adversarial network.

  • batch_size: The batch size used when training with the supplied train_data.

  • lr: Learning Rate indicating the step-size of adjusting the network parameters.

  • privatizer_train_every_n: Parameter defining when to update the privatizer network; Default=1.

  • adversary_train_every_n: Parameter defining when to update the adversary network; Default=1.

  • data_sampler: Function used for generating samples by the privatizer network.

  • k: The number of samples which should be generated by the supplied data_sampler function.

class privpack.core.architectures.GenerativeAdversarialNetwork(device, privacy_size, public_size, gan_criterion: privpack.core.criterion.PGANCriterion, metrics, lr=0.001)

Bases: abc.ABC

A Generative Adversarial Network defined using the PyTorch library.

This abstract class expects one to implement an update adversary method as well as an update privatizer method. The naming is according to the goal of this library; release privatized data optimized in accordance to a privacy-utility trade-off.

get_device()

Get the device this network is currently using.

abstract privatize(data)

Privatize the provided data using the privatizer in the network.

Parameters:

  • data: data to be privatized by the network.

return the privatized version of the provided data.

abstract classmethod reset()

Reset the parameters of both networks.

set_adversary_class(adversary_class)
set_device(device)

Change the device used by this network.

Parameters:

  • device: the device to be used: CPU or CUDA

set_privatizer_class(privatizer_class)
train(train_data, test_data, epochs, batch_size=1, privatizer_train_every_n=1, adversary_train_every_n=1, data_sampler=None, k=1, verbose=False)

Train the Generative Adversarial Network using the implemented privatizer and adversary network. The privatizer network and adversary network are both trained using the supplied train_data. However, the privatizer network is trained every nth batch-iteration supported by the privatizer_train_every_n parameter. Identically, the adversary network is trained only every nth batch-iteration supported by the adversary_train_every_n. Where both should be divisible by 5 due to the current logging system.1

Parameters:

  • train_data: the training data used for training the generative adversarial network.

  • test_data: the testing data used for printing validation results on the generative adversarial network.

  • batch_size: The batch size used when training with the supplied train_data.

  • lr: Learning Rate indicating the step-size of adjusting the network parameters.

  • privatizer_train_every_n: Parameter defining when to update the privatizer network; Default=1.

  • adversary_train_every_n: Parameter defining when to update the adversary network; Default=1.

  • data_sampler: Function used for generating samples by the privatizer network.

  • k: The number of samples which should be generated by the supplied data_sampler function.