Documentation ¶
Overview ¶
Package dpagg contains differentially private aggregation primitives.
Index ¶
- Constants
- Variables
- func ClampFloat64(e, lower, upper float64) (float64, error)
- func ClampInt64(e, lower, upper int64) (int64, error)
- type BoundedMeanFloat64
- func (bm *BoundedMeanFloat64) Add(e float64)
- func (bm *BoundedMeanFloat64) ComputeConfidenceInterval(alpha float64) (noise.ConfidenceInterval, error)
- func (bm *BoundedMeanFloat64) GobDecode(data []byte) error
- func (bm *BoundedMeanFloat64) GobEncode() ([]byte, error)
- func (bm *BoundedMeanFloat64) Merge(bm2 *BoundedMeanFloat64)
- func (bm *BoundedMeanFloat64) Result() float64
- type BoundedMeanFloat64Options
- type BoundedQuantiles
- type BoundedQuantilesOptions
- type BoundedStandardDeviation
- func (bstdv *BoundedStandardDeviation) Add(e float64)
- func (bstdv *BoundedStandardDeviation) GobDecode(data []byte) error
- func (bstdv *BoundedStandardDeviation) GobEncode() ([]byte, error)
- func (bstdv *BoundedStandardDeviation) Merge(bstdv2 *BoundedStandardDeviation)
- func (bstdv *BoundedStandardDeviation) Result() float64
- type BoundedStandardDeviationOptions
- type BoundedSumFloat64
- func (bs *BoundedSumFloat64) Add(e float64)
- func (bs *BoundedSumFloat64) ComputeConfidenceInterval(alpha float64) (noise.ConfidenceInterval, error)
- func (bs *BoundedSumFloat64) GobDecode(data []byte) error
- func (bs *BoundedSumFloat64) GobEncode() ([]byte, error)
- func (bs *BoundedSumFloat64) Merge(bs2 *BoundedSumFloat64)
- func (bs *BoundedSumFloat64) Result() float64
- func (bs *BoundedSumFloat64) ThresholdedResult(thresholdDela float64) *float64
- type BoundedSumFloat64Options
- type BoundedSumInt64
- func (bs *BoundedSumInt64) Add(e int64)
- func (bs *BoundedSumInt64) ComputeConfidenceInterval(alpha float64) (noise.ConfidenceInterval, error)
- func (bs *BoundedSumInt64) GobDecode(data []byte) error
- func (bs *BoundedSumInt64) GobEncode() ([]byte, error)
- func (bs *BoundedSumInt64) Merge(bs2 *BoundedSumInt64)
- func (bs *BoundedSumInt64) Result() int64
- func (bs *BoundedSumInt64) ThresholdedResult(thresholdDelta float64) *int64
- type BoundedSumInt64Options
- type BoundedVariance
- type BoundedVarianceOptions
- type Count
- func (c *Count) ComputeConfidenceInterval(alpha float64) (noise.ConfidenceInterval, error)
- func (c *Count) GobDecode(data []byte) error
- func (c *Count) GobEncode() ([]byte, error)
- func (c *Count) Increment()
- func (c *Count) IncrementBy(count int64)
- func (c *Count) Merge(c2 *Count)
- func (c *Count) Result() int64
- func (c *Count) ThresholdedResult(thresholdDelta float64) *int64
- type CountOptions
- type PreAggSelectPartition
- func (s *PreAggSelectPartition) GetHardThreshold() int
- func (s *PreAggSelectPartition) GobDecode(data []byte) error
- func (s *PreAggSelectPartition) GobEncode() ([]byte, error)
- func (s *PreAggSelectPartition) Increment()
- func (s *PreAggSelectPartition) Merge(s2 *PreAggSelectPartition)
- func (s *PreAggSelectPartition) ShouldKeepPartition() bool
- func (s *PreAggSelectPartition) String() string
- type PreAggSelectPartitionOptions
Constants ¶
const ( DefaultTreeHeight = 4 DefaultBranchingFactor = 16 )
Constants used for QuantileTrees.
Variables ¶
var LargestRepresentableDelta = 1 - math.Pow(2, -53)
LargestRepresentableDelta is the largest delta we could support in 64 bit precision, approximately equal to one.
Functions ¶
func ClampFloat64 ¶
ClampFloat64 clamps e within lower and upper, such that lower is returned if e < lower, and upper is returned if e > upper. Otherwise, e is returned.
func ClampInt64 ¶
ClampInt64 clamps e within lower and upper. Returns lower if e < lower. Returns upper if e > upper.
Types ¶
type BoundedMeanFloat64 ¶
type BoundedMeanFloat64 struct { // State variables NormalizedSum BoundedSumFloat64 Count Count // contains filtered or unexported fields }
BoundedMeanFloat64 calculates a differentially private mean of a collection of float64 values.
The mean is computed by dividing a noisy sum of the entries by a noisy count of the entries. To improve utility, all entries are normalized by setting them to the difference between their actual value and the middle of the input range before summation. The original mean is recovered by adding the midpoint in a post-processing step. This idea is taken from Algorithm 2.4 of "Differential Privacy: From Theory to Practice", by Ninghui Li, Min Lyu, Dong Su and Weining Yang (section 2.5.5, page 28). In contrast to Algorithm 2.4, we do not return the midpoint if the noisy count is less or equal to 1. Instead, we set the noisy count to 1. Since this is a mere post-processing step, the DP bounds are preserved. Moreover, for small numbers of entries, this approach will return results that are closer to the actual mean in expectation.
BoundedMeanFloat64 supports privacy units that contribute to multiple partitions (via the MaxPartitionsContributed parameter) as well as contribute to the same partition multiple times (via the MaxContributionsPerPartition parameter), by scaling the added noise appropriately.
For general details and key definitions, see https://github.com/google/differential-privacy/blob/main/differential_privacy.md#key-definitions.
Note: Do not use when your results may cause overflows for float64 values. This aggregation is not hardened for such applications yet.
Not thread-safe.
func NewBoundedMeanFloat64 ¶
func NewBoundedMeanFloat64(opt *BoundedMeanFloat64Options) *BoundedMeanFloat64
NewBoundedMeanFloat64 returns a new BoundedMeanFloat64.
func (*BoundedMeanFloat64) Add ¶
func (bm *BoundedMeanFloat64) Add(e float64)
Add an entry to a BoundedMeanFloat64. It skips NaN entries and doesn't count them in the final result because introducing even a single NaN entry will result in a NaN mean regardless of other entries, which would break the indistinguishability property required for differential privacy.
func (*BoundedMeanFloat64) ComputeConfidenceInterval ¶
func (bm *BoundedMeanFloat64) ComputeConfidenceInterval(alpha float64) (noise.ConfidenceInterval, error)
ComputeConfidenceInterval computes a confidence interval that contains the true mean with probability greater than or equal to 1 - alpha. The computation is based exclusively on noised data and the privacy parameters. Thus no privacy budget is consumed by this operation.
Result() needs to be called before ComputeConfidenceInterval, otherwise this will return an error.
func (*BoundedMeanFloat64) GobDecode ¶
func (bm *BoundedMeanFloat64) GobDecode(data []byte) error
GobDecode decodes Count.
func (*BoundedMeanFloat64) GobEncode ¶
func (bm *BoundedMeanFloat64) GobEncode() ([]byte, error)
GobEncode encodes Count.
func (*BoundedMeanFloat64) Merge ¶
func (bm *BoundedMeanFloat64) Merge(bm2 *BoundedMeanFloat64)
Merge merges bm2 into bm (i.e., adds to bm all entries that were added to bm2). bm2 is consumed by this operation: bm2 may not be used after it is merged into bm.
func (*BoundedMeanFloat64) Result ¶
func (bm *BoundedMeanFloat64) Result() float64
Result returns a differentially private estimate of the average of bounded elements added so far. The method can be called only once.
Note that the returned value is not an unbiased estimate of the raw bounded mean.
type BoundedMeanFloat64Options ¶
type BoundedMeanFloat64Options struct { Epsilon float64 // Privacy parameter ε. Required. Delta float64 // Privacy parameter δ. Required with Gaussian noise, must be 0 with Laplace noise. MaxPartitionsContributed int64 // How many distinct partitions may a single user contribute to? Defaults to 1. MaxContributionsPerPartition int64 // How many times may a single user contribute to a single partition? Required. // Lower and Upper bounds for clamping. Default to 0; must be such that Lower < Upper. Lower, Upper float64 Noise noise.Noise // Type of noise used in BoundedMean. Defaults to Laplace noise. }
BoundedMeanFloat64Options contains the options necessary to initialize a BoundedMeanFloat64.
type BoundedQuantiles ¶
BoundedQuantiles calculates a differentially private quantiles of a collection of float64 values using a quantile tree mechanism. See https://github.com/google/differential-privacy/blob/main/common_docs/Differentially_Private_Quantile_Trees.pdf.
It supports privacy units that contribute to multiple partitions (via the MaxPartitionsContributed parameter) as well as contribute to the same partition multiple times (via the MaxContributionsPerPartition parameter), by scaling the added noise appropriately.
For general details and key definitions, see https://github.com/google/differential-privacy/blob/main/differential_privacy.md#key-definitions.
Note: Do not use when your results may cause overflows for float64 values. This aggregation is not hardened for such applications yet.
Not thread-safe.
func NewBoundedQuantiles ¶
func NewBoundedQuantiles(opt *BoundedQuantilesOptions) *BoundedQuantiles
NewBoundedQuantiles returns a new BoundedQuantiles.
func (*BoundedQuantiles) Add ¶
func (bq *BoundedQuantiles) Add(e float64)
Add adds an entry to BoundedQuantiles. It skips (ignores) NaN values because their contribution to the final result is not well defined.
func (*BoundedQuantiles) GobDecode ¶
func (bq *BoundedQuantiles) GobDecode(data []byte) error
GobDecode decodes BoundedQuantiles.
func (*BoundedQuantiles) GobEncode ¶
func (bq *BoundedQuantiles) GobEncode() ([]byte, error)
GobEncode encodes BoundedQuantiles.
func (*BoundedQuantiles) Merge ¶
func (bq *BoundedQuantiles) Merge(bq2 *BoundedQuantiles)
Merge merges bq2 into bq (i.e., adds to bq all entries that were added to bq2). bq2 is consumed by this operation: bq2 may not be used after it is merged into bq.
func (*BoundedQuantiles) Result ¶
func (bq *BoundedQuantiles) Result(rank float64) float64
Result calculates and returns a differentially private quantile of the values added. The specified rank must be between 0.0 and 1.0.
This function can be called multiple times to compute different quantiles. Privacy budget is paid only once, on its first invocation. Calling this method repeatedly for the same rank will return the same result. The results of repeated calls are guaranteed to be monotonically increasing in the sense that r_1 < r_2 implies that Result(r_1) <= Result(r_2).
Note that the returned values is not an unbiased estimate of the raw bounded quantile.
type BoundedQuantilesOptions ¶
type BoundedQuantilesOptions struct { Epsilon float64 // Privacy parameter ε. Required. Delta float64 // Privacy parameter δ. Required with Gaussian noise, must be 0 with Laplace noise. MaxPartitionsContributed int64 // How many distinct partitions may a single privacy unit contribute to? Defaults to 1. MaxContributionsPerPartition int64 // How many times may a single user contribute to a single partition? Required. // Lower and Upper bounds for clamping. Default to 0; must be such that Lower < Upper. Lower, Upper float64 Noise noise.Noise // Type of noise used in BoundedSum. Defaults to Laplace noise. // It is not recommended to set TreeHeight and BranchingFactor since they require // implementation-specific insight to modify and they only apply to QuantileTree // algorithm, which might become obsolote if another algorithm is used. TreeHeight int // Height of the QuantileTree. Defaults to defaultTreeHeight. BranchingFactor int // Number of children of every non-leaf node. Defaults to defaultBranchingFactor. }
BoundedQuantilesOptions contains the options necessary to initialize a BoundedQuantiles.
type BoundedStandardDeviation ¶ added in v1.1.0
type BoundedStandardDeviation struct { // State variables Variance BoundedVariance // contains filtered or unexported fields }
BoundedStandardDeviation calculates a differentially private standard deviation of a collection of float64 values.
The output will be clamped between 0 and (upper - lower).
The implementation simply computes the bounded variance and takes the square root, which is differentially private by the post-processing theorem. It relies on the fact that the bounded variance algorithm guarantees that the output is non-negative.
For general details and key definitions, see https://github.com/google/differential-privacy/blob/main/differential_privacy.md#key-definitions.
Note: Do not use when your results may cause overflows for float64 values. This aggregation is not hardened for such applications yet.
Not thread-safe.
func NewBoundedStandardDeviation ¶ added in v1.1.0
func NewBoundedStandardDeviation(opt *BoundedStandardDeviationOptions) *BoundedStandardDeviation
NewBoundedStandardDeviation returns a new BoundedStandardDeviation.
func (*BoundedStandardDeviation) Add ¶ added in v1.1.0
func (bstdv *BoundedStandardDeviation) Add(e float64)
Add an entry to a BoundedStandardDeviation. It skips NaN entries and doesn't count them in the final result because introducing even a single NaN entry will result in a NaN standard deviation regardless of other entries, which would break the indistinguishability property required for differential privacy.
func (*BoundedStandardDeviation) GobDecode ¶ added in v1.1.0
func (bstdv *BoundedStandardDeviation) GobDecode(data []byte) error
GobDecode decodes BoundedStandardDeviation.
func (*BoundedStandardDeviation) GobEncode ¶ added in v1.1.0
func (bstdv *BoundedStandardDeviation) GobEncode() ([]byte, error)
GobEncode encodes BoundedStandardDeviation.
func (*BoundedStandardDeviation) Merge ¶ added in v1.1.0
func (bstdv *BoundedStandardDeviation) Merge(bstdv2 *BoundedStandardDeviation)
Merge merges bstdv2 into bstdv (i.e., adds to bstdv all entries that were added to bstdv2). bstdv2 is consumed by this operation: bstdv2 may not be used after it is merged into bstdv.
func (*BoundedStandardDeviation) Result ¶ added in v1.1.0
func (bstdv *BoundedStandardDeviation) Result() float64
Result returns a differentially private estimate of the standard deviation of bounded elements added so far. The method can be called only once.
Note that the returned value is not an unbiased estimate of the raw bounded standard deviation.
type BoundedStandardDeviationOptions ¶ added in v1.1.0
type BoundedStandardDeviationOptions struct { Epsilon float64 // Privacy parameter ε. Required. Delta float64 // Privacy parameter δ. Required with Gaussian noise, must be 0 with Laplace noise. MaxPartitionsContributed int64 // How many distinct partitions may a single user contribute to? Defaults to 1. MaxContributionsPerPartition int64 // How many times may a single user contribute to a single partition? Required. // Lower and Upper bounds for clamping. Default to 0; must be such that Lower < Upper. Lower, Upper float64 Noise noise.Noise // Type of noise used in BoundedStandardDeviation. Defaults to Laplace noise. }
BoundedStandardDeviationOptions contains the options necessary to initialize a BoundedStandardDeviation.
type BoundedSumFloat64 ¶
BoundedSumFloat64 calculates a differentially private sum of a collection of float64 values. It supports privacy units that contribute to multiple partitions (via the MaxPartitionsContributed parameter) by scaling the added noise appropriately. However, it assumes that for each BoundedSumFloat64 instance (partition), each privacy unit contributes at most one value. If a privacy unit contributes more, the contributions should be pre-aggregated before passing them to BoundedSumFloat64.
The provided differentially private sum is an unbiased estimate of the raw bounded sum meaning that its expected value is equal to the raw bounded sum.
For general details and key definitions, see https://github.com/google/differential-privacy/blob/main/differential_privacy.md#key-definitions,
Not thread-safe.
func NewBoundedSumFloat64 ¶
func NewBoundedSumFloat64(opt *BoundedSumFloat64Options) *BoundedSumFloat64
NewBoundedSumFloat64 returns a new BoundedSumFloat64, whose sum is initialized at 0.
func (*BoundedSumFloat64) Add ¶
func (bs *BoundedSumFloat64) Add(e float64)
Add adds a new summand to the BoundedSumFloat64. It ignores NaN summands because introducing even a single NaN summand will result in a NaN sum regardless of other summands, which would break the indistinguishability property required for differential privacy.
func (*BoundedSumFloat64) ComputeConfidenceInterval ¶
func (bs *BoundedSumFloat64) ComputeConfidenceInterval(alpha float64) (noise.ConfidenceInterval, error)
ComputeConfidenceInterval computes a confidence interval that contains the true sum with a probability greater than or equal to 1 - alpha using the noised sum computed by Result(). The computation is based exclusively on the noised sum returned by Result(). Thus no privacy budget is consumed by this operation.
Result() needs to be called before ComputeConfidenceInterval, otherwise this will return an error.
See https://github.com/google/differential-privacy/tree/main/common_docs/confidence_intervals.md.
func (*BoundedSumFloat64) GobDecode ¶
func (bs *BoundedSumFloat64) GobDecode(data []byte) error
GobDecode decodes BoundedSumInt64.
func (*BoundedSumFloat64) GobEncode ¶
func (bs *BoundedSumFloat64) GobEncode() ([]byte, error)
GobEncode encodes BoundedSumInt64.
func (*BoundedSumFloat64) Merge ¶
func (bs *BoundedSumFloat64) Merge(bs2 *BoundedSumFloat64)
Merge merges bs2 into bs (i.e., adds to bs all entries that were added to bs2). bs2 is consumed by this operation: bs2 may not be used after it is merged into bs.
func (*BoundedSumFloat64) Result ¶
func (bs *BoundedSumFloat64) Result() float64
Result returns a differentially private estimate of the sum of bounded elements added so far. The method can be called only once.
The returned value is an unbiased estimate of the raw bounded sum.
The returned value may sometimes be outside the set of possible raw bounded sums, e.g., the differentially private bounded sum may be positive although neither the lower nor the upper bound are positive. This can be corrected by the caller of this method, e.g., by snapping the result to the closest value representing a bounded sum that is possible. Note that such post processing introduces bias to the result.
func (*BoundedSumFloat64) ThresholdedResult ¶
func (bs *BoundedSumFloat64) ThresholdedResult(thresholdDela float64) *float64
ThresholdedResult is similar to Result() but applies thresholding to the result. So, if the result is less than the threshold specified by the noise, mechanism, it returns nil. Otherwise, it returns the result.
type BoundedSumFloat64Options ¶
type BoundedSumFloat64Options struct { Epsilon float64 // Privacy parameter ε. Required. Delta float64 // Privacy parameter δ. Required with Gaussian noise, must be 0 with Laplace noise. MaxPartitionsContributed int64 // How many distinct partitions may a single privacy unit contribute to? Defaults to 1. // Lower and Upper bounds for clamping. Default to 0; must be such that Lower < Upper. Lower, Upper float64 Noise noise.Noise // Type of noise used in BoundedSum. Defaults to Laplace noise. // contains filtered or unexported fields }
BoundedSumFloat64Options contains the options necessary to initialize a BoundedSumFloat64.
type BoundedSumInt64 ¶
BoundedSumInt64 calculates a differentially private sum of a collection of int64 values. It supports privacy units that contribute to multiple partitions (via the MaxPartitionsContributed parameter) by scaling the added noise appropriately. However, it assumes that for each BoundedSumInt64 instance (partition), each privacy unit contributes at most one value. If a privacy unit contributes more, the contributions should be pre-aggregated before passing them to BoundedSumInt64.
The provided differentially private sum is an unbiased estimate of the raw bounded sum in the sense that its expected value is equal to the raw bounded sum.
For general details and key definitions, see https://github.com/google/differential-privacy/blob/main/differential_privacy.md#key-definitions.
Note: Do not use when your results may cause overflows for int64 values. This aggregation is not hardened for such applications yet.
Not thread-safe.
func NewBoundedSumInt64 ¶
func NewBoundedSumInt64(opt *BoundedSumInt64Options) *BoundedSumInt64
NewBoundedSumInt64 returns a new BoundedSumInt64, whose sum is initialized at 0.
func (*BoundedSumInt64) Add ¶
func (bs *BoundedSumInt64) Add(e int64)
Add adds a new summand to the BoundedSumInt64.
func (*BoundedSumInt64) ComputeConfidenceInterval ¶
func (bs *BoundedSumInt64) ComputeConfidenceInterval(alpha float64) (noise.ConfidenceInterval, error)
ComputeConfidenceInterval computes a confidence interval with integer bounds that contains the true sum with a probability greater than or equal to 1 - alpha using the noised sum computed by Result(). The computation is based exclusively on the noised sum returned by Result(). Thus no privacy budget is consumed by this operation.
Result() needs to be called before ComputeConfidenceInterval, otherwise this will return an error.
See https://github.com/google/differential-privacy/tree/main/common_docs/confidence_intervals.md.
func (*BoundedSumInt64) GobDecode ¶
func (bs *BoundedSumInt64) GobDecode(data []byte) error
GobDecode decodes BoundedSumInt64.
func (*BoundedSumInt64) GobEncode ¶
func (bs *BoundedSumInt64) GobEncode() ([]byte, error)
GobEncode encodes BoundedSumInt64.
func (*BoundedSumInt64) Merge ¶
func (bs *BoundedSumInt64) Merge(bs2 *BoundedSumInt64)
Merge merges bs2 into bs (i.e., adds to bs all entries that were added to bs2). bs2 is consumed by this operation: bs2 may not be used after it is merged into bs.
func (*BoundedSumInt64) Result ¶
func (bs *BoundedSumInt64) Result() int64
Result returns a differentially private estimate of the sum of bounded elements added so far. The method can be called only once.
The returned value is an unbiased estimate of the raw bounded sum.
The returned value may sometimes be outside the set of possible raw bounded sums, e.g., the differentially private bounded sum may be positive although neither the lower nor the upper bound are positive. This can be corrected by the caller of this method, e.g., by snapping the result to the closest value representing a bounded sum that is possible. Note that such post processing introduces bias to the result.
func (*BoundedSumInt64) ThresholdedResult ¶
func (bs *BoundedSumInt64) ThresholdedResult(thresholdDelta float64) *int64
ThresholdedResult is similar to Result() but applies thresholding to the result. So, if the result is less than the threshold specified by the parameters of BoundedSumInt64 as well as thresholdDelta, it returns nil. Otherwise, it returns the result.
Note that the nil results should not be published when the existence of a partition in the output depends on private data.
type BoundedSumInt64Options ¶
type BoundedSumInt64Options struct { Epsilon float64 // Privacy parameter ε. Required. Delta float64 // Privacy parameter δ. Required with Gaussian noise, must be 0 with Laplace noise. MaxPartitionsContributed int64 // How many distinct partitions may a single privacy unit contribute to? Defaults to 1. // Lower and Upper bounds for clamping. Default to 0; must be such that Lower < Upper. Lower, Upper int64 Noise noise.Noise // Type of noise used in BoundedSum. Defaults to Laplace noise. // contains filtered or unexported fields }
BoundedSumInt64Options contains the options necessary to initialize a BoundedSumInt64.
type BoundedVariance ¶ added in v1.1.0
type BoundedVariance struct { // State variables NormalizedSumOfSquares BoundedSumFloat64 NormalizedSum BoundedSumFloat64 Count Count // contains filtered or unexported fields }
BoundedVariance calculates a differentially private variance of a collection of float64 values.
The output will be clamped between 0 and (upper - lower)². Since the result is guaranteed to be positive, this algorithm can be used to compute a differentially private standard deviation.
The algorithm is a variation of the algorithm for differentially private mean from "Differential Privacy: From Theory to Practice", section 2.5.5: https://books.google.com/books?id=WFttDQAAQBAJ&pg=PA24#v=onepage&q&f=false
BoundedVariance supports privacy units that contribute to multiple partitions (via the MaxPartitionsContributed parameter) as well as contribute to the same partition multiple times (via the MaxContributionsPerPartition parameter), by scaling the added noise appropriately.
For general details and key definitions, see https://github.com/google/differential-privacy/blob/main/differential_privacy.md#key-definitions.
Note: Do not use when your results may cause overflows for float64 values. This aggregation is not hardened for such applications yet.
Not thread-safe.
func NewBoundedVariance ¶ added in v1.1.0
func NewBoundedVariance(opt *BoundedVarianceOptions) *BoundedVariance
NewBoundedVariance returns a new BoundedVariance.
func (*BoundedVariance) Add ¶ added in v1.1.0
func (bv *BoundedVariance) Add(e float64)
Add an entry to a BoundedVariance. It skips NaN entries and doesn't count them in the final result because introducing even a single NaN entry will result in a NaN variance regardless of other entries, which would break the indistinguishability property required for differential privacy.
func (*BoundedVariance) GobDecode ¶ added in v1.1.0
func (bv *BoundedVariance) GobDecode(data []byte) error
GobDecode decodes BoundedVariance.
func (*BoundedVariance) GobEncode ¶ added in v1.1.0
func (bv *BoundedVariance) GobEncode() ([]byte, error)
GobEncode encodes BoundedVariance.
func (*BoundedVariance) Merge ¶ added in v1.1.0
func (bv *BoundedVariance) Merge(bv2 *BoundedVariance)
Merge merges bv2 into bv (i.e., adds to bv all entries that were added to bv2). bv2 is consumed by this operation: bv2 may not be used after it is merged into bv.
func (*BoundedVariance) Result ¶ added in v1.1.0
func (bv *BoundedVariance) Result() float64
Result returns a differentially private estimate of the variance of bounded elements added so far. The method can be called only once.
Note that the returned value is not an unbiased estimate of the raw bounded variance.
type BoundedVarianceOptions ¶ added in v1.1.0
type BoundedVarianceOptions struct { Epsilon float64 // Privacy parameter ε. Required. Delta float64 // Privacy parameter δ. Required with Gaussian noise, must be 0 with Laplace noise. MaxPartitionsContributed int64 // How many distinct partitions may a single user contribute to? Defaults to 1. MaxContributionsPerPartition int64 // How many times may a single user contribute to a single partition? Required. // Lower and Upper bounds for clamping. Default to 0; must be such that Lower < Upper. Lower, Upper float64 Noise noise.Noise // Type of noise used in BoundedVariance. Defaults to Laplace noise. }
BoundedVarianceOptions contains the options necessary to initialize a BoundedVariance.
type Count ¶
Count calculates a differentially private count of a collection of values using the Laplace or Gaussian mechanism
It supports privacy units that contribute to multiple partitions (via the MaxPartitionsContributed parameter) by scaling the added noise appropriately. However, it does not support multiple contributions to a single partition from the same privacy unit. For that use case, BoundedSumInt64 should be used instead.
The provided differentially private count is an unbiased estimate of the raw count meaning that its expected value is equal to the raw count.
For general details and key definitions, see https://github.com/google/differential-privacy/blob/main/differential_privacy.md#key-definitions.
Note: Do not use when your results may cause overflows for int64 values. This aggregation is not hardened for such applications yet.
Not thread-safe.
func NewCount ¶
func NewCount(opt *CountOptions) *Count
NewCount returns a new Count, initialized at 0.
func (*Count) ComputeConfidenceInterval ¶
func (c *Count) ComputeConfidenceInterval(alpha float64) (noise.ConfidenceInterval, error)
ComputeConfidenceInterval computes a confidence interval with integer bounds that contains the true count with a probability greater than or equal to 1 - alpha using the noised count computed by Result(). The computation is based exclusively on the noised count returned by Result(). Thus no privacy budget is consumed by this operation.
Result() needs to be called before ComputeConfidenceInterval, otherwise this will return an error.
See https://github.com/google/differential-privacy/tree/main/common_docs/confidence_intervals.md.
func (*Count) IncrementBy ¶
IncrementBy increments the count by the given value. Note that this shouldn't be used to count multiple contributions to a single partition from the same privacy unit.
func (*Count) Merge ¶
Merge merges c2 into c (i.e., adds to c all entries that were added to c2). c2 is consumed by this operation: it may not be used after it is merged into c.
func (*Count) Result ¶
Result returns a differentially private estimate of the current count. The method can be called only once.
The returned value is an unbiased estimate of the raw count.
The returned value may sometimes be negative. This can be corrected by setting negative results to 0. Note that such post processing introduces bias to the result.
func (*Count) ThresholdedResult ¶
ThresholdedResult is similar to Result() but applies thresholding to the result. So, if the result is less than the threshold specified by the parameters of Count as well as thresholdDelta, it returns nil. Otherwise, it returns the result.
Note that the nil results should not be published when the existence of a partition in the output depends on private data.
type CountOptions ¶
type CountOptions struct { Epsilon float64 // Privacy parameter ε. Required. Delta float64 // Privacy parameter δ. Required with Gaussian noise, must be 0 with Laplace noise. MaxPartitionsContributed int64 // How many distinct partitions may a single privacy unit contribute to? Defaults to 1. Noise noise.Noise // Type of noise used. Defaults to Laplace noise. // contains filtered or unexported fields }
CountOptions contains the options necessary to initialize a Count.
type PreAggSelectPartition ¶
type PreAggSelectPartition struct {
// contains filtered or unexported fields
}
PreAggSelectPartition is used to compute an (ε,δ)-differentially private decision of whether to materialize a partition.
Many differential privacy mechanisms work by performing an aggregation and adding noise. They achieve (ε_m,δ_m)-differential privacy under the assumption that partitions are chosen in advance. In other words, they assume that even if no data is associated with a partition, noise is added to the empty aggregation, and the noisy result is materialized. However, when only partitions containing data are materialized, such mechanisms fail to protect privacy for partitions containing data from a single privacy ID (e.g., user). To fix this, partitions with small numbers of privacy IDs must sometimes be dropped in order to maintain privacy. This process of partition selection incurs an additional (ε,δ) differential privacy budget resulting in a total differential privacy budget of (ε+ε_m,δ+δ_m) being used for the aggregation with partition selection.
Depending on the l0sensitivity, the PreAggSelectPartition uses one of two differentially private partition selection algorithms.
When l0sensitivity ≤ 3, the partition selection process is made (ε,δ) differentially private by applying the definition of differential privacy to the count of privacy IDs. Supposing l0Sensitivity bounds the number of partitions a privacy ID may contribute to, we define:
pε := ε/l0Sensitivity pδ := δ/l0Sensitivity
to be the per-partition differential privacy losses incurred by the partition selection process. Letting n denote the number of privacy IDs in a partition, the probability of selecting a partition is given by the following recurrence relation:
keepPartitionProbability(n) = min( keepPartitionProbability(n-1) * exp(pε) + pδ, (1) 1 - exp(-pε) * (1-keepPartitionProbability(n-1)-pδ), (2) 1 (3) )
with base case keepPartitionProbability(0) = 0. This formula is optimal in terms of maximizing the likelihood of selecting a partition under (ε,δ)-differential privacy, with the caveat that the input values for pε and pδ are lower bound approximations. For efficiency, we use a closed-form solution to this recurrence relation. See [Differentially private partition selection paper] https://arxiv.org/pdf/2006.03684.pdf for details on the underlying mathematics.
When l0sensitivity > 3, the partition selection process is made (ε,δ) differentially private by using the ThresholdedResult() of the Count primitive with Gaussian noise. Count computes a (ε,δ/2) differentially private count of the privacy IDs in a partition by adding Gaussian noise. Then, it computes a threshold T for which the probability that a (ε,δ/2) differentially private count of a single privacy ID can exceed T is δ/2. It keeps the partition iff differentially private count exceeds the threshold.
The reason two different algorithms for deciding whether to keep a partition is used is because the first algorithm ("magic partition selection") is optimal when l0sensitivity ≤ 3 but is outperformed by Gaussian-based thresholding when l0sensitivity > 3.
PreAggSelectPartition is a utility for maintaining the count of IDs in a single partition and then determining whether the partition should be materialized. Use Increment() to increment the count of IDs and ShouldKeepPartition() to decide if the partition should be materialized.
func NewPreAggSelectPartition ¶
func NewPreAggSelectPartition(opt *PreAggSelectPartitionOptions) *PreAggSelectPartition
NewPreAggSelectPartition constructs a new PreAggSelectPartition from opt.
func (*PreAggSelectPartition) GetHardThreshold ¶
func (s *PreAggSelectPartition) GetHardThreshold() int
GetHardThreshold returns a threshold k, where if there are at least k privacy units in a partition, we are guaranteed to keep that partition.
This is the conceptual equivalent of the post-aggregation threshold of the noise.Noise interface with the difference that here there is 0 probability of not keeping the partition if it has at least k privacy units, whereas with the post-aggregation threshold there is a non-zero probability (however small).
func (*PreAggSelectPartition) GobDecode ¶
func (s *PreAggSelectPartition) GobDecode(data []byte) error
GobDecode decodes PreAggSelectPartition.
func (*PreAggSelectPartition) GobEncode ¶
func (s *PreAggSelectPartition) GobEncode() ([]byte, error)
GobEncode encodes PreAggSelectPartition.
func (*PreAggSelectPartition) Increment ¶
func (s *PreAggSelectPartition) Increment()
Increment increments the ids count by one. The caller must ensure this methods called at most once per privacy ID.
func (*PreAggSelectPartition) Merge ¶
func (s *PreAggSelectPartition) Merge(s2 *PreAggSelectPartition)
Merge merges s2 into s (i.e., add the idCount of s2 to s). This implicitly assumes that s and s2 act on distinct privacy IDs. s2 is consumed by this operation: s2 may not be used after it is merged into s.
Preconditions: s and s2 must have the same privacy parameters. In addition, ShouldKeepPartition() may not be called yet for either s or s2.
func (*PreAggSelectPartition) ShouldKeepPartition ¶
func (s *PreAggSelectPartition) ShouldKeepPartition() bool
ShouldKeepPartition returns whether the partition should be materialized.
func (*PreAggSelectPartition) String ¶
func (s *PreAggSelectPartition) String() string
type PreAggSelectPartitionOptions ¶
type PreAggSelectPartitionOptions struct { // Epsilon and Delta specify the (ε,δ)-differential privacy budget used for // partition selection. Required. Epsilon float64 Delta float64 // MaxPartitionsContributed is the number of distinct partitions a single // privacy unit can contribute to. Defaults to 1. MaxPartitionsContributed int64 }
PreAggSelectPartitionOptions is used to set the privacy parameters when constructing a PreAggSelectPartition.