MRT Generalized Quantization API¶
Quantizer API¶
Optimizor definition for MRT calibration. Quantizer definition for MRT calibration and quantization. Feature Types Definition for MRT calibration and quantization. Buffers Types Definition for MRT quantization. Granularity constant vars definition.
- class mrt.V2.tfm_types.Feature(*args)¶
The data structure which specifies the object of data sampling in calibration stage.
Feature can be manipulated in quantization stage.
- get()¶
Get the value of feature.
- Returns
ret – The feature value.
- Return type
float or tuple
- get_threshold()¶
Get the threshold of feature
- serialize()¶
Serialize the feature into list to be compatible with json.
- Returns
ret – list of serialized features.
- Return type
list
- class mrt.V2.tfm_types.AFeature(*args)¶
AFeature is designed for uniform symmetric quantization. absmax stands for the max of the absolute value of every entry in the input tensor.
- class mrt.V2.tfm_types.MMFeature(*args)¶
MMFeature is designed for unifrom affine quantization. minv and maxv respectively stand for the min and max entries of the input tensor.
- class mrt.V2.tfm_types.Buffer(*args)¶
Quantization buffer used to store the scale. For uniform affine quantizers, the zero point is also stored.
- get()¶
Get the value of buffer.
- Returns
ret – The buffer value.
- Return type
float or tuple
- serialize()¶
Serialize the buffer into list to be compatible with json.
- Returns
ret – list of serialized buffers.
- Return type
list
- class mrt.V2.tfm_types.SBuffer(*args)¶
SBuffer is designed for uniform symmetric quantizers, where scale is stored.
- class mrt.V2.tfm_types.SZBuffer(*args)¶
SZBuffer is designed for uniform affine quantizers, where both scale and zero point is stored.
- class mrt.V2.tfm_types.Quantizer¶
Helper class to execute quantization process.
- Current quantizer types supported by MRT GEN:
Uniform Symmetric Quantization
Uniform Affine Quantization
- get_prec(val)¶
Get the quantizer precision with respect to the given value.
For quantizers like uniform symmetric quantizers, the returned precision should be ‘int’,
For quantizers like uniform affine quantizers, the returned precision should be ‘uint’.
- Parameters
val (float) – The quantize precision of the node.
- Returns
ret – The quantizer precision.
- Return type
int
- get_range(prec)¶
Get the quantizer range of with respect to the given precision.
- Parameters
prec (int) – The specified precision.
- Returns
ret – The minimal and maximal possible value.
- Return type
tuple
- get_scale(oprec, ft)¶
Get the quantizer scale.
- Parameters
oprec (int) – The quantize precision of the node.
ft (mrt.V2.Feature) – The feature of the node to be quantized.
- Returns
ret – The quantizer scale.
- Return type
float
- int_realize(data, prec, **kwargs)¶
Realize the given input with respect to the given precision bound.
- Parameters
data (mxnet.NDArray) – The float weight to be realized.
prec (int) – The output precision bound.
- Returns
ret – The realized result and the tight precision.
- Return type
tuple
- quantize(sym, oprec, oscale=None, **kwargs)¶
The interface where operator quantization is perfomed.
- Parameters
sym (mxnet.symbol) – The expansion symbol or float weight symbol to be quantized.
oprec (int) – The output precision of the quantized symbol.
oscale (flaot or NoneType) – The output scale of the quantized symbol. If it’s not NoneType, the expansion operator will be quantized by output scale. Otherwise, it will be quantized by output precision.
- Returns
ret – Respectively output quantized symbol, output precision, output scale. For quantizers like uniform affine quantizer, zero point is also returned.
- Return type
tuple
- class mrt.V2.tfm_types.USQuantizer¶
Uniform symmetric quantizer
- class mrt.V2.tfm_types.UAQuantizer¶
Uniform affine quantizer
- class mrt.V2.tfm_types.Optimizor(**attrs)¶
- Currently supported optimizor types intended for sampling optimization:
historical value
moving average
kl divergence
- Optimizor types to be implemented:
outlier removal
- Notice:
The users can implement customized optimizors with respect to the features. e.g. Designing different optimizors for different components of the feature.
- get_opt(raw_ft, out, **kwargs)¶
Get the optimized value of the calibrated feature.
- Parameters
raw_ft (float) – The calibrated feature.
out (mxnet.NDArray) – The original data from which the raw_ft is calibrated.
- Returns
ret – The optimized feature.
- Return type
float
- static list_supported_quant_types()¶
List the supported quantizer types.
- class mrt.V2.tfm_types.HVOptimizor(**attrs)¶
Generalized historical value optimizor
- class mrt.V2.tfm_types.MAOptimizor(**attrs)¶
Generalized moving average optimizor
- class mrt.V2.tfm_types.KLDOptimizor(**attrs)¶
KL divergence optimizor for AFeature
Graph API¶
Collection of MRT GEN pass tions. Stage-level symbol pass designation for MRT. Compatible with MRT architecture.
- mrt.V2.tfm_pass.sym_config_infos(symbol, params, cfg_dict={}, logger=<module 'logging' from '/home/docs/.pyenv/versions/3.7.9/lib/python3.7/logging/__init__.py'>)¶
Customized graph-level topo pass definition.
Interface for MRT main2 configuration Create customized samplers and optimizors.
Use it just before calibration.
- mrt.V2.tfm_pass.deserialize(cfg_groups)¶
Interface for MRT main2 configuration
Check the validity and compatibility of feature, sampler and optimizor configurations.
- Parameters
cfg_groups (dict) – configuration information (quantizer type, optimizor information) maps to node names (before calibration).
- mrt.V2.tfm_pass.sym_calibrate(symbol, params, data, cfg_dict, **kwargs)¶
Customized graph-level topo pass definition. Interface for MRT GEN Calibration.
- mrt.V2.tfm_pass.sym_separate_pad(symbol, params)¶
Separate pad attribute as an independent symbol in rewrite stage.
- mrt.V2.tfm_pass.sym_separate_bias(symbol, params)¶
Separate bias attribute as an independent symbol in rewrite stage.
- mrt.V2.tfm_pass.sym_slice_channel(symbol, params, cfg_dict={})¶
Customized graph-level topo pass definition.
Interface for granularity control. While layer-wise feature is by default, MRT support channel-wise features specified in cfg_dict.
- mrt.V2.tfm_pass.quantize(symbol, params, features, precs, buffers, cfg_dict, op_input_precs, restore_names, shift_bits, softmax_lambd)¶
Customized graph-level topo pass definition. Interface for MRT GEN Quantization.
- Parameters
symbol (mxnet.symbol) – the grouped output symbol represent the graph to be quantized.
params (dict) – symbol name maps to mxnet.NDArray, represent graph parameters
features (dict) – symbol name maps to mrt.V2.Feature
precs (dict) – symbol name maps to precision dict
buffers (dict) – symbol name maps to mrt.V2.Buffer
cfg_dict (dict) – symbol name maps to configuration dict
op_input_precs (dict) – symbol name maps to input precision
restore_names (set) – set of symbol names representing symbols to be restored
shift_bits (int) – hyperparameter for quantize precision control
softmax_lambd (float) – hyperparameter for feature optimization
Transformer API¶
Customized Symbolic Pass Interfaces. Base passes with default operation settings for MRT GEN. Collection of transformer management functions.
- class mrt.V2.tfm_base.Transformer¶
Generalized transformer which provide default slice_channel and quantize interface for specific ops. Other default transformer interface like fuse_transpose is inherited.
- quantize(op, **kwargs)¶
Generalized version of quantization for quantization.
- slice_channel(op, **kwargs)¶
Operators will be split into multiple channels for intended for quantization of channel-wise granularity.
Do nothing by default.