convert_fx¶
-
class
torch.quantization.quantize_fx.
convert_fx
(graph_module, is_reference=False, convert_custom_config_dict=None, _remove_qconfig=True, qconfig_dict=None, backend_config_dict=None)[source]¶ Convert a calibrated or trained model to a quantized model
- Parameters
graph_module (*) – A prepared and calibrated/trained model (GraphModule)
is_reference (*) – flag for whether to produce a reference quantized model, which will be a common interface between pytorch quantization with other backends like accelerators
convert_custom_config_dict (*) –
dictionary for custom configurations for convert function:
convert_custom_config_dict = { # user will manually define the corresponding quantized # module class which has a from_observed class method that converts # observed custom module to quantized custom module "observed_to_quantized_custom_module_class": { "static": { ObservedCustomModule: QuantizedCustomModule }, "dynamic": { ObservedCustomModule: QuantizedCustomModule }, "weight_only": { ObservedCustomModule: QuantizedCustomModule } }, # Attributes that are not used in forward function will # be removed when constructing GraphModule, this is a list of attributes # to preserve as an attribute of the GraphModule even when they are # not used in the code "preserved_attributes": ["preserved_attr"], }
_remove_qconfig (*) – Option to remove the qconfig attributes in the model after convert.
qconfig_dict (*) –
- qconfig_dict with either same keys as what is passed to
the qconfig_dict in prepare_fx API, with same values or None, or additional keys with values set to None
For each entry whose value is set to None, we skip quantizing that entry in the model:
qconfig_dict = { # used for object_type, skip quantizing torch.nn.functional.add "object_type": [ (torch.nn.functional.add, None), (torch.nn.functional.linear, qconfig_from_prepare) ..., ], # sed for module names, skip quantizing "foo.bar" "module_name": [ ("foo.bar", None) ..., ], }
- backend_config_dict: A configuration for the backend which describes how
operators should be quantized in the backend, this includes quantization mode support (static/dynamic/weight_only), dtype support (quint8/qint8 etc.), observer placement for each operators and fused operators. Detailed documentation can be found in torch/ao/quantization/backend_config/README.md
- Returns
A quantized model (GraphModule)
Example:
# prepared_model: the model after prepare_fx/prepare_qat_fx and calibration/training quantized_model = convert_fx(prepared_model)