CLIP

deepke.name_entity_re.multimodal.models.clip.configuration_clip module

CLIP model configuration

class deepke.name_entity_re.multimodal.models.clip.configuration_clip.CLIPTextConfig(vocab_size=49408, hidden_size=512, intermediate_size=2048, num_hidden_layers=12, num_attention_heads=8, max_position_embeddings=77, hidden_act='quick_gelu', layer_norm_eps=1e-05, dropout=0.0, attention_dropout=0.0, initializer_range=0.02, initializer_factor=1.0, pad_token_id=1, bos_token_id=0, eos_token_id=2, **kwargs)[source]

Bases: transformers.configuration_utils.PretrainedConfig

This is the configuration class to store the configuration of a [CLIPModel]. It is used to instantiate an CLIP model according to the specified arguments, defining the model architecture. Instantiating a configuration with the defaults will yield a similar configuration to that of the CLIP [openai/clip-vit-base-patch32](https://huggingface.co/openai/clip-vit-base-patch32) architecture.

Configuration objects inherit from [PretrainedConfig] and can be used to control the model outputs. Read the documentation from [PretrainedConfig] for more information.

Parameters
  • vocab_size (int, optional, defaults to 49408) – Vocabulary size of the CLIP text model. Defines the number of different tokens that can be represented by the inputs_ids passed when calling [CLIPModel].

  • hidden_size (int, optional, defaults to 512) – Dimensionality of the encoder layers and the pooler layer.

  • intermediate_size (int, optional, defaults to 2048) – Dimensionality of the “intermediate” (i.e., feed-forward) layer in the Transformer encoder.

  • num_hidden_layers (int, optional, defaults to 12) – Number of hidden layers in the Transformer encoder.

  • num_attention_heads (int, optional, defaults to 8) – Number of attention heads for each attention layer in the Transformer encoder.

  • max_position_embeddings (int, optional, defaults to 77) – The maximum sequence length that this model might ever be used with. Typically set this to something large just in case (e.g., 512 or 1024 or 2048).

  • hidden_act (str or function, optional, defaults to “quick_gelu”) – The non-linear activation function (function or string) in the encoder and pooler. If string, “gelu”, “relu”, “selu” and “gelu_new” ``”quick_gelu”` are supported. layer_norm_eps (float, optional, defaults to 1e-5): The epsilon used by the layer normalization layers.

  • attention_dropout (float, optional, defaults to 0.0) – The dropout ratio for the attention probabilities.

  • dropout (float, optional, defaults to 0.0) – The dropout probabilitiy for all fully connected layers in the embeddings, encoder, and pooler.

  • initializer_range (float, optional, defaults to 0.02) – The standard deviation of the truncated_normal_initializer for initializing all weight matrices.

  • initializer_factor (float`, optional, defaults to 1) – A factor for initializing all weight matrices (should be kept to 1, used internally for initialization testing).

Example:

```python >>> from transformers import CLIPTextModel, CLIPTextConfig

>>> # Initializing a CLIPTextModel with openai/clip-vit-base-patch32 style configuration
>>> configuration = CLIPTextConfig()
>>> # Initializing a CLIPTextConfig from the openai/clip-vit-base-patch32 style configuration
>>> model = CLIPTextModel(configuration)
>>> # Accessing the model configuration
>>> configuration = model.config
```
model_type: str = 'clip_text_model'
class deepke.name_entity_re.multimodal.models.clip.configuration_clip.CLIPVisionConfig(hidden_size=768, intermediate_size=3072, num_hidden_layers=12, num_attention_heads=12, image_size=224, patch_size=32, hidden_act='quick_gelu', layer_norm_eps=1e-05, dropout=0.0, attention_dropout=0.0, initializer_range=0.02, initializer_factor=1.0, **kwargs)[source]

Bases: transformers.configuration_utils.PretrainedConfig

This is the configuration class to store the configuration of a [CLIPModel]. It is used to instantiate an CLIP model according to the specified arguments, defining the model architecture. Instantiating a configuration with the defaults will yield a similar configuration to that of the CLIP [openai/clip-vit-base-patch32](https://huggingface.co/openai/clip-vit-base-patch32) architecture.

Configuration objects inherit from [PretrainedConfig] and can be used to control the model outputs. Read the documentation from [PretrainedConfig] for more information.

Parameters
  • hidden_size (int, optional, defaults to 768) – Dimensionality of the encoder layers and the pooler layer.

  • intermediate_size (int, optional, defaults to 3072) – Dimensionality of the “intermediate” (i.e., feed-forward) layer in the Transformer encoder.

  • num_hidden_layers (int, optional, defaults to 12) – Number of hidden layers in the Transformer encoder.

  • num_attention_heads (int, optional, defaults to 12) – Number of attention heads for each attention layer in the Transformer encoder.

  • image_size (int, optional, defaults to 224) – The size (resolution) of each image.

  • patch_size (int, optional, defaults to 32) – The size (resolution) of each patch.

  • hidden_act (str or function, optional, defaults to “quick_gelu”) – The non-linear activation function (function or string) in the encoder and pooler. If string, “gelu”, “relu”, “selu” and “gelu_new” ``”quick_gelu”` are supported. layer_norm_eps (float, optional, defaults to 1e-5): The epsilon used by the layer normalization layers.

  • dropout (float, optional, defaults to 0.0) – The dropout probabilitiy for all fully connected layers in the embeddings, encoder, and pooler.

  • attention_dropout (float, optional, defaults to 0.0) – The dropout ratio for the attention probabilities.

  • initializer_range (float, optional, defaults to 0.02) – The standard deviation of the truncated_normal_initializer for initializing all weight matrices.

  • initializer_factor (float`, optional, defaults to 1) – A factor for initializing all weight matrices (should be kept to 1, used internally for initialization testing).

Example:

```python >>> from transformers import CLIPVisionModel, CLIPVisionConfig

>>> # Initializing a CLIPVisionModel with openai/clip-vit-base-patch32 style configuration
>>> configuration = CLIPVisionConfig()
>>> # Initializing a CLIPVisionModel model from the openai/clip-vit-base-patch32 style configuration
>>> model = CLIPVisionModel(configuration)
>>> # Accessing the model configuration
>>> configuration = model.config
```
model_type: str = 'clip_vision_model'
class deepke.name_entity_re.multimodal.models.clip.configuration_clip.CLIPConfig(text_config_dict=None, vision_config_dict=None, projection_dim=512, logit_scale_init_value=2.6592, **kwargs)[source]

Bases: transformers.configuration_utils.PretrainedConfig

[CLIPConfig] is the configuration class to store the configuration of a [CLIPModel]. It is used to instantiate CLIP model according to the specified arguments, defining the text model and vision model configs.

Configuration objects inherit from [PretrainedConfig] and can be used to control the model outputs. Read the documentation from [PretrainedConfig] for more information.

Parameters
  • text_config_dict (dict, optional) – Dictionary of configuration options used to initialize [CLIPTextConfig].

  • vision_config_dict (dict, optional) – Dictionary of configuration options used to initialize [CLIPVisionConfig].

  • projection_dim (int, optional, defaults to 512) – Dimentionality of text and vision projection layers.

  • logit_scale_init_value (float, optional, defaults to 2.6592) – The inital value of the logit_scale paramter. Default is used as per the original CLIP implementation.

  • kwargs (optional) – Dictionary of keyword arguments.

model_type: str = 'clip'
is_composition = True
classmethod from_text_vision_configs(text_config: deepke.name_entity_re.multimodal.models.clip.configuration_clip.CLIPTextConfig, vision_config: deepke.name_entity_re.multimodal.models.clip.configuration_clip.CLIPVisionConfig, **kwargs)[source]

Instantiate a [CLIPConfig] (or a derived class) from clip text model configuration and clip vision model configuration.

Returns

An instance of a configuration object

Return type

[CLIPConfig]

to_dict()[source]

Serializes this instance to a Python dictionary. Override the default [~PretrainedConfig.to_dict].

Returns

Dictionary of all the attributes that make up this configuration instance,

Return type

Dict[str, any]

deepke.name_entity_re.multimodal.models.clip.feature_extraction_clip module

Feature extractor class for CLIP.

class deepke.name_entity_re.multimodal.models.clip.feature_extraction_clip.CLIPFeatureExtractor(do_resize=True, size=224, resample=Resampling.BICUBIC, do_center_crop=True, crop_size=224, do_normalize=True, image_mean=None, image_std=None, **kwargs)[source]

Bases: deepke.name_entity_re.multimodal.models.clip.feature_extraction_utils.FeatureExtractionMixin, deepke.name_entity_re.multimodal.models.clip.image_utils.ImageFeatureExtractionMixin

Constructs a CLIP feature extractor.

This feature extractor inherits from [FeatureExtractionMixin] which contains most of the main methods. Users should refer to this superclass for more information regarding those methods.

Parameters
  • do_resize (bool, optional, defaults to True) – Whether to resize the input to a certain size.

  • size (int, optional, defaults to 224) – Resize the input to the given size. Only has an effect if do_resize is set to True.

  • resample (int, optional, defaults to PIL.Image.BICUBIC) – An optional resampling filter. This can be one of PIL.Image.NEAREST, PIL.Image.BOX, PIL.Image.BILINEAR, PIL.Image.HAMMING, PIL.Image.BICUBIC or PIL.Image.LANCZOS. Only has an effect if do_resize is set to True.

  • do_center_crop (bool, optional, defaults to True) – Whether to crop the input at the center. If the input size is smaller than crop_size along any edge, the image is padded with 0’s and then center cropped.

  • crop_size (int, optional, defaults to 224) – Desired output size when applying center-cropping. Only has an effect if do_center_crop is set to True.

  • do_normalize (bool, optional, defaults to True) – Whether or not to normalize the input with image_mean and image_std.

  • image_mean (List[int], defaults to [0.485, 0.456, 0.406]) – The sequence of means for each channel, to be used when normalizing images.

  • image_std (List[int], defaults to [0.229, 0.224, 0.225]) – The sequence of standard deviations for each channel, to be used when normalizing images.

model_input_names = ['pixel_values']
center_crop(image, size)[source]

Crops image to the given size using a center crop. Note that if the image is too small to be cropped to the size is given, it will be padded (so the returned result has the size asked).

Parameters
  • image (PIL.Image.Image or np.ndarray or torch.Tensor) – The image to resize.

  • size (int or Tuple[int, int]) – The size to which crop the image.

resize(image, size, resample=Resampling.BICUBIC)[source]

Resizes image. Note that this will trigger a conversion of image to a PIL Image.

Parameters
  • image (PIL.Image.Image or np.ndarray or torch.Tensor) – The image to resize.

  • size (int or Tuple[int, int]) – The size to use for resizing the image. If int it will be resized to match the shorter side

  • resample (int, optional, defaults to PIL.Image.BILINEAR) – The filter to user for resampling.

deepke.name_entity_re.multimodal.models.clip.feature_extraction_utils module

Feature extraction saving/loading class for common feature extractors.

deepke.name_entity_re.multimodal.models.clip.feature_extraction_utils.is_offline_mode()[source]
class deepke.name_entity_re.multimodal.models.clip.feature_extraction_utils.ExplicitEnum(value)[source]

Bases: enum.Enum

Enum with more explicit error message for missing values.

class deepke.name_entity_re.multimodal.models.clip.feature_extraction_utils.TensorType(value)[source]

Bases: deepke.name_entity_re.multimodal.models.clip.feature_extraction_utils.ExplicitEnum

Possible values for the return_tensors argument in [PreTrainedTokenizerBase.__call__]. Useful for tab-completion in an IDE.

PYTORCH = 'pt'
TENSORFLOW = 'tf'
NUMPY = 'np'
JAX = 'jax'
class deepke.name_entity_re.multimodal.models.clip.feature_extraction_utils.BatchFeature(data: Optional[Dict[str, Any]] = None, tensor_type: Union[None, str, deepke.name_entity_re.multimodal.models.clip.feature_extraction_utils.TensorType] = None)[source]

Bases: collections.UserDict

Holds the output of the pad() and feature extractor specific __call__ methods.

This class is derived from a python dictionary and can be used as a dictionary.

Parameters
  • data (dict) – Dictionary of lists/arrays/tensors returned by the __call__/pad methods (‘input_values’, ‘attention_mask’, etc.).

  • tensor_type (Union[None, str, TensorType], optional) – You can give a tensor_type here to convert the lists of integers in PyTorch/TensorFlow/Numpy Tensors at initialization.

keys() a set-like object providing a view on D's keys[source]
values() an object providing a view on D's values[source]
items() a set-like object providing a view on D's items[source]
convert_to_tensors(tensor_type: Union[None, str, deepke.name_entity_re.multimodal.models.clip.feature_extraction_utils.TensorType] = None)[source]

Convert the inner content to tensors.

Parameters

tensor_type (str or TensorType, optional) – The type of tensors to use. If str, should be one of the values of the enum TensorType. If None, no modification is done.

to(device: Union[str, torch.device]) BatchFeature[source]

Send all values to device by calling v.to(device) (PyTorch only).

Parameters

device (str or torch.device) – The device to put the tensors on.

Returns

The same instance after modification.

Return type

BatchFeature

class deepke.name_entity_re.multimodal.models.clip.feature_extraction_utils.FeatureExtractionMixin(**kwargs)[source]

Bases: object

This is a feature extraction mixin used to provide saving/loading functionality for sequential and image feature extractors.

classmethod from_pretrained(pretrained_model_name_or_path: Union[str, os.PathLike], **kwargs) SequenceFeatureExtractor[source]

Instantiate a type of FeatureExtractionMixin from a feature extractor, e.g. a derived class of SequenceFeatureExtractor.

Parameters
  • pretrained_model_name_or_path (str or os.PathLike) –

    This can be either:

    • a string, the model id of a pretrained feature_extractor hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.

    • a path to a directory containing a feature extractor file saved using the save_pretrained() method, e.g., ./my_model_directory/.

    • a path or url to a saved feature extractor JSON file, e.g., ./my_model_directory/preprocessor_config.json.

  • cache_dir (str or os.PathLike, optional) – Path to a directory in which a downloaded pretrained model feature extractor should be cached if the standard cache should not be used.

  • force_download (bool, optional, defaults to False) – Whether or not to force to (re-)download the feature extractor files and override the cached versions if they exist.

  • resume_download (bool, optional, defaults to False) – Whether or not to delete incompletely received file. Attempts to resume the download if such a file exists.

  • proxies (Dict[str, str], optional) – A dictionary of proxy servers to use by protocol or endpoint, e.g., {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request.

  • use_auth_token (str or bool, optional) – The token to use as HTTP bearer authorization for remote files. If True, will use the token generated when running transformers-cli login (stored in huggingface).

  • revision (str, optional, defaults to "main") – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.

  • return_unused_kwargs (bool, optional, defaults to False) – If False, then this function returns just the final feature extractor object. If True, then this functions returns a Tuple(feature_extractor, unused_kwargs) where unused_kwargs is a dictionary consisting of the key/value pairs whose keys are not feature extractor attributes: i.e., the part of kwargs which has not been used to update feature_extractor and is otherwise ignored.

  • kwargs (Dict[str, Any], optional) – The values in kwargs of any keys which are feature extractor attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are not feature extractor attributes is controlled by the return_unused_kwargs keyword parameter.

Note

Passing use_auth_token=True is required when you want to use a private model.

Returns

A feature extractor of type FeatureExtractionMixin.

Examples:

# We can't instantiate directly the base class `FeatureExtractionMixin` nor `SequenceFeatureExtractor` so let's show the examples on a
# derived class: `Wav2Vec2FeatureExtractor`
feature_extractor = Wav2Vec2FeatureExtractor.from_pretrained('facebook/wav2vec2-base-960h')    # Download feature_extraction_config from huggingface.co and cache.
feature_extractor = Wav2Vec2FeatureExtractor.from_pretrained('./test/saved_model/')  # E.g. feature_extractor (or model) was saved using `save_pretrained('./test/saved_model/')`
feature_extractor = Wav2Vec2FeatureExtractor.from_pretrained('./test/saved_model/preprocessor_config.json')
feature_extractor = Wav2Vec2FeatureExtractor.from_pretrained('facebook/wav2vec2-base-960h', return_attention_mask=False, foo=False)
assert feature_extractor.return_attention_mask is False
feature_extractor, unused_kwargs = Wav2Vec2FeatureExtractor.from_pretrained('facebook/wav2vec2-base-960h', return_attention_mask=False,
                                                   foo=False, return_unused_kwargs=True)
assert feature_extractor.return_attention_mask is False
assert unused_kwargs == {'foo': False}
save_pretrained(save_directory: Union[str, os.PathLike])[source]

Save a feature_extractor object to the directory save_directory, so that it can be re-loaded using the from_pretrained() class method.

Parameters

save_directory (str or os.PathLike) – Directory where the feature extractor JSON file will be saved (will be created if it does not exist).

classmethod get_feature_extractor_dict(pretrained_model_name_or_path: Union[str, os.PathLike], **kwargs) Tuple[Dict[str, Any], Dict[str, Any]][source]

From a pretrained_model_name_or_path, resolve to a dictionary of parameters, to be used for instantiating a feature extractor of type FeatureExtractionMixin using from_dict.

Parameters

pretrained_model_name_or_path (str or os.PathLike) – The identifier of the pre-trained checkpoint from which we want the dictionary of parameters.

Returns

The dictionary(ies) that will be used to instantiate the feature extractor object.

Return type

Tuple[Dict, Dict]

classmethod from_dict(feature_extractor_dict: Dict[str, Any], **kwargs) SequenceFeatureExtractor[source]

Instantiates a type of FeatureExtractionMixin from a Python dictionary of parameters.

Parameters
  • feature_extractor_dict (Dict[str, Any]) – Dictionary that will be used to instantiate the feature extractor object. Such a dictionary can be retrieved from a pretrained checkpoint by leveraging the to_dict() method.

  • kwargs (Dict[str, Any]) – Additional parameters from which to initialize the feature extractor object.

Returns

The feature extractor object instantiated from those parameters.

Return type

FeatureExtractionMixin

to_dict() Dict[str, Any][source]

Serializes this instance to a Python dictionary.

Returns

Dictionary of all the attributes that make up this feature extractor instance.

Return type

Dict[str, Any]

classmethod from_json_file(json_file: Union[str, os.PathLike]) SequenceFeatureExtractor[source]

Instantiates a feature extractor of type FeatureExtractionMixin from the path to a JSON file of parameters.

Parameters

json_file (str or os.PathLike) – Path to the JSON file containing the parameters.

Returns

The feature_extractor object instantiated from that JSON file.

Return type

A feature extractor of type FeatureExtractionMixin

to_json_string() str[source]

Serializes this instance to a JSON string.

Returns

String containing all the attributes that make up this feature_extractor instance in JSON format.

Return type

str

to_json_file(json_file_path: Union[str, os.PathLike])[source]

Save this instance to a JSON file.

Parameters

json_file_path (str or os.PathLike) – Path to the JSON file in which this feature_extractor instance’s parameters will be saved.

deepke.name_entity_re.multimodal.models.clip.file_utils module

Utilities for working with the local dataset cache. Parts of this file is adapted from the AllenNLP library at https://github.com/allenai/allennlp.

class deepke.name_entity_re.multimodal.models.clip.file_utils.EmptyTqdm(*args, **kwargs)[source]

Bases: object

Dummy tqdm which doesn’t do anything.

deepke.name_entity_re.multimodal.models.clip.file_utils.is_offline_mode()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_torch_available()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_pyctcdecode_available()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_librosa_available()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_torch_cuda_available()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_torch_bf16_available()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_torch_tf32_available()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_torch_fx_available()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_torch_onnx_dict_inputs_support_available()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_tf_available()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_coloredlogs_available()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_tf2onnx_available()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_onnx_available()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_flax_available()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_torch_tpu_available()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_datasets_available()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_detectron2_available()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_rjieba_available()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_psutil_available()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_py3nvml_available()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_apex_available()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_faiss_available()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_scipy_available()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_sklearn_available()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_sentencepiece_available()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_protobuf_available()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_tokenizers_available()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_vision_available()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_pytesseract_available()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_spacy_available()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_ftfy_available()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_in_notebook()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_scatter_available()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_pytorch_quantization_available()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_tensorflow_probability_available()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_pandas_available()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_sagemaker_dp_enabled()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_sagemaker_mp_enabled()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_training_run_on_sagemaker()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_soundfile_availble()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_timm_available()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_torchaudio_available()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_speech_available()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_phonemizer_available()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.torch_only_method(fn)[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.requires_backends(obj, backends)[source]
class deepke.name_entity_re.multimodal.models.clip.file_utils.DummyObject[source]

Bases: type

Metaclass for the dummy objects. Any class inheriting from it will return the ImportError generated by requires_backend each time a user tries to access any method of that class.

deepke.name_entity_re.multimodal.models.clip.file_utils.add_start_docstrings(*docstr)[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.add_start_docstrings_to_model_forward(*docstr)[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.add_end_docstrings(*docstr)[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.add_code_sample_docstrings(*docstr, processor_class=None, checkpoint=None, output_type=None, config_class=None, mask='[MASK]', model_cls=None, modality=None, expected_output='', expected_loss='')[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.replace_return_docstrings(output_type=None, config_class=None)[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_remote_url(url_or_filename)[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.hf_bucket_url(model_id: str, filename: str, subfolder: Optional[str] = None, revision: Optional[str] = None, mirror=None) str[source]

Resolve a model identifier, a file name, and an optional revision id, to a huggingface.co-hosted url, redirecting to Cloudfront (a Content Delivery Network, or CDN) for large files. Cloudfront is replicated over the globe so downloads are way faster for the end user (and it also lowers our bandwidth costs). Cloudfront aggressively caches files by default (default TTL is 24 hours), however this is not an issue here because we migrated to a git-based versioning system on huggingface.co, so we now store the files on S3/Cloudfront in a content-addressable way (i.e., the file name is its hash). Using content-addressable filenames means cache can’t ever be stale. In terms of client-side caching from this library, we base our caching on the objects’ ETag. An object’ ETag is: its sha1 if stored in git, or its sha256 if stored in git-lfs. Files cached locally from transformers before v3.5.0 are not shared with those new files, because the cached file’s name contains a hash of the url (which changed).

deepke.name_entity_re.multimodal.models.clip.file_utils.url_to_filename(url: str, etag: Optional[str] = None) str[source]

Convert url into a hashed filename in a repeatable way. If etag is specified, append its hash to the url’s, delimited by a period. If the url ends with .h5 (Keras HDF5 weights) adds ‘.h5’ to the name so that TF 2.0 can identify it as a HDF5 file (see https://github.com/tensorflow/tensorflow/blob/00fad90125b18b80fe054de1055770cfb8fe4ba3/tensorflow/python/keras/engine/network.py#L1380)

deepke.name_entity_re.multimodal.models.clip.file_utils.filename_to_url(filename, cache_dir=None)[source]

Return the url and etag (which may be None) stored for filename. Raise EnvironmentError if filename or its stored metadata do not exist.

deepke.name_entity_re.multimodal.models.clip.file_utils.get_cached_models(cache_dir: Optional[Union[str, pathlib.Path]] = None) List[Tuple][source]

Returns a list of tuples representing model binaries that are cached locally. Each tuple has shape (model_url, etag, size_MB). Filenames in cache_dir are use to get the metadata for each model, only urls ending with .bin are added. :param cache_dir: The cache directory to search for models within. Will default to the transformers cache if unset. :type cache_dir: Union[str, Path], optional

Returns

List of tuples each with shape (model_url, etag, size_MB)

Return type

List[Tuple]

deepke.name_entity_re.multimodal.models.clip.file_utils.cached_path(url_or_filename, cache_dir=None, force_download=False, proxies=None, resume_download=False, user_agent: Optional[Union[Dict, str]] = None, extract_compressed_file=False, force_extract=False, use_auth_token: Optional[Union[bool, str]] = None, local_files_only=False) Optional[str][source]

Given something that might be a URL (or might be a local path), determine which. If it’s a URL, download the file and cache it, and return the path to the cached file. If it’s already a local path, make sure the file exists and then return the path :param cache_dir: specify a cache directory to save the file to (overwrite the default cache dir). :param force_download: if True, re-download the file even if it’s already cached in the cache dir. :param resume_download: if True, resume the download if incompletely received file is found. :param user_agent: Optional string or dict that will be appended to the user-agent on remote requests. :param use_auth_token: Optional string or boolean to use as Bearer token for remote files. If True,

will get token from ~/.huggingface.

Parameters
  • extract_compressed_file – if True and the path point to a zip or tar file, extract the compressed file in a folder along the archive.

  • force_extract – if True when extract_compressed_file is True and the archive was already extracted, re-extract the archive and override the folder where it was extracted.

Returns

Local path (string) of file or if networking is off, last version of file cached on disk.

Raises

In case of non-recoverable file (non-existent or inaccessible url + no cache on disk).

deepke.name_entity_re.multimodal.models.clip.file_utils.define_sagemaker_information()[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.http_user_agent(user_agent: Optional[Union[Dict, str]] = None) str[source]

Formats a user-agent string with basic info about a request.

exception deepke.name_entity_re.multimodal.models.clip.file_utils.RepositoryNotFoundError(*args, **kwargs)[source]

Bases: requests.exceptions.HTTPError

Raised when trying to access a hf.co URL with an invalid repository name, or with a private repo name the user does not have access to.

exception deepke.name_entity_re.multimodal.models.clip.file_utils.EntryNotFoundError(*args, **kwargs)[source]

Bases: requests.exceptions.HTTPError

Raised when trying to access a hf.co URL with a valid repository and revision but an invalid filename.

exception deepke.name_entity_re.multimodal.models.clip.file_utils.RevisionNotFoundError(*args, **kwargs)[source]

Bases: requests.exceptions.HTTPError

Raised when trying to access a hf.co URL with a valid repository but an invalid revision.

deepke.name_entity_re.multimodal.models.clip.file_utils.http_get(url: str, temp_file: BinaryIO, proxies=None, resume_size=0, headers: Optional[Dict[str, str]] = None)[source]

Download remote file. Do not gobble up errors.

deepke.name_entity_re.multimodal.models.clip.file_utils.get_from_cache(url: str, cache_dir=None, force_download=False, proxies=None, etag_timeout=10, resume_download=False, user_agent: Optional[Union[Dict, str]] = None, use_auth_token: Optional[Union[bool, str]] = None, local_files_only=False) Optional[str][source]

Given a URL, look for the corresponding file in the local cache. If it’s not there, download it. Then return the path to the cached file. :returns: Local path (string) of file or if networking is off, last version of file cached on disk.

Raises

In case of non-recoverable file (non-existent or inaccessible url + no cache on disk).

deepke.name_entity_re.multimodal.models.clip.file_utils.get_file_from_repo(path_or_repo: Union[str, os.PathLike], filename: str, cache_dir: Optional[Union[str, os.PathLike]] = None, force_download: bool = False, resume_download: bool = False, proxies: Optional[Dict[str, str]] = None, use_auth_token: Optional[Union[bool, str]] = None, revision: Optional[str] = None, local_files_only: bool = False)[source]

Tries to locate a file in a local folder and repo, downloads and cache it if necessary. :param path_or_repo: This can be either:

  • a string, the model id of a model repo on huggingface.co.

  • a path to a directory potentially containing the file.

Parameters
  • filename (str) – The name of the file to locate in path_or_repo.

  • cache_dir (str or os.PathLike, optional) – Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

  • force_download (bool, optional, defaults to False) – Whether or not to force to (re-)download the configuration files and override the cached versions if they exist.

  • resume_download (bool, optional, defaults to False) – Whether or not to delete incompletely received file. Attempts to resume the download if such a file exists.

  • proxies (Dict[str, str], optional) – A dictionary of proxy servers to use by protocol or endpoint, e.g., {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.

  • use_auth_token (str or bool, optional) – The token to use as HTTP bearer authorization for remote files. If True, will use the token generated when running transformers-cli login (stored in ~/.huggingface).

  • revision (str, optional, defaults to “main”) – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.

  • local_files_only (bool, optional, defaults to False) – If True, will only try to load the tokenizer configuration from local files.

<Tip> Passing use_auth_token=True is required when you want to use a private model. </Tip> :returns: Returns the resolved file (to the cache folder if downloaded from a repo) or None if the

file does not exist.

Return type

Optional[str]

Examples: `python # Download a tokenizer configuration from huggingface.co and cache. tokenizer_config = get_file_from_repo("bert-base-uncased", "tokenizer_config.json") # This model does not have a tokenizer config so the result will be None. tokenizer_config = get_file_from_repo("xlm-roberta-base", "tokenizer_config.json") `

deepke.name_entity_re.multimodal.models.clip.file_utils.has_file(path_or_repo: Union[str, os.PathLike], filename: str, revision: Optional[str] = None, mirror: Optional[str] = None, proxies: Optional[Dict[str, str]] = None, use_auth_token: Optional[Union[bool, str]] = None)[source]

Checks if a repo contains a given file wihtout downloading it. Works for remote repos and local folders. <Tip warning={false}> This function will raise an error if the repository path_or_repo is not valid or if revision does not exist for this repo, but will return False for regular connection errors. </Tip>

deepke.name_entity_re.multimodal.models.clip.file_utils.get_list_of_files(path_or_repo: Union[str, os.PathLike], revision: Optional[str] = None, use_auth_token: Optional[Union[bool, str]] = None, local_files_only: bool = False) List[str][source]

Gets the list of files inside path_or_repo. :param path_or_repo: Can be either the id of a repo on huggingface.co or a path to a directory. :type path_or_repo: str or os.PathLike :param revision: The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a

git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.

Parameters
  • use_auth_token (str or bool, optional) – The token to use as HTTP bearer authorization for remote files. If True, will use the token generated when running transformers-cli login (stored in ~/.huggingface).

  • local_files_only (bool, optional, defaults to False) – Whether or not to only rely on local files and not to attempt to download any files.

<Tip warning={true}> This API is not optimized, so calling it a lot may result in connection errors. </Tip> :returns: The list of files available in path_or_repo. :rtype: List[str]

class deepke.name_entity_re.multimodal.models.clip.file_utils.cached_property(fget=None, fset=None, fdel=None, doc=None)[source]

Bases: property

Descriptor that mimics @property but caches output in member variable. From tensorflow_datasets Built-in in functools from Python 3.8.

deepke.name_entity_re.multimodal.models.clip.file_utils.torch_required(func)[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.tf_required(func)[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_torch_fx_proxy(x)[source]
deepke.name_entity_re.multimodal.models.clip.file_utils.is_tensor(x)[source]

Tests if x is a torch.Tensor, tf.Tensor, jaxlib.xla_extension.DeviceArray or np.ndarray.

deepke.name_entity_re.multimodal.models.clip.file_utils.to_py_obj(obj)[source]

Convert a TensorFlow tensor, PyTorch tensor, Numpy array or python list to a python list.

deepke.name_entity_re.multimodal.models.clip.file_utils.to_numpy(obj)[source]

Convert a TensorFlow tensor, PyTorch tensor, Numpy array or python list to a Numpy array.

class deepke.name_entity_re.multimodal.models.clip.file_utils.ModelOutput[source]

Bases: collections.OrderedDict

Base class for all model outputs as dataclass. Has a __getitem__ that allows indexing by integer or slice (like a tuple) or strings (like a dictionary) that will ignore the None attributes. Otherwise behaves like a regular python dictionary. <Tip warning={true}> You can’t unpack a ModelOutput directly. Use the [~file_utils.ModelOutput.to_tuple] method to convert it to a tuple before. </Tip>

setdefault(*args, **kwargs)[source]

Insert key with a value of default if key is not in the dictionary.

Return the value for key if key is in the dictionary, else default.

pop(k[, d]) v, remove specified key and return the corresponding[source]

value. If key is not found, d is returned if given, otherwise KeyError is raised.

update([E, ]**F) None.  Update D from dict/iterable E and F.[source]

If E is present and has a .keys() method, then does: for k in E: D[k] = E[k] If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v In either case, this is followed by: for k in F: D[k] = F[k]

to_tuple() Tuple[Any][source]

Convert self to a tuple containing all the attributes/keys that are not None.

class deepke.name_entity_re.multimodal.models.clip.file_utils.ExplicitEnum(value)[source]

Bases: enum.Enum

Enum with more explicit error message for missing values.

class deepke.name_entity_re.multimodal.models.clip.file_utils.PaddingStrategy(value)[source]

Bases: deepke.name_entity_re.multimodal.models.clip.file_utils.ExplicitEnum

Possible values for the padding argument in [PreTrainedTokenizerBase.__call__]. Useful for tab-completion in an IDE.

LONGEST = 'longest'
MAX_LENGTH = 'max_length'
DO_NOT_PAD = 'do_not_pad'
class deepke.name_entity_re.multimodal.models.clip.file_utils.TensorType(value)[source]

Bases: deepke.name_entity_re.multimodal.models.clip.file_utils.ExplicitEnum

Possible values for the return_tensors argument in [PreTrainedTokenizerBase.__call__]. Useful for tab-completion in an IDE.

PYTORCH = 'pt'
TENSORFLOW = 'tf'
NUMPY = 'np'
JAX = 'jax'
deepke.name_entity_re.multimodal.models.clip.file_utils.copy_func(f)[source]

Returns a copy of a function f.

deepke.name_entity_re.multimodal.models.clip.file_utils.is_local_clone(repo_path, repo_url)[source]

Checks if the folder in repo_path is a local clone of repo_url.

class deepke.name_entity_re.multimodal.models.clip.file_utils.PushToHubMixin[source]

Bases: object

A Mixin containing the functionality to push a model or tokenizer to the hub.

push_to_hub(repo_path_or_name: Optional[str] = None, repo_url: Optional[str] = None, use_temp_dir: bool = False, commit_message: Optional[str] = None, organization: Optional[str] = None, private: Optional[bool] = None, use_auth_token: Optional[Union[bool, str]] = None, **model_card_kwargs) str[source]

Upload the {object_files} to the 🤗 Model Hub while synchronizing a local clone of the repo in repo_path_or_name. :param repo_path_or_name: Can either be a repository name for your {object} in the Hub or a path to a local folder (in which case

the repository will have the name of that local folder). If not specified, will default to the name given by repo_url and a local directory with that name will be created.

Parameters
  • repo_url (str, optional) – Specify this in case you want to push to an existing repository in the hub. If unspecified, a new repository will be created in your namespace (unless you specify an organization) with repo_name.

  • use_temp_dir (bool, optional, defaults to False) – Whether or not to clone the distant repo in a temporary directory or in repo_path_or_name inside the current working directory. This will slow things down if you are making changes in an existing repo since you will need to clone the repo before every push.

  • commit_message (str, optional) – Message to commit while pushing. Will default to “add {object}”.

  • organization (str, optional) – Organization in which you want to push your {object} (you must be a member of this organization).

  • private (bool, optional) – Whether or not the repository created should be private (requires a paying subscription).

  • use_auth_token (bool or str, optional) – The token to use as HTTP bearer authorization for remote files. If True, will use the token generated when running transformers-cli login (stored in ~/.huggingface). Will default to True if repo_url is not specified.

Returns

The url of the commit of your {object} in the given repository.

Return type

str

Examples: `python from transformers import {object_class} {object} = {object_class}.from_pretrained("bert-base-cased") # Push the {object} to your namespace with the name "my-finetuned-bert" and have a local clone in the # *my-finetuned-bert* folder. {object}.push_to_hub("my-finetuned-bert") # Push the {object} to your namespace with the name "my-finetuned-bert" with no local clone. {object}.push_to_hub("my-finetuned-bert", use_temp_dir=True) # Push the {object} to an organization with the name "my-finetuned-bert" and have a local clone in the # *my-finetuned-bert* folder. {object}.push_to_hub("my-finetuned-bert", organization="huggingface") # Make a change to an existing repo that has been cloned locally in *my-finetuned-bert*. {object}.push_to_hub("my-finetuned-bert", repo_url="https://huggingface.co/sgugger/my-finetuned-bert") `

deepke.name_entity_re.multimodal.models.clip.file_utils.get_full_repo_name(model_id: str, organization: Optional[str] = None, token: Optional[str] = None)[source]
class deepke.name_entity_re.multimodal.models.clip.file_utils.ContextManagers(context_managers: List[AbstractContextManager])[source]

Bases: object

Wrapper for contextlib.ExitStack which enters a collection of context managers. Adaptation of ContextManagers in the fastcore library.

deepke.name_entity_re.multimodal.models.clip.image_utils module

deepke.name_entity_re.multimodal.models.clip.image_utils.is_torch_tensor(obj)[source]
class deepke.name_entity_re.multimodal.models.clip.image_utils.ImageFeatureExtractionMixin[source]

Bases: object

Mixin that contain utilities for preparing image features.

to_pil_image(image, rescale=None)[source]

Converts image to a PIL Image. Optionally rescales it and puts the channel dimension back as the last axis if needed.

Parameters
  • image (PIL.Image.Image or numpy.ndarray or torch.Tensor) – The image to convert to the PIL Image format.

  • rescale (bool, optional) – Whether or not to apply the scaling factor (to make pixel values integers between 0 and 255). Will default to True if the image type is a floating type, False otherwise.

to_numpy_array(image, rescale=None, channel_first=True)[source]

Converts image to a numpy array. Optionally rescales it and puts the channel dimension as the first dimension.

Parameters
  • image (PIL.Image.Image or np.ndarray or torch.Tensor) – The image to convert to a NumPy array.

  • rescale (bool, optional) – Whether or not to apply the scaling factor (to make pixel values floats between 0. and 1.). Will default to True if the image is a PIL Image or an array/tensor of integers, False otherwise.

  • channel_first (bool, optional, defaults to True) – Whether or not to permute the dimensions of the image to put the channel dimension first.

normalize(image, mean, std)[source]

Normalizes image with mean and std. Note that this will trigger a conversion of image to a NumPy array if it’s a PIL Image.

Parameters
  • image (PIL.Image.Image or np.ndarray or torch.Tensor) – The image to normalize.

  • mean (List[float] or np.ndarray or torch.Tensor) – The mean (per channel) to use for normalization.

  • std (List[float] or np.ndarray or torch.Tensor) – The standard deviation (per channel) to use for normalization.

resize(image, size, resample=Resampling.BILINEAR)[source]

Resizes image. Note that this will trigger a conversion of image to a PIL Image.

Parameters
  • image (PIL.Image.Image or np.ndarray or torch.Tensor) – The image to resize.

  • size (int or Tuple[int, int]) – The size to use for resizing the image.

  • resample (int, optional, defaults to PIL.Image.BILINEAR) – The filter to user for resampling.

center_crop(image, size)[source]

Crops image to the given size using a center crop. Note that if the image is too small to be cropped to the size given, it will be padded (so the returned result has the size asked).

Parameters
  • image (PIL.Image.Image or np.ndarray or torch.Tensor) – The image to resize.

  • size (int or Tuple[int, int]) – The size to which crop the image.

deepke.name_entity_re.multimodal.models.clip.modeling_clip module

PyTorch CLIP model.

deepke.name_entity_re.multimodal.models.clip.modeling_clip.contrastive_loss(logits: torch.Tensor) torch.Tensor[source]
deepke.name_entity_re.multimodal.models.clip.modeling_clip.clip_loss(similarity: torch.Tensor) torch.Tensor[source]
class deepke.name_entity_re.multimodal.models.clip.modeling_clip.CLIPBaseModelOutput[source]

Bases: transformers.file_utils.ModelOutput

last_hidden_state: torch.FloatTensor = None
hidden_states: Optional[Tuple[torch.FloatTensor]] = None
attentions: Optional[Tuple[torch.FloatTensor]] = None
qks: Optional[Tuple[torch.FloatTensor]] = None
class deepke.name_entity_re.multimodal.models.clip.modeling_clip.CLIPBaseModelOutputWithPooling[source]

Bases: transformers.file_utils.ModelOutput

last_hidden_state: torch.FloatTensor = None
pooler_output: torch.FloatTensor = None
hidden_states: Optional[Tuple[torch.FloatTensor]] = None
attentions: Optional[Tuple[torch.FloatTensor]] = None
qks: Optional[Tuple[torch.FloatTensor]] = None
class deepke.name_entity_re.multimodal.models.clip.modeling_clip.CLIPOutput[source]

Bases: transformers.file_utils.ModelOutput

Parameters
  • loss (torch.FloatTensor of shape (1,), optional, returned when return_loss is True) – Contrastive loss for image-text similarity.

  • logits_per_image – (torch.FloatTensor of shape (image_batch_size, text_batch_size)): The scaled dot product scores between image_embeds and text_embeds. This represents the image-text similarity scores.

  • logits_per_text – (torch.FloatTensor of shape (text_batch_size, image_batch_size)): The scaled dot product scores between text_embeds and image_embeds. This represents the text-image similarity scores.

  • text_embeds (torch.FloatTensor of shape (batch_size, output_dim) – The text embeddings obtained by applying the projection layer to the pooled output of CLIPTextModel.

  • image_embeds (torch.FloatTensor of shape (batch_size, output_dim) – The image embeddings obtained by applying the projection layer to the pooled output of CLIPVisionModel.

  • text_model_output (BaseModelOutputWithPooling) – The output of the CLIPTextModel.

  • vision_model_output (BaseModelOutputWithPooling) – The output of the CLIPVisionModel.

loss: Optional[torch.FloatTensor] = None
logits_per_image: torch.FloatTensor = None
logits_per_text: torch.FloatTensor = None
text_embeds: torch.FloatTensor = None
image_embeds: torch.FloatTensor = None
text_model_output: transformers.modeling_outputs.BaseModelOutputWithPooling = None
vision_model_output: transformers.modeling_outputs.BaseModelOutputWithPooling = None
to_tuple() Tuple[Any][source]

Convert self to a tuple containing all the attributes/keys that are not None.

class deepke.name_entity_re.multimodal.models.clip.modeling_clip.CLIPVisionEmbeddings(config: deepke.name_entity_re.multimodal.models.clip.configuration_clip.CLIPVisionConfig)[source]

Bases: torch.nn.modules.module.Module

forward(pixel_values, aux_embeddings=None, rcnn_embeddings=None)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class deepke.name_entity_re.multimodal.models.clip.modeling_clip.CLIPTextEmbeddings(config: deepke.name_entity_re.multimodal.models.clip.configuration_clip.CLIPTextConfig)[source]

Bases: torch.nn.modules.module.Module

forward(input_ids=None, position_ids=None, inputs_embeds=None)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class deepke.name_entity_re.multimodal.models.clip.modeling_clip.CLIPAttention(config)[source]

Bases: torch.nn.modules.module.Module

Multi-headed attention from ‘Attention Is All You Need’ paper

forward(hidden_states: torch.Tensor, attention_mask: Optional[torch.Tensor] = None, causal_attention_mask: Optional[torch.Tensor] = None, output_attentions: bool = False, output_qks: bool = False) Tuple[torch.Tensor, Optional[torch.Tensor], Optional[Tuple[torch.Tensor]]][source]

Input shape: Batch x Time x Channel

training: bool
deepke.name_entity_re.multimodal.models.clip.modeling_clip.quick_gelu(x)[source]
class deepke.name_entity_re.multimodal.models.clip.modeling_clip.CLIPMLP(config)[source]

Bases: torch.nn.modules.module.Module

forward(hidden_states)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class deepke.name_entity_re.multimodal.models.clip.modeling_clip.CLIPEncoderLayer(config: deepke.name_entity_re.multimodal.models.clip.configuration_clip.CLIPConfig)[source]

Bases: torch.nn.modules.module.Module

forward(hidden_states: torch.Tensor, attention_mask: torch.Tensor, causal_attention_mask: torch.Tensor, output_attentions: bool = False, output_qks: bool = False)[source]
Parameters
  • hidden_states (torch.FloatTensor) – input to the layer of shape (seq_len, batch, embed_dim)

  • attention_mask (torch.FloatTensor) – attention mask of size (batch, 1, tgt_len, src_len) where padding elements are indicated by very large negative values.

  • layer_head_mask (torch.FloatTensor) – mask for attention heads in a given layer of size (config.encoder_attention_heads,).

  • output_attentions (bool, optional) – Whether or not to return the attentions tensors of all attention layers. See attentions under returned tensors for more detail.

training: bool
class deepke.name_entity_re.multimodal.models.clip.modeling_clip.CLIPPreTrainedModel(config: transformers.configuration_utils.PretrainedConfig, *inputs, **kwargs)[source]

Bases: transformers.modeling_utils.PreTrainedModel

An abstract class to handle weights initialization and a simple interface for downloading and loading pretrained models.

config_class

alias of deepke.name_entity_re.multimodal.models.clip.configuration_clip.CLIPConfig

base_model_prefix = 'clip'
supports_gradient_checkpointing = True
training: bool
class deepke.name_entity_re.multimodal.models.clip.modeling_clip.CLIPEncoder(config: deepke.name_entity_re.multimodal.models.clip.configuration_clip.CLIPConfig)[source]

Bases: torch.nn.modules.module.Module

Transformer encoder consisting of config.num_hidden_layers self attention layers. Each layer is a CLIPEncoderLayer.

Parameters
  • config – CLIPConfig

  • embed_tokens (nn.Embedding) – output embedding

forward(inputs_embeds, attention_mask=None, causal_attention_mask=None, output_attentions=None, output_hidden_states=None, return_dict=None, output_qks=False)[source]
Parameters
  • inputs_embeds (torch.FloatTensor of shape (batch_size, sequence_length, hidden_size), optional) – Optionally, instead of passing input_ids you can choose to directly pass an embedded representation. This is useful if you want more control over how to convert input_ids indices into associated vectors than the model’s internal embedding lookup matrix.

  • attention_mask (torch.Tensor of shape (batch_size, sequence_length), optional) –

    Mask to avoid performing attention on padding token indices. Mask values selected in [0, 1]:

    • 1 for tokens that are not masked,

    • 0 for tokens that are masked.

    What are attention masks?

  • causal_attention_mask (torch.Tensor of shape (batch_size, sequence_length), optional) –

    Causal mask for the text model. Mask values selected in [0, 1]:

    • 1 for tokens that are not masked,

    • 0 for tokens that are masked.

    What are attention masks?

  • output_attentions (bool, optional) – Whether or not to return the attentions tensors of all attention layers. See attentions under returned tensors for more detail.

  • output_hidden_states (bool, optional) – Whether or not to return the hidden states of all layers. See hidden_states under returned tensors for more detail.

  • return_dict (bool, optional) – Whether or not to return a ModelOutput instead of a plain tuple.

training: bool
class deepke.name_entity_re.multimodal.models.clip.modeling_clip.CLIPTextTransformer(config: deepke.name_entity_re.multimodal.models.clip.configuration_clip.CLIPTextConfig)[source]

Bases: torch.nn.modules.module.Module

forward(input_ids=None, attention_mask=None, position_ids=None, output_attentions=None, output_hidden_states=None, return_dict=None)[source]
Returns

A BaseModelOutputWithPooling (if return_dict=True is passed or when config.return_dict=True) or a tuple of torch.FloatTensor comprising various elements depending on the configuration (~transformers.) and inputs.

  • last_hidden_state (torch.FloatTensor of shape (batch_size, sequence_length, hidden_size)) – Sequence of hidden-states at the output of the last layer of the model.

  • pooler_output (torch.FloatTensor of shape (batch_size, hidden_size)) – Last layer hidden-state of the first token of the sequence (classification token) further processed by a Linear layer and a Tanh activation function. The Linear layer weights are trained from the next sentence prediction (classification) objective during pretraining.

  • hidden_states (tuple(torch.FloatTensor), optional, returned when output_hidden_states=True is passed or when config.output_hidden_states=True) – Tuple of torch.FloatTensor (one for the output of the embeddings + one for the output of each layer) of shape (batch_size, sequence_length, hidden_size).

    Hidden-states of the model at the output of each layer plus the initial embedding outputs.

  • attentions (tuple(torch.FloatTensor), optional, returned when output_attentions=True is passed or when config.output_attentions=True) – Tuple of torch.FloatTensor (one for each layer) of shape (batch_size, num_heads, sequence_length, sequence_length).

    Attentions weights after the attention softmax, used to compute the weighted average in the self-attention heads.

Return type

BaseModelOutputWithPooling or tuple(torch.FloatTensor)

training: bool
class deepke.name_entity_re.multimodal.models.clip.modeling_clip.CLIPTextModel(config: deepke.name_entity_re.multimodal.models.clip.configuration_clip.CLIPTextConfig)[source]

Bases: deepke.name_entity_re.multimodal.models.clip.modeling_clip.CLIPPreTrainedModel

config_class

alias of deepke.name_entity_re.multimodal.models.clip.configuration_clip.CLIPTextConfig

get_input_embeddings() torch.nn.modules.module.Module[source]

Returns the model’s input embeddings.

Returns

A torch module mapping vocabulary to hidden states.

Return type

nn.Module

set_input_embeddings(value)[source]

Set model’s input embeddings.

Parameters

value (nn.Module) – A module mapping vocabulary to hidden states.

forward(input_ids=None, attention_mask=None, position_ids=None, output_attentions=None, output_hidden_states=None, return_dict=None, output_qks=False)[source]
Returns

A BaseModelOutputWithPooling (if return_dict=True is passed or when config.return_dict=True) or a tuple of torch.FloatTensor comprising various elements depending on the configuration (~transformers.) and inputs.

  • last_hidden_state (torch.FloatTensor of shape (batch_size, sequence_length, hidden_size)) – Sequence of hidden-states at the output of the last layer of the model.

  • pooler_output (torch.FloatTensor of shape (batch_size, hidden_size)) – Last layer hidden-state of the first token of the sequence (classification token) further processed by a Linear layer and a Tanh activation function. The Linear layer weights are trained from the next sentence prediction (classification) objective during pretraining.

  • hidden_states (tuple(torch.FloatTensor), optional, returned when output_hidden_states=True is passed or when config.output_hidden_states=True) – Tuple of torch.FloatTensor (one for the output of the embeddings + one for the output of each layer) of shape (batch_size, sequence_length, hidden_size).

    Hidden-states of the model at the output of each layer plus the initial embedding outputs.

  • attentions (tuple(torch.FloatTensor), optional, returned when output_attentions=True is passed or when config.output_attentions=True) – Tuple of torch.FloatTensor (one for each layer) of shape (batch_size, num_heads, sequence_length, sequence_length).

    Attentions weights after the attention softmax, used to compute the weighted average in the self-attention heads.

Examples:

>>> from transformers import CLIPTokenizer, CLIPTextModel

>>> model = CLIPTextModel.from_pretrained("openai/clip-vit-base-patch32")
>>> tokenizer = CLIPTokenizer.from_pretrained("openai/clip-vit-base-patch32")

>>> inputs = tokenizer(["a photo of a cat", "a photo of a dog"],  padding=True, return_tensors="pt")

>>> outputs = model(**inputs)
>>> last_hidden_state = outputs.last_hidden_state
>>> pooled_output = outputs.pooled_output # pooled (EOS token) states

Return type

BaseModelOutputWithPooling or tuple(torch.FloatTensor)

training: bool
class deepke.name_entity_re.multimodal.models.clip.modeling_clip.CLIPVisionTransformer(config: deepke.name_entity_re.multimodal.models.clip.configuration_clip.CLIPVisionConfig)[source]

Bases: torch.nn.modules.module.Module

forward(pixel_values=None, output_attentions=None, output_hidden_states=None, return_dict=None, aux_embeddings=None, rcnn_embeddings=None, output_qks=False)[source]
Returns

A BaseModelOutputWithPooling (if return_dict=True is passed or when config.return_dict=True) or a tuple of torch.FloatTensor comprising various elements depending on the configuration (~transformers.) and inputs.

  • last_hidden_state (torch.FloatTensor of shape (batch_size, sequence_length, hidden_size)) – Sequence of hidden-states at the output of the last layer of the model.

  • pooler_output (torch.FloatTensor of shape (batch_size, hidden_size)) – Last layer hidden-state of the first token of the sequence (classification token) further processed by a Linear layer and a Tanh activation function. The Linear layer weights are trained from the next sentence prediction (classification) objective during pretraining.

  • hidden_states (tuple(torch.FloatTensor), optional, returned when output_hidden_states=True is passed or when config.output_hidden_states=True) – Tuple of torch.FloatTensor (one for the output of the embeddings + one for the output of each layer) of shape (batch_size, sequence_length, hidden_size).

    Hidden-states of the model at the output of each layer plus the initial embedding outputs.

  • attentions (tuple(torch.FloatTensor), optional, returned when output_attentions=True is passed or when config.output_attentions=True) – Tuple of torch.FloatTensor (one for each layer) of shape (batch_size, num_heads, sequence_length, sequence_length).

    Attentions weights after the attention softmax, used to compute the weighted average in the self-attention heads.

Return type

BaseModelOutputWithPooling or tuple(torch.FloatTensor)

training: bool
class deepke.name_entity_re.multimodal.models.clip.modeling_clip.CLIPVisionModel(config: deepke.name_entity_re.multimodal.models.clip.configuration_clip.CLIPVisionConfig)[source]

Bases: deepke.name_entity_re.multimodal.models.clip.modeling_clip.CLIPPreTrainedModel

config_class

alias of deepke.name_entity_re.multimodal.models.clip.configuration_clip.CLIPVisionConfig

get_input_embeddings() torch.nn.modules.module.Module[source]

Returns the model’s input embeddings.

Returns

A torch module mapping vocabulary to hidden states.

Return type

nn.Module

forward(pixel_values=None, output_attentions=None, output_hidden_states=None, return_dict=None)[source]
Returns

A BaseModelOutputWithPooling (if return_dict=True is passed or when config.return_dict=True) or a tuple of torch.FloatTensor comprising various elements depending on the configuration (~transformers.) and inputs.

  • last_hidden_state (torch.FloatTensor of shape (batch_size, sequence_length, hidden_size)) – Sequence of hidden-states at the output of the last layer of the model.

  • pooler_output (torch.FloatTensor of shape (batch_size, hidden_size)) – Last layer hidden-state of the first token of the sequence (classification token) further processed by a Linear layer and a Tanh activation function. The Linear layer weights are trained from the next sentence prediction (classification) objective during pretraining.

  • hidden_states (tuple(torch.FloatTensor), optional, returned when output_hidden_states=True is passed or when config.output_hidden_states=True) – Tuple of torch.FloatTensor (one for the output of the embeddings + one for the output of each layer) of shape (batch_size, sequence_length, hidden_size).

    Hidden-states of the model at the output of each layer plus the initial embedding outputs.

  • attentions (tuple(torch.FloatTensor), optional, returned when output_attentions=True is passed or when config.output_attentions=True) – Tuple of torch.FloatTensor (one for each layer) of shape (batch_size, num_heads, sequence_length, sequence_length).

    Attentions weights after the attention softmax, used to compute the weighted average in the self-attention heads.

Examples:

>>> from PIL import Image
>>> import requests
>>> from transformers import CLIPProcessor, CLIPVisionModel

>>> model = CLIPVisionModel.from_pretrained("openai/clip-vit-base-patch32")
>>> processor = CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")

>>> url = "http://images.cocodataset.org/val2017/000000039769.jpg"
>>> image = Image.open(requests.get(url, stream=True).raw)

>>> inputs = processor(images=image, return_tensors="pt")

>>> outputs = model(**inputs)
>>> last_hidden_state = outputs.last_hidden_state
>>> pooled_output = outputs.pooled_output # pooled CLS states

Return type

BaseModelOutputWithPooling or tuple(torch.FloatTensor)

training: bool
class deepke.name_entity_re.multimodal.models.clip.modeling_clip.CLIPModel(config: deepke.name_entity_re.multimodal.models.clip.configuration_clip.CLIPConfig)[source]

Bases: deepke.name_entity_re.multimodal.models.clip.modeling_clip.CLIPPreTrainedModel

This model is a PyTorch torch.nn.Module subclass. Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage and behavior.

Parameters

config (CLIPConfig) – Model configuration class with all the parameters of the model. Initializing with a config file does not load the weights associated with the model, only the configuration. Check out the from_pretrained() method to load the model weights.

config_class

alias of deepke.name_entity_re.multimodal.models.clip.configuration_clip.CLIPConfig

get_text_features(input_ids=None, attention_mask=None, position_ids=None, output_attentions=None, output_hidden_states=None, return_dict=None)[source]
Returns

The text embeddings obtained by applying the projection layer to the pooled output of CLIPTextModel.

Return type

text_features (torch.FloatTensor of shape (batch_size, output_dim)

Examples:

>>> from transformers import CLIPTokenizer, CLIPModel

>>> model = CLIPModel.from_pretrained("openai/clip-vit-base-patch32")
>>> tokenizer = CLIPTokenizer.from_pretrained("openai/clip-vit-base-patch32")

>>> inputs = tokenizer(["a photo of a cat", "a photo of a dog"],  padding=True, return_tensors="pt")
>>> text_features = model.get_text_features(**inputs)
get_image_features(pixel_values=None, output_attentions=None, output_hidden_states=None, return_dict=None)[source]
Returns

The image embeddings obtained by applying the projection layer to the pooled output of CLIPVisionModel.

Return type

image_features (torch.FloatTensor of shape (batch_size, output_dim)

Examples:

>>> from PIL import Image
>>> import requests
>>> from transformers import CLIPProcessor, CLIPModel

>>> model = CLIPModel.from_pretrained("openai/clip-vit-base-patch32")
>>> processor = CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")

>>> url = "http://images.cocodataset.org/val2017/000000039769.jpg"
>>> image = Image.open(requests.get(url, stream=True).raw)

>>> inputs = processor(images=image, return_tensors="pt")

>>> image_features = model.get_image_features(**inputs)
forward(input_ids=None, pixel_values=None, attention_mask=None, position_ids=None, return_loss=None, output_attentions=None, output_hidden_states=None, return_dict=None)[source]
Returns

A CLIPOutput (if return_dict=True is passed or when config.return_dict=True) or a tuple of torch.FloatTensor comprising various elements depending on the configuration (~transformers.) and inputs.

  • loss (torch.FloatTensor of shape (1,), optional, returned when return_loss is True) – Contrastive loss for image-text similarity.

  • logits_per_image:(:obj:`torch.FloatTensor` of shape (image_batch_size, text_batch_size)) – The scaled dot product scores between image_embeds and text_embeds. This represents the image-text similarity scores.

  • logits_per_text:(:obj:`torch.FloatTensor` of shape (text_batch_size, image_batch_size)) – The scaled dot product scores between text_embeds and image_embeds. This represents the text-image similarity scores.

  • text_embeds(:obj:`torch.FloatTensor` of shape (batch_size, output_dim) – The text embeddings obtained by applying the projection layer to the pooled output of CLIPTextModel.

  • image_embeds(:obj:`torch.FloatTensor` of shape (batch_size, output_dim) – The image embeddings obtained by applying the projection layer to the pooled output of CLIPVisionModel.

  • text_model_output(:obj:`BaseModelOutputWithPooling`): The output of the CLIPTextModel.

  • vision_model_output(:obj:`BaseModelOutputWithPooling`): The output of the CLIPVisionModel.

Examples:

>>> from PIL import Image
>>> import requests
>>> from transformers import CLIPProcessor, CLIPModel

>>> model = CLIPModel.from_pretrained("openai/clip-vit-base-patch32")
>>> processor = CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")

>>> url = "http://images.cocodataset.org/val2017/000000039769.jpg"
>>> image = Image.open(requests.get(url, stream=True).raw)

>>> inputs = processor(text=["a photo of a cat", "a photo of a dog"], images=image, return_tensors="pt", padding=True)

>>> outputs = model(**inputs)
>>> logits_per_image = outputs.logits_per_image # this is the image-text similarity score
>>> probs = logits_per_image.softmax(dim=1) # we can take the softmax to get the label probabilities

Return type

CLIPOutput or tuple(torch.FloatTensor)

training: bool

deepke.name_entity_re.multimodal.models.clip.processing_clip module

Image/Text processor class for CLIP

class deepke.name_entity_re.multimodal.models.clip.processing_clip.CLIPProcessor(feature_extractor, tokenizer)[source]

Bases: object

Constructs a CLIP processor which wraps a CLIP feature extractor and a CLIP tokenizer into a single processor.

[CLIPProcessor] offers all the functionalities of [CLIPFeatureExtractor] and [CLIPTokenizer]. See the [~CLIPProcessor.__call__] and [~CLIPProcessor.decode] for more information.

Parameters
  • feature_extractor ([CLIPFeatureExtractor]) – The feature extractor is a required input.

  • tokenizer ([CLIPTokenizer]) – The tokenizer is a required input.

save_pretrained(save_directory)[source]

Save a CLIP feature extractor object and CLIP tokenizer object to the directory save_directory, so that it can be re-loaded using the [~CLIPProcessor.from_pretrained] class method.

<Tip>

This class method is simply calling [~PreTrainedFeatureExtractor.save_pretrained] and [~tokenization_utils_base.PreTrainedTokenizer.save_pretrained]. Please refer to the docstrings of the methods above for more information.

</Tip>

Parameters

save_directory (str or os.PathLike) – Directory where the feature extractor JSON file and the tokenizer files will be saved (directory will be created if it does not exist).

classmethod from_pretrained(pretrained_model_name_or_path, **kwargs)[source]

Instantiate a [CLIPProcessor] from a pretrained CLIP processor.

<Tip>

This class method is simply calling CLIPFeatureExtractor’s [~PreTrainedFeatureExtractor.from_pretrained] and CLIPTokenizer’s [~tokenization_utils_base.PreTrainedTokenizer.from_pretrained]. Please refer to the docstrings of the methods above for more information.

</Tip>

Parameters
  • pretrained_model_name_or_path (str or os.PathLike) –

    This can be either:

    • a string, the model id of a pretrained feature_extractor hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like clip-vit-base-patch32, or namespaced under a user or organization name, like openai/clip-vit-base-patch32.

    • a path to a directory containing a feature extractor file saved using the [~PreTrainedFeatureExtractor.save_pretrained] method, e.g., ./my_model_directory/.

    • a path or url to a saved feature extractor JSON file, e.g., ./my_model_directory/preprocessor_config.json.

  • **kwargs – Additional keyword arguments passed along to both [PreTrainedFeatureExtractor] and [PreTrainedTokenizer]

batch_decode(*args, **kwargs)[source]

This method forwards all its arguments to CLIPTokenizer’s [~PreTrainedTokenizer.batch_decode]. Please refer to the docstring of this method for more information.

decode(*args, **kwargs)[source]

This method forwards all its arguments to CLIPTokenizer’s [~PreTrainedTokenizer.decode]. Please refer to the docstring of this method for more information.

deepke.name_entity_re.multimodal.models.clip.tokenization_clip module

Tokenization classes for CLIP.

deepke.name_entity_re.multimodal.models.clip.tokenization_clip.bytes_to_unicode()[source]

Returns list of utf-8 byte and a mapping to unicode strings. We specifically avoids mapping to whitespace/control characters the bpe code barfs on.

The reversible bpe codes work on unicode strings. This means you need a large # of unicode characters in your vocab if you want to avoid UNKs. When you’re at something like a 10B token dataset you end up needing around 5K for decent coverage. This is a significant percentage of your normal, say, 32K bpe vocab. To avoid that, we want lookup tables between utf-8 bytes and unicode strings.

deepke.name_entity_re.multimodal.models.clip.tokenization_clip.get_pairs(word)[source]

Return set of symbol pairs in a word.

Word is represented as tuple of symbols (symbols being variable-length strings).

deepke.name_entity_re.multimodal.models.clip.tokenization_clip.whitespace_clean(text)[source]
class deepke.name_entity_re.multimodal.models.clip.tokenization_clip.CLIPTokenizer(vocab_file, merges_file, errors='replace', unk_token='<|endoftext|>', bos_token='<|startoftext|>', eos_token='<|endoftext|>', pad_token='<|endoftext|>', add_prefix_space=False, do_lower_case=True, **kwargs)[source]

Bases: transformers.tokenization_utils.PreTrainedTokenizer

Construct a CLIP tokenizer. Based on byte-level Byte-Pair-Encoding.

This tokenizer has been trained to treat spaces like parts of the tokens (a bit like sentencepiece) so a word will be encoded differently whether it is at the beginning of the sentence (without space) or not:

You can get around that behavior by passing add_prefix_space=True when instantiating this tokenizer or when you call it on some text, but since the model was not pretrained this way, it might yield a decrease in performance.

<Tip>

When used with is_split_into_words=True, this tokenizer will add a space before each word (even the first one).

</Tip>

This tokenizer inherits from [PreTrainedTokenizer] which contains most of the main methods. Users should refer to this superclass for more information regarding those methods.

Parameters
  • vocab_file (str) – Path to the vocabulary file.

  • merges_file (str) – Path to the merges file.

  • errors (str, optional, defaults to “replace”) – Paradigm to follow when decoding bytes to UTF-8. See [bytes.decode](https://docs.python.org/3/library/stdtypes.html#bytes.decode) for more information.

  • unk_token (str, optional, defaults to <|endoftext|>) – The unknown token. A token that is not in the vocabulary cannot be converted to an ID and is set to be this token instead.

  • bos_token (str, optional, defaults to <|endoftext|>) – The beginning of sequence token.

  • eos_token (str, optional, defaults to <|endoftext|>) – The end of sequence token.

  • add_prefix_space (bool, optional, defaults to False) – Whether or not to add an initial space to the input. This allows to treat the leading word just as any other word. (CLIP tokenizer detect beginning of words by the preceding space).

vocab_files_names: Dict[str, str] = {'merges_file': 'merges.txt', 'vocab_file': 'vocab.json'}
pretrained_vocab_files_map: Dict[str, Dict[str, str]] = {'merges_file': {'openai/clip-vit-base-patch32': 'https://huggingface.co/openai/clip-vit-base-patch32/resolve/main/merges.txt'}, 'vocab_file': {'openai/clip-vit-base-patch32': 'https://huggingface.co/openai/clip-vit-base-patch32/resolve/main/vocab.json'}}
max_model_input_sizes: Dict[str, Optional[int]] = {'openai/clip-vit-base-patch32': 77}
model_input_names: List[str] = ['input_ids', 'attention_mask']
property pad_token_id: Optional[int]

Id of the padding token in the vocabulary. Returns None if the token has not been set.

Type

Optional[int]

property vocab_size

Size of the base vocabulary (without the added tokens).

Type

int

get_vocab()[source]

Returns the vocabulary as a dictionary of token to index.

tokenizer.get_vocab()[token] is equivalent to tokenizer.convert_tokens_to_ids(token) when token is in the vocab.

Returns

The vocabulary.

Return type

Dict[str, int]

build_inputs_with_special_tokens(token_ids_0: List[int], token_ids_1: Optional[List[int]] = None) List[int][source]

Build model inputs from a sequence or a pair of sequence for sequence classification tasks by concatenating and adding special tokens. A CLIP sequence has the following format:

  • single sequence: <|startoftext|> X <|endoftext|>

Pairs of sequences are not the expected use case, but they will be handled without a separator.

Parameters
  • token_ids_0 (List[int]) – List of IDs to which the special tokens will be added.

  • token_ids_1 (List[int], optional) – Optional second list of IDs for sequence pairs.

Returns

List of [input IDs](../glossary#input-ids) with the appropriate special tokens.

Return type

List[int]

get_special_tokens_mask(token_ids_0: List[int], token_ids_1: Optional[List[int]] = None, already_has_special_tokens: bool = False) List[int][source]

Retrieve sequence ids from a token list that has no special tokens added. This method is called when adding special tokens using the tokenizer prepare_for_model method.

Parameters
  • token_ids_0 (List[int]) – List of IDs.

  • token_ids_1 (List[int], optional) – Optional second list of IDs for sequence pairs.

  • already_has_special_tokens (bool, optional, defaults to False) – Whether or not the token list is already formatted with special tokens for the model.

Returns

A list of integers in the range [0, 1]: 1 for a special token, 0 for a sequence token.

Return type

List[int]

bpe(token)[source]
convert_tokens_to_string(tokens)[source]

Converts a sequence of tokens (string) in a single string.

save_vocabulary(save_directory: str, filename_prefix: Optional[str] = None) Tuple[str][source]

Save only the vocabulary of the tokenizer (vocabulary + added tokens).

This method won’t save the configuration and special token mappings of the tokenizer. Use _save_pretrained() to save the whole state of the tokenizer.

Parameters
  • save_directory (str) – The directory in which to save the vocabulary.

  • filename_prefix (str, optional) – An optional prefix to add to the named of the saved files.

Returns

Paths to the files saved.

Return type

Tuple(str)

prepare_for_tokenization(text, is_split_into_words=False, **kwargs)[source]

Performs any necessary transformations before tokenization.

This method should pop the arguments from kwargs and return the remaining kwargs as well. We test the kwargs at the end of the encoding process to be sure all the arguments have been used.

Parameters
  • test (str) – The text to prepare.

  • is_split_into_words (bool, optional, defaults to False) – Whether or not the text has been pretokenized.

  • kwargs – Keyword arguments to use for the tokenization.

Returns

The prepared text and the unused kwargs.

Return type

Tuple[str, Dict[str, Any]]