Module hdlib.vector

Implementation of hyperdimensional Vector.

hdlib provides the Vector class under hdlib.vector for building the abstract representation of hyperdimensional vectors.

Classes

class Vector (name: Optional[str] = None, size: int = 10000, vector: Optional[numpy.ndarray] = None, vtype: str = 'bipolar', tags: Optional[Set[Union[str, int, float]]] = None, seed: Optional[int] = None, warning: bool = False, from_file: Optional[] = None)

Vector object.

Initialize a Vector object.

Parameters

name : str, optional
The unique identifier of the Vector object. A random UUID v4 is generated if not specified.
size : int, optional, default 10000
The size of the vector. It is 10,000 by default and cannot be less than that.
vector : numpy.ndarray, optional, default None
The actual vector. A random vector is created if not specified.
vtype : {'binary', 'bipolar'}, default 'bipolar'
The vector type.
tags : set, default None
An optional set of vector tags. Tags can be str, int, and float.
seed : int, default None
An optional seed for reproducibly generating the vector numpy.ndarray randomly.
warning : bool, default False
Print warning messages if True.
from_file : str, default None
Path to a pickle file. Used to load a Vector object from file.

Returns

Vector
A new Vector object.

Raises

Exception
If the pickle object in from_file is not instance of Vector.
FileNotFoundError
If from_file is not None but the file does not exist.
TypeError
  • if the vector name is not instance of a primitive;
  • if tags is not an instance of set;
  • if vector is not an instance of numpy.ndarray;
  • if size is not an integer number.
ValueError
  • if vtype is different than 'binary' or 'bipolar';
  • if size is lower than 1,000.

Examples

>>> from hdlib.space import Vector
>>> vector = Vector()
>>> type(vector)
<class 'hdlib.space.Vector'>

A new bipolar vector with a size of 1,000 is created by default.

>>> vector = Vector(size=10)
ValueError: Vector size must be greater than or equal to 1000

This throws a ValueError since the vector size cannot be less than 1,000.

>>> vector1 = Vector()
>>> vector1.dump(to_file='~/my_vector.pkl')
>>> vector2 = Vector(from_file='~/my_vector.pkl')
>>> type(vector2)
<class 'hdlib.space.Vector'>

This creates a random bipolar vector vector1, dumps the object to a pickle file under the home directory, and finally create a new vector object vector2 from the pickle file.

Expand source code
class Vector(object):
    """Vector object."""

    def __init__(
        self,
        name: Optional[str]=None,
        size: int=10000,
        vector: Optional[np.ndarray]=None,
        vtype: str="bipolar",
        tags: Optional[Set[Union[str, int, float]]]=None,
        seed: Optional[int]=None,
        warning: bool=False,
        from_file: Optional[os.path.abspath]=None,
    ) -> "Vector":
        """Initialize a Vector object.

        Parameters
        ----------
        name : str, optional
            The unique identifier of the Vector object. A random UUID v4 is generated if not specified.
        size : int, optional, default 10000
            The size of the vector. It is 10,000 by default and cannot be less than that.
        vector : numpy.ndarray, optional, default None
            The actual vector. A random vector is created if not specified.
        vtype : {'binary', 'bipolar'}, default 'bipolar'
            The vector type.
        tags : set, default None
            An optional set of vector tags. Tags can be str, int, and float.
        seed : int, default None
            An optional seed for reproducibly generating the vector numpy.ndarray randomly.
        warning : bool, default False
            Print warning messages if True.
        from_file : str, default None
            Path to a pickle file. Used to load a Vector object from file.

        Returns
        -------
        Vector
            A new Vector object.

        Raises
        ------
        Exception
            If the pickle object in `from_file` is not instance of Vector.
        FileNotFoundError
            If `from_file` is not None but the file does not exist.
        TypeError
            - if the vector name is not instance of a primitive;
            - if `tags` is not an instance of set;
            - if `vector` is not an instance of numpy.ndarray;
            - if `size` is not an integer number.
        ValueError
            - if `vtype` is different than 'binary' or 'bipolar';
            - if `size` is lower than 1,000.

        Examples
        --------
        >>> from hdlib.space import Vector
        >>> vector = Vector()
        >>> type(vector)
        <class 'hdlib.space.Vector'>

        A new bipolar vector with a size of 1,000 is created by default.

        >>> vector = Vector(size=10)
        ValueError: Vector size must be greater than or equal to 1000

        This throws a ValueError since the vector size cannot be less than 1,000.

        >>> vector1 = Vector()
        >>> vector1.dump(to_file='~/my_vector.pkl')
        >>> vector2 = Vector(from_file='~/my_vector.pkl')
        >>> type(vector2)
        <class 'hdlib.space.Vector'>

        This creates a random bipolar vector `vector1`, dumps the object to a pickle file under the home directory,
        and finally create a new vector object `vector2` from the pickle file.
        """

        # Conditions on vector name or ID
        # Vector name is casted to string. For this reason, only Python primitives are allowed
        # A random name is assigned if not specified
        try:
            if name is None:
                name = str(uuid.uuid4())

            else:
                name = str(name)

            self.name = name

        except:
            raise TypeError("Vector name must be instance of a primitive")

        # Register random seed for reproducibility
        self.seed = seed

        # Take track of the hdlib version
        self.version = __version__

        if tags and not isinstance(tags, set):
            raise TypeError("Tags must be a set")

        # Add tags
        self.tags = tags if tags else set()

        # Add links
        # Used to link Vectors by their names or IDs
        self.parents = set()
        self.children = set()

        # Conditions on vector
        # It must be a numpy.ndarray
        # A random vector is generated if not specified
        if vector is not None:
            if not isinstance(vector, np.ndarray):
                raise TypeError("Vector must be instance of numpy.ndarray")

            self.vector = vector

            self.size = len(self.vector)

            if self.size < 1000:
                raise ValueError("Vector size must be greater than or equal to 1000")

            self.vtype = vtype

            # Try to infer the vector type from the content of the vector itself
            # only in the case where the elements are not all 1s
            if not (self.vector == 1).all():
                if ((self.vector == 0) | (self.vector == 1)).all():
                    self.vtype = "binary"

                elif ((self.vector == -1) | (self.vector == 1)).all():
                    self.vtype = "bipolar"

                else:
                    if warning:
                        print("Vector type can be binary or bipolar only")

        elif from_file:
            if not os.path.isfile(from_file):
                raise FileNotFoundError(errno.ENOENT, os.strerror(errno.ENOENT), from_file)

            else:
                # Load vector from pickle file
                with open(from_file, "rb") as pkl:
                    from_file_obj = pickle.load(pkl)

                    if not isinstance(from_file_obj, type(self)):
                        raise Exception("Pickle object is not instance of {}".format(type(self)))

                    self.__dict__.update(from_file_obj.__dict__)

                    if self.version != __version__:
                        print("Warning: the specified Vector has been created with a different version of hdlib")

        else:
            # Conditions on vector size
            # It must be an integer number greater than or equal to 10000
            # This size makes sure that vectors are quasi-orthogonal in space
            if not isinstance(size, int):
                raise TypeError("Vector size must be an integer number")

            if size < 1000:
                raise ValueError("Vector size must be greater than or equal to 1000")

            self.size = size

            if vtype not in ("bipolar", "binary"):
                raise ValueError("Vector type can be binary or bipolar only")

            # Add vector type
            self.vtype = vtype.lower()

            if seed is None:
                rand = np.random.default_rng()

            else:
                # Conditions on random seed for reproducibility
                # numpy allows integers as random seeds
                if not isinstance(seed, int):
                    raise TypeError("Seed must be an integer number")

                rand = np.random.default_rng(seed=self.seed)

            """
            # Use the following statements to generate random vectors with real numbers
            self.vector = rand.uniform(low=-1.0, high=1.0, size=(self.size,))
            self.vector /= np.linalg.norm(self.vector)
            """

            # Build a random binary vector
            self.vector = rand.integers(2, size=size)

            if vtype == "bipolar":
                # Build a random bipolar vector
                self.vector = 2 * self.vector - 1

    def __len__(self) -> int:
        """Get the vector size.

        Returns
        -------
        int
            The vector size.

        Examples
        --------
        >>> from hdlib.space import Vector
        >>> vector = Vector()
        >>> len(vector)
        10000

        Return the vector size, which is 10,000 by default here
        """

        return self.size

    def __str__(self) -> str:
        """Print the Vector object properties.

        Returns
        -------
        str
            A description of the Vector object. It reports the name, seed, size,
            vector type, tags, and the actual vector.

        Examples
        --------
        >>> from hdlib.space import Vector
        >>> vector = Vector()
        >>> print(vector)

                Class:   hdlib.space.Vector
                Version: 0.1.17
                Name:    89ea628b-3d29-47e1-9d10-34bdbfce8d40
                Seed:    None
                Size:    10000
                Type:    bipolar
                Tags:

                []

                Vector:

                [ 1 -1 -1 ... -1  1 -1]

        Print the Vector object properties. The name has been generated as a UUID v4, while
        the vector size and type are 10,000 and 'bipolar' by default. No tags have been specified.
        Thus, the set of vector tags is empty.
        """

        return """
            Class:   hdlib.space.Vector
            Version: {}
            Name:    {}
            Seed:    {}
            Size:    {}
            Type:    {}
            Tags:

            {}

            Vector:

            {}
        """.format(
            self.version,
            self.name,
            self.seed,
            self.size,
            self.vtype,
            np.array(list(self.tags)),
            self.vector
        )

    def __add__(self, vector: "Vector") -> "Vector":
        """Implement the addition operator between two Vector objects as bundle.

        Returns
        -------
        Vector
            A new vector object as the result of the bundle operator on the two input vectors.

        Raises
        ------
        TypeError
            If the input `vector` is not instance of the Vector class.

        Examples
        --------
        >>> from hdlib.space import Vector
        >>> vector1 = Vector()
        >>> vector2 = Vector()
        >>> vector3 = vector1 + vector2
        >>> type(vector3)
        <class 'hdlib.space.Vector'>

        The bundle function returns a new Vector object whose content is computed as the element-wise sum 
        of the two input vectors.
        """

        if not isinstance(vector, type(self)):
            raise TypeError("Cannot apply the bundle operator to non-Vector objects")

        # Import arithmetic.bundle here to avoid circular imports
        from hdlib.arithmetic import bundle as bundle_operator

        return bundle_operator(self, vector)

    def __sub__(self, vector: "Vector") -> "Vector":
        """Implement the subtraction operator between two Vector objects.

        Returns
        -------
        Vector
            A new vector object as the result of the subtraction operator on the two input vectors.

        Raises
        ------
        TypeError
            If the input `vector` is not instance of the Vector class.

        Examples
        --------
        >>> from hdlib.space import Vector
        >>> vector1 = Vector()
        >>> vector2 = Vector()
        >>> vector3 = vector1 - vector2
        >>> type(vector3)
        <class 'hdlib.space.Vector'>

        The subtraction operation returns a new Vector object whose content is computed as the element-wise 
        subtraction of the two input vectors.
        """

        if not isinstance(vector, type(self)):
            raise TypeError("Cannot apply the subtraction operator to non-Vector objects")

        # Import arithmetic.bind here to avoid circular imports
        from hdlib.arithmetic import subtraction as subtraction_operator

        return subtraction_operator(self, vector)

    def __mul__(self, vector: "Vector") -> "Vector":
        """Implement the multiplication operator between two Vector objects as bind.

        Returns
        -------
        Vector
            A new vector object as the result of the bind operator on the two input vectors.

        Raises
        ------
        TypeError
            If the input `vector` is not instance of the Vector class.

        Examples
        --------
        >>> from hdlib.space import Vector
        >>> vector1 = Vector()
        >>> vector2 = Vector()
        >>> vector3 = vector1 * vector2
        >>> type(vector3)
        <class 'hdlib.space.Vector'>

        The bind function returns a new Vector object whose content is computed as the element-wise 
        multiplication of the two input vectors.
        """

        if not isinstance(vector, type(self)):
            raise TypeError("Cannot apply the bind operator to non-Vector objects")

        # Import arithmetic.bind here to avoid circular imports
        from hdlib.arithmetic import bind as bind_operator

        return bind_operator(self, vector)

    def dist(self, vector: "Vector", method: str="cosine") -> float:
        """Compute distance between vectors.

        Parameters
        ----------
        vector : Vector
            A Vector object from which the distance must be computed.
        method : {'cosine', 'euclidean', 'hamming'}, optional, default 'cosine'
            The distance method.

        Returns
        -------
        float
            The distance between the current Vector object and the input `vector`.

        Raises
        ------
        Exception
            If the current vector has a different size or vector type than the input vector.

        Examples
        --------
        >>> from hdlib.space import Vector
        >>> vector1 = Vector(seed=1)
        >>> vector2 = Vector(seed=2)
        >>> vector1.dist(vector2, method='cosine')
        0.996

        Generate two random bipolar vectors and compute the distance between them.
        """

        if self.size != vector.size:
            raise Exception("Vectors must have the same size")

        if self.vtype != vector.vtype:
            raise Exception("Vectors must be of the same type")

        if method.lower() == "cosine":
            return 1 - np.dot(self.vector, vector.vector) / (np.linalg.norm(self.vector) * np.linalg.norm(vector.vector))

        elif method.lower() == "hamming":
            return np.count_nonzero(self.vector != vector.vector)

        elif method.lower() == "euclidean":
            return np.linalg.norm(self.vector - vector.vector)

        else:
            raise ValueError("Distance method \"{}\" is not supported".format(method))

    def normalize(self) -> None:
        """Normalize a vector after a binding or bundling with another vector.

        Raises
        ------
        Exception
            If the vector type is not supported (i.e., is different from binary and bipolar).

        Examples
        --------
        >>> from hdlib.space import Vector
        >>> from hdlib.arithmetic import bind
        >>> vector1 = Vector()
        >>> vector2 = Vector()
        >>> vector3 = bind(vector1, vector2)
        >>> vector3.normalize()
        >>> ((vector3.vector == -1) | (vector3.vector == 1)).all()
        True

        Binding or bundling two vectors can produce a new vector whose vtype is different from the
        one of the two input vector. This function normalizes the vector content in accordance to
        its vector type.
        """

        if self.vtype not in ("bipolar", "binary"):
            raise Exception("Vector type is not supported")

        self.vector[self.vector > 0] = 1

        self.vector[self.vector <= 0] = 0 if self.vtype == "binary" else -1

    def bind(self, vector: "Vector") -> None:
        """Bind the current vector with another vector object inplace.

        Parameters
        ----------
        vector : Vector
            The input Vector object.

        Examples
        --------
        >>> from hdlib.space import Vector
        >>> vector1 = Vector()
        >>> vector2 = Vector()
        >>> vector1.bind(vector2)

        It overrides the actual vector content of `vector1` with the result of the binding with `vector2`.
        Refers to hdlib.arithmetic.bind for additional information.
        """

        # Import arithmetic.bind here to avoid circular imports
        from hdlib.arithmetic import bind as bind_operator

        self.__override_object(bind_operator(self, vector))

    def bundle(self, vector: "Vector") -> None:
        """Bundle the current vector with another vector object inplace.

        Parameters
        ----------
        vector : Vector
            The input Vector object.

        Examples
        --------
        >>> from hdlib.space import Vector
        >>> vector1 = Vector()
        >>> vector2 = Vector()
        >>> vector1.bundle(vector2)

        It overrides the actual vector content of `vector1` with the result of the bundling with `vector2`.
        Refers to hdlib.arithmetic.bundle for additional information.
        """

        # Import arithmetic.bundle here to avoid circular imports
        from hdlib.arithmetic import bundle as bundle_operator

        self.__override_object(bundle_operator(self, vector))

    def subtraction(self, vector: "Vector") -> None:
        """Subtract a vector from the current vector object inplace.

        Parameters
        ----------
        vector : Vector
            The input Vector object.

        Examples
        --------
        >>> from hdlib.space import Vector
        >>> vector1 = Vector()
        >>> vector2 = Vector()
        >>> vector1.subtract(vector2)

        It overrides the actual vector content of `vector1` with the result of the subtraction with `vector2`.
        Refers to hdlib.arithmetic.subtraction for additional information.
        """

        # Import arithmetic.subtraction here to avoid circular imports
        from hdlib.arithmetic import subtraction as subtraction_operator

        self.__override_object(subtraction_operator(self, vector))

    def permute(self, rotate_by: int=1) -> None:
        """Permute the current vector inplace.

        Parameters
        ----------
        rotate_by : int
            Rotate the input vector by `rotate_by` positions (the default is 1).

        Examples
        --------
        >>> from hdlib.space import Vector
        >>> vector = Vector()
        >>> vector.permute(rotate_by=2)

        It overrides the actual vector content of `vector` with the result of applying the permute function inplace.
        Refers to hdlib.arithmetic.permute for additional information.
        """

        # Import arithmetic.permute here to avoid circular imports
        from hdlib.arithmetic import permute as permute_operator

        self.__override_object(permute_operator(self, rotate_by=rotate_by))

    def __override_object(self, vector: "Vector") -> None:
        """Override the Vector object with another Vector object. This is a private method.

        Parameters
        ----------
        vector : Vector
            The input vector from which properties are inherited to the current vector.
        """

        self.name = vector.name
        self.size = vector.size
        self.seed = vector.seed
        self.tags = vector.tags

        self.parents = vector.parents
        self.children = vector.children

        self.vtype = vector.vtype
        self.vector = vector.vector

        self.version = vector.version

    def dump(self, to_file: Optional[os.path.abspath]=None) -> None:
        """Dump the Vector object to a pickle file.

        Parameters
        ----------
        to_file
            Path to the file used to dump the Vector object to.

        Raises
        ------
        Exception
            If the `to_file` file already exists.

        Examples
        --------
        >>> import os
        >>> from hdlib.space import Vector
        >>> vector = Vector()
        >>> vector.dump(to_file='~/my_vector.pkl')
        >>> os.path.isfile('~/my_vector.pkl')
        True

        Create a Vector object and dump it to a pickle file under the home directory.
        """

        if not to_file:
            # Dump the vector to a pickle file in the current working directory if not file path is provided
            to_file = os.path.join(os.getcwd(), "{}.pkl".format(self.name))

        if os.path.isfile(to_file):
            raise Exception("The output file already exists!\n{}".format(to_file))

        with open(to_file, "wb") as pkl:
            pickle.dump(self, pkl)

Methods

def bind(self, vector: Vector) ‑> None

Bind the current vector with another vector object inplace.

Parameters

vector : Vector
The input Vector object.

Examples

>>> from hdlib.space import Vector
>>> vector1 = Vector()
>>> vector2 = Vector()
>>> vector1.bind(vector2)

It overrides the actual vector content of vector1 with the result of the binding with vector2. Refers to hdlib.arithmetic.bind for additional information.

def bundle(self, vector: Vector) ‑> None

Bundle the current vector with another vector object inplace.

Parameters

vector : Vector
The input Vector object.

Examples

>>> from hdlib.space import Vector
>>> vector1 = Vector()
>>> vector2 = Vector()
>>> vector1.bundle(vector2)

It overrides the actual vector content of vector1 with the result of the bundling with vector2. Refers to hdlib.arithmetic.bundle for additional information.

def dist(self, vector: Vector, method: str = 'cosine') ‑> float

Compute distance between vectors.

Parameters

vector : Vector
A Vector object from which the distance must be computed.
method : {'cosine', 'euclidean', 'hamming'}, optional, default 'cosine'
The distance method.

Returns

float
The distance between the current Vector object and the input vector.

Raises

Exception
If the current vector has a different size or vector type than the input vector.

Examples

>>> from hdlib.space import Vector
>>> vector1 = Vector(seed=1)
>>> vector2 = Vector(seed=2)
>>> vector1.dist(vector2, method='cosine')
0.996

Generate two random bipolar vectors and compute the distance between them.

def dump(self, to_file: Optional[] = None) ‑> None

Dump the Vector object to a pickle file.

Parameters

to_file
Path to the file used to dump the Vector object to.

Raises

Exception
If the to_file file already exists.

Examples

>>> import os
>>> from hdlib.space import Vector
>>> vector = Vector()
>>> vector.dump(to_file='~/my_vector.pkl')
>>> os.path.isfile('~/my_vector.pkl')
True

Create a Vector object and dump it to a pickle file under the home directory.

def normalize(self) ‑> None

Normalize a vector after a binding or bundling with another vector.

Raises

Exception
If the vector type is not supported (i.e., is different from binary and bipolar).

Examples

>>> from hdlib.space import Vector
>>> from hdlib.arithmetic import bind
>>> vector1 = Vector()
>>> vector2 = Vector()
>>> vector3 = bind(vector1, vector2)
>>> vector3.normalize()
>>> ((vector3.vector == -1) | (vector3.vector == 1)).all()
True

Binding or bundling two vectors can produce a new vector whose vtype is different from the one of the two input vector. This function normalizes the vector content in accordance to its vector type.

def permute(self, rotate_by: int = 1) ‑> None

Permute the current vector inplace.

Parameters

rotate_by : int
Rotate the input vector by rotate_by positions (the default is 1).

Examples

>>> from hdlib.space import Vector
>>> vector = Vector()
>>> vector.permute(rotate_by=2)

It overrides the actual vector content of vector with the result of applying the permute function inplace. Refers to hdlib.arithmetic.permute for additional information.

def subtraction(self, vector: Vector) ‑> None

Subtract a vector from the current vector object inplace.

Parameters

vector : Vector
The input Vector object.

Examples

>>> from hdlib.space import Vector
>>> vector1 = Vector()
>>> vector2 = Vector()
>>> vector1.subtract(vector2)

It overrides the actual vector content of vector1 with the result of the subtraction with vector2. Refers to hdlib.arithmetic.subtraction for additional information.