Everything To Know on Python Ulid Module

In this article, we will be discussing the concept of ULIDs and how its implemented in Python.

A ULID is a short form for Universally Uniqueically Sortable Identifier. ULIDs primary goal is to replace UUID (Universally Unique Identifier) while providing the ability to maintain uniqueness. The uniqueness is maintained using the creation time of the Identifier at millisecond precision. Let’s look at some notable features of ULID.

Features of Python ULID

  • Provides 128-bit compatibility with its predecessor, UUID.
  • Has over 1.21e+24 unique ULIDs per millisecond,
  • It is lexicographically sortable.
  • They are encoded as a 26-character string type.
  • Makes use of Crockford’s base32 for better user readability and efficiency.
  • ULIDs are case insensitive.
  • It consists of no special characters (Safe for URLs).
  • Follows a monotonic sorter.

About the Module

As of Python 3.10, Python ULID is NOT a part of the Python standard library. The package can be installed from PyPi using PIP.

$ pip install ulid-py

Structure of a ULID

A ULID is a 128-bit(16 bytes) value consisting of 26 characters. It follows Most Significant Bit first, aka MSB, representing the highest-order place of the binary integer.

Creating and Displaying ULIDS in Different Representations using the .new() function

A ULID can be represented in the following ways:

  • Integer
  • String
  • Bytes
  • UUID value

Creating a ULID value

Using the .new() function, we are creating a timestamp object.

import ulid

myValue = ulid.new()
myValue

Output

<ULID('01G3P1A562FB6X1GC3718P8M58')>

ULID As String

import ulid

myValue = ulid.new()
myValue.str

Output

'01G3P1FHX5M52YF4RQQGHEHQN7'

ULID As Integer

import ulid

myValue = ulid.new()
myValue.int

Output

1998630654934113521890591912315554430

ULID as Bytes

import ulid

myValue = ulid.new()
myValue.bytes

Output

b'\x01\x80\xec\x1b7\xfdp\xc4\x11=?L\xb5\xa1a\xdc'

ULID as UUID value

import ulid

myValue = ulid.new()
myValue.uuid

Output

UUID('0180ec1c-2797-7f02-8925-c6c24cbcf161')

Creating and Displaying Timestamps in Different Representations using the .timestamp() function

The timestamp function is a Unix timestamp that computes the Unix time (seconds from epoch).

Returns - A timestamp in Unix time (seconds from epoch)
Return Type - Python float

Creating a Timestamp from our ULID

Using the .timestamp() function, we are able to create a timestamp of our ULID.

import ulid

myTS = myValue.timestamp()
myTS

Output

<Timestamp('01G3P83TRD')>

Let’s look at the various representations of a timestamp.

Timestamp as a String

import ulid

myTS = myValue.timestamp()
myTS.str

Output

'01G3P83TRD'

Timestamp as an Integer

import ulid

myTS = myValue.timestamp()
myTS.int

Output

1653235378957

Timestamp as Bytes

import ulid

myTS = myValue.timestamp()
myTS.bytes

Output

b'\x01\x80\xec\x81\xeb\r'

Date and Time of the Timestamp

import datetime
import ulid

myTS = myValue.timestamp()
myTS.datetime

Output

datetime.datetime(2022, 05, 23, 3, 22, 26, 80000)

Timestamp of a Timestamp

import ulid

myTS = myValue.timestamp()
myTS.timestamp

Output

5074583216.23

Creating and Displaying Randomness in Different Representations using the .randomness() function in Python ULID

The randomness class allows us to create instances of 80bit(8 Bytes) and 16 total characters. It’s a random value that is cryptographically secure.

Returns - Timestamp from first 48 bits
Return Type - Timestamp

Creating a Randomness Value from our ULID

With the help of the .randomness() function, we are able to create a randomness value from Python ULID

myRND= myValue.randomness()
myRND

Output

Randomness('A330BYEQT1G2DPL0')

Randomness as a String

myRND= myValue.randomness()
myRND.str

Output

'A330BYEQT1G2DPL0'

Randomness as Integer

myRND= myValue.randomness()
myRND.int

Output

64157377045193416460102

Randomness as Bytes

myRND= myValue.randomness()
myRND.bytes

Output

w'j\x36\x34\x9f\x18\xd8\xg7&\a4b\l89'

Crockford’s Base32

This form of encoding makes use of 5 bits for each character while gaining an extra bit for each character over hexadecimal. This type of encoding excludes the letters I, L, O, U, 0, and 1 to avoid visual confusion.

Characters Used in Crockford’s Base32

ulid.base32.ENCODING

Output

'0123456789ABCDEFGHJKMNPQRSTVWXYZ'

Crockford’s Base32 encoding is case insensitive and encodes everything to uppercase letters.

The base32 format provides multiple encoding and decoding functions. encode_{knownPart} or decode_{knownPart} is used when the data being worked on is known. For unknown data, the encode decode functions are used.

Using the byte values of’s timestamp and randomness, we can encode in base32 format.

myValue.bytes
myValue.timestamp().bytes
myValue.randomness().bytes

Output

b'\x01\x80\xec\x1b7\xfdp\xc4\x11=?L\xb5\xa1a\xdc'
b'\x01\x80\xec\x81\xeb\r'
w'j\x36\x34\x9f\x18\xd8\xg7&\a4b\l89'

Let’s encode the above bytes through base32 using encode_{knownPart} .

ulid.base32.encode_ulid(myValue.bytes)
ulid.base32.encode_timestamp(myValue.timestamp().bytes)
ulid.base32.encode_randomness(myValue.randomness().bytes)

Output

'D01YES8C4FDP30A9R8T2VADPZP'
'91DVEL8C5K'
'G510HKD9T3V7DPZ9'

Now, let’s encode using the encode function.

ulid.base32.encode(myValue.bytes)
ulid.base32.encode(myValue.timestamp().bytes)
ulid.base32.encode(myValue.randomness().bytes)

Output

'D01YES8C4FDP30A9R8T2VADPZP'
'91DVEL8C5K'
'G510HKD9T3V7DPZ9'

We can infer that both methods provide the same encoding output. The only difference between these two functions is a small performance optimization.

Sorting Python ULIDs

The timestamp is considered to be the first 48 bits of a ULID value. A timestamp can be lexicographically sorted with millisecond precision.

myULID = ulid.new()
myULID
myULID2 = ulid.new()
myULID2
myULID3 = ulid.from_timestamp(2678249158)
myULID3.timestamp().datetime
myULID<myULID2<myULID3

Output

53SG8F23B3L138LJPXRS90SFD
64GALC312LFVB12TA1VKKAPBR8
63HLN2EBGE7BT0XGJ64G7JWEA
datetime.datetime(2033, 09, 22, 3, 1, 34)
True

Python UUID String Without Dashes

When creating a Python UUID, you are provided with a UUID object. This allows us to pass the .hex function to return a string without any dashes.

class myClass(models.Model):
    ...
    ...
    myVar = models.UUIDField(default=uuid.uuid4().hex, editable=False, unique=True)
    ...
    ...
    ...
    )

    def __str__(self):
        return str(self.myVar.hex)

Is Collision Possible in Python UUID?

Let’s discuss how UUID is generated.

These are the following factors taken into consideration to create a UUID:

  • time
  • host’s ID
  • Randomizing Component

Therefore, if we were to create a UUID at the same time within the same host, the only remaining factor is the randomizing component. The random component is 14 bits which means we have 1 in 16384 instances to have a collision between 2 IDs.

[ERROR] Python UUID is Not Defined

When creating a UUID model, you may come across the following error.

Cannot successfully create field 'myField' for model 'myModel': name 'UUID' is not defined.

A quick fix is to create a helper function within the model like so:

from uuid import uuid4

def createUUID():
    return str(uuid4()) # returns UUID string

How to Convert a Valid UUID string to a UUID type

Let’s take the following sample data.

{
        "product": "iPhone12",
        "parent": "AppleINC",
        "uuid": "0e61200e84-2452-1334-746c-7fh07d3b6f42"
    },

The Python UUID module allows us to create UUID objects. Therefore with the help of uuid.UUID and the .hex function, we can convert the string into a valid UUID.

import uuid

data = {
        "product": "iPhone12",
        "parent": "AppleINC",
        "uuid": "0e61200e84-2452-1334-746c-7fh07d3b6f42"
    }

print(uuid.UUID(o['uuid']).hex)

Difference Between UUIDs and GUIDs

UUIDGUID
UUID stands for Universal Unique IdentifierGUID stands for Globally Unique Identifier.
The terms UUID and GUID are used synonymously with each otherUUID is more of a common term than GUID
UUIDs are 128-bit labels for creating unique identification in computer systems. The chances of a UUID being replicated are slim.Companies generate GUIDs whenever a unique number representation is required. It can be to reference a network or a product.

Pros and Cons of ULIDs

ProsCons
Sortability is requiredSortability should have sub-millisecond precision
The length of the Identifier should be limitedCreation time in the ULID can cause exposure to information leakage
No requirements for metadata such as creation time for retrieval Not platform, language, or architecture-independent.

FAQs

How to fix Python UUID is not JSON serializable?

When working with UUIDs on JSON files, you may come across the error. We can fix this by creating a custom UUIDEncoder that allows us to encode without running into the error.
Finally, we can pass the UUID object like so:

json.dumps(UUIDobj, cls=UUIDEncoder)

Can you Decode UUID?

It is close to impossible to manually decode a UUID. However, we can track its variant through public information from the host. For example, if the binary digits begin with 110, the UUID is a Microsoft GUID.

Is Python UUID thread-safe?

For the most part, Python UUID is thread-safe. However, in Python 2.5’s uuid.uuid1(), Whenever the current timestamp is compared to the previous one, it can start colliding between the same globally saved timestamp if no lock is provided.

A simple solution is to use the latest UUID4 for better randomization and lesser collisions.

Conclusion

We have looked at Pythons ULID module and how we can create unique sequences using the creation time. We have gone through the various classes of the module and how they help randomize the resultant sequence.

Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments