Create custom datatypes using Pydantic module in Python
Many times, we find that we need to pass in a long list of variables to a function, and specifying all of that in the Python function signature can be a bit messy. Again, problems also arise when you want some kind of validation to variables passed. For a long list of variables, it is really difficult to keep validating the data inside the main function body, also it is not a good practice. In that scenario, what you want to do is separate and segregate your variables into different classes. Here, we are going to demonstrate how can use pydantic to create models along with your custom validations. First, let’s discuss the use case.
Consider, we are receiving some data from an API call, and we need to do some kind of analysis on it. Typically, an API response will send the response in form of JSON, so we would want our models to be able to serialize and deserialize JSON (1).
Also, we would assume types of certain variables. For example, if we are passing an address, we would assume the pincode to be an Integer value. This is type checking (2).
To perform analysis, you would make some assumptions on the data, like say, pincode should match with the district name provided. This is validation (3).
We might also assume that for certain fields like states, it should be within a list of states say in India, and not any random arbitrary value. This falls under cleaning (4).
So, with these four requirements, let’s start coding out mode. I would assume that you have python installed on your system. To, install pydantic simply run,
pip install pydantic
With that set, create a file called models.py and paste the below code in it. We have added detailed in-line comments in the code itself to make it easier to understand directly.
Python3
# import required modules from enum import Enum from typing import Optional from pydantic import BaseModel, PositiveInt, validator, root_validator, constr # custom class used as choices for state # pydantic choices using the built-in Enum of python # which reduces the need for additional packages class StateTypes( str , Enum): DELHI = "DLH" UTTAR_PRADESH = "UP" BENGALURU = "BLR" WEST_BENGAL = "WB" # class to get personal credentials class PersonalDetails(BaseModel): id : int # constr gives us the ability to specify # the min and max length name: constr(min_length = 2 , max_length = 15 phone: PositiveInt # validation at field level @validator ( "phone" ) # get phone number def phone_length( cls , v): # phone number should typically be of length 10 if len ( str (v)) ! = 10 : raise ValueError( "Phone number must be of ten digits" ) return v # class to get address class Address(BaseModel): id : int address_line_1: constr(max_length = 50 ) # assigning some fields to be optional address_line_2: Optional[constr(max_length = 50 )] = None pincode: PositiveInt city: constr(max_length = 30 ) # using choices in python is this simple. # Just create a class with Enums as choices # and the pass the class as type for the field state: StateTypes @validator ( "pincode" ) def pincode_length( cls , v): if len ( str (v)) ! = 6 : raise ValueError( "Pincode must be of six digits" ) return v # using BaseModels as custom datatypes # in the User class class User(BaseModel): personal_details: PersonalDetails address: Address @root_validator (skip_on_failure = True ) # skip_on_failure=True means it will skip the # validation for this class if it's custom # fields are not validated def check_id( cls , values): # custom validation ensuring personal_details.id is # same as address.id personal_details: PersonalDetails = values.get( "personal_details" ) address: Address = values.get( "address" ) if personal_details. id ! = address. id : raise ValueError( "ID field of both personal_details as well as address should match" ) return values # Driver Code if __name__ = = "__main__" : # testing models validated_data = { "personal_details" : { "id" : 1 , "name" : "w3wiki" , "phone" : 9999999999 , }, "address" : { "id" : 1 , "address_line_1" : "Sector- 136" , "pincode" : 201305 , "city" : "Noida" , "state" : "UP" , }, } # this would work without any error as # no validation will fail user = User( * * validated_data) # would print the standard __str__ value for the model print (user) unvalidated_data = { "personal_details" : { "id" : 1 , "name" : "w3wiki" , "phone" : 9999999999 , }, "address" : { "id" : 2 , "address_line_1" : "Sector- 136" , "pincode" : 201305 , "city" : "Noida" , "state" : "UP" , }, } # this would raise a value error since the IDs # are different user = User( * * unvalidated_data) print (user) |
Output:
Upon running this, the first print statement will get executed successfully but in the next initialization of the User model, it would throw ValidationError of type ValueError because IDs of both personal details and address does not match.
Contact Us