UPC Validation in Python: Making it Easy

This is turning into a bit of a rabbit hole, but today I noticed that UPCs (Universal Product Codes) have a check digit also. UPCs are the numbers (most of the time accompanied by a bar code) that appear on just about every item and enable those cool scanners at the checkout of your local supermarket.

Interestingly enough, the checkdigit for a UPC-A code (the “normal” UPC) is generated almost like the checkdigit for ISBN-13. Here are the major differences:

UPCs only have 12 digits (including the checkdigit) versus 13 for ISBN-13
The checkdigit is calculated by weighting the ODD numbers by 3 (versus the EVEN numbers in ISBN-13)

Other than that, it’s the same process!

Since I’m SO close to also having a UPC validator, I think I’ll go for it, and I’ll also do the implementation a bit differently. Specifically, I’m going to validate the UPC by actually calculating and checking the check digit instead of computing the checksum for the whole number and making sure it is evenly divisible by 10. Here’s what my code looks like – pay special attention to line 17 where I’ve changed the formula slightly to weight the odd numbers:

class ISBNValidator:
    class FormatException(Exception):
        pass

    @staticmethod
    def prepare_code_string(code_string: str) -> str:
        retval = code_string.replace("-", "")
        retval = retval.replace(" ", "")
        return retval
...
    @staticmethod
    def calculate_upc_checkdigit(first_11_numbers: str) -> str:
        if len(first_11_numbers) != 11 or not first_11_numbers.isnumeric():
            raise ISBNValidator.FormatException("Improper format in first 11 numbers of UPC")
        checksum = 0
        for (count, digit) in enumerate(first_11_numbers):
            weight = 1 + (((count+1) % 2) * 2)
            checksum += (int(digit) * weight)
        checkdigit = (10 - (checksum % 10)) % 10
        return str(checkdigit)

    @staticmethod
    def validate_upc(code_string: str) -> bool:
        upc_string = ISBNValidator.prepare_code_string(code_string)
        if len(upc_string) != 12:
            return False
        retval = ISBNValidator.calculate_upc_checkdigit(upc_string[:-1]) == upc_string[-1:]
        return retval

Notice that I also updated the prepare_code_string() method to not only trim spaces, but also remove them from inside the code string. This is because UPCs often have embedded spaces instead of dashes.

I also updated my unit tests accordingly to exercise the new methods and all looks good.

Adding this extra capability was fun, but now I’m a bit concerned because my ISBNValidator class is no longer only validating ISBNs… It’s also doing UPCs! Also, I feel like I’ve got some duplicate code that I can potentially streamline. Tomorrow, I’m going to go through a process called refactoring to clean this up and make this into something that I’m a bit prouder of.

Leave a Comment Cancel Reply