Burak's Blog

Blog about various subjects


Enhancing Code Quality with Abstract Classes: The Power of Abstraction

Introduction

In the world of object-oriented programming, abstract classes play a crucial role in improving code quality, promoting code reuse, and enabling flexible design. By encapsulating common behavior and providing a blueprint for derived classes, abstract classes facilitate the creation of well-organized, maintainable, and extensible code. In this article, we will explore the benefits of using abstract classes to enhance code and demonstrate how they contribute to building efficient and robust software solutions.

I often observe that developers frequently forget or lack knowledge of how to effectively utilize object-oriented programming in Python. In the fast-paced tech world, Python empowers developers to rapidly write intricate code, test it, and bring it into production. However, the emphasis on features and speed often overshadows the importance of code quality and testability.

Strongly-typed languages such as Java and C# impose a stricter adherence to object-oriented coding practices, prompting developers to consider code organization more thoroughly. On the other hand, weakly-typed languages like Python or Ruby do not impose strict object-oriented requirements, allowing developers to build entire services using only functions. As a result, it is quite common to encounter Python projects where classes, for instance, are utilized in a minimalistic manner.

Not long ago, I came across a code that one of my colleagues had pushed into production. However, I noticed that the code lacked adherence to Pythonic principles and proved to be challenging to test. Whenever writing tests for a piece of code becomes complex and extends beyond 50 lines, it typically indicates a strong signal that the code requires improvements and a reconsideration of its overall structure. The code below is a simplified version of the original.

fp = tempfile.TemporaryFile()
fp.write(b'Hello world!')
try:
    with ClamdVirusScan() as virus_scanner:
        virus_scan_result = VirusScanningStatus.CLEAN

        for result in virus_scanner.scan_files(fp.name):
            if result.status != "ERROR":
                virus_scan_result = VirusScanningStatus.INFECTED
                os.remove(fp.name)
            else:
                virus_scan_result = VirusScanningStatus.PENDING

except VirusScanConnectionError as error:
    virus_scan_result = VirusScanningStatus.PENDING

Abstract Classes and Pythonic Code Refactoring

In the provided code snippet, we have a section of code responsible for scanning files for viruses. However, as we attempted to write tests for this code, we encountered challenges in the testing process. The complexity of the code made it difficult to write clean and understandable tests without excessive use of mocking. This is a signal that the code may require some refactoring to improve testability. One of the first steps we took to address this issue was to abstract the virus scanner functionality. Abstraction allows us to define a common interface that multiple virus scanning implementations can adhere to, providing a standard set of methods while still allowing the flexibility to switch between different scanning services.>

class VirusScanConnectionError(Exception):
    """
    To handle generic virus scan exceptions
    """

    def __init__(self, message: str = None):
        super().__init__(message)


@attr.s(frozen=True)
class VirusScanFiles(ABC):
    """Scanned file status list"""

    @abstractmethod
    def __len__(self):
        """virus scanned file list size"""
        raise NotImplementedError

    @abstractmethod
    def __getitem__(self, index):
        """retrieve file by index"""
        raise NotImplementedError


@attr.s(frozen=True)
class VirusScan(ABC):
    """Virus Scan Files list representation"""

    def __enter__(self):
        self.open()
        return self

    def __exit__(self, exc_type, exc_val, exc_tb):
        if exc_type == VirusScanConnectionError:
            logging.info("No virus scanner found")
        else:
            self.close()

    @abstractmethod
    def open(self):
        """Open connection with the virus scanner"""
        raise NotImplementedError

    @abstractmethod
    def close(self):
        """Close connection with the virus scanner"""
        raise NotImplementedError

    @abstractmethod
    def scan_files(self, path) -> VirusScanFiles:
        """Scan one or more files"""
        raise NotImplementedError

The abstraction layer consists of three main components:

  1. VirusScanConnectionError This is a custom exception class that we use to handle generic virus scan exceptions. By defining this exception class, we can raise specific errors when dealing with issues related to the virus scanner.
  2. VirusScanFiles (Abstract Base Class) This class represents a list of scanned files’ statuses. We declare this as an abstract class using the ABC (Abstract Base Class) module to indicate that it defines an interface without providing an implementation. The abstract methods __len__ and __getitem__ are declared, indicating that any concrete implementation of this class must provide these methods. This abstract class serves as a blueprint for the concrete classes that will represent the scanned file list
  3. VirusScan (Abstract Base Class) This class represents the interface for the virus scanner itself. Like the VirusScanFiles class, it is an abstract class defined using the ABC module. It defines the context management methods __enter__ and __exit__, which allow us to use the with statement to open and close connections with the virus scanner. The abstract methods open, close, and scan_files are declared, leaving the implementation details to the concrete classes that inherit from this abstract class.

By abstracting the virus scanner, we achieve several key benefits. First, it allows us to decouple the code from a specific virus scanning service. This means we are no longer tied to a single implementation and can easily switch between different virus scanners if needed. Additionally, the use of abstract classes and methods provides a clear contract for the required behavior, ensuring that any concrete implementation adheres to the same standard.

In the next section, we will demonstrate how we apply these abstract classes to refactor the original code, resulting in a more organized, maintainable, and testable solution. Stay tuned for the refactored version and a detailed explanation of the improvements achieved through abstraction.

Refactoring the Code Using Abstract Classes

Now that we have our abstraction layer in place with the abstract classes, VirusScan and VirusScanFiles, let’s see how we can refactor the original code to take advantage of these abstractions and improve its structure and testability.

The original code that we encountered was responsible for scanning files for viruses using a specific virus scanning service. However, it lacked proper organization and was challenging to test without heavy use of mocking.

To address these issues, we’ll introduce new classes, ClamdVirusScanFile, ClamdVirusScanFiles, and ClamdVirusScan, which will implement the abstract classes and handle the virus scanning logic accordingly.

  1. ClamdVirusScanFile - Virus Scanned File Result The ClamdVirusScanFile class represents the result of a scanned file. It contains attributes such as filename, status, and reason. This class encapsulates the scanned file’s details and provides a cleaner representation for the scanned file results.
  2. ClamdVirusScanFiles - Scanned File Status List The ClamdVirusScanFiles class implements the VirusScanFiles abstract class. It takes the results from the virus scanning service and processes them to create a list of ClamdVirusScanFile instances. By doing so, it provides a clear and standardized representation of the scanned file status list. The implementation of __len__ and __getitem__ allows us to access the scanned files in a more Pythonic way, improving code readability and making it easier to work with the scanned file results.
  3. ClamdVirusScan - Clamd Virus Scan Connection The ClamdVirusScan class implements the VirusScan abstract class, providing the concrete implementation for connecting to the ClamAV virus scanning service. It uses the popular pyclamd library to interact with ClamAV via Unix sockets. The open method initializes the connection, and the close method closes it when necessary. The scan_files method takes a file path as input, checks whether it is a file or a directory, and performs the appropriate scan operation. It then returns the results in the form of ClamdVirusScanFiles, adhering to the abstraction provided by the VirusScanFiles abstract class.

Applying the Abstractions:

By refactoring the code to utilize these abstract classes and implementing concrete classes for the ClamAV virus scanning service, we achieve several benefits. The code becomes more organized, and the responsibilities of each class are well-defined, making the codebase easier to understand and maintain. Additionally, the abstract classes provide a clear contract for the required methods, which helps ensure that the code adheres to a standardized structure.

Improved Testability:

With the refactored code, testing becomes more straightforward and less reliant on heavy mocking. We can now create test cases that interact with the concrete classes representing the virus scanner, making our unit tests more focused and readable. This improved testability helps catch issues earlier in the development process and promotes a more robust and reliable codebase.

In conclusion, refactoring the original code using abstract classes and concrete implementations has greatly enhanced the code’s organization, maintainability, and testability. By abstracting the virus scanner and providing a clear interface, we ensure that the code can easily adapt to different virus scanning services, while also adhering to Pythonic principles for improved readability and maintainability.

The Benefits of Abstraction

Abstracting the virus scanner through the use of abstract classes in Python brings significant advantages to our code. It is important to note that these abstractions are applicable to all object-oriented languages, not just Python. By decoupling the code from a specific virus scanning service, we gain the flexibility to switch between different implementations while still adhering to a standard set of methods. This promotes code reusability, ensuring that our codebase remains scalable and adaptable. Moreover, the clear separation of concerns achieved through abstraction improves code organization and maintainability. With standardized interfaces, writing focused unit tests becomes more straightforward, allowing us to detect issues early and create a more robust and reliable software solution. Embracing abstraction empowers developers to build efficient and extensible code, aligning with Pythonic principles for cleaner, more readable codebases across various object-oriented programming languages.

Conclusion

In conclusion, abstract classes play a pivotal role in the world of object-oriented programming, offering numerous benefits when used strategically. Through this article, we explored how abstract classes enhance code quality, promote code reuse, and enable flexible design. Refactoring the original code by abstracting the virus scanner demonstrated the power of abstraction in Python.

By encapsulating common behavior and providing a standardized blueprint for derived classes, abstract classes lead to well-organized, maintainable, and extensible code. The refactored code showcased how abstract classes, such as VirusScan and VirusScanFiles, allowed us to decouple the code from a specific virus scanning service, making it possible to switch between different implementations seamlessly.

Moreover, the clear separation of concerns facilitated cleaner code organization and improved testability. Through standardized interfaces, unit tests became more focused, aiding in the early detection of issues and contributing to a more robust software solution.

It is worth emphasizing that the benefits of abstraction extend beyond Python; these abstractions are applicable to all object-oriented languages. Embracing abstraction empowers developers to create efficient and scalable codebases that adhere to best practices across various programming languages.

In the fast-paced tech world, where speed and functionality are often prioritized, it is crucial not to overlook the importance of code quality and maintainability. By leveraging the power of abstraction and adopting Pythonic principles, developers can build elegant and readable code that withstands the test of time and evolves with changing requirements.

As developers, let us strive to harness the full potential of abstract classes in Python and other object-oriented languages, as they pave the way for crafting robust, efficient, and adaptable software solutions. With a solid foundation in abstraction, we unlock a world of possibilities, enabling us to build exceptional applications that stand as a testament to the art and science of programming.