ClamAV Antivirus scanner for file uploads for Python applications
Introduction
ClamAV is an open-source antivirus engine designed for detecting trojans, viruses, malware, and other malicious threats on Unix-based systems. Initially developed for email scanning on Unix-based systems like Linux, it has evolved into a comprehensive antivirus solution for a variety of platforms, including Windows and macOS. Known for its reliability, ease of use, and frequent updates, ClamAV has become a popular choice for both individual users and organizations seeking effective protection against cyber threats.
Clamd is a portable Python module to uses the ClamAV anti-virus engine on Windows, Linux, MacOSX, and other platforms. It requires a running instance of the clamd daemon. The below steps will provide details on how to install ClamAV and how it can be used with Python applications.
Technical implementation
Step 1. Open the terminal and install ClamAV in Local using cmd for macOS brew install ClamAV
Step 2: Go to the path in macOS cd /opt/homebrew/etc/clamav/ , which will have two sample files
- clamd.conf.sample
- freshclam.conf.sample
Step 3: Run cmd. “cp freshclam.conf.sample freshclam.conf” to copy and open the file that was just created and comment out Example -> #Example.
Step 4: Open the terminal and update the ClamAV database using “freshclam -v”.
Step 5: Run cmd. “cp clamd.conf.sample clamd.conf” to copy and open the file that was just created and apply the below changes
- Comment out Example -> #Example
- Uncomment TCPSocket
- Uncomment TCPAddr
Step 6: Now that we have installed and setup ClamAV and to run ClamAV in local clamd –foreground, this will run the ClamAV service on
- HOST = localhost
- PORT = 3310
Step 7: Now, in your application, install ClamAV python package using the below command.
pip install clamd==1.0.2
Step 8: Once installed, import the package in your app as below to scan files in a specific path.
import clamd def scan_file_using_file_path(): cd = clamd.ClamdNetworkSocket() cd.__init__(host='127.0.0.1', port=3310, timeout=100) cwd = os.getcwd() for file in os.listdir("files"): directory = os.path.join(cwd, f"files/{file}") res = cd.scan(file=directory) print(res) scan_file_using_file_path() Results:{'/Users/PycharmProjects/ClamAVAntivirus/files/test_xls.xlsx': ('OK', None)} {'/Users//PycharmProjects/ClamAVAntivirus/files/users.txt': ('OK', None)} {'/Users/PycharmProjects/ClamAVAntivirus/files/testfile.txt': ('FOUND', 'Win.Test.EICAR_HDB-1')} {'/Users/PycharmProjects/ClamAVAntivirus/files/test_pdf.pdf': ('OK', None)} {'/Users/PycharmProjects/ClamAVAntivirus/files/image.jpg': ('OK', None)} Process finished with exit code 0
After scanning through the files, it will report if any malicious item is found. Below is an example to scan a file if the input is in the form of a stream.
from io import BytesIO def scan_file_using_byte_stream(): with open("files/testfile.txt", "rb") as fh: buf = BytesIO(fh.read()) try: res = antivirus_scanner.scan_stream(stream=buf) except (AntivirusScannerException, MaliciousContentException) as e: print(e) else: if res == "OK": print("File scanned successfully, no potential malware found") else: print("Something went wrong!!!") scan_file_using_byte_stream() Results:***** Malicious content found in file scanning, reason: Win.Test.EICAR_HDB-1 *****
Process finished with exit code 0
Class to connect to ClamAV service
from io import BytesIO from typing import Optional, Tuple import clamd class AntivirusScannerException(Exception): pass class MaliciousContentException(AntivirusScannerException): def __init__(self, reason: str): super().__init__(f"Malicious content found in file scanning, reason: {reason}") class InvalidScanResultStatus(AntivirusScannerException): def __init__(self, status: str): super().__init__(f"Undefined status returned from antivirus scanner: {status}") class ClamAvScanner: _CONNECTION_TIMEOUT = 60 _DEFAULT_PORT = 3310 _MALICIOUS_STATUSES = frozenset(("ERROR", "FOUND")) def __init__(self, hostname: str, port: Optional[int]): self._hostname = hostname self._port = port or self._DEFAULT_PORT def scan_stream(self, stream: BytesIO): connection = self._init_connection() try: output = connection.instream(stream) except (clamd.BufferTooLongError, clamd.ConnectionError) as exc: raise AntivirusScannerException( "Unable to scan stream due to internal issues" ) from exc return self._output(output["stream"]) def _init_connection(self) -> clamd.ClamdNetworkSocket: return clamd.ClamdNetworkSocket( host=self._hostname, port=self._port, timeout=self._CONNECTION_TIMEOUT, ) def _output(self, output: Tuple[str, str]): status, reason = output if status in self._MALICIOUS_STATUSES: raise MaliciousContentException(reason) if status != "OK": raise InvalidScanResultStatus(status) if status == "OK": return "OK" antivirus_scanner = ClamAvScanner(hostname="127.0.0.1", port=3310)
To determine what virus is associated with a file detected by ClamAV, you can usually check the scan report generated by ClamAV after it completes a scan. The report should list the name or identifier of the detected virus or malware and information about the affected file. Additionally, you can search for the specific virus name or identifier online to find more information about its characteristics and potential impact.
Links:
- https://pypi.org/project/clamd/
Some best practices:
- Filtering or allowing only required file formats while uploading.(ex: .pdf, .jpeg, .xlsx)
- Scanning files for potential malware before storing.
- Limiting max file size.
Conclusion
In conclusion, ClamAV continues to be a trusted antivirus solution, providing essential protection against a wide range of malicious threats across multiple platforms. With its open-source nature, frequent updates, and robust detection capabilities, ClamAV remains a valuable tool in the fight against malware. Whether you’re an individual user or a large organization, integrating ClamAV into your cybersecurity strategy can help safeguard your systems and data from the ever-evolving landscape of cyber threats.