Pytest for Data Scientists

Motivation

As a data scientist, one way to test your Python code is by using an interactive notebook to verify the accuracy of the outputs.

However, this approach does not guarantee that your code works as intended in all cases.

A better approach is to identify the expected behavior of the code in various scenarios, and then verify if the code executes accordingly.

For example, testing a function used to extract the sentiment of a text might include checking whether:

The function returns a value that is greater than 0 if the test is positive.
The function returns a value that is less than 0 if the text is negative.

#sentiment.py

def test_extract_sentiment_positive():

    text = "I think today will be a great day"

    sentiment = extract_sentiment(text)

    assert sentiment > 0

def test_extract_sentiment_negative():

    text = "I do not think this will turn out well"

    sentiment = extract_sentiment(text)

    assert sentiment < 0

Besides ensuring that your code works as intended, incorporating testing in a data science project also provides the following benefits:

Identifies edge cases.
Enables safe replacement of existing code with enhanced versions, without risking disruption of the entire process.
Makes it easier for your teammates to understand the behaviors of your functions.

While Python offers various testing tools, Pytest is the most user-friendly option.

The source code for this article can be found here.

Get Started with Pytest

Pytest is the framework that makes it easy to write small tests in Python. I like pytest because it helps me to write tests with minimal code. If you are not familiar with testing, pytest is a great tool to get started.

To install pytest, run

pip install -U pytest

To test the extract_sentiment function, create a function that starts with test_ followed by the name of the tested function.

#sentiment.py
def extract_sentiment(text: str):
        '''Extract sentiment using textblob. 
        Polarity is within range [-1, 1]'''

        text = TextBlob(text)

        return text.sentiment.polarity

def test_extract_sentiment():

    text = "I think today will be a great day"

    sentiment = extract_sentiment(text)

    assert sentiment > 0

That’s it! Now we are ready to run the test.

To test the sentiment.py file, run:

pytest sentiment.py

Pytest will run all functions that start with test in the current working directory. The output of the test above will look like this:

========================================= test session starts ==========================================
process.py .                                                                                     [100%]
========================================= 1 passed in 0.68s ===========================================

If the test fails, pytest will produce the following outputs:

#sentiment.py

def test_extract_sentiment():

    text = "I think today will be a great day"

    sentiment = extract_sentiment(text)

    assert sentiment < 0

$ pytest sentiment.py
========================================= test session starts ==========================================
process.py F                                                                                     [100%]
=============================================== FAILURES ===============================================
________________________________________ test_extract_sentiment ________________________________________
def test_extract_sentiment():
    
        text = "I think today will be a great day"
    
        sentiment = extract_sentiment(text)
    
>       assert sentiment < 0
E       assert 0.8 < 0
process.py:17: AssertionError
======================================= short test summary info ========================================
FAILED process.py::test_extract_sentiment - assert 0.8 < 0
========================================== 1 failed in 0.84s ===========================================

The test failed because the sentiment of the function is 0.8, which is not less than 0. Knowing why the function doesn’t work gives us directions on how to fix it.

Multiple Tests for the Same Function

With pytest, we can also create multiple tests for the same function.

#sentiment.py

def test_extract_sentiment_positive():

    text = "I think today will be a great day"

    sentiment = extract_sentiment(text)

    assert sentiment > 0

def test_extract_sentiment_negative():

    text = "I do not think this will turn out well"

    sentiment = extract_sentiment(text)

    assert sentiment < 0

$ pytest sentiment.py
========================================= test session starts ==========================================
process.py .F                                                                                    [100%]
=============================================== FAILURES ===============================================
___________________________________ test_extract_sentiment_negative ____________________________________
def test_extract_sentiment_negative():
    
        text = "I do not think this will turn out well"
    
        sentiment = extract_sentiment(text)
    
>       assert sentiment < 0
E       assert 0.0 < 0
process.py:25: AssertionError
======================================= short test summary info ========================================
FAILED process.py::test_extract_sentiment_negative - assert 0.0 < 0
===================================== 1 failed, 1 passed in 0.80s ======================================

Parametrization: Combining Tests

Since the two test functions mentioned earlier test the same function, we can combine them into one test function with parameterization.

Parametrize with a List of Samples

pytest.mark.parametrize() allows us to execute a test with different examples by providing a list of examples in the argument.

# sentiment.py

from textblob import TextBlob
import pytest

def extract_sentiment(text: str):
        '''Extract sentiment using textblob. 
        Polarity is within range [-1, 1]'''

        text = TextBlob(text)

        return text.sentiment.polarity

testdata = ["I think today will be a great day","I do not think this will turn out well"]

@pytest.mark.parametrize('sample', testdata)
def test_extract_sentiment(sample):

    sentiment = extract_sentiment(sample)

    assert sentiment > 0

========================== test session starts ===========================
platform linux -- Python 3.8.3, pytest-5.4.2, py-1.8.1, pluggy-0.13.1
collected 2 items
sentiment.py .F                                                    [100%]
================================ FAILURES ================================
_____ test_extract_sentiment[I do not think this will turn out well] _____
sample = 'I do not think this will turn out well'
@pytest.mark.parametrize('sample', testdata)
    def test_extract_sentiment(sample):
    
        sentiment = extract_sentiment(sample)
    
>       assert sentiment > 0
E       assert 0.0 > 0
sentiment.py:19: AssertionError
======================== short test summary info =========================
FAILED sentiment.py::test_extract_sentiment[I do not think this will turn out well]
====================== 1 failed, 1 passed in 0.80s ===================

Parametrize with a List of Examples and Expected Outputs

What if we expect different examples to have different outputs?

For example, we might want to check if the function text_contain_word:

Returns True if word="duck" and text="There is a duck in this text"
Returns False if word="duck" and text="There is nothing here"

def text_contain_word(word: str, text: str):
    '''Find whether the text contains a particular word'''
    
    return word in text

To create a test for multiple examples with different expected outputs, we can use parametrize(‘sample, expected_out’, testdata) with testdata=[(<sample1>, <output1>), (<sample2>, <output2>).

# process.py
import pytest
def text_contain_word(word: str, text: str):
    '''Find whether the text contains a particular word'''
    
    return word in text

testdata = [
    ('There is a duck in this text',True),
    ('There is nothing here', False)
    ]

@pytest.mark.parametrize('sample, expected_output', testdata)
def test_text_contain_word(sample, expected_output):

    word = 'duck'

    assert text_contain_word(word, sample) == expected_output

$ pytest process.py
========================================= test session starts ==========================================
platform linux -- Python 3.8.3, pytest-5.4.2, py-1.8.1, pluggy-0.13.1
plugins: hydra-core-1.0.0, Faker-4.1.1
collected 2 items
process.py ..                                                                                    [100%]
========================================== 2 passed in 0.04s ===========================================

Awesome! Both tests passed!

Test One Function at a Time

To test a specific function, run pytest file.py::function_name

testdata = ["I think today will be a great day","I do not think this will turn out well"]

@pytest.mark.parametrize('sample', testdata)
def test_extract_sentiment(sample):

    sentiment = extract_sentiment(sample)

    assert sentiment > 0


testdata = [
    ('There is a duck in this text',True),
    ('There is nothing here', False)
    ]

@pytest.mark.parametrize('sample, expected_output', testdata)
def test_text_contain_word(sample, expected_output):

    word = 'duck'

    assert text_contain_word(word, sample) == expected_output

For example, to run only test_text_contain_word, type:

pytest process.py::test_text_contain_word

Fixtures: Use the Same Data to Test Different Functions

We can also use the same data to test different functions with pytest fixture.

In the code below, we use pytest fixture to convert the sentence “Today I found a duck and I am happy” into a reusable fixture and use it in multiple tests.

@pytest.fixture
def example_data():
    return 'Today I found a duck and I am happy'


def test_extract_sentiment(example_data):

    sentiment = extract_sentiment(example_data)

    assert sentiment > 0

def test_text_contain_word(example_data):

    word = 'duck'

    assert text_contain_word(word, example_data) == True

Structure your Projects

Last but not least, when our code grows bigger, we should organize the code by storing functions and their tests in two different folders. Conventionally, source code is kept in the “src” folder, while tests are stored in the “tests” folder.

To automate test executions, name your test functions as either “test_<name>.py” or “<name>_test.py”. Pytest will then identify and run all files ending or beginning with “test”.

This is how these two files will look like:

from textblob import TextBlob

def extract_sentiment(text: str):
        '''Extract sentiment using textblob. 
        Polarity is within range [-1, 1]'''

        text = TextBlob(text)

        return text.sentiment.polarity

from src.process import extract_sentiment
import pytest


def test_extract_sentiment():

    text = 'Today I found a duck and I am happy'

    sentiment = extract_sentiment(text)

    assert sentiment > 0

To run all tests, type pytest tests in the root directory:

========================== test session starts ===========================
platform linux -- Python 3.8.3, pytest-5.4.2, py-1.8.1, pluggy-0.13.1
collected 1 item
tests/test_process.py .                                            [100%]
=========================== 1 passed in 0.69s ============================

Conclusion

Congratulations! You have just learned about pytest. I hope this article gives you a good overview of why testing is important and how to incorporate testing in your data science projects with pytest. With testing, you are not only able to know whether your function works as expected but also have the confidence to transition to new tools or code structures.