Use case

For example, I am working on a small language training app. One of the data structures I use is what I call a “Cloze”, inspired by Anki. It has two strings: a to-be-solved version, and a solved version. And a boolean flag to say whether it was solved or not. These Clozes appear all over the app. I could create a mock. And I will when I need to call its methods or check whether some of the code does. But most of the time, I just need data in the right shape to pass around. Faker is great for that.

Creating a provider in Faker

At the time of writing, the Faker doc doesn’t quite explain how to create your own provider. Luckily reverse engineering some of the existing ones is not too hard. It boils down to extending the BaseProvider class. Then every method becomes a generator. For example

from faker.provider import BaseProvider
from models.cloze import Cloze

class Provider(BaseProvider):

    def cloze(
        self,
        text: Optional[str] = None,
        obfuscated: Optional[str] = None,
        guessed: Optional[bool] = False,
    ):
        """Model a Cloze entity"""
        return Cloze(
            text=text or "".join(self.random_letters()),
            obfuscated=obfuscated or "".join(self.random_letters()),
            guessed=guessed,
        )

    def clozes(self, nb=3):
      return [Cloze() for _ in range(nb)]

    def exercise(... etc

Note how I can use Faker’s BaseProvider’s methods in my own provider. For example “random_letters” in "".join(self.random_letters()). Once I add the provider to Faker (more on that below), I can then use it in my tests. In the example above, I can create a single Cloze, or a list of them, with


cloze = faker_instance.cloze()
# Cloze(text='zTbJuPgkCvtGJtms', obfuscated='LkqKBZBNLRYbAvXj', guessed=False)

cloze = faker_instance.cloze(text="TEXT")
# Cloze(text='TEXT', obfuscated='pdyQXUBhSyflBXvZ', guessed=False)

list_of_clozes = faker_instance.clozes(nb=2)
# [
#   Cloze(text='EQKHvVUIJFllEZUg', obfuscated='ymEaSvqcYPdsGeFw', guessed=False)
#   Cloze(text='IvSMtFGVXEGdODoC', obfuscated='pPTkcmlvCOizEqZe', guessed=False)
# ]

Accessing Faker’s standard providers’ methods in a custom provider

In the example above I could use self.random_letters from BaseProvider, as well as all the other BaseProvider fakes. But not the others. In fact, I used random_letters because I didn’t know how to access the one I really wanted: words, from the Lorem provider. Turns out it can be done, although it is a bit hacky.

from faker.provider.lorem.en_US import Provider as LoremProvider
from models.cloze import Cloze

class Provider(LoremProvider):
  def words_words(self, nb=3):
    return [
      twice
      for w in self.words(nb=nb) # outer loop
      for twice in [w, w]        # inner loop
      ]

In the example above I import a specific locale of the provider (faker.provider.lorem.en_US). Then I have all of its methods available. Here I am using a nested array comprehension to repeat each list item twice. Using a fixed locale works well because my fakes rarely need localisation.

Adding a provider to Faker

Providers in Faker follow an odd structure. A provider is a class called Provider which inherits from Faker’s BaseProvider. The code lives in the package’s __init__.py, not a provider.py file. That Provider class is all you need for a basic use case with no locales. If you need localised versions, they are alwyas packages named after the locale. They live one level down from the main Provider, and consists of a class also called Provider, which inherit from the Provider one level up.

src/language_learning/
│
 ...
└── faker_providers
    ├── __init__.py
    ├── en_US
    ├── └── __init__py
    ├── ... other locales ..
 ...
│

I import my providers in my conftest.py file and add them to the centralised Faker instance, which is passed to the texts as a fixture, with

# tests/conftest.py
from faker import Faker

from app.fake_providers.language_training import   Provider as LanguageTrainingProvider

fake = Faker()
A_RANDOM_SEED = 1369
Faker.seed(A_RANDOM_SEED)
fake.add_provider(LanguageTrainingProvider)

@pytest.fixture(name="fake")
def fixture_fake():
    """Pass a seeded Faker instance as a fixture"""
    return fake

Now I can use the provier in my tests with


def test_something(fake):
  # fake.words is from a standard Faker provider
  assert fake.words() != 123

  # fake.clozes is the one I just created
  assert len(fake.clozes(nb=7)) != 8

A Faker provider for all locales

If I never use locales, a simple class like the one in the example above will Just Work™. But in the rare cases when I use different locales for my fakes for testing, it won’t. Since I’ll never create a different version for each locale, I need a workaround. What I can do is to create a single locale, en_US, which simply inherits from the my main provider. Then Faker will fall back to it when it can’t find any of the other locales.

└── language_learning
    ├── __init__.py
    └── en_US
        └── __init__py

The class is minimal

# src/fake_providers/language_learning/en_US/__init__.py
from .. import Provider as LanguageLearningProvider


class Provider(LanguageLearningProvider):
    """Fallback locale"""

Conclusion

I find using Faker keeps the code quite clean. Of course, I cannot call any of Cloze’s methods. For that I would need to create mocks, but that is a different story.