Skip to main content

Repository Pattern, Revisited

· 8 min read

Motivation

I first encountered the repository pattern in a Go backend codebase, where there are files/packages named "repo" and those packages are used to supply information from data sources. I was puzzled by such usage because until then, I have always known "repository" as a term related to "Git" and "GitHub". With further research online, I then realized that the repository pattern is a popular abstraction layer between the business logic and the data sources.

A succinct description of the repository pattern by Edward Hieatt and Rob Mee (P of EAA):

Mediates between the domain and data mapping layers using a collection-like interface for accessing domain objects.

UML illustration, source: martinfowler.com

This pattern is often discussed when talking about Domain Driven Design, which is

an approach to software development that centers the development on programming a domain model that has a rich understanding of the processes and rules of a domain. - Martin Fowler DomainDrivenDesign

In this article, I hope to consolidate some of the excellent resources that discuss the repository pattern in-depth and provide my examples and reflections on this pattern.

Uncovering the Repository Pattern

The idea of the repository pattern is to have a layer between a business domain and a data store. This can be done via an interface that defines data store handling logic, which the domain service logic can depend on.

Let's discuss a simplified example in Go, in the context of a URL-shortener application.

1. Create a repository interface in the service layer

The example application provides a URL-shortening service, which essentially takes in any URL and returns a shortened version that redirects visitors to the original address.

Let's assume that the URL-shortener service needs

  • a way to create a mapping of the original URL and the shortened URL
  • a way to query for the original URL for redirection
  • anything else (for simplicity we will only focus on the two above, CR of CRUD)

We mentioned that a repository interface needs to be created, but where?

The short answer is that we can implement it alongside the service layer. This is because the service knows what needs to be done with the data store (it may not need to know how). The repository interface, therefore, specifies the operations required by the service (without underlying details). One possible arrangement in Go is to have the service domain struct contain a reference to the repository interface, which is passed in from the constructor.

For example, we can have the following in a service/urlshortener.go file

package service

// The interface to be implemented by the actual datastore
type URLShortenerRepository interface {
Create(ctx context.Context, url string) (error)
Find(ctx context.Context, url string) (string, error)
}

// Domain struct
type URLShortener struct {
repo URLShortenerRepository
}

func NewURLShortener(repo URLShortenerRepository) *URLShortener {
return &URLShortener{repo: repo}
}

// Illustrations of the wrapping methods on the domain struct
func (u *URLShortener) Create(ctx context.Context, url string) (error) {
err := u.repo.Create(ctx, url)
if err != nil {
fmt.Println(err)
return err
}
return nil
}

func (u *URLShortener) Find(ctx context.Context, url string) (string, error) {
result, err := u.repo.Find(ctx, url)
if err != nil {
fmt.Println(err)
}
return result, err
}

2. Implement the repository interface in the data store layer

So far we have the service layer interacting with the repository interface, and we can now focus on implementing the actual handling logic in the data store layer. This typically involves a persistent database either relational or NoSQL like MongoDB, which we will use in this example.

Now, let's implement the handling logic in a mongoDB/mongo.go file

// Note that in Go, interfaces are implemented implicitly
type MongoDBRepository struct {
connectionString string
}

func NewMongoDBRepository(connectionString string) *MongoDBRepository {
return &MongoDBRepository{connectionString: connectionString}
}

func (m *MongoDBRepository) Create(ctx context.Context, url string) (error) {
// Insert a URL pair into the datastore via some MongoDB specific query
}

func (m *MongoDBRepository) Find(ctx context.Context, url string) (string, error) {
// Find from the datastore via some MongoDB specific query
}

3. Connecting the repository interface with the implementation

The last step in the process is to utilize what we have implemented so far.

We can imagine a central place where the service is initialized along with the data store, perhaps in a main.go file

repo := mongoDB.NewMongoDBRepository("db connection url here")
URLShortenerService := service.NewURLShortener(repo)

// example usage
err := URLShortenerService.Create(context.Background(), "some long url here")
if err != nil {
panic(err)
}

Diagram of the repository pattern Summary of the repository pattern

Analyzing the Repository Pattern

In the above section, we discussed a possible repository pattern implementation. In this part, we will highlight some of the benefits achieved.

Abstraction

The repository interface created separates the contract from implementation. This reduces the complexity at the service layer as only cares about the supporting behaviors of the underlying data store and not how they are actually implemented. It also reduces code duplication as all other services can share a consistent way to access data via the repository interface.

In the article on why you should use the repository pattern by Joe Petrakovich, he uses an analogy of a universal adapter to describe how the repository pattern sits between services and the data so that access or even modifications will less likely to impact the business logic code.

Encapsulation

Closely related to abstraction, encapsulation here means your repository interface helps to control access in a standardized way. This means regardless of the underlying data store, the repository interface exposes only the essential and expected ways to interact with the data store. This means a set of consistent error handling or logging can be performed at this layer and changes to the underlying data store are unlikely to affect the service layer code.

Separation of concern

The separation created by the repository layer reduces coupling as the service layer code does not depend on the data store directly. Similarly, the data store changes can hence be independent of the business requirement.

Facilitate unit testing via dependency injection

A crucial benefit of the repository pattern is that it allows for easy mocking and quicker unit tests. As we can see in our example's main.go file, a mock repository can be implemented and passed into the constructor instead. During testing, a mock repository can remove the need to establish a database connection or query a database, hence isolating the service layer logic.

Diagram of the repository pattern test Testing with the repository pattern

For example:

// Note that in Go, interfaces are implemented implicitly
type MockRepository struct {}

func NewMockRepository() *MockRepository {
return &MockRepository{}
}

func (m *MockRepository) Create(ctx context.Context, url string) (error) {
// Simulate insertion
return nil
}

func (m *MockRepository) Find(ctx context.Context, url string) (string, error) {
// Simulate read
return "https://short.com/url", nil
}

repo := NewMockRepository()
URLShortenerService := service.NewURLShortener(repo)

// example usage
err := URLShortenerService.Create(context.Background(), "some long url here")
if err != nil {
panic(err)
}

To understand dependency injection better, read more here

Drawbacks and Considerations

As with all patterns, there are drawbacks and even proponents who are loudly against the use of the repository pattern. Here are some of my observations and thoughts on the matter.

Is it cost-effective?

When implementing a software design pattern, it typically adds on the number of boilerplate codes to "set it up". Similarly for the repository pattern, implementing it could mean more structural code is added for the sake of "writing more code now so as to not repeat ourselves down the line". If however, the project is small-scale and there's likely no further development given that it is a demo/playground application, the investment in using the repository pattern could go unrealized.

Is another layer of indirection really necessary?

A fairly famous quote in computer science states:

Any problem in computer science can be solved with another layer of indirection. But that usually will create another problem

I am very cautious whenever I need to build a new layer of abstraction, because often than not, abstractions turned out to be "leaky" or "hasty". Such layers of abstractions don't deliver on their promises of simplicity and in very extreme cases, make the code harder to understand for ourselves and more so for future maintainers.

Better or worst testing?

Together with dependency injection, the repository pattern can help speed up unit testing by abstracting away the database. However, it does not remove the need to conduct integration tests because with a mock repository, the responses from the data store layer may not be realistic. To gain confidence in the system, integration tests are still necessary.

Conclusion

Design patterns such as the repository pattern are useful to understand because even if we choose not to use them, we are likely to come across them in existing codebases. As with all design patterns, the key is to plan well and find the right context before moving headlong into implementation. That's all and hope you enjoyed reading this article!

References