DAO vs Repository

DAO

DAO = Data Access Object, goal is to abstract and encapsulate all access to the data and provide an interface. The DAO is usually able to create an instance of a data object (“to read data”) and also to persist data (“to save data”) to the datasource. Use when you want to separate a data resource’s client interface from its data access mechanisms, it allows clean separation of concerns.

BookDAO inheritance hierarchy UML diagram

The Data Access Layer (DAL) is the layer of a system that exists between the business logic layer and the persistence / storage layer. A DAL might be a single class, or it might be composed of multiple Data Access Objects (DAOs). It may have a facade over the top for the business layer to talk to, hiding the complexity of the data access logic. It might be a third-party object-relational mapping tool (ORM) such as Hibernate.

DAL is an architectural term, DAOs are a design detail.

In computer software, a data access object (DAO) is an object that provides an abstract interface to some type of database or other persistence mechanism. By mapping application calls to the persistence layer, DAO provide some specific data operations without exposing details of the database. This isolation supports the Single responsibility principle. It separates what data accesses the application needs, in terms of domain-specific objects and data types (the public interface of the DAO), from how these needs can be satisfied with a specific DBMS, database schema, etc. (the implementation of the DAO).

Although this design pattern is equally applicable to the following: 1- most programming languages; 2- most types of software with persistence needs; and 3- most types of databases) it is traditionally associated with Java EE applications and with relational databases (accessed via the JDBC API because of its origin in Sun Microsystems’ best practice guidelines “Core J2EE Patterns” for that platform).

But a DAL is more than a group of DAOs. It contains EVERYTHING related to persistence: DAOs, entities which model how the data is stored, if you’re using a (micro) ORM and other internal services used by the DAOs. However the app accesses the DAL only via DAOs, which can be considered the ‘entry points’ of DAL.

Note that the DAO itself is just a concept and it’s used as an abstraction, that is the application doesn’t know about the concrete object, it knows about an interface providing the desired functionality. The DAO has intimate knowledge about the storage system but it exposes only behaviour which makes sense for the application i.e a DAO should never expose or require information that is tied to a specific storage system. While a Repository is a concept, it is implemented as a DAO, at least from the application point of view. In fact every object used to deal with the storage is a DAO, but the Repository is a specialized DAO. It deals only with Business Objects and acts as a facade for other lower level DAOs (such as an ORM).

The ORM tries to present a relational database in an object oriented way. It abstracts actual database access (that’s why it’s a DAO) but still deals with specific database concepts as the entities defined model the storage structure, the way data is saved. For many (CRUD) applications it can be enough and the application can use the objects returned by the ORM without caring that they are modelling persistence. For applications with complex behaviour, usually business applications, the Repository is a better choice as most of the time a business object is different than the way it’s persisted.

Repository

A Repository mediates between the domain and data mapping layers, acting like an in-memory domain object collection. Client objects construct query specifications declaratively and submit them to Repository for satisfaction. Objects can be added to and removed from the Repository, as they can from a simple collection of objects, and the mapping code encapsulated by the Repository will carry out the appropriate operations behind the scenes.

Purpose: persistent ingonorance. Use a repository to separate the logic that retrieves the data and maps it to the entity model from the business logic that acts on the model. The business logic should be agnostic to the type of data that comprises the data source layer. For example, the data source layer can be a database, a SharePoint list, or a Web service. Repositories remove dependencies that the calling clients have on specific technologies. A repository centralizes the access logic for a service and provides a substitution point for unit tests. Services are often expensive to invoke and benefit from caching strategies that are implemented within the repository.

As the repository is an abstraction it should aways return whatever the layer above want to work with, which in most cases are domain entities, i.e. the objects which will encapsulate the logic in your business code.

db repository interactions web services repository
MSDN repository interactions schema web services repository