PDO for Elegant PHP Database Access

📁 architecture

Speaking very broadly, every computer application is basically a data manipulation tool. The application accepts data from the outside world, performs some transformation on it and then pushes some different data right back out.

The data inserted into an application can come from a peripheral device (for example, a keyboard or mouse), another system (via an API) or some internally stored state. The output can take the form of text on a screen, a HTML document rendered in a web browser or full motion video. The nature of the data doesn’t matter – the principle is the same for every application.

Most software (including any kind of web-based application) requires that we make some of that data long lasting: we want to be able to access it at a later date. It might be input from a user, content for pages on a web site or system configuration values. Relational databases are useful for this purpose, as they enable us to store data in a flexible format that is easily queried, filtered and transformed.

But there are some significant problems to be solved when integrating our web applications with relational databases. Failing to adequately address them can result in decreased maintainability, scalability and/or security. Luckily, there are a number of design patterns, pre-built tools and best practices that can help us to avoid these potential issues.

Data layer architecture The way that we design our system to access the database (the architecture of our data access layer) plays an important part in determining how easy it is to maintain and test our application. We want to design it in such a way that the system is decoupled from the underlying implementation of the datasource.

Decoupling our application from the datasource means that the components used to implement the business logic and flow of control are not tied to the specific datasource. By utilising a pattern that adheres to this paradigm, we can easily switch out the datasource for a different kind of database or a different kind of storage medium without requiring code changes throughout the rest of our application.

Typically we achieve this by adding layers of abstraction to the data retrieval and storage process. In a web application that follows an MVC-based architecture, all of the code responsible for managing domain entities will be stored in Model classes.

A naive model implementation may interact with a database in a similar way to the following:

While this is a decent first attempt, there are a number of problems with this approach. The most obvious problem here is that the model class is tightly coupled to the implementation of the database. If we wanted to change database vendor from MySQL to another (such as Postgres or a NoSQL solution), we will have to rewrite some (or all) of our models. Another problem is that the cohesion of the model class is significantly degraded using this approach: our models should be responsible for implementing the business logic of our application, not building SQL queries.

Data Access Objects

We can improve on this pattern slightly by introducing a Data Access Object (DAO). Utilising this pattern, our user model is no longer responsible for interacting with the database. Instead, our model is responsible for managing domain entities and coordinating the interactions between these entities (ie: the business logic), and the database interactions are delegated to another class.

This is a better approach, as it means any change to database settings will be confined to the DataAccessObject class. But we still encounter a lot of the same problems: our model is generating SQL.

We can alleviate this problem by inserting another layer of abstraction. This time, we can use the repository pattern. The repository class is responsible for using the DAO to retrieve and persist domain entities to the store, what ever that store may be (database, API, flat file).

As a side note, the repository pattern typically implements some kind of in-memory caching mechanism so that we don’t need to keep going back to the database for subsequent queries for the same entity. We won’t spend too much time on it here as it isn’t relevant to the discussion at hand, but it may look something like the following:

PHP Data Objects (PDO)

So far we’ve discussed the Data Access Object (DAO) layer of our application as if we had to implement the class ourselves. Conveniently, this functionality is provided for us by the PHP Data Objects (PDO) module which is built in to PHP since version 5.1.

The PDO object enables us to interact with most of the popular relational database providers in a somewhat systematic way. Note that the PDO object only abstracts away the connection process – we still need to write vendor-specific SQL. An example of connecting to a MySQL database using PDO is demonstrated below:

There are some important things to note here. The first is that we use something called a data source name (DSN) when instantiating the PDO object. The DSN is a specially formatted string that tells the PDO object how to connect to the database. It includes the database type, the host address as well as the database name. In the example above, it appears as mysql:host=localhost;dbname=someDb. From the PHP documentation:

In general, a DSN consists of the PDO driver name, followed by a colon, followed by the PDO driver-specific connection syntax.

More information about vendor-specific DSNs can be found here.

You’ll also notice that the usage of the query(…) method to escape the content of the variable $name. This will prevent SQL injection, which may be a potential attack vector if the variable is populated by an untrusted source (such as a form field).

Finally, you’re probably wondering what kind of data the PDO object will return when the call to fetch() is made. This is actually configurable by passing different PDO constant values in, but by default the PDO::FETCH_BOTH value is used. As specified in the official documentation, this provides us with a single row from the result set structured as an array indexed both by column name and 0-indexed column number.

The following would be valid:

If there are multiple rows in the result set, we can iterate through them by making subsequent calls to fetch(). When the result set is empty, false will be returned. The following pattern is useful to use when looping through multiple rows:

Handling errors

  • PDO::ERRMODE_SILENT: database-related errors will be ignored.
  • PDO::ERRMODE_WARNING: database-related errors will cause a warning to be emitted, but execution will continue.
  • PDO::ERRMODE_EXCEPTION: database-related errors will cause a PDOException to be thrown.

Using exceptions for PDO errors and wrapping your database code in try {} catch {} blocks is the best approach for a number of reasons. Firstly, from the perspective of object oriented programming, it’s the “best practice” approach. Secondly, it provides the greatest amount of flexibility. Thirdly, a PDOException will get raised if there’s a connectivity problem with the database, so it makes sense to wrap your database access code in try {} catch {} blocks anyway.

Using prepared statements with PDO

PDO provides us with the concept of prepared statements. Using these, we can create precompiled templates for queries which can then be reused.

There are two major benefits to using prepared statements. The first is that it enables us to perform parameter binding, which defers variable substitution (and string escaping) in queries to the database engine. This makes it easier to write queries and protects us against SQL injection attacks. The second benefit is that the database engine can optimise prepared queries, making them run more quickly on subsequent calls.

We create prepared statements using the following syntax:

Our UserModel, version 2.0

Now that we’ve seen different ways of improving the architecture of our database access layer and have been introduced to PHP Data Objects, we can look at improving the UserModel presented earlier.

Conclusion

Using PDO for database access is one of the first steps you should be taking to move your PHP applications towards better security and maintainability. PDO provides us with an abstraction layer over the database and exposes tools we can use to optimise and sanitise our queries.

If you’d like to learn more about PDO, you can access the official documentation for more examples of its usage and a comprehensive description of its API.