Relational Databases and Information Systems

Updated on March 7, 2018
charles griffiths profile image

Charles is a software engineer and college professor interested in technology, medicine, economics, and nutrition.


What are databases good for? Most people will say they store data. Not true.

Databases are terrible at storing data. If you just want to store data, use files. Compress the files and make redundant backups. A database will easily take up 10x the amount of storage.

Databases are good for answering questions. In order to do this, they use extra RAM and drive space to improve speed. Between speed and space, speed always wins until you run out of space.

In this tutorial, I'll cover basic database concepts so you can understand the databases you use and hopefully communicate problems and feature requests more effectively. Let's get started.

Five Types of Data

The smallest part of a database is called a field. Data in a field is one of five basic types.

  • String types are implemented as CHAR, VARCHAR, VARCHAR2. The difference between these types is not important unless you're a DBA. Strings are all used for storing anything you can type on a keyboard. Depending on the software you're using, a string can be up to about 1000 characters. Strings are the most common and important type of data, and you could use them for everything but sometimes there's a better suited, more specialized type available.
  • Integers, also called INT are used for counting, expressing which or how many. They are also often used as indexes, which are explained later.
  • Floating point numbers, also called NUMBER are used for measurements, expressing how much, or for general math.
  • Dates, represented with the DATE or DATETIME types are used for calendar or schedule information. Birthdays, weddings, class schedules, work schedules, appointments, and reservations all use DATEs to indicate a point in time when something happened or is supposed to happen.
  • Binary Large OBjects, also known as BLOBs in database lingo are used for storing anything that's too large for a string, or is in a special format not compatible with text. BLOBs can contain compressed data, pictures, sounds, or movies and can store hundreds of MB.

Three Levels of Organization

Relational databases are made up of tables, which are made up of records, which are made up of fields.

  • Fields are what we just covered with the five types of data. Each field has a name and a type.
  • Records exist in tables. A table may have 0 or more records, but no two records can be exactly the same.
  • Tables each have a name and a list of fields. Each record in the table will be a set of these fields.


By default, any field can be NULL which means the data is missing. If the data in the field is required, you can mark it as NOT NULL. If a field is NOT NULL, then you may also mark it as UNIQUE which means that every record in that table must have a different value for that field.

UNIQUE is highly prized in large databases because it means you can guarantee that you'll get back either one or zero answers, never two or more. In a table with millions of records, lacking this guarantee can easily result in pages and pages of data when you're searching for just one field.


Structured Query Language (SQL) is the means for asking questions to your database. The most important SQL is a SELECT statement which as the name implies allows you to choose a subset of your data and include it in a report.

A SELECT statement has three parts. The first part is the keyword SELECT followed by a list of fields. These fields are what you want your answer to look like. The second part is the keyword FROM followed by a list of tables. These tables tell SQL where to look for data, among the potentially hundreds of tables in your database. The third and optional part is the keyword WHERE which you can think of as meaning "if". If the WHERE clause is missing, then all possible answers are returned. If present, the WHERE clause limits the answers to those that evaluate as true.

Add to this a few wrinkles for convenience. If the list of fields after SELECT is replaced by * then all applicable fields will be part of your answer. If the list of fields is replaced by COUNT(*) then your answer will be an integer count of how many answers passed the logical test in the WHERE clause.


This example will tell you how many records are in the LIBRARY_BOOKS table. Once you understand SELECT, UPDATE works just the same but chooses which records to modify.


A join is simply when you have a where clause that tests for equality of two fields. Usually these are key fields in different tables, but they may also be the same field in the same table. The graphic above shows different kinds of joins based on whether the test found a match or not. This is a heavy subject and you'll study it more if you write a lot of complex SELECT statements.

Keys and Indexes

A key for a table is something you search for in order to find the records you want. A key is often one field, but may be a set of several fields together.

To make this searching fast, add an index to the table. Indexes make searching faster, but adding or changing the table's data is slower because you have to update the table and also update the index. Usually this comes out in favor of having the index, but for efficiency an index should have as few a number and as small (in bytes) fields as possible. INT indexes are the best.

A primary key is a special kind of key where each key value has at most one record in the table. This is a good thing for efficiency because once the database finds a record that matches a primary key, it can stop searching.


Database normalization is very dry, and you don't need to worry about it that much unless you're a DBA. Even most DBAs for most databases don't need to understand much more normalization than the following.

There are different levels of normalization, but the gist of it is this. You only want to store each needed bit of data in your database once, not zero times, and not twice.

Extending the LIBRARY_BOOKS table to illustrate, suppose every book in the library has a binding material that's either leather or paper, and a binding color that is always black for leather and brown for paper. If there are lots of books of each type, material and color are redundant fields and only one of them should be stored for each book. Putting the associations of black leather and brown paper into a separate table prevents you from storing this data more than once.

What happens if you store it more than once? Obviously you waste space, but that's not the most important thing. You might also introduce errors or omissions into your data. What if for some reason the library lost all leather bound books? Then the association of black leather would be nowhere in the database.

This is a trivial example for simplicity, but imagine a database stores a person's name or phone number multiple times. One location might be updated and another left with wrong information.

To solve this problem generally, normalization asks the question: Are there any fields in this table that are completely determined by another set of non-primary-key fields? (Is the book binding color completely determined by the binding material?) If there are, then you might want to normalize those fields out into a separate table.


You made it to the end. Are you a DBA now?

Yes you are! But only if you were a DBA when you started reading. Setting up a small relational database in MySQL or SQLite is very easy, and can be done without even the minimal grounding from this introductory article. Knowing what you do now might just make you less dangerous.

The path from here may lead you to a formal database course, or textbook, or video lecture series, or at the very least a better appreciation of the data in the databases you use and the work done by your friendly neighborhood DBA. Safe travels!

This article is accurate and true to the best of the author’s knowledge. Content is for informational or entertainment purposes only and does not substitute for personal counsel or professional advice in business, financial, legal, or technical matters.


    0 of 8192 characters used
    Post Comment

    No comments yet.


    This website uses cookies

    As a user in the EEA, your approval is needed on a few things. To provide a better website experience, uses cookies (and other similar technologies) and may collect, process, and share personal data. Please choose which areas of our service you consent to our doing so.

    For more information on managing or withdrawing consents and how we handle data, visit our Privacy Policy at:

    Show Details
    HubPages Device IDThis is used to identify particular browsers or devices when the access the service, and is used for security reasons.
    LoginThis is necessary to sign in to the HubPages Service.
    Google RecaptchaThis is used to prevent bots and spam. (Privacy Policy)
    AkismetThis is used to detect comment spam. (Privacy Policy)
    HubPages Google AnalyticsThis is used to provide data on traffic to our website, all personally identifyable data is anonymized. (Privacy Policy)
    HubPages Traffic PixelThis is used to collect data on traffic to articles and other pages on our site. Unless you are signed in to a HubPages account, all personally identifiable information is anonymized.
    Amazon Web ServicesThis is a cloud services platform that we used to host our service. (Privacy Policy)
    CloudflareThis is a cloud CDN service that we use to efficiently deliver files required for our service to operate such as javascript, cascading style sheets, images, and videos. (Privacy Policy)
    Google Hosted LibrariesJavascript software libraries such as jQuery are loaded at endpoints on the or domains, for performance and efficiency reasons. (Privacy Policy)
    Google Custom SearchThis is feature allows you to search the site. (Privacy Policy)
    Google MapsSome articles have Google Maps embedded in them. (Privacy Policy)
    Google ChartsThis is used to display charts and graphs on articles and the author center. (Privacy Policy)
    Google AdSense Host APIThis service allows you to sign up for or associate a Google AdSense account with HubPages, so that you can earn money from ads on your articles. No data is shared unless you engage with this feature. (Privacy Policy)
    Google YouTubeSome articles have YouTube videos embedded in them. (Privacy Policy)
    VimeoSome articles have Vimeo videos embedded in them. (Privacy Policy)
    PaypalThis is used for a registered author who enrolls in the HubPages Earnings program and requests to be paid via PayPal. No data is shared with Paypal unless you engage with this feature. (Privacy Policy)
    Facebook LoginYou can use this to streamline signing up for, or signing in to your Hubpages account. No data is shared with Facebook unless you engage with this feature. (Privacy Policy)
    MavenThis supports the Maven widget and search functionality. (Privacy Policy)
    Google AdSenseThis is an ad network. (Privacy Policy)
    Google DoubleClickGoogle provides ad serving technology and runs an ad network. (Privacy Policy)
    Index ExchangeThis is an ad network. (Privacy Policy)
    SovrnThis is an ad network. (Privacy Policy)
    Facebook AdsThis is an ad network. (Privacy Policy)
    Amazon Unified Ad MarketplaceThis is an ad network. (Privacy Policy)
    AppNexusThis is an ad network. (Privacy Policy)
    OpenxThis is an ad network. (Privacy Policy)
    Rubicon ProjectThis is an ad network. (Privacy Policy)
    TripleLiftThis is an ad network. (Privacy Policy)
    Say MediaWe partner with Say Media to deliver ad campaigns on our sites. (Privacy Policy)
    Remarketing PixelsWe may use remarketing pixels from advertising networks such as Google AdWords, Bing Ads, and Facebook in order to advertise the HubPages Service to people that have visited our sites.
    Conversion Tracking PixelsWe may use conversion tracking pixels from advertising networks such as Google AdWords, Bing Ads, and Facebook in order to identify when an advertisement has successfully resulted in the desired action, such as signing up for the HubPages Service or publishing an article on the HubPages Service.
    Author Google AnalyticsThis is used to provide traffic data and reports to the authors of articles on the HubPages Service. (Privacy Policy)
    ComscoreComScore is a media measurement and analytics company providing marketing data and analytics to enterprises, media and advertising agencies, and publishers. Non-consent will result in ComScore only processing obfuscated personal data. (Privacy Policy)
    Amazon Tracking PixelSome articles display amazon products as part of the Amazon Affiliate program, this pixel provides traffic statistics for those products (Privacy Policy)
    ClickscoThis is a data management platform studying reader behavior (Privacy Policy)