This is Dr. Nancy Zeliff. I will be discussing Chapter 3 with you, which is on Databases and Data Warehouses and how these support the Analytics-Driven Organization. This chapter is divided into two lectures. 3.1 is on Databases
The objectives you should attain in the first part are: List and describe the key characteristics of a relational database, and define five software components of a database management system.
Business intelligence or BI consists of collective information that gives you the ability to make effective, important, and strategic business decisions. It is a resource/component of the overall framework or field of analytics. Analytics is the science of fact-based decision making and has been and continues to be a strong focus for the business world. To create BI, you need data and information and then use the right IT tools to create BI from information. You need data, data from databases, data warehouses, and data-mining tools to get the information you need.
As businesses work with IT tools, two types of information processing are completed. Online transaction processing (OLTP) is the gathering of input information, processing that information, and updating existing information from operational databases. For example, one could query an operational database to see how many products individually sold over $10,000 last month and how much money was spent last month on radio advertising. While the results of these queries would be helpful, you need to combine product and advertising information to perform online analytical processing.
Online analytical processing (OLAP) is the manipulation of information to support decision making. For example, an Australian company has created a data warehouse that houses customer information (including census data and lifestyle codes), its financial and insurance products, and its marketing campaign information. From the data warehouse, agents can view all the products a given customer has purchased and what marketing campaigns a given customer is likely to respond to.
Therefore, online transaction processing supports operational processing such as sales orders and accounts receivable. It is supported by operational databases and database management systems. Online analytical processing helps build business intelligence and is supported by data warehouses and data-mining tools.
Companies and organizations uses databases for organizing and storing basic transaction-oriented information. A database is a collection of information that you organize and access according to the logical structure of that information. A relational database is a series of logically related two-dimensional tables or files to store information. A “relation” exists between data in tables or files. Students at Northwest are assigned a 919 number. Within the Banner database at Northwest, the Registrar’s Office file has the primary key or field of a student’s 919 number, which leads to a file or table about each student, specifically name, local address, phone number, permanent address, Social Security number, birthdate, major, and advisor. The Financial Assistance Office would have a separate file or table with the 919 number as the primary key or field about each student’s financial aid information, such as grants, loans, and scholarships received. Banner or CatPaws as we know it is a relational database, where relations exist between multiple files and tables. Redundancy of data is decreased because of the relationship of data among tables and files.
The characteristics of a database then are that databases are collections of information, created with logical structures that include logical ties within the information and built-in integrity constraints.
A data dictionary contains the logical structure for the information in a database. Before you can enter information into a database, you must define the data dictionary for all the tables and their fields. For example, a student field in the Banner database at Northwest would consist of many fields, including first name, last name, middle name, maiden name, 919 number, local address, permanent address, and others.
In a relational database, you must create ties or relationships in the information that show how the files relate to each other. You must specify the primary key of each file. The primary key is a field or group of fields that uniquely describes each record. At Northwest, the primary key for student records is the 919 number. It is also the primary key for faculty and staff. A foreign key is the primary key of one file that appears in another file. A student’s 919 number would be the foreign key in files or tables within Banner for various academic purposes: Admissions, Registrar’s, Financial Assistance, and Residential Life.
Integrity constraints are rules that help ensure the quality of information. Two built-in integrity constraints are the data dictionary, which defines the information or fields in a database, and the foreign keys. Foreign keys are primary keys that are found in more than one table or file.
In word processing, you create and work with documents; in spreadsheets, you create and work in workbooks. With a database management system or DBMS, you work with databases. A DBMS helps you specify the logical requirements for a database and access and use the information in a database.
The five important software components in a DBMS are an engine, data definition subsystem, data manipulation subsystem, application generation subsystem, and data administration subsystem.
The DBMS engine accepts logical requests and converts them into the physical equivalents and accesses the database and data dictionary on a storage device. The physical view of information deals with how the information is physically arranged, stored, and accessed on a storage device. The DBMS handles the physical view. The logical view of information focuses on how information is arranged and accessed to meet particular business needs and user preferences and experience.
The data definition subsystem helps you create and maintain the data dictionary and define the structure of the files in a database. You will experience this in designing the table structure of a database in Microsoft Access in your casebook. You will define fields in various tables, their data type, and identify which fields are the primary keys. You can also set properties of data in fields, make some fields required, limit the size of fields, and assign validation rules, such as “greater than or equal to 4 and not greater than 8.”
The data manipulation subsystem of a DBMS helps you add, change, and delete information in a database and query the database to find valuable information. This subsystem is the primary interface with users and includes report generators and queries
A view allows a user to see the contents of a database file, make whatever changes wanted, perform simple sorting, and query to find the location of specific information. In Microsoft Access, you can view a table in Datasheet View or Design View.
A report generator helps the user quickly define formats of reports and what information should be seen in a report. Fields that should appear in a report can be selected, grouped and sorted, and subtotaled and totaled. Predefined report formats are available
A query-by-example tool helps a user graphically design the answer to a question. Suppose Janielle Smith from Triple A Homes (Customer Number 4567) called and ordered a delivery of concrete. Although she can’t remember the name of the driver, she would like to have the driver that comes out the most often to deliver concrete to Triple A Homes. Fortunately, QBE tools can answer this question. When a QBE is performed in Microsoft Access, a user identifies the files in which the needed information is located, drags any necessary fields from the files to the QBE grid, and specifies selection criteria. A query performed for Janielle will yield the answer of John Robertson as the driver who usually delivers concrete to Triple A Homes. Without the logical relationships being correctly defined, this QBE query would not have worked properly.
Structured Query Language or SQL is a standardized fourth-generation query language found in most DBMSs. SQL performs the same function as QBE, except you perform the query by creating a statement instead of pointing, clicking, and dragging. The basic form of an SQL statement is Select, From, and Where. After SELECT, list the fields of information you want. After FROM, specify what logical relationships to use. After WHERE, specify any selection criteria.
The application generation subsystem of a DBMS contains facilities to help you develop transaction-intensive applications. These types of applications usually require that you perform a detailed series of tasks to process a transaction. Application generation subsystems include tools to create visually appealing and easy-to-use data entry screens and interfaces. SQL and application generation subsystems are usually used by IT specialists. Information users, like you, need to focus on views, report generators, and QBE tools. These will help you find information in a database and perform queries so you can start to build and use business intelligence.
The data administration subsystem of a DBMS helps users manage the overall database environment by providing facilities for backup and recovery, security management, query optimization, concurrency control, and change management. This subsystem is most often used by the database administrator.
Backup is a simple copy of the information stored on a computer. Recovery is the process of reinstalling the backup information in the event the information was lost. A plan for backup and recovery is called business continuity planning or disaster recovery planning and is discussed in Chapter 7.
Security management facilities allow users to control who has access to what information and what type of access those people have. Remember the acronym CRUD - create, read, update, and delete. Identify who can perform the functions of creating, reading, updating, and deleting. Query optimization facilities take queries from the users (QBE or SQL) and restructures them to minimize response times. They find the shortest route to the information needed. Reorganization facilities maintain statistics concerning how the DBMS engine physically access information and reorganizes how information is physically stored. If a user frequently accesses a file by a specific order, the reorganization facilities may maintain the file in that presorted order by creating an index that maintains the sorted order in that file.
Concurrency control facilities ensure the validity of database updates when multiple users attempt to access and change the same information. Consider the online registration feature of Northwest’s CatPaws. When you and another student are trying to register for a class with just one seat remaining, concurrency management allows the first student to enroll. What happens with the student who does not get his or her desired class? This is concurrency control. Change management facilities allow the user to assess the impact of proposed structural changes to a database. Information managers will not set up these backup and recovery, security management, query optimization, reorganization, or concurrency controls, but they use the database and see the effects of change management and concurrency controls.
This ends lecture 3.1.